aboutsummaryrefslogtreecommitdiff
path: root/en_GB/Introduction to Information Security/introduction_to_information_security.md
blob: fbdd9ad504524ede34b33bb96a427333d3235068 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
# Introduction to Information Security - Lernzettel

## Security

### Security objectives

- Confidentiality  
  Contents of objects cannot be read by third parties.
- Integrity  
  Whether or not a message has been modified between origin and receiver.
- Availability  
  Guaranteed access to the information for permitted parties.
- Access Control  
  Only permitted parties are allowed to access the information.
- Non-repudiation  
  Proof that an entity was involved in some event.

### CIA

- Confidentiality
- Integrity
- Availability

### Perkerian hexad

- Confidentiality
- Integrity
- Availability
- Utility  
  Ensures that the information is useful and insensitive to e.g. lost keys.
- Possession or Control  
  Be sure that the possessor is in control of the information at all times.
- Authenticity  
  Verification of claimed identities. Notice: In most cases, this just proves entities (e.g. machines), not humans.
  Also, there must be a point in time where authentication starts. If this step is taken automatically by a machine (e.g. session start),
  there is no valid inference to the actual human.

### Secrecy

Confidentiality+.  
Not only provides hidden contents, but also hides the fact that there is content at all.

### Strategy

1. Prevention
2. Detection
3. Reaction

## Reliability

Reliability addresses consequences of accidential errors. Reliability checks if service interruptions cause low or medium disturbance.

## Safety

Safety addresses catastrophic influences on the environment (e.g. human life). Safety checks if service interruptions cause very high disturbance and even harm.

## Authentication

### Modes

As a user, you can be authenticated on the basis of

- Something you know (e.g. password)
- Something you hold (e.g. ID card)
- Who you are (e.g. biometrics)
- What you do (e.g. behaviour analysis)
- Where you are (e.g. geo-location)

### Passwords

- Assure correct receiver of the initial password. Communication might be intercepted.
- Call back already authenticated entities, which are authorized to hand over the password.
- Force the user to change the password immediately after first login.
- Provide multi-factor authentication to let the user to be able to reset forgotten passwords without costly helpdesks.

### Guessing passwords

- Brute force
- Intelligent search (alphabet limits, length limits)

### Password protection

- No expiry dates  
  Studies have shown that this results in worse passwords.
- No restrictions in password alphabet  
  Studies have shown that this leads to less possibilities in exhaustive guessing and therefore leads to worse passwords.
- Set a minimum length instead  
  Has a higher impact than complexity. Set the maximum to at least 64.
- No hints
- Show passwords while typing  
  Doing the opposite motivates the user to choose shorter passwords.
- Allow passwords to be pasted  
  This enables secure password managers to be used.
- Forbid commonly used passwords  
  Makes dictionary attacks difficult.
- Limit number of failed password attempts

### Challenge Response Authentication

1. Authenticator knows the password.
2. User identifies himself and requests authentication.
3. Authenticator sends *nonce* (random, temporary number) to the user (challenge).
4. User computes the one-way-function result of the concatenation of password and nonce, sends the result to the authenticator.
5. Authenticator computes the same.
6. Authenticator compares his computed value with the users computed value. If they match, authentication is successful.

### HTTP Digest Authentication

Same as *Challenge Response Authentication*, but the compare value is computed as:  
$$
\text{digest} = \text{h}(\text{h}(\text{username}:\text{realm}:\text{password}):\text{nonce}:\text{h}(\text{method}:\text{digest-uri}))
$$, $\text{h}$ being a one-way-function, $\text{:}$ being the concatenation operator.

### HTTP Session authentication

Authenticating every single *POST* and *GET* request is inefficient.
Therefore, authentication is done at the beginning of the session and remains until the end of the session.

#### Unilateral Authentication

Only one party is authenticated, while the other is an anonymous entity. This is common practice in the web with HTTPS.

#### Mutual Authentication

Every communication party is authenticated to each other.

### Biometrics

#### Use cases

| Use case | Cardinality | Description |
| -------- | ----------- | ----------- |
| Identification | 1:n | Identify the user from a set of users in a database. |
| Verification | 1:1 | Verifies the single claimed identity by comparing captured patterns to the stored patterns. |

#### False match rate (FMR)

How often is a false match attempt successful, which it should not be? Best case: $\text{FMR} = 0$.
It can be interpreted as a measure of quality of the specific scheme. The lower the value, the better.  
$$
\text{FMR} = \frac{\text{\# successful false matches}}{\text{\# attempted false matches}}
$$

#### False non-match rate (FNMR)

How often is a genuine match attempt rejected, which it should not be? Best case: $\text{FNMR} = 0$.  
$$
\text{FNMR} = \frac{\text{\# rejected genuine matches}}{\text{\# attempted genuine matches}}
$$

#### Fitting Rate

A value (in %) indicating how much the captured pattern fits the stored pattern in the database.

##### Examples

A *Fitting Rate* of 100% indicates that all of the captured pattern data fits the data stored in the database (unlikely, due to noise).
A *Fitting Rate* of 50% indicates that half the pattern data fits the data stored in the database.

#### Matching Threshold

A value (in %) determining the minimum *Fitting Rate* for an matching check to be considered as matching.
A lower *Matching Threshold* raises the amount of false positive matching checks, but lowers the amount of rejected genuine matching checks.
*FMR* increases, *FNMR* decreases.
A higher *Matching Threshold* raises the amount of rejected genuine matching checks, but lowers the amount of false positive matching checks.
*FMR* increases, *FNMR* decreases.

#### Equal Error Rate (EER)

The value of *Matching Threshold*, at which $\text{FMR} = \text{FNMR}$.

#### Failure-To-Capture Rate (FTC)

Frequency of failing to capture a sample.

#### Failure-To-Extract Rate (FTX)

Frequency of failing to extract a feature of a sample.

#### Failure-To-Acquire Rate (FTA)

Frequency of failing to acquire a biometric feature.  
$$
\text{FTA} = \text{FTC} + \text{FTX} \times (1 - \text{FTC})
$$

#### False Accept Rate (FAR)

$$
\text{FAR} = \text{FMR} \times (1 - \text{FTA})
$$

#### False Reject Rate (FRR)

$$
\text{FRR} = \text{FTA} + \text{FNMR} \times (1 - \text{FTA})
$$

#### False Positive Identification Rate (FPIR)

Probability of some sample to match at least one of the entries in the database.  
$$
\text{FPIR} = (1 - \text{FTA}) \times (1 - (1 - \text{FMR})^{n})
$$
, $\text{n}$ being the database size.

#### Biometrics in remote authentication

*FPIR* scales up with increased *n* (database size), which makes it unusable in remote authentication with large databases.

##### Examples

Using a biometric scheme with $\text{FMR} = 0.01\%$ and a database of size $\text{n} = 80000$ results in $\text{FPIR} = (1 - 0) \times (1 - (1 - 0.0001)^{80000}) = 99.97\%$.

## Encryption

### Kerckhoffs' Principle

Do not rely on the secrecy of the algorithm; only the keys have to be secret.  
*Security by obscurity* is nonsense.

### Cipher

#### Block Cipher

*Block Cipher* encrypts long sequences of data with the same key. Single bit errors in ciphertext cause bit errors on half of the cleartext on average.

#### Stream Cipher

*Stream cipher* encrypts short sequences of data with a changing key per sequence, coming from a *key stream*, generated by a *key generator*.
Security of ciphertext depends on the security of the *key generator*. Single bit errors in ciphertext cause single bit errors in cleartext.
This is commonly used in noisy channels.

### Public Key Encryption

*A* encrypts message with public key of *B* (publicly available via *Public Key Infrastructure* (PKI)).
This message is only decryptable with the private key of *B* (only available to *B*).  
Public keys need to be bound to the actual receiver! You have to make sure the public key you have is actually the key of the receiver
and not somebody you think is the receiver (receiving machine being used by many users, spoofing).

### Message Authentication Codes (MAC)

*Message Authentication Codes* are used to verify the integrity of a message (proof, that the message has not been modified between sender and receiver).  

1. Sender and receiver share a common secret key *k*.
2. Sender computes $\text{MAC}_\text{sent} = \text{h}(\text{k}, \text{x})$, *h* being a one-way-function, *x* being the message.
3. Sender sends message *x* with $\text{MAC}_\text{sent}$.
4. Receiver receives the message and $\text{MAC}_\text{sent}$ and calculates $\text{MAC}_\text{received} = \text{h}(\text{k}, \text{x'})$ with *x'* being the received message.
5. Receiver compares $\text{MAC}_\text{sent}$ and $\text{MAC}_\text{received}$. If they match, the message is considered not modified.

### Digital Signatures

*Digital Signatures* are used to verify the integrity of a message, same as *MAC*.
Compared to *MAC*, it does not rely on shared secret keys. Instead, it uses *Private Key* for signing, and *Public Key* to verify.  

1. Sender computes $\text{sig} = \text{h}(\text{private}, \text{message})$.
2. Sender sends message and appends signature $\text{sig}$.
3. Receiver verifies signature $\text{sig}$ using *Public Key* of the sender.

## Sessions

### TLS Sessions

1. Client proposes a suite of ciphersuites to the server.
2. Server picks a ciphersuite and sends the certificate (containing the public key of the server) to the client.
3. Client generates *pre-master-secret*, encrypts it with the servers public key and sends it to the server.
4. Only the server can decrypt the *pre-master-secret* with his private key (authentication).
5. Session is established and symmetrically encrypted using shared but secret key.

### Forward Secrecy

*Forward Secrecy* ensures that even if the private key of the server gets compromised, old session keys are not compromised.
Therefore, secure sessions remain secure, even with revealed private keys afterwards.  
TLS provides *Forward Secrecy*, because with a compromised public key, new sessions can be forged, but old sessions are not compromised,
because they are encrypted with *master-secret*.

### Diffie-Hellman Key Agreement

#### Principle

1. *A* has a secret box, applies a lock on it and sends it to *B*.
2. *B* adds his lock to the package and sends the result to *A*.
3. *A* removes his lock and sends the result to *B*.
4. *B* removes his lock and has the secret box.

#### Discrete Logarithm Problem (DLP)

Group $(\text{G}, \circ)$, Generator $\text{g}$ so that $\forall \text{g'} \in \text{G} : \exists \text{n} \in \mathbb{N} : \text{g'} = \text{g}^{\text{n}} = \text{g} \circ ... \circ \text{g}$.

- Discrete exponentiation (calculating $\text{g}^\text{n}$) is easy.
- Taking discrete logarithm (calculating $\text{x}$ for given $\text{g'} = \text{g}^\text{x}$) is computationally infeasable.
- Normally, $\text{G} = \mathbb{Z}/\mathbb{Z}_\text{p}$ (elements of $\mathbb{Z}$ modulo some prime $\text{p}$).
- Exponention modulo $\text{p}$ is commutative: $(\text{g}^\text{x})^\text{y} \mod \text{p} = \text{g}^{\text{x}\text{y}} \mod \text{p} = (\text{g}^\text{y})^\text{x} \mod \text{p}$.
- Order of *g*: Smallest number *n*, so that $\text{g}^\text{n} = 1$.

#### Algorithm

1. *A* and *B* agree on primes *p*, *q* and a generator *g* of order *q*.
2. *A* picks random number *x*, sends $\text{X} = \text{g}^\text{x} \mod \text{p}$ to *B*.
3. *B* picks random number *y*, sends $\text{Y} = \text{g}^\text{y} \mod \text{p}$ to *A*.
4. *B* computes shared secret $\text{X}^\text{y} = \text{g}^{\text{x}\text{y}} \mod \text{p}$.
5. *A* comoutes shared secret $\text{Y}^\text{x} = \text{g}^{\text{y}\text{x}} \mod \text{p}$.
6. *A* and *B* now have the same shared secret, since $\text{X}^\text{y} = (\text{g}^\text{x})^\text{y} \mod \text{p} = \text{g}^{\text{x}\text{y}} \mod \text{p} = (\text{g}^\text{y})^\text{x} \mod \text{p} = \text{Y}^\text{x}$.

#### Remarks

Diffie-Hellman does not provide authentication. When establishing a connection between *A* and *B*, an attacker *C* could intercept the connection during
key agreement and distribute his own keys to *A* and *B* respectively.  
In TLS, the Diffie-Hellman values coming from Server to Client are signed with the servers public key to prevent this.

### HTTP sessions

HTTP sessions per se do no end with TCP connections.  
To check for Man-In-The-Middle attacks, TCP session state is included in the HTTP digest.
Then the server also checks against his session state. If they are not equal, there is a MitM.

## Certificates

*Certificates* are used for authentication, mostly by the server in TLS sessions.  
They contain the public key of the owner amongst other information (issue date, expiration date etc.).
By default, public keys are not bound to real world entities. To create this binding, some kind of authority of trust has to propagate it.
To do so, *Certificates* contain a signature of the *Certificate Authority* (CA). *Certificate Authorities* also have their own certificates,
signed by an above *Certificate Authority*. This forms a hierarchy of authorization, with *Root Certificate Authorities* on top.
*Certificates* for those are pre-installed on the machines.

### X.509

*X.500* is a global repository of named entities (people, computers etc.).  
*X.509* is a certificate format binding public keys to those database entries. This is the de facto industry standard.

### Public Key Infrastructure (PKI)

Infrastructure providing the service of public key distribution.

#### Certificate Classes

- Domain Validated SSL Certificate (DV cert)  
  Cheap. Verify owner of domain via *whois* and DNS records. There are reserved email addresses (e.g. postmaster@example.com),
  which can be used aswell. This is a potential security issue! MX delegation to anonymous email services has to be carefully minded!
- Organization Validation SSL Certificate (OV cert)  
  Medium. CA checks same as *DV* + company identity checked by third parties.
- Extended Validation SSL Certificate (EV cert)  
  Expensive. CA checks same as *OV* + official record matching.

## Crypto Analysis

### Side Channel Analysis

*Side Channel Analysis* focuses on analyzing unintended *side-channels*, which might not have been respected enough in security design.  
*Side-channels* are all communication channels not used for main communication.
In many embedded devices, IO pins are the main channel of communication. Any other channel is a *side-channel*.  
The attack focus moves from the logical layer (algorithms) to the physical layer (time, power etc.).

### Simple Power Analysis (SPA)

Every instruction/data consumes a specific amount of power. This power is measurable in execution time.
So measuring current power consumption allows the attacker to conclude to the currently executed instruction / accessed data!

#### Defences

- Execute leaking instructions on full time anyways, even if they are not needed.
- Make executions input-independent

### Differential Power Analysis (DPA)

Similar to *SPA*, but *DPA* focuses on the change of values. Precisely, is analyzes the effects correlated to the change of values.  
Data depends on inputs. Now the attacker tries different inputs and observes data changes by constantly measuring the power.

### Timing Analysis

Runtime of algorithms might depend on input data. If the data can be split into chunks, currect chunks might take a different time
than incorrect ones. Also, the time it takes might depend on the number of correct chunks, (e.g. n correct chunks and k incorrect chunks
may take a different time to process than n-1 correct and k+1 incorrect ones).  

#### Example

The algorithm that checks a password checks the input char by char, from left to right. *It stops checking the rest of the characters, if one character is wrong.*
The password is `IntroSec`.

1. The attacker guesses `testtest`. The algorithm takes 100 ns to process.
2. The attacker guesses `Iesttest`. The algorithm takes 110 ns to process. Now the attacker learned that the first character is likely to be correct.
3. The attacker guesses `Itsttest`. The alogrithm takes 110 ns to process. The attacker detects no significant change in processing time,
   so it is likely that there is no new correct char.
4. The attacker continues, learning char by char for each processing time increase, until he got the full password.

## Software Security

### Buffers

Logically, values are assigned to variables.
In reality, for every variable, a portion of memory on the call stack gets allocated (*buffer*), and the data gets written to that buffer.
Data written to the buffer is written upwards from the address allocated to the buffer.

## Threat scenarios

No security issues without threat models! E.g. a password is considered safe without any provided threat model.

### Passive Attack

Attacker only reads data from a communication channel.

### Active Attack

Attacker inserts, alters or deletes data on a communication channel.

### Smurf attack

Attacker sends out ICMP ping request with spoofed sender IP address of the victim to the broadcast of some network.
All recipients will answer the ping, and send the answer packets to the IP address they think was the sender, which is the victims IP address.
In a network with 100 nodes, a single broadcast ICMP request results in 100 answers sent to the victim, causing a denial of service.

### TCP session hijacking

Attacker pretends to be a trusted host.

1. Attacker sends *SYN* packet with spoofed address of the trusted host to the server.
2. Server replies with *SYN ACK*, with some sequence number *y*, but sends the response to the trusted host he thinks sent the *SYN*.
3. Attacker now has to guess the sequence number *y* in order to reply with an *ACK* and sequence number *y + 1*.
4. If guessed correctly, the attacker can send TCP packets as trusted host.

### Password compromise

Old threat model: One machine, one password. One compromised password means one compromised machine.
New threat model: Multiple machines, one or similar passwords. One compromised machine can cause other compromised passwords.

### Password spoofing attack

Attacker presents a fake login screen to the victim.
Victim enters his password and the attacker captures the data forwarded by the fake login screen.

#### Countermeasures

- System authentication to the user
- Display number of failed logins  
  Indicates compromised password to the user.

### Cookie poisoning

1. Session begins, server issues session ID (SID), which gets stored as a client cookie.
2. A client changes his own SID cookie according to a SID of another client, therefore hijacking the session and being able to do everything the other client can do.  
   He can do this by  
2.1 Brute-force.  
2.2 Having access to the cookie and simply reading it out (security failure).

#### Countermeasures

- Picking random, hard to guess SIDs
- Storing cookies in a safe place
- Include a MAC with the SID to check for integrity

### Buffer Overflow

If the size of the data to be written to the buffer is larger than the size of the memory allocated to that buffer,
the upper buffer limit gets exceeded. If additionally those limits are not checked, memory not assigned to the buffer
gets overwritten.

1. Attacker knows that there is a buffer with 80 bytes allocated.
2. Attacker invokes the function writing to the buffer, with shellcode beginning from byte 80 in the data array.
3. This shellcode overwrites the return address of the current stack function in that program.
4. As soon as the program exits its function, it executes the attackers shellcode *with this programs privileges*.

#### Examples of vulnerable functions

##### strcpy

```C
char *strcpy(char *destination, const char *source );
```

`strcpy(...)` works with `NULL` terminated strings as `destination` and `source`.
If `destination` is not `NULL` terminated, the behaviour of this function is undefined and might cause a buffer overflow
and overrides in wrong memory.

#### Defences

- Check size of data input and compare to actual allocated buffer size.
- Use functions requiring explicit size definition.
- Check for integer overflows.
- Protect return address using *canaries*.
  Special instructions placed before actually returning to check for correct return address.
- Mark potential shellcode memory as non-executable.