Although most of my works are properties of the companies I work for, This repository is one of my recent personal projects I am actually proud of. This is portable-geli. GELI(8) is FreeBSD's cryptography framework technology. It consists of the kernel side which controls the actual encryption/decryption blocks of data and the user-space side which gives the user the ability to control various aspects of the cryptography such as defining keys, choosing encryption algorithms and so on.
The primary goal of this project is to make it possible to attach GELI encrypted block devices on other operating systems such as GNU/Linux while being 100% compatible with the FreeBSD implementation. Currently only the GNU/Linux is supported but support for other operating systems such as OpenBSD (The daily operating system of mine) is already under serious consideration. Also I am planning to add GELI support to the Linux kernel but that work is not started yet and I’m not quite sure if it’s acceptable to upstream project’s maintainers.
Before explaining the internals of the project, I should mention that I almost rewrote the entire project again after completion and replaced it with most of FreeBSD’s actual code (Kernel and UserSpace) for two reasons: The main reason was to avoid mistakes and leaving cryptography vulnerabilities which are very hard to find and debug and the other was that this decision (hopefully) wil help me to improve the code faster and maybe lead to some contribution to the FreeBSD’s code as well. So, most of the descriptions below except the NBD part is identical to the FreeBSD implementations. Hope it helps others interested in cryptography as well as FreeBSD enthusiasts.
GELI uses the last sector of the block device to store it’s metadata (struct eli_metadata) describing the properties of the encrypted device (Note that this decision to use the last sector is the opposite in LUKS in which it uses the first sectors). Although the metadata structure is fixed in shape, The stored values and interpretation of it depends on the version of geli (md_version) which created the device. There are 7 different versions which the latest version is currently supported on portable-geli. Some of the most notable fields in this metadata is the encryption algorithm (md_ealgo), authentication algorithm (md_aalgo), the random salt (md_salt)and finally the encryption master keys (md_mkeys).
struct eli_metadata {
char md_magic[16];
uint32_t md_version;
uint32_t md_flags;
uint16_t md_ealgo;
uint16_t md_keylen;
uint16_t md_aalgo;
uint64_t md_provsize;
uint32_t md_sectorsize;
uint8_t md_keys;
int32_t md_iterations;
uint8_t md_salt[ELI_SALTLEN];
uint8_t md_mkeys[ELI_MAXMKEYS * ELI_MKEYLEN];
uint8_t md_hash[MD5_DIGEST_LENGTH];
} __attribute__((packed));
The master-keys field (md_mkeys) can hold up to two master keys. Each master key slot holds Initial Vector (IV) and DATA which are filled with highly randomized bits of data and a HASH field which is calculated using the user's passphrase.
The IV field is the Initialization vector and the DATA is the actual key used in encrypting the disk which both are randomly created.
64 64 64 64 64 64
┌──────┬──────┬──────┬──────┬──────┬──────┐
│ IV DATA │ HASH │ IV DATA │ HASH │
└──────┴──────┴──────┴──────┴──────┴──────┘
First Key’s slot Second Key’s slot
(192 bytes) (192 bytes)
The above diagram shows the unencrypted schema of the master keys. The actual master keys (md_keys) is always encrypted on disk using a symmetric-key algorithm such as AES and only with the correct key, the components of the master key can be retrieved. This field is encrypted using the human passphrase but instead of using the passphrase directly as an encryption key, first a derived-key is calculated using a key derivation function (KDF) which makes a much stronger key with desired length (512-bits) than the user passphrase. GELI uses HMAC as it’s key derivation function with SHA512 hash function but also applies PBKDF2 algorithm to add computational costs to brute force attacks.
# pseudocode:
derived-key=HMAC(k,m) = H((k'⊕opad) || H((k'⊕ipad) || m))
H=HASH512
k=k’=64-bytes-zero-filled-buffer
ipad[i]=key[i]⊕0x36
opad[i]=key[i]⊕0x56
m=PKCS5v2(salted-passphrase, iteration) or salted-password if iteration is 0
The || symbol means string concatenation
The ⊕ symbol means XOR
Call graph:
The derived-key is then used to calculate the HMAC of the IV and DATA fields together and store it into the HASH field of the master-key.
┌───────────────────┬─────────────────────┬─────────────────────────┐
| IV-DATA | HASH |
└───────────────────┴─────────────────────┴─────────────────────────┘
Unencrypted Master Key slot (192 bytes)
# pseudocode:
HASH=HMAC(hmac-key,IV-DATA)
hmac-key=HMAC(derived-key, ‘\x00’)
Call graph:
Then, the whole master-key slot is encrypted using AES-CBC algorithm.
┌───────────────────┬─────────────────────┬─────────────────────────┐
| ENCRYPTED-MASTER-KEY |
└───────────────────┴─────────────────────┴─────────────────────────┘
Encrypted Master Key slot (192 bytes)
# pseudocode:
ENCRYPTED-MASTER-KEY=AES-CBC-ENCRYPT(encryption-key, IV-DATA-HASH)
encryption-key=HMAC(derived-key, ‘\x01’)
Call graph:
Now the master-key slot is fully encrypted indirectly using the user's passphrase. Remember that the only way to access the master-key’s components is to decrypt the master-key slot using the correct passphrase.
┌───────────────────┬───────────────────┬───────────────────┐
| ENCRYPTED-MASTER-KEY |
└───────────────────┴───────────────────┴───────────────────┘
Encrypted Master Key slot (192 bytes)
┌───────────────────┬───────────────────┬───────────────────┐
| IV DATA | HASH |
└───────────────────┴───────────────────┴───────────────────┘
Decrypted Master Key slot (192 bytes)
# pseudocode:
decrypted-master-key=AES-CBC-DECRYPT(encryption-key, encrypted-master-key)
encryption-key=HMAC(derived-key, '\x01')
Call graph:
After decrypting the master-key slot, the HASH field of the decrypted-master-key should be equal to the recalculated hash of the IV and DATA section of decrypted-master-key slot using the passphrase. If equal, the given passphrase is the correct one. With any given passphrase, this process is repeated for the two of the master-keys before giving up.
┌───────────────────┬───────────────────┬───────────────────┐
| IV DATA | HASH |
└───────────────────┴───────────────────┴───────────────────┘
Decrypted Master Key slot (192 bytes)
┌───────────────────┐
| RECALCULATED-HASH |
└───────────────────┘
# pseudocode:
RECALCULATED-HASH=HMAC(hkey, IV-DATA)
hkey=HMAC(derived-key, ‘\x00’)
Call graph:
If the passphrase is verified, this metadata is then processed and stored in a more efficient structure in memory at run-time (struct eli_softc). Only the IV and DATA sections of the unencrypted master key components are needed and will be stored in memory until the encrypted device is finally detached.
struct eli_softc {
u_int sc_version;
u_int sc_crypto;
uint8_t sc_mkey[ELI_DATAIVKEYLEN];
uint8_t sc_ekey[ELI_DATAKEYLEN];
TAILQ_HEAD(, eli_key) sc_ekeys_queue;
uint64_t sc_ekeys_total;
uint64_t sc_ekeys_allocated;
u_int sc_ealgo;
u_int sc_ekeylen;
uint8_t sc_akey[ELI_AUTHKEYLEN];
u_int sc_aalgo;
u_int sc_akeylen;
u_int sc_alen;
SHA256_CTX sc_akeyctx;
uint8_t sc_ivkey[ELI_IVKEYLEN];
SHA256_CTX sc_ivctx;
int sc_nkey;
uint32_t sc_flags;
int sc_inflight;
off_t sc_mediasize;
size_t sc_sectorsize;
u_int sc_bytes_per_sector;
u_int sc_data_per_sector;
} eli_sc;
The decrypted block device is at least one sector smaller than the encrypted block device since the metadata sector is not present. GELI uses different derived-keys for every 2 to the power of 20 (2^20) blocks of data or sectors (unless it is configured to use only a single key) and an unique initial vector (IV) for each sector equal to it’s sector offset. That means there is a unique key for every 2^20 blocks of data starting from key number zero. To encrypt/decrypt each block of data, a specific key should be calculated every time a block is requested which can be quite cumbersome. To avoid this, a number of possible keys can be calculated beforehand to save some computational power. These keys are then inserted into a sorted list of calculated keys. The process of decrypting and encrypting blocks of data is as following:
To read the unencrypted sector of m1, the content of b1 sector on the encrypted device should be read and decrypted by the following algorithm. And to write the m1 sector, the content of this sector should be encrypted and replaces the b1 sector on disk.
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬─────────┐
| b0 b1 b2 b3 b4 b5 b6 b7 bi metadata
└────┴─┬──┴────┴────┴────┴────┴────┴────┴────┴────┴─────────┘
Encrypted block device (on physical disk)
│
┌────┬─▼──┬────┬────┬────┬────┬────┬────┬────┬────┐
| m0 m1 m2 m3 m4 m5 m6 m7 mi
└────┴─┬──┴────┴────┴────┴────┴────┴────┴────┴────┘
| Decrypted block device (virtual block device)
▼
# pseudocode:
keyno=i/block-size
hmac-data={’ekey’, keyno}
encryption-key is IV-DATA if md_flags has ENC_IVKEY flag set otherwise DATA.
data-encryption-key=HMAC(encryption-key, hmac-data)
ivi=i || padded-zero
mi=AES-XTS-DECRYPT(data-encryption-key, ivi, bi)
bi=AES-XTS-ENCRYPT(data-encryption-key, ivi, mi)
The final part is to provide the user a virtual block device. Using Linux Network Block Device (NBD) it’s possible to provide a virtual block device in which the requests to such a device are handled by a user-space program like portable-geli. So when geli is attached to an encrypted block device, first it verifies the user passphrase and then creates a nbd device with the proper size (md_provsize). User’s requests to this virtual block device are received by the geli process and appropriate type of action is done according to it’s command type which can be either read or write.
I'm a software enginner interested in kernel programming, Senior System Developer and project manager of the router project at Zharfpouyan. Currently I'm looking forward to work with bigger teams on more exciting projects. You can find me at Github.