The sodium R package provides bindings to libsodium: a modern, easy-to-use software library for encryption, decryption, signatures, password hashing and more.

The goal of Sodium is to provide the core operations needed to build higher-level cryptographic tools. It is not intended for implementing standardized protocols such as TLS, SSH or GPG. Sodium only supports a limited set of state-of-the-art elliptic curve methods, resulting in a simple but very powerful tool-kit for building secure applications.

Authenticated Encryption Encryption only Authentication only
Symmetric (secret key): data_encrypt / data_decrypt data_tag
Asymmetric (public+private key): auth_encrypt / auth_decrypt simple_encrypt / simple_decrypt sig_sign / sig_verify

Using Sodium

All Sodium functions operate on binary data, called ‘raw’ vectors in R. Use charToRaw and rawToChar to convert between strings and raw vectors. Alternatively hex2bin and bin2hex can convert between binary data to strings in hex notation:

test <- hash(charToRaw("test 123"))
str <- bin2hex(test)
print(str)
[1] "e8b785b02e702c0b7edc9683130db36c91e0241ba0c489ff1e20cbb4fa3920f9"
hex2bin(str)
 [1] e8 b7 85 b0 2e 70 2c 0b 7e dc 96 83 13 0d b3 6c 91 e0 24 1b a0 c4 89 ff 1e
[26] 20 cb b4 fa 39 20 f9

Random data generator

The random() function generates n bytes of unpredictable data, suitable for creating secret keys.

secret <- random(8)
print(secret)
[1] 6c d7 1a 09 68 9c 96 53

Implementation is platform specific, see the docs for details.

Hash functions

Sodium has several hash functions including hash(), shorthash(), sha256(), sha512 and scrypt(). The generic hash() is usually recommended. It uses blake2b with a configurable size between 16 bytes (128bit) and 64 bytes (512bit).

# Generate keys from passphrase
passphrase <- charToRaw("This is super secret")
hash(passphrase)
 [1] 98 5c 9b b6 f6 92 d5 26 10 80 99 25 3e a5 a6 66 67 13 fd 88 10 b6 12 74 86
[26] c8 e9 5c 44 07 45 f5
hash(passphrase, size = 16)
 [1] eb 6c df 04 18 40 16 28 c1 b0 2e 76 f3 e6 bd 89
hash(passphrase, size = 64)
 [1] d0 89 68 30 26 1d 1b 85 76 dc ad 20 c9 58 0a fb b1 d0 62 ba 10 d6 80 f6 cb
[26] c6 ae 2d 42 57 ee a0 65 fd b0 e8 90 02 ae b3 e0 4f 88 df ba ea 26 bb 47 3f
[51] 29 5a a4 06 cd b8 05 78 83 31 66 dc 7b 24

The shorthash() function is a special 8 byte (64 bit) hash based on SipHash-2-4. The output of this function is only 64 bits (8 bytes). It is useful for in e.g. Hash tables, but it should not be considered collision-resistant.

Secret key encryption

Symmetric encryption uses the same secret key for both encryption and decryption. It is mainly useful for encrypting local data, or as a building block for more complex methods.

Most encryption methods require a nonce: a piece of non-secret unique data that is used to randomize the cipher. This allows for safely using the same key for encrypting multiple messages. The nonce should be stored or shared along with the ciphertext.

key <- hash(charToRaw("This is a secret passphrase"))
msg <- serialize(iris, NULL)

# Encrypt with a random nonce
nonce <- random(24)
cipher <- data_encrypt(msg, key, nonce)

# Decrypt with same key and nonce
orig <- data_decrypt(cipher, key, nonce)
identical(iris, unserialize(orig))
[1] TRUE

Because the secret has to be known by all parties, symmetric encryption by itself is often impractical for communication with third parties. For this we need asymmetric (public key) methods.

Secret key authentication

Secret key authentication is called tagging in Sodium. A tag is basically a hash of the data together with a secret key.

key <- hash(charToRaw("This is a secret passphrase"))
msg <- serialize(iris, NULL)
mytag <- data_tag(msg, key)

To verify the integrity of the data at a later point in time, simply re-calculate the tag with the same key:

stopifnot(identical(mytag, data_tag(msg, key)))

The secret key protects against forgery of the data+tag by an intermediate party, as would be possible with a regular checksum.

Public key encryption

Where symmetric methods use the same secret key for encryption and decryption, asymmetric methods use a key-pair consisting of a public key and private key. The private key is secret and only known by its owner. The public key on the other hand can be shared with anyone. Public keys are often published on the user’s website or posted in public directories or keyservers.

key <- keygen()
pub <- pubkey(key)

In public key encryption, data encrypted with a public key can only be decrypted using the corresponding private key. This allows anyone to send somebody a secure message by encrypting it with the receivers public key. The encrypted message will only be readable by the owner of the corresponding private key.

# Encrypt message with pubkey
msg <- serialize(iris, NULL)
ciphertext <- simple_encrypt(msg, pub)

# Decrypt message with private key
out <- simple_decrypt(ciphertext, key)
stopifnot(identical(out, msg))

Public key authentication (signatures)

Public key authentication works the other way around. First, the owner of the private key creates a ‘signature’ (an authenticated checksum) for a message in a way that allows anyone who knows his/her public key to verify the integrity of the message and identity of the sender.

Currently sodium requires a different type of key-pair for signatures (ed25519) than for encryption (curve25519).

# Generate signature keypair
key <- sig_keygen()
pubkey <- sig_pubkey(key)

# Create signature with private key
msg <- serialize(iris, NULL)
sig <- sig_sign(msg, key)
print(sig)
 [1] ac 8e db bd 7f 2f f7 22 c4 12 e8 37 d1 69 64 11 d1 69 d6 e7 49 77 5b fd bd
[26] a9 ef fe 3f 0d 3a ce a5 70 33 60 4f 6a 59 f5 e5 a6 09 28 ae e9 93 bf 0a d8
[51] 0d 42 7b 57 4e bd b2 2a 34 4c 40 5c 0c 02
# Verify a signature from public key
sig_verify(msg, sig, pubkey)
[1] TRUE

Signatures are useful when the message itself is not confidential but integrity is important. A common use is for software repositories where to include an index file with checksums for all packages, signed by the repository maintainer. This allows client package managers to verify that the binaries were not manipulated by intermediate parties during the distribution process.

Public key authenticated encryption

Authenticated encryption implements best practices for secure messaging. It requires that both sender and receiver have a keypair and know each other’s public key. Each message gets authenticated with the key of the sender and encrypted with the key of the receiver.

# Bob's keypair:
bob_key <- keygen()
bob_pubkey <- pubkey(bob_key)

# Alice's keypair:
alice_key <- keygen()
alice_pubkey <- pubkey(alice_key)

# Bob sends encrypted message for Alice:
msg <- charToRaw("TTIP is evil")
ciphertext <- auth_encrypt(msg, bob_key, alice_pubkey)

# Alice verifies and decrypts with her key
out <- auth_decrypt(ciphertext, alice_key, bob_pubkey)
stopifnot(identical(out, msg))

# Alice sends encrypted message for Bob
msg <- charToRaw("Let's protest")
ciphertext <- auth_encrypt(msg, alice_key, bob_pubkey)

# Bob verifies and decrypts with his key
out <- auth_decrypt(ciphertext, bob_key, alice_pubkey)
stopifnot(identical(out, msg))

Note that even though public keys are not confidential, you should not exchange them over the same insecure channel you are trying to protect. If the connection is being tampered with, the attacker could simply replace the key with another one to hijack the interaction.