Today we are going to learn about cryptography. I keep my posts pretty conversational, so if you find that annoying, fuck off.
Got pretty into cryptography recently, wanted to understand a bit more about security IRL, so I threw together an experimental project, which started out just looking at sending encrypted messages back and forth across a server without the server ever having the ability to access the content, traversing its system. As almost everybody knows, this is called end-to-end encryption (I’m just using as many words as I think will help everyone in the audience to understand, so if it sounds juvenile for a moment, I’m not talking to you).
The key components I found would be necessary in order for this system to succeed (which have been determined, over time, iteratively, and definitely not right out of the gate), were the components that would allow us to create a large group communication system, in an end-to-end encrypted fashion.
Those components are as follows:
With these 5 components, we might be able to achieve our goal and create an end-to-end encrypted system, which could very well, support a large group communication setting.
Let’s dive in…
On Device Security
On device security is first and foremost critical when developing an end to end encrypted system, where the entire point is that the security work of encryption, and decryption, happen only on the user’s device, where the user, themselves, has the most control.
Our core belief is that keeping things consistent, and robust as possible. While we are open to the idea that some folks might disagree with this decision, we have opted to go with a purely WebAssembly, in Rust, web framework. We’ve chosen to go this route for the sake of keeping our only 2 officially supported languages Rust and CSS, as well as take advantage of shared libraries, and the speed / performance of Rust + WASM doing a ton of cryptographic math.
the only languages we officially support are Rust and CSS
In short, Rust and WebAssembly for performance needs… BUT WAIT there’s more (and we’re in the “Device Security” section, so duh). WebAssembly has some cool sandboxing and security guarantees which potentially open us up to some neat surprises in the future (no more on that now. if we like this idea, folks might be able to hear more someday). Anyway, it’s cool to be isolated from the host and because we chose WebAssembly, and you may have inferred this by now, but we’re going wholesale on this whole PWA thing.
So on top of the WASM (memory safety, sandboxing, etc.) we’re ALSO able to run in the browser… safely. Which means, we can go on any device, anywhere, that supports some reasonably late browser.
WebAssembly is cool, you should check it out
“How can you say ‘safely’ dumbass? don’t you know that you will just summon the nerds who will want to hurt your feelings.” And I say to you, “Please send your nerds.” Honestly I want some eyes on this, but my best at this point, I’m pretty sure is actually fairly solid.
So when a user sets up their account, they’ll need to set up a passcode (or biometrics if that’s an easy option from the browser, but I’ve honestly been too lazy
for like weeks now to just look that up… could look up bluetooth for data transfer… anyway, back to the story) and they’ll set up this passcode,
because it will be the only thing which allows them to decrypt the contents of their username-based, database, partition, which is full of only
AES256 encrypted records, with the private key,
for the database, locked behind biometrics or passcode (in the worst case scenario, I might allow for just like some made up key that’s secured by a remote passcode, but honestly
that might just be a dumbass idea).
All local content is
AES256encrypted in an on-device key/value store
So, I don’t know if you have much better of an option than that, but the next phase rolls us into how, what seems like an unreasonable amount of encryption, turns into an actually very manageable amount…
Alright, so the question is, “wow, it already sounds like you’re doing an awful lot of encryption… how can your devices handle that much heat?” and the answer is
[u8; 32] and
[u8; 12]. The core items which are ever being encrypted or decrypted are of fixed length and are tiny. There is one exception, and that is the
actual content, which, needs to be decrypted very infrequently and only once during the full decryption process.
despite the nested nature of our encryption patterns, it remains “fast enough” due to the fact that we are mostly encrypting and decrypting other keys of relatively small size
So, when we say that, “all the data stored on device is encrypted with
AES256” what that actually means is that there is a central, private key, held behind a passcode/biometric, and
that private key (which is unlocked only by biometrics/passcode) is used to unlock values in a table whose values are addressed using the “plaintext”
x25519 public key, that points to an
x25519 private key.
x25519 private key is used for creating shared secrets with other entities, public
“Okay, fuck. what the hell are you talking about? Why have you now introduced me to
Okay, fair enough. While
AES256 - as I think most know - is a symmetric encryption algorithm which is very popular (I think payment processors use it to encrypt credit card numbers and stuff).
But some of you might not know about
x25519 (you might. if you do, that bit? not for you). There is another kind of cryptographic algorithm out there, and that is the asymmetrical kind.
Asymmetrical encryption can be used to facilitate public/private key encryption, and it will be a foundational component of our e2ee system. But, one of the things that makes it a little more complicated is that
your keys and the values that can be encrypted with
x25519 are fixed in length. This is part of why (but not directly why) you’re seeing the blending of these two kinds (asymmetrical and symmetrical)
of cryptographic approaches.
“hybrid encryption,” or the use of symmetrical and asymmetrical encryption, together, is a way of using public/private key encryption with larger “balls” of content
So, to explain first, how we’re using
x25519, we first have to understand its purpose: we need to be able to create shared secrets with counterpart entities, so that we can encrypt content which we would like to only be decrypt-ble by another entity which can generate the same shared secret. And the way that we can (but seems to me like “have to”) do this, while still allowing for offline communication, is by utilizing the concept of “batched forward secrecy” (which is a term I think I made up). Essentially, what one entity does, is generate a brand new public/private
x25519 pair, write them to the on-device datastore (the private one being encrypted as we said, with the
AES256 key which is behind the passcode/biometrics), and register the newly generated, public
x25519 entity’s publicly viewable profile.
Anyone who the entity has allowed onto their “encryption list” (it’s really just like a follow request for a private account).
The way that an external entity would use your pub key “to encrypt” is as follows:
- they grab down your pub key off your profile
- they generate a new public and private key
- they generate a shared secret with your public key
- they encrypt for you
- they send up the payload with the metadata of your public key so that the message can be anonymously addressed to you, and their public key, so you can recreate the shared secret.
- then they throw away all references to both the public, and private key, as they have their own encrypted copy and don’t need to keep your details
Okay, cool. So now you know why the public keys matter (and hopefully how they relate to the database storage and the
AES256 encryption part… if you’re still lost, “
AES256 hides sensitive keys, safely.
x25519 facilitate shared
secret creation, and the ‘public/private’ key encryption part!”), we can talk about WHAT the shared secret is actually used to encrypt. You might be like “ah, lol. they’re just gonna be rubes and encrypt the entire payload for
every single person who’s basically ‘subscribed’ to you, because you’ve just invented mailing lis… “ *fades out into more just annoying-ness* but no, we’re not. What we’re actually going to do is slow our roll and instead focus
on our strategy for encrypting the actual content, which doesn’t require us to repeat ourselves.
And… the answer is…
I don’t know if this metaphor is the most accurate. I’m pretty sure somebody’s gonna get mad at me because it has other special significance to them, but for now I’m gonna call it an onion and I’ll change it if I hate it.
Basically, you generate some bullshit. That bullshit is actually, a private key, a public key, and a nonce for a
ChaCha20 which is another fucking dope ass form of encryption,
which is also a symmetric encryption algorithm but as distinct from
AES256’s being a block-cipher,
ChaCha20 is a stream-cipher.
“Okay, what the fuck? Are you just trying to show us that you know what stuff is?”
Well, kinda. I like knowing what stuff is, and so do you. BUT, here’s the thing,
AES256 doesn’t make as much sense here because (from what I understand, and I could be mistaken) it’s too slow. And the payload
sizes have so much more variability, which I personally think means
ChaCha20, despite adding complexity, does in fact, add value.
ChaCha20isn’t just a flex, it’s a performance thing
Okay, so we’re passed the part where people either respect me or they don’t, and that’s fine. So double down on how much you hate me, or keep your mind open and let’s explore more together.
So, we know we have our content, and we no it’s eligible for encryption. We know we don’t want to encrypt it more than once. So, what we do, is we use that beautiful private/public
ChaCha20 key pair, and
it’s so very useful, little, nonce (that’s our
[u8; 12] btw. everybody else is 256 bits), and we encrypt the shit out of that content.
Now we’re left with some encrypted content, a private
ChaCha20 key, and the public + nonce (which need to be here but can be less central). We’ve also got a bunch of other shit that you forgot about
while you were mad at me or distracted with me about whatever shit I said that didn’t make sense to you: WE NEED TO DISTRIBUTE THIS ENCRYPTED CONTENT TO A BUNCH OF OTHER PEER ENTITIES. So, how we do that, is we
say, “oh fuck, we’ve got the 4 pieces mentioned before, and a whole long ass list of these public keys we need to pair with our brand newly generated private
x25519 key to create a shared secret” and we’re like, “hell yea, we gotta
use it to encrypt the private
ChaCha20 key (we just used to encrypt the content) with the shared secret we generated.”
Anyway, nobody is probably following, so let me map it out:
key_sets public_sender public_receiver nonce # <- for immediately below chacha20(chacha20_key) # <- internal key is for decryption of chacha(content) below nonce # <- for immediately below chacha20(content)
So, what we have is a long list of key pairs that is unique for every receiver. But the content must only be encrypted once. This means, for every receiver on this record, it should only increase the message size by the size of the 3 keys and nonce.
So, you’re still thinking, “goddamn, this is really a lot of encryption, even if it’s just on small numbers.” And, you’re right, but candidly, (as we’re about to find out) I could really find a much better way have handling some kind of useful forward secrecy. And I think we’ve found the right balance.
So I am not a cybersecurity expert, as I have said this is just an experiment and I am “doing my best.” But I think this is a lot close enough to what we would want to see from forward secrecy as a general principal, and maybe hold space for some performance gains which, maybe, negatively impact the security profile of one single message.
The problem of “close enough” on forward secrecy, is that I’m not sure of a great way for it to be fully “forward secret” qualifying at any one time without both communicating entities being online at the exact same time.
So instead, what I think we can do is like DNS or a routing table, we can have our entities offer up their current public
x25519, and for the time range that that particular public key is active, all content is encrypted
using it. Public keys will be cycled on a cadence, but a record of them will be maintained locally as it lives in the “key table” as an address, and is necessary to reference the location of locally stored, corresponding,
x25519 keys which help you create shared secrets and decrypt the old messages which are either stored on the device or stored in the remote service).
“batched forward secrecy” is achieved by frequent enough key cycling
Users will obviously have to maintain a copy of their keys, and exports should be really easy or people will just be fine losing data, or maybe we can sync keys across bluetooth, or maybe YubiKeys are sick, or maybe password managers alone are sufficient but whatever. Experiment.
we gotta figure out how to make password managers cooler
So what entities end up doing, is for periods of time (why I referred to it as “batched” earlier), and that user stores the public key and the time duration for which it was active, so that it can (as close to as anonymously as I could fuckin’ figure out), query for their message with that public key, and decrypt it later with the (and yes, it’s still fucking encrypted in the db) private key, companion, to the sender’s public key, which is indexed by the aforementioned “public key” or “receivers” key. Then when you get em back, you decrypt them by getting all the shit you need and unwrapping key, after key, until finally you realize “this is a lot, but I can’t really tell if it’s unreasonable”
Alright, we talked about this for a second up there, and it’s about to get even more fucking important when we introduce the idea of “pods” which we have only, maybe, brushed up against at this point.
In our end-to-end encrypted system we’re creating for ourselves, the bounds of encryption, are not “between peers”, but “through pods.” All messages sent, are sent to “a pod.” All “one to one” user engagement is only something simulated by a client, or enforced by the restrictive size of 2 for their shared pod.
Anonymity is a wonderful goal, and it is one that we are always doing our best to satisfy, but is honestly one of the harder parts of this problem to solve (if you have thoughts, we’re open).
it is not knowable by our system, which pods you are associated to
In order to prevent the user from having to display their residency in pods, or identify themselves by any measure other than their “batch time” public key, which only they should hold a record of, locally, we include only information about “pod slots.” We don’t want to know which pods you’re in, and we don’t want anyone else to (unless of course you share with them in your encrypted content) either. Currently the thing we are trying to sort out the viability of, is figuring out how to hide the relationship between customer ids in a payment processor. I’m sure it’s possible and want to do it (assuming their isn’t some horrible gotcha that I haven’t thought about yet).
the only way to track down whether a message has been associated to you (as a recipient or sender) is with a public key, tied to that particular message. The key, which is only valid for a “batch” of messages, and does not give you the ability to decrypt any content within the message itself.
Anyway, it’s impossible for us to tell what groups you’re actually in because it’s encrypted, inside your “ball” (we call the encrypted shit that no one can look at “the ball” or “their ball”). So what happens is (and this will transition us into the “verification” step) is that when you go to initiate a session, you pull down your “ball”, you decrypt it, you see your own list of pods that you’ve selected to be in (and have been “accepted” into), and your public profile displays your number of slots, so it does all the work to make sure that you’re never exceeding your limit or cheating the system, and (this is where signing comes in in a min) every message sent will be verifiable and signed by every user.
Okay, so everything is encrypted. Everyone is in very secret sneaky rooms, called “pods” and they’re sending each other encrypted “balls” of content, and everything is anonymous so no one knows what the fuck anything is.
Okay, so it’s not all chaos; after all, idk honestly… just like it when people say “after all” and there’s a conclusion that sums it up nicely.
Anyway, not chaos, but is anonymous and must remain that way, but we also need people to get their shit when they ask for it.
So, before we can explain verification, we have to explain how the data is stored:
The data is stored in a giant key/value store, where every key is just a big, long, metadata-y string, which does the work of allowing us super efficient query-time performance, via range-queries,
and yea, basically just range queries over timestamps. This shit is super simple. We have a key structure that looks like
POD#<id>#<timestamp>#<uuid> points at a message. So when you’re looking for the message history
of your given pod, you say “hey, start that this time stamp, and go backwards until I have enough shit to be useful” (i.e
get begins_with='POD#<id>#<now>' direction='reverse'), and then, on the backend server,
it’s ripping through the
key_sets (described above) and grabbing just your keyset that is necessary for decrypting the payload locally (boo over-fetching).
any payload’s content, sent with a JWT (everything other than signup and signin) will be signed with the private key, tied to the public signature on the JWT
Okay, so that’s how data is stored, and I honestly don’t know why it was a prerequisite to: when you sign in, you get a JWT and it has your public signature on it, every time you send up content, it is encrypted with the private
signing key that corresponds to the public signature, so that we ensure, all tokens are being used, only with content, which is signed by the same private signing key as the user who logged in with their username, password, and (hopefully)
and OTP. Signing is done with
content verification is done with signatures and utilizes the
To sign in, the only credentials you need are:
- OTP (like with 1Password or an Authenticator app)
- private signature / signing key (this is the only “extra” thing, because the “private key” for content, is cycled constantly)
I think over time people will get more used to a lot of login credentials as we all got pretty used to OTPs via SMS and email. Or who knows.
you can verify the authenticity of the message poster by checking their signature on a decrypted payload
In addition to the encrypted content being signed before and verified by the backend application layer, it is stored in the database “unsigned”, as to protect the anonymity of the message sender. BUT the content internal to the encrypted content “ball”, has also been signed, for the message receiver to verify with when they receive it.
I’m not really sure what the conclusion is supposed to be. I threw a bunch of stuff out there. At the end of the day, the 3 (but really 2) logical components of the system are “entities” and “events”, with a differentiation between ‘users’ and ‘pods’ as separate entities.
All content is encrypted for any user “subscribed” to the pod’s content. The mod admission and removal hasn’t been detailed yet, but can be an evolving component of the system. On the whole, we are essentially describing an end-to-end encrypted mailing list, where instead of being given an email that you have to index and make sure you send stuff to, you have a public, temporary key, which you use to encrypt and address the message with.
I feel strongly that with the primitives described (User, Pod and Message), and with the encryption scheme (is it a “protocol”?) that I describe, we could create any meaningful app, with maximum security, shared login, everywhere, on every device.
Anyway, cool if not tho.
Massive group chats that require people to generate a shared secret with
ChaCha20 encrypt the other (content)
ChaCha20 private key with that shared secret, for every single user, might be really annoying. BUT I actually think most users won’t mind given that the time it takes to encrypt positively correlates to the number of recipients. Decryption speed is constant, so basically if that’s too slow, you’re just SOL. Really it’s about prefetching and doing that accurately when the only context you have on a “pod” are the timestamps on its surrounding “events” or “messages.”