I've been working on Arkfile for over a year now. I have been using mostly AI tools to build out and co-design the system with me. I have learned a lot about CGO (Go + C), OPAQUE Auth protocol, and rqlite database innards.
I am excited to continue building it out.
As I have been working on it, I've adapted to the realities of what I'm trying to build. Where the rubber meets the road, building truly zero-knowledge systems is hard.
I had a mental model going in of performing client-side encryption and decryption of files.
I've adjusted from thinking that I must do all of this in Go with Web Assembly on the client side, to now accepting the tradeoffs of using the Web Crypto API and using TypeScript on the client-side instead for the browser side of the app.
I've also built out a large codebase dedicated to command-line users, which will use strictly Go (or CGO) code there via the CLI tools arkfile-admin (sysadmin tool), arkfile-client (auth/account access tool) and cryptocli (offline crypto util).
So client-side encryption and decryption of files is certainly doable in multiple ways. I think leveraging TypeScript for the browser and Go for the command-line makes sense. It is easy enough to understand and I've developed a new unified config system which can keep the server, and the web app user and the command-line users all in line with each other on the critical crypto parameters, algorithms, and also on password strength requirements. For this I've developed a password requirement json config and a password hashing parameter json config file. When building both the server app and the Go CLI tools, these configs are embedded into the executables. For the browser clients, the web app can query the server via the API to get the configs and then cache them client-side. There are things like how many characters are required for each type of password (account: 14; custom: 14; shared file: 18), minimum bits of entropy per password (60 bits), and specific Argon2id values like memory (256m), iterations (8t), and parallelism (4p) used in securely hashing passwords.
- password requirements: https://github.com/arkfile/Arkfile/blob/main/crypto/password-requirements.json
- argon2id params: https://github.com/arkfile/Arkfile/blob/main/crypto/argon2id-params.json
Where I didn't have a mental model previously was in removing, obfuscating or encrypting all additional data and metadata about users and files.
Most web apps start with the assumption that you sign up with an email address. I've removed that and require only a username of your choosing.
Most web apps log IP addresses somewhere. I've stripped these out completely. Where necessary, to bolster the server's defenses, I've employed an ephemeral and pseudonymous Entity ID concept that is derived from the IP address via HMAC and is thus irreversible. This Entity ID can be used to perform rate-limiting where necessary, such as on registration attempts and shared file URL enumeration attempts. Resetting these every day means we don't inadvertently block different visitors who might incidentally use the same IP address later on.
Additionally, all file metadata, including sha256 digests of the original plaintext content, and the filenames, are encrypted client-side too. These are recoverable only by the original user himself. We also append a small and random amount of padding to the end of the client-encrypted files prior to uploading them to the object storage backend to reduce the likelihood that files can be identified purely based on their resulting size post-encryption.
These enhancements came about over time through hard won trial-and-error (multiple refactors) and back-and-forth conversations (and even 3-4 entity conversations) between me and the best models of the day, starting from Claude 3.7 to now Claude 4.5, Opus, ChatGPT 5, DeepSeek, etc.
As I've gone on in the development of Arkfile, I've developed a couple mental models and approaches to incremental improvement in the app.
The first is my "dev reset" script. This basically nukes the entire current installation and redeploys the app on the local machine for dev/test purposes, including seeding the database with a fixed admin dev/test user for end-to-end testing purposes.
Then with this dev reset tool, small and large changes alike can be consistently rolled out every time. If something breaks in the build pipeline it is immediately surfaced. If the app fails to start up that is instantly clear. And from there we have a baseline for redeploying the app and testing against the server's API from either the Go CLI utils or the browser web app.
For there I've built a number of scripts to perform end-to-end testing. Ideally, we don't see regressions once we hit a certain milestone, such as validating that OPAQUE Authentication works for registration and login purposes for a specific test user. If we do see regressions, they are at least immediately clear because we run the same test script after every redeployement of the app. The current iteration of the end-to-end test script is named 'e2e-test.sh'.
- End-to-end test script: https://github.com/arkfile/Arkfile/blob/main/scripts/testing/e2e-test.sh
The near-term goal is to make sure that OPAQUE Auth works in the same way using the libopaque CGO and WASM library packages that are integrated in with the Go CLI tools for CLI users and with the web app for browser clients.
The end-to-end test script is most useful for validating the Go CLI app usage. For the web app, I am working on creating a hybrid approach combining manual testing through the browser, automated testing with agentic AI agents, and bash scripts for programmatic testing as well.
After proving out the Auth system further, I will do the same for File encryption/decryption and then File Sharing.
More to come.