Warehouse is a distributed object storage system (an alternative to S3) that is fully self hostable
  • HTML 67.6%
  • JavaScript 15.8%
  • Go 12.8%
  • templ 3.5%
  • CSS 0.1%
  • Other 0.1%
2026-04-30 00:21:38 +01:00
build feat: auth using jwt+refresh tokens 2026-04-26 00:54:28 +01:00
cmd docs: add warehouse manager + solid icons 2026-04-30 00:21:38 +01:00
internal feat: view api keys in web UI 2026-04-26 12:08:34 +01:00
pkg feat: create API keys 2026-04-27 00:32:11 +01:00
proto feat: create API keys 2026-04-27 00:32:11 +01:00
sql feat: create API keys 2026-04-27 00:32:11 +01:00
.env.example feat: configure volume size + display amount of data that is wasted in 2026-04-21 22:43:47 +01:00
.gitignore add docs public folder 2026-04-13 22:38:01 +01:00
.goreleaser.yaml feat: auth using jwt+refresh tokens 2026-04-26 00:54:28 +01:00
.harper-dictionary.txt feat!: replace multiple file storage with single volume file 2026-03-24 22:44:53 +00:00
go.mod security: upgrade dependency with vulnerability 2026-04-27 01:57:08 +01:00
go.sum chore: go mod tidy 2026-04-28 21:47:40 +01:00
LICENSE add LICENSE 2026-03-20 21:30:13 +00:00
Makefile feat: view api keys in web UI 2026-04-26 12:08:34 +01:00
README.md Update README.md 2026-04-27 21:02:34 +01:00
sqlc.yaml first commit with bucket creation 2026-03-20 21:25:11 +00:00

Warehouse is a distributed object storage system (an alternative to S3) that is fully self hostable.

Features:

  • Scalable: Deploy as many volume servers as you want to expand the storage pool
  • Optimized for small files - based on the haystack paper and inspired by SeaweedFS
  • Web UI - Easily manage volume servers, buckets, objects, and API keys through the Web UI
  • Fine grained API keys - create API keys with access to only what you allow

Docs:

Feature Goals

  • Basic Bucket CRUD
  • Basic Object CRUD
  • Web UI - In progress
  • Authentication
  • Golang client
  • TypeScript client
  • Graph based upload processing (replacement for a message broker)
  • Cache server (would reconstruct chunks as well)
  • FFmpeg integration
  • TensorFlow integration
  • In memory database instead of SQLite

Unfortunately I've ran out of time to implement all these features!

I'll update during Stardance by Hackclub

Optimizations

  1. Optimized for small files
    • Usually to read a file its metadata has to read from disk first (unless its in cache) and then do another read to actually read the file. This adds overhead.
    • Each file usually has over 128 bytes of metadata overhead (256+ bytes in ext4!)
    • Warehouse solves this problem by
      1. Having a very small metadata overhead (17 bytes)
      2. Storing all metadata in memory
    • This means that each read from a volume server is only one disk read, lowering latency
  2. Direct connections
    • Some file storage systems may proxy data to the underlying storage
    • This increases bandwith
    • Warehouse uses direct connections (with JWTs) so that clients upload and read directly from the volume servers the file (or chunk) is located at
  3. Chunking
    • Warehouse supports large files using chunks
    • Each file is split into 80 MiB chunks, by the client. This means less work for the server
    • The client uploads chunks directly to multiple volumes, spreading out work and increasing upload speed by using concurrency.