Data Directory Layout Specifications
Everything club persists at runtime lives under a single directory (/data in Docker, configurable elsewhere). The tree is organised into four durability tiers so backups, restores, and disk-cleanup scripts can target exactly the right slice.
This is the authoritative layout reference. Every other operational doc (backups, upgrades, self-hosting) links here.
The whole tree at a glance
/data/├── db/ 🔒 PRIMARY — back up│ ├── club.db SQLite database│ ├── club.db-shm SQLite shared-memory file (auto)│ └── club.db-wal SQLite write-ahead log (auto)│├── blobs/ 🔒 PRIMARY — back up│ └── <package>/│ ├── <version>/│ │ ├── artifacts/│ │ │ ├── package.tar.gz the archive│ │ │ └── package.json sidecar metadata (S3/GCS only)│ │ └── screenshots/│ │ ├── 0 raw bytes, mime type in DB│ │ ├── 1│ │ └── …│ └── dartdoc/ only when DARTDOC_BACKEND=blob│ └── latest/│ ├── blob concatenated, per-file gzipped│ └── index.json BlobIndex: path → {start,end}│├── cache/ 🟡 REGENERABLE — skip from backups│ ├── dartdoc/ only when DARTDOC_BACKEND=filesystem│ │ └── <package>/│ │ └── latest/│ │ ├── index.html│ │ ├── __404error.html│ │ └── …│ ├── sdks/ Dart/Flutter SDK installs│ │ ├── flutter-3.24.0/│ │ └── flutter-3.27.0/│ └── pub-cache/ pana's PUB_CACHE│ └── hosted/pub.dev/<pkg>-<ver>/│├── logs/ 🟢 OBSERVABILITY — optional│ └── scoring.log pana run log (append-only)│└── tmp/ 🟢 EPHEMERAL — never restore └── uploads/ └── <upload-id>.tar.gz in-flight publish tarballTier semantics
Each top-level directory has one job. The tiers exist specifically so operators can reason about what to back up, what to wipe, and what to ignore.
| Tier | Dir | Survives container restart? | Back up? | Safe to rm -rf? |
|---|---|---|---|---|
| Primary (DB) | db/ | Yes, must | Yes | No — loses all metadata |
| Primary (blobs) | blobs/ | Yes, must | Yes | No — loses tarballs + screenshots + blob-mode dartdoc |
| Regenerable | cache/ | Preferred | No | Yes — regenerates on demand |
| Observability | logs/ | Preferred | Only if you care about history | Yes |
| Ephemeral | tmp/ | No matter either way | No | Yes |
A correctly-configured backup is just two directories: db/ + blobs/. Everything else rebuilds itself:
cache/dartdoc/regenerates on the next scoring pass per package.cache/sdks/re-downloads on first boot when pana scheduling kicks in.cache/pub-cache/populates as scoring fetches dependencies.logs/scoring.logis append-only observability.tmp/uploads/is per-request scratch that expires along with the DB session.
Path controls (env vars)
Every path is overridable. Defaults listed assume DATA_DIR=/data.
| Path | Env var | YAML key | Default |
|---|---|---|---|
| SQLite DB file | SQLITE_PATH | db.sqlite_path | /data/db/club.db |
| Blob store root | BLOB_PATH | blob.path | /data/blobs |
| Dartdoc (filesystem mode) | DARTDOC_PATH | dartdoc_path | /data/cache/dartdoc |
| Flutter/Dart SDKs | SDK_BASE_DIR | — | /data/cache/sdks |
| Pana pub-cache | — | — | /data/cache/pub-cache (hardcoded today) |
| Logs directory | LOGS_DIR | — | /data/logs |
| Temp uploads | TEMP_DIR | temp_dir | /data/tmp/uploads |
db/ — SQLite database
Three files managed together:
| File | Role | Lock ownership |
|---|---|---|
club.db | The primary database — all tables, FTS5 index, triggers, pragmas | SQLite main process |
club.db-shm | WAL shared memory, mmap-mapped by readers | SQLite runtime |
club.db-wal | Write-ahead log; readers see a consistent snapshot up to the last checkpoint | SQLite runtime |
Back all three up together or none at all. The sqlite3 .backup command handles this correctly; cp while the server is running is unsafe.
blobs/ — Blob store root
Everything addressable through the BlobStore interface lives here. The layout is identical whether the backend is filesystem or S3/GCS (in the latter case, the prefix becomes the bucket’s key structure).
Per-version layout
blobs/<package>/<version>/├── artifacts/│ ├── package.tar.gz ← the archive│ └── package.json ← {size, sha256, createdAt}; S3/GCS only└── screenshots/ ├── 0 ← raw bytes, MIME type stored in DB ├── 1 └── …package.tar.gzis the literal archive theclub publish/dart pub publishclient uploaded, atomically renamed into place after SHA-256 verification.package.jsonis a tiny sidecar that exists only on the S3 and GCS backends, holding the fields object-storage metadata can’t cheaply return (SHA-256 of upload-time bytes + pinned createdAt). The filesystem backend usesstat(2)and on-the-fly hashing instead.- Screenshots are referenced by their zero-based index in the pubspec’s
screenshots:declaration, not by filename. MIME types are stored in the DB rather than inferred from the filesystem.
Per-package dartdoc (blob mode only)
When DARTDOC_BACKEND=blob (see Dartdoc Serving Specifications), an additional subtree sits under the package root:
blobs/<package>/dartdoc/latest/├── blob ← concatenated, gzipped-per-file contents└── index.json ← BlobIndex mapping path → {start, end} within blobOnly latest/ exists — older versions’ dartdoc is discarded on regeneration. The blobId inside index.json embeds a millisecond timestamp so cache entries keyed on it expire naturally without explicit invalidation.
cache/ — Re-derivable state
cache/dartdoc/ (filesystem mode only)
Full HTML tree as emitted by dart doc:
cache/dartdoc/<package>/latest/├── index.html├── __404error.html├── <library_name>/… one directory per top-level library├── static-assets/ CSS, JS, fonts└── …Regenerated on every successful scoring pass for the current latest version. If deleted, /documentation/<pkg>/latest/ returns 404 until the next scoring run. The latest-only policy means older versions never leave a directory here.
cache/sdks/
Dart/Flutter SDK installations managed by the admin-settings SDK page. Each directory is a full SDK install (~1–3 GB). The admin UI knows how to discover these on startup and re-index them; you can safely delete the whole directory and the admin UI will report “no SDK available” until you reinstall.
cache/pub-cache/
Shared PUB_CACHE directory used by pana when it resolves a package’s dependencies during scoring. Grows with the union of all transitive deps seen so far. Bounded only by the set of scored packages; self-hosted registries with a small package count stay under ~1 GB.
logs/ — Observability
Append-only log files. Currently just one:
| File | What it holds |
|---|---|
scoring.log | Every pana run: start, finish, exit code, truncated stderr on failure |
Rotate externally if you care about size bounds (logrotate, fluent-bit, etc.). The server never rotates on its own because pana’s cadence is low enough that unbounded growth takes many months.
tmp/ — Ephemeral
Per-request scratch only.
tmp/uploads/
The dart pub publish upload protocol is three-step (reserve → upload bytes → finalize). The tarball lives here between step 2 and step 3. Entries older than the DB’s upload-session TTL (default 10 min) can be safely deleted at any time; the finalize path will reject their session IDs anyway.
Back-up matrix
This is what goes in your cron job. Everything else is derived or ephemeral.
# Primary state — covers everything the registry needs to serve.tar czf /backups/club-$(date +%Y%m%d).tar.gz \ -C /data db blobs
# Optional: include logs for historical observability.tar czf /backups/club-logs-$(date +%Y%m%d).tar.gz \ -C /data logsEverything under cache/ and tmp/ is intentionally excluded. For the full restore procedure including PostgreSQL and S3/GCS variants, see Backup & Restore.
Switching DARTDOC_BACKEND between modes
filesystem → blob: on the next scoring run per package, output starts landing underblobs/<pkg>/dartdoc/latest/. The stalecache/dartdoc/<pkg>/tree becomes unused — safe torm -rf /data/cache/dartdoconce every package has re-scored.blob → filesystem: on the next scoring run per package, output lands undercache/dartdoc/<pkg>/latest/. The staleblobs/<pkg>/dartdoc/subtree is unused — safe to delete viaDELETE <pkg>/dartdoc/*(S3/GCS) orrm -rf /data/blobs/<pkg>/dartdoc(filesystem) once every package has re-scored.
In both cases, until a package’s dartdoc is regenerated, its /documentation/<pkg>/latest/ endpoint returns a 404 in the new backend. This is expected; clients will retry once the next scoring pass completes.
Related specs
- Dartdoc Serving Specifications — request routing, cache mechanics, MIME safelist.
- Validation & Scoring Specifications — what runs before anything lands in
blobs/orcache/. - Environment Variables — every path’s env-var override.
- Backup & Restore — script templates and restore procedure.