Files
stonks-oracle/docs/backup-restore.md
T
Celes Renata 88ad1e8d99 feat: comprehensive docs, unit tests, docker-compose app services
- Add scheduler and ingestion unit tests (test_scheduler_unit.py, test_ingestion_unit.py)
- Add all 13 app services + dashboard to docker-compose.yml
- Add full documentation suite: API reference, Helm reference, Docker deployment guide,
  3 architecture diagrams (K8s, Docker Compose, data pipeline), AI agent guide,
  backup/restore guide, observability/metrics reference, per-service docs
- Add intelligence pipeline deep-dive docs with Mermaid diagrams
- Update README with documentation index and links
- Add specs for comprehensive-quality-docs, intelligence-pipeline-deep-dive,
  sanitized-pipeline-docs
2026-04-22 02:56:41 +00:00

16 KiB

Backup and Restore Guide

This guide documents every backup and restore script in the Stonks Oracle platform, their CLI options, storage locations, retention policies, and procedures for disaster recovery.

Overview

Stonks Oracle provides two tiers of backup tooling:

Tier Scripts Scope Storage
Local (kubectl-based) backup-db.sh, restore-db.sh, backup-redis.sh Individual data stores, streamed to the operator's machine ~/backups/stonks-oracle/ (local filesystem)
Cluster (Kubernetes Job) backup.sh, restore.sh Full platform (PostgreSQL + all MinIO buckets) NFS share at 192.168.42.8:/volume1/Kubernetes/stonks

All scripts live in the scripts/ directory and require kubectl access to the cluster.


Local Backup Scripts

backup-db.sh — PostgreSQL Database Backup

Creates a compressed pg_dump of the stonks database and optionally uploads it to MinIO.

Usage:

./scripts/backup-db.sh                   # backup to local file
./scripts/backup-db.sh --upload-minio    # backup + upload to MinIO

CLI Arguments:

Argument Required Description
--upload-minio No Upload the backup file to the stonks-backups MinIO bucket after creating it

Environment Variables:

Variable Default Description
BACKUP_DIR ~/backups/stonks-oracle Local directory where backup files are stored

What it captures:

  • Full pg_dump of the stonks database (all tables, data, sequences)
  • Dump flags: --no-owner --no-privileges --clean --if-exists
  • Output format: gzip-compressed SQL (.sql.gz)

How it works:

  1. Runs pg_dump inside the PostgreSQL pod (postgresql-1 in postgresql-service namespace) and streams the compressed output to the local machine
  2. Validates the backup is non-empty and counts tables as a sanity check
  3. If --upload-minio is specified, attempts to create the stonks-backups bucket (if it doesn't exist) and stages the file for upload
  4. Prunes old backups, keeping only the last 7 files matching stonks-*.sql.gz

Storage:

  • Local path: ~/backups/stonks-oracle/stonks-<YYYYMMDD-HHMMSS>.sql.gz
  • MinIO bucket (optional): stonks-backups

Retention: Keeps the last 7 backups. Older files matching stonks-*.sql.gz in the backup directory are automatically deleted.


backup-redis.sh — Redis State Backup

Triggers a Redis BGSAVE and copies the RDB dump file to the local machine.

Usage:

./scripts/backup-redis.sh

CLI Arguments: None.

Environment Variables:

Variable Default Description
BACKUP_DIR ~/backups/stonks-oracle Local directory where the RDB file is stored
REDIS_PASSWORD PSCh4ng3me! Redis authentication password

What it captures:

  • Redis RDB snapshot (dump.rdb) containing all in-memory state: deduplication markers, queue contents, rate-limit counters, cached values

How it works:

  1. Triggers BGSAVE on the Redis master pod (redis-master-0 in redis-service namespace)
  2. Waits 5 seconds for the background save to complete, then logs the LASTSAVE timestamp
  3. Copies the RDB file from the pod. Tries /data/dump.rdb first, then falls back to /var/lib/redis/dump.rdb and /bitnami/redis/data/dump.rdb
  4. Prints Redis keyspace statistics for verification

Storage:

  • Local path: ~/backups/stonks-oracle/redis-<YYYYMMDD-HHMMSS>.rdb

Retention: No automatic pruning. Old Redis backups accumulate and must be cleaned up manually.


restore-db.sh — PostgreSQL Database Restore

Restores a pg_dump backup into the stonks database with full service scale-down/scale-up.

Usage:

./scripts/restore-db.sh <backup-file.sql.gz>
./scripts/restore-db.sh ~/backups/stonks-oracle/stonks-20260415-180000.sql.gz

If called without arguments, lists available backups in ~/backups/stonks-oracle/.

CLI Arguments:

Argument Required Description
<backup-file.sql.gz> Yes Path to the gzip-compressed SQL backup file to restore

What it restores:

  • All tables, data, sequences, and indexes in the stonks database
  • Re-grants ALL PRIVILEGES to the stonks user on all tables and sequences after restore

Service scale-down/scale-up procedure:

  1. Terminates active connections — Runs pg_terminate_backend() for all connections to the stonks database
  2. Scales down all deployments in the stonks-oracle namespace to 0 replicas to prevent reconnections
  3. Waits 10 seconds for pods to terminate
  4. Restores the backup using psql --single-transaction (piped from zcat)
  5. Re-grants permissions to the stonks user
  6. Verifies the restore by counting tables
  7. Scales all deployments back to 1 replica, then scales ingestion and parser to 2 replicas

Data loss implications:

WARNING: This replaces ALL data in the stonks database with the backup contents. Any data written after the backup was taken is permanently lost. The script requires interactive confirmation — you must type yes to proceed.


Cluster Backup Scripts (Kubernetes Jobs)

backup.sh — Full Platform Backup (PostgreSQL + MinIO)

Runs a Kubernetes Job that backs up both PostgreSQL and all MinIO buckets to an NFS share.

Usage:

bash scripts/backup.sh

CLI Arguments: None.

What it captures:

  • PostgreSQL: Full pg_dump in custom format (-Fc) as stonks.pgdump
  • MinIO buckets (8 buckets mirrored):
    • stonks-raw-market — Raw market data from Polygon.io
    • stonks-raw-news — Raw news articles
    • stonks-raw-filings — Raw SEC filings
    • stonks-normalized — Normalized documents
    • stonks-llm-prompts — LLM prompt logs
    • stonks-llm-results — LLM extraction results
    • stonks-lakehouse — Parquet fact tables for Trino
    • stonks-audit — Audit trail artifacts
  • Manifest: manifest.json with backup name, timestamp, and bucket list

How it works:

  1. Deletes any previous stonks-backup Job
  2. Creates a Kubernetes Job using postgres:18-alpine with NFS volume mount and MinIO credentials from cluster secrets
  3. Inside the Job container:
    • Runs pg_dump with credentials from stonks-config ConfigMap and stonks-core-secrets Secret
    • Installs the MinIO client (mc) and mirrors each bucket to the NFS backup directory
    • Writes a manifest.json and updates the latest symlink
  4. Waits up to 600 seconds (10 minutes) for the Job to complete
  5. Job auto-cleans after 300 seconds (ttlSecondsAfterFinished)

Storage:

  • NFS path: 192.168.42.8:/volume1/Kubernetes/stonks/<backup-name>/
  • Directory structure:
    stonks-backup-YYYYMMDD-HHMMSS/
    ├── stonks.pgdump              # PostgreSQL custom-format dump
    ├── manifest.json              # Backup metadata
    └── minio/
        ├── stonks-raw-market/     # Mirrored bucket contents
        ├── stonks-raw-news/
        ├── stonks-raw-filings/
        ├── stonks-normalized/
        ├── stonks-llm-prompts/
        ├── stonks-llm-results/
        ├── stonks-lakehouse/
        └── stonks-audit/
    
  • A latest symlink always points to the most recent backup

Retention: No automatic pruning on NFS. Old backups must be cleaned up manually.


restore.sh — Full Platform Restore (PostgreSQL + MinIO)

Runs a Kubernetes Job that restores both PostgreSQL and MinIO buckets from an NFS backup.

Usage:

bash scripts/restore.sh                    # restore from "latest" symlink
bash scripts/restore.sh <backup-name>      # restore a specific backup

CLI Arguments:

Argument Required Description
<backup-name> No Name of the backup directory on NFS. Defaults to latest (symlink to most recent backup)

What it restores:

  • PostgreSQL: Full database restore using pg_restore --clean --if-exists --no-owner --no-acl
  • MinIO buckets: All 8 buckets mirrored back with mc mirror --overwrite

How it works:

  1. Prints a warning and gives 5 seconds to abort (Ctrl+C)
  2. Deletes any previous stonks-restore Job
  3. Creates a Kubernetes Job that:
    • Validates the backup exists (stonks.pgdump file present)
    • Restores PostgreSQL using pg_restore with --clean (drops and recreates objects)
    • Installs mc and mirrors each bucket back from NFS to MinIO
    • Verifies the restore by querying row counts for key tables (companies, documents, intelligence, impacts, trends, recommendations)
  4. Waits up to 600 seconds for the Job to complete

Data loss implications:

WARNING: This will DROP and recreate all objects in the stonks database. All MinIO bucket contents are overwritten. Any data written after the backup was taken is permanently lost. The script provides a 5-second abort window before proceeding.

Post-restore steps:

After the restore completes, restart all services to pick up the restored state:

kubectl rollout restart deployment -n stonks-oracle --all

MinIO Upload Option (--upload-minio)

The backup-db.sh script supports --upload-minio for off-host storage of database backups. When enabled:

  1. The script connects to MinIO through an ingestion pod in the stonks-oracle namespace
  2. Creates the stonks-backups bucket if it doesn't already exist
  3. Stages the backup file for upload

This provides a second copy of the database backup on object storage, separate from the operator's local filesystem. The full cluster backup (backup.sh) stores backups on NFS and does not use this flag — it backs up MinIO bucket contents rather than uploading database dumps to MinIO.


Full Nuke and Rebuild Procedure

When a complete platform reset is needed (corrupted state, major schema changes, fresh start), follow this procedure:

Step 1: Tear Down Services

bash ~/sources/kube/stonks-oracle/runmelast.sh

This runs from gremlin-1 and performs a Helm uninstall, cleaning up all Kubernetes resources in the stonks-oracle namespace. Database, MinIO, and Redis data are preserved (they run in separate namespaces).

Step 2: Terminate Database Connections

kubectl exec -n postgresql-service postgresql-1 -c postgres -- \
  psql -U postgres -c \
  "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'stonks' AND pid <> pg_backend_pid();"

Step 3: Drop the Database

kubectl exec -n postgresql-service postgresql-1 -c postgres -- \
  psql -U postgres -c "DROP DATABASE IF EXISTS stonks;"

Step 4: Flush Redis

Clear all stonks:* keys to reset deduplication markers, queue contents, and cached state:

kubectl exec -n redis-service redis-master-0 -- \
  redis-cli -a 'PSCh4ng3me!' --scan --pattern 'stonks:*' | \
  xargs -L 100 kubectl exec -n redis-service redis-master-0 -- \
  redis-cli -a 'PSCh4ng3me!' DEL

Step 5: Redeploy

bash ~/sources/kube/stonks-oracle/runmefirst.sh

This runs from gremlin-1 and performs:

  • Database creation and migration (all infra/migrations/*.sql files applied in order)
  • Helm install with secrets injected via --set flags
  • Rolling restart of all deployments

Step 6: Re-seed the Symbol Registry

POSTGRES_HOST=postgresql-rw.postgresql-service.svc.cluster.local \
POSTGRES_PASSWORD='St0nks0racl3!' \
POSTGRES_USER=stonks \
POSTGRES_DB=stonks \
.venv/bin/python -m services.symbol_registry.seed

This populates the 50 tracked companies across 10 sectors and 46 competitor relationships.


Daily Database Backup (cron)

Run backup-db.sh daily on a machine with kubectl access. The built-in retention keeps the last 7 backups automatically.

# Daily database backup at 2:00 AM
0 2 * * * /path/to/stonks-oracle/scripts/backup-db.sh --upload-minio >> /var/log/stonks-backup.log 2>&1

Weekly Full Backup (cron)

Run the full cluster backup weekly to capture both PostgreSQL and MinIO data on NFS:

# Weekly full backup (PostgreSQL + MinIO) on Sundays at 3:00 AM
0 3 * * 0 /path/to/stonks-oracle/scripts/backup.sh >> /var/log/stonks-full-backup.log 2>&1

Redis Backup Before Deployments

Redis state is transient (queues, dedup markers, caches) and rebuilds naturally. Back up Redis before major deployments or database resets as a precaution:

./scripts/backup-redis.sh

Kubernetes CronJobs

For fully automated in-cluster backups, create a CronJob based on the same Job spec used by backup.sh:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: stonks-backup
  namespace: stonks-oracle
spec:
  schedule: "0 2 * * *"          # Daily at 2:00 AM UTC
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      ttlSecondsAfterFinished: 3600
      backoffLimit: 1
      template:
        spec:
          restartPolicy: Never
          volumes:
            - name: nfs-backup
              nfs:
                server: 192.168.42.8
                path: /volume1/Kubernetes/stonks
          containers:
            - name: backup
              image: postgres:18-alpine
              volumeMounts:
                - name: nfs-backup
                  mountPath: /backup
              envFrom:
                - configMapRef:
                    name: stonks-config
                - secretRef:
                    name: stonks-core-secrets
              env:
                - name: MINIO_ACCESS_KEY
                  valueFrom:
                    secretKeyRef:
                      name: stonks-core-secrets
                      key: MINIO_ACCESS_KEY
                - name: MINIO_SECRET_KEY
                  valueFrom:
                    secretKeyRef:
                      name: stonks-core-secrets
                      key: MINIO_SECRET_KEY
              command: ["sh", "-c"]
              args:
                - |
                  set -e
                  apk add --no-cache curl ca-certificates
                  STAMP="stonks-backup-$(date +%Y%m%d-%H%M%S)"
                  DIR="/backup/${STAMP}"
                  mkdir -p "${DIR}/minio"

                  # PostgreSQL backup
                  PGPASSWORD="${POSTGRES_PASSWORD}" pg_dump \
                    -h "${POSTGRES_HOST}" -p "${POSTGRES_PORT}" \
                    -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" \
                    --no-owner --no-acl -Fc \
                    -f "${DIR}/stonks.pgdump"

                  # MinIO backup
                  curl -sL https://dl.min.io/client/mc/release/linux-amd64/mc -o /usr/local/bin/mc
                  chmod +x /usr/local/bin/mc
                  mc alias set backup "http://${MINIO_ENDPOINT}" "${MINIO_ACCESS_KEY}" "${MINIO_SECRET_KEY}" --api S3v4

                  for bucket in stonks-raw-market stonks-raw-news stonks-raw-filings stonks-normalized stonks-llm-prompts stonks-llm-results stonks-lakehouse stonks-audit; do
                    mc mirror "backup/${bucket}" "${DIR}/minio/${bucket}/" 2>/dev/null || true
                  done

                  ln -sfn "${STAMP}" /backup/latest
                  echo "Backup complete: ${DIR}"
What Frequency Script Retention
Database only Daily backup-db.sh --upload-minio Last 7 (auto-pruned)
Full platform (DB + MinIO) Weekly backup.sh Manual cleanup on NFS
Redis snapshot Before deployments backup-redis.sh Manual cleanup