# Backup and Restore Guide This guide documents every backup and restore script in the Stonks Oracle platform, their CLI options, storage locations, retention policies, and procedures for disaster recovery. ## Overview Stonks Oracle provides two tiers of backup tooling: | Tier | Scripts | Scope | Storage | |------|---------|-------|---------| | **Local (kubectl-based)** | `backup-db.sh`, `restore-db.sh`, `backup-redis.sh` | Individual data stores, streamed to the operator's machine | `~/backups/stonks-oracle/` (local filesystem) | | **Cluster (Kubernetes Job)** | `backup.sh`, `restore.sh` | Full platform (PostgreSQL + all MinIO buckets) | NFS share at `192.168.42.8:/volume1/Kubernetes/stonks` | All scripts live in the `scripts/` directory and require `kubectl` access to the cluster. --- ## Local Backup Scripts ### `backup-db.sh` — PostgreSQL Database Backup Creates a compressed `pg_dump` of the `stonks` database and optionally uploads it to MinIO. **Usage:** ```bash ./scripts/backup-db.sh # backup to local file ./scripts/backup-db.sh --upload-minio # backup + upload to MinIO ``` **CLI Arguments:** | Argument | Required | Description | |----------|----------|-------------| | `--upload-minio` | No | Upload the backup file to the `stonks-backups` MinIO bucket after creating it | **Environment Variables:** | Variable | Default | Description | |----------|---------|-------------| | `BACKUP_DIR` | `~/backups/stonks-oracle` | Local directory where backup files are stored | **What it captures:** - Full `pg_dump` of the `stonks` database (all tables, data, sequences) - Dump flags: `--no-owner --no-privileges --clean --if-exists` - Output format: gzip-compressed SQL (`.sql.gz`) **How it works:** 1. Runs `pg_dump` inside the PostgreSQL pod (`postgresql-1` in `postgresql-service` namespace) and streams the compressed output to the local machine 2. Validates the backup is non-empty and counts tables as a sanity check 3. If `--upload-minio` is specified, attempts to create the `stonks-backups` bucket (if it doesn't exist) and stages the file for upload 4. Prunes old backups, keeping only the last 7 files matching `stonks-*.sql.gz` **Storage:** - Local path: `~/backups/stonks-oracle/stonks-.sql.gz` - MinIO bucket (optional): `stonks-backups` **Retention:** Keeps the last 7 backups. Older files matching `stonks-*.sql.gz` in the backup directory are automatically deleted. --- ### `backup-redis.sh` — Redis State Backup Triggers a Redis `BGSAVE` and copies the RDB dump file to the local machine. **Usage:** ```bash ./scripts/backup-redis.sh ``` **CLI Arguments:** None. **Environment Variables:** | Variable | Default | Description | |----------|---------|-------------| | `BACKUP_DIR` | `~/backups/stonks-oracle` | Local directory where the RDB file is stored | | `REDIS_PASSWORD` | `PSCh4ng3me!` | Redis authentication password | **What it captures:** - Redis RDB snapshot (`dump.rdb`) containing all in-memory state: deduplication markers, queue contents, rate-limit counters, cached values **How it works:** 1. Triggers `BGSAVE` on the Redis master pod (`redis-master-0` in `redis-service` namespace) 2. Waits 5 seconds for the background save to complete, then logs the `LASTSAVE` timestamp 3. Copies the RDB file from the pod. Tries `/data/dump.rdb` first, then falls back to `/var/lib/redis/dump.rdb` and `/bitnami/redis/data/dump.rdb` 4. Prints Redis keyspace statistics for verification **Storage:** - Local path: `~/backups/stonks-oracle/redis-.rdb` **Retention:** No automatic pruning. Old Redis backups accumulate and must be cleaned up manually. --- ### `restore-db.sh` — PostgreSQL Database Restore Restores a `pg_dump` backup into the `stonks` database with full service scale-down/scale-up. **Usage:** ```bash ./scripts/restore-db.sh ./scripts/restore-db.sh ~/backups/stonks-oracle/stonks-20260415-180000.sql.gz ``` If called without arguments, lists available backups in `~/backups/stonks-oracle/`. **CLI Arguments:** | Argument | Required | Description | |----------|----------|-------------| | `` | Yes | Path to the gzip-compressed SQL backup file to restore | **What it restores:** - All tables, data, sequences, and indexes in the `stonks` database - Re-grants `ALL PRIVILEGES` to the `stonks` user on all tables and sequences after restore **Service scale-down/scale-up procedure:** 1. **Terminates active connections** — Runs `pg_terminate_backend()` for all connections to the `stonks` database 2. **Scales down all deployments** in the `stonks-oracle` namespace to 0 replicas to prevent reconnections 3. **Waits 10 seconds** for pods to terminate 4. **Restores the backup** using `psql --single-transaction` (piped from `zcat`) 5. **Re-grants permissions** to the `stonks` user 6. **Verifies** the restore by counting tables 7. **Scales all deployments back to 1 replica**, then scales `ingestion` and `parser` to 2 replicas **Data loss implications:** > **WARNING:** This replaces ALL data in the `stonks` database with the backup contents. Any data written after the backup was taken is permanently lost. The script requires interactive confirmation — you must type `yes` to proceed. --- ## Cluster Backup Scripts (Kubernetes Jobs) ### `backup.sh` — Full Platform Backup (PostgreSQL + MinIO) Runs a Kubernetes Job that backs up both PostgreSQL and all MinIO buckets to an NFS share. **Usage:** ```bash bash scripts/backup.sh ``` **CLI Arguments:** None. **What it captures:** - **PostgreSQL**: Full `pg_dump` in custom format (`-Fc`) as `stonks.pgdump` - **MinIO buckets** (8 buckets mirrored): - `stonks-raw-market` — Raw market data from Polygon.io - `stonks-raw-news` — Raw news articles - `stonks-raw-filings` — Raw SEC filings - `stonks-normalized` — Normalized documents - `stonks-llm-prompts` — LLM prompt logs - `stonks-llm-results` — LLM extraction results - `stonks-lakehouse` — Parquet fact tables for Trino - `stonks-audit` — Audit trail artifacts - **Manifest**: `manifest.json` with backup name, timestamp, and bucket list **How it works:** 1. Deletes any previous `stonks-backup` Job 2. Creates a Kubernetes Job using `postgres:18-alpine` with NFS volume mount and MinIO credentials from cluster secrets 3. Inside the Job container: - Runs `pg_dump` with credentials from `stonks-config` ConfigMap and `stonks-core-secrets` Secret - Installs the MinIO client (`mc`) and mirrors each bucket to the NFS backup directory - Writes a `manifest.json` and updates the `latest` symlink 4. Waits up to 600 seconds (10 minutes) for the Job to complete 5. Job auto-cleans after 300 seconds (`ttlSecondsAfterFinished`) **Storage:** - NFS path: `192.168.42.8:/volume1/Kubernetes/stonks//` - Directory structure: ``` stonks-backup-YYYYMMDD-HHMMSS/ ├── stonks.pgdump # PostgreSQL custom-format dump ├── manifest.json # Backup metadata └── minio/ ├── stonks-raw-market/ # Mirrored bucket contents ├── stonks-raw-news/ ├── stonks-raw-filings/ ├── stonks-normalized/ ├── stonks-llm-prompts/ ├── stonks-llm-results/ ├── stonks-lakehouse/ └── stonks-audit/ ``` - A `latest` symlink always points to the most recent backup **Retention:** No automatic pruning on NFS. Old backups must be cleaned up manually. --- ### `restore.sh` — Full Platform Restore (PostgreSQL + MinIO) Runs a Kubernetes Job that restores both PostgreSQL and MinIO buckets from an NFS backup. **Usage:** ```bash bash scripts/restore.sh # restore from "latest" symlink bash scripts/restore.sh # restore a specific backup ``` **CLI Arguments:** | Argument | Required | Description | |----------|----------|-------------| | `` | No | Name of the backup directory on NFS. Defaults to `latest` (symlink to most recent backup) | **What it restores:** - **PostgreSQL**: Full database restore using `pg_restore --clean --if-exists --no-owner --no-acl` - **MinIO buckets**: All 8 buckets mirrored back with `mc mirror --overwrite` **How it works:** 1. Prints a warning and gives 5 seconds to abort (Ctrl+C) 2. Deletes any previous `stonks-restore` Job 3. Creates a Kubernetes Job that: - Validates the backup exists (`stonks.pgdump` file present) - Restores PostgreSQL using `pg_restore` with `--clean` (drops and recreates objects) - Installs `mc` and mirrors each bucket back from NFS to MinIO - Verifies the restore by querying row counts for key tables (companies, documents, intelligence, impacts, trends, recommendations) 4. Waits up to 600 seconds for the Job to complete **Data loss implications:** > **WARNING:** This will DROP and recreate all objects in the `stonks` database. All MinIO bucket contents are overwritten. Any data written after the backup was taken is permanently lost. The script provides a 5-second abort window before proceeding. **Post-restore steps:** After the restore completes, restart all services to pick up the restored state: ```bash kubectl rollout restart deployment -n stonks-oracle --all ``` --- ## MinIO Upload Option (`--upload-minio`) The `backup-db.sh` script supports `--upload-minio` for off-host storage of database backups. When enabled: 1. The script connects to MinIO through an ingestion pod in the `stonks-oracle` namespace 2. Creates the `stonks-backups` bucket if it doesn't already exist 3. Stages the backup file for upload This provides a second copy of the database backup on object storage, separate from the operator's local filesystem. The full cluster backup (`backup.sh`) stores backups on NFS and does not use this flag — it backs up MinIO bucket *contents* rather than uploading database dumps *to* MinIO. --- ## Full Nuke and Rebuild Procedure When a complete platform reset is needed (corrupted state, major schema changes, fresh start), follow this procedure: ### Step 1: Tear Down Services ```bash bash ~/sources/kube/stonks-oracle/runmelast.sh ``` This runs from `gremlin-1` and performs a Helm uninstall, cleaning up all Kubernetes resources in the `stonks-oracle` namespace. Database, MinIO, and Redis data are preserved (they run in separate namespaces). ### Step 2: Terminate Database Connections ```bash kubectl exec -n postgresql-service postgresql-1 -c postgres -- \ psql -U postgres -c \ "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'stonks' AND pid <> pg_backend_pid();" ``` ### Step 3: Drop the Database ```bash kubectl exec -n postgresql-service postgresql-1 -c postgres -- \ psql -U postgres -c "DROP DATABASE IF EXISTS stonks;" ``` ### Step 4: Flush Redis Clear all `stonks:*` keys to reset deduplication markers, queue contents, and cached state: ```bash kubectl exec -n redis-service redis-master-0 -- \ redis-cli -a 'PSCh4ng3me!' --scan --pattern 'stonks:*' | \ xargs -L 100 kubectl exec -n redis-service redis-master-0 -- \ redis-cli -a 'PSCh4ng3me!' DEL ``` ### Step 5: Redeploy ```bash bash ~/sources/kube/stonks-oracle/runmefirst.sh ``` This runs from `gremlin-1` and performs: - Database creation and migration (all `infra/migrations/*.sql` files applied in order) - Helm install with secrets injected via `--set` flags - Rolling restart of all deployments ### Step 6: Re-seed the Symbol Registry ```bash POSTGRES_HOST=postgresql-rw.postgresql-service.svc.cluster.local \ POSTGRES_PASSWORD='St0nks0racl3!' \ POSTGRES_USER=stonks \ POSTGRES_DB=stonks \ .venv/bin/python -m services.symbol_registry.seed ``` This populates the 50 tracked companies across 10 sectors and 46 competitor relationships. --- ## Recommended Backup Schedules ### Daily Database Backup (cron) Run `backup-db.sh` daily on a machine with `kubectl` access. The built-in retention keeps the last 7 backups automatically. ```cron # Daily database backup at 2:00 AM 0 2 * * * /path/to/stonks-oracle/scripts/backup-db.sh --upload-minio >> /var/log/stonks-backup.log 2>&1 ``` ### Weekly Full Backup (cron) Run the full cluster backup weekly to capture both PostgreSQL and MinIO data on NFS: ```cron # Weekly full backup (PostgreSQL + MinIO) on Sundays at 3:00 AM 0 3 * * 0 /path/to/stonks-oracle/scripts/backup.sh >> /var/log/stonks-full-backup.log 2>&1 ``` ### Redis Backup Before Deployments Redis state is transient (queues, dedup markers, caches) and rebuilds naturally. Back up Redis before major deployments or database resets as a precaution: ```bash ./scripts/backup-redis.sh ``` ### Kubernetes CronJobs For fully automated in-cluster backups, create a CronJob based on the same Job spec used by `backup.sh`: ```yaml apiVersion: batch/v1 kind: CronJob metadata: name: stonks-backup namespace: stonks-oracle spec: schedule: "0 2 * * *" # Daily at 2:00 AM UTC concurrencyPolicy: Forbid successfulJobsHistoryLimit: 3 failedJobsHistoryLimit: 3 jobTemplate: spec: ttlSecondsAfterFinished: 3600 backoffLimit: 1 template: spec: restartPolicy: Never volumes: - name: nfs-backup nfs: server: 192.168.42.8 path: /volume1/Kubernetes/stonks containers: - name: backup image: postgres:18-alpine volumeMounts: - name: nfs-backup mountPath: /backup envFrom: - configMapRef: name: stonks-config - secretRef: name: stonks-core-secrets env: - name: MINIO_ACCESS_KEY valueFrom: secretKeyRef: name: stonks-core-secrets key: MINIO_ACCESS_KEY - name: MINIO_SECRET_KEY valueFrom: secretKeyRef: name: stonks-core-secrets key: MINIO_SECRET_KEY command: ["sh", "-c"] args: - | set -e apk add --no-cache curl ca-certificates STAMP="stonks-backup-$(date +%Y%m%d-%H%M%S)" DIR="/backup/${STAMP}" mkdir -p "${DIR}/minio" # PostgreSQL backup PGPASSWORD="${POSTGRES_PASSWORD}" pg_dump \ -h "${POSTGRES_HOST}" -p "${POSTGRES_PORT}" \ -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" \ --no-owner --no-acl -Fc \ -f "${DIR}/stonks.pgdump" # MinIO backup curl -sL https://dl.min.io/client/mc/release/linux-amd64/mc -o /usr/local/bin/mc chmod +x /usr/local/bin/mc mc alias set backup "http://${MINIO_ENDPOINT}" "${MINIO_ACCESS_KEY}" "${MINIO_SECRET_KEY}" --api S3v4 for bucket in stonks-raw-market stonks-raw-news stonks-raw-filings stonks-normalized stonks-llm-prompts stonks-llm-results stonks-lakehouse stonks-audit; do mc mirror "backup/${bucket}" "${DIR}/minio/${bucket}/" 2>/dev/null || true done ln -sfn "${STAMP}" /backup/latest echo "Backup complete: ${DIR}" ``` ### Recommended Schedule Summary | What | Frequency | Script | Retention | |------|-----------|--------|-----------| | Database only | Daily | `backup-db.sh --upload-minio` | Last 7 (auto-pruned) | | Full platform (DB + MinIO) | Weekly | `backup.sh` | Manual cleanup on NFS | | Redis snapshot | Before deployments | `backup-redis.sh` | Manual cleanup |