# Stonks Oracle — Operator Runbook ## Cluster Access ```bash kubectl config use-context # All stonks-oracle resources live in the stonks-oracle namespace alias kso='kubectl -n stonks-oracle' ``` ## Service Overview | Service | Type | Replicas | Notes | |---------|------|----------|-------| | scheduler | CronJob-like worker | 1 | Polls sources on schedule | | symbol-registry | FastAPI | 1 | Company/watchlist CRUD | | ingestion | Queue worker | 2 | Fetches from adapters | | parser | Queue worker | 2 | HTML→text extraction | | extractor | Queue worker | 1 | LLM-based intelligence extraction | | aggregation | Queue worker | 1 | Trend/signal aggregation | | recommendation | Queue worker | 1 | Trade signal generation | | risk | FastAPI | 1 | Risk evaluation + approval | | broker-adapter | Queue worker | 1 | Paper/live order execution | | lake-publisher | Queue worker | 1 | Iceberg table publication | | query-api | FastAPI | 1 | Dashboard/analytics queries | | trino | Analytics engine | 1 | SQL over lakehouse | | superset | Dashboard | 1 | Visualization | | hive-metastore | Metastore | 1 | Iceberg catalog backend | ## Common Operations ### Restart a service ```bash kso rollout restart deployment/ ``` ### Check logs ```bash kso logs deployment/ --tail=50 -f # For previous crash: kso logs --previous --tail=50 ``` ### Scale a service ```bash kso scale deployment/ --replicas=N ``` ### Redeploy with updated secrets ```bash GHCR_TOKEN=$(cat /run/secrets/github_token) helm upgrade --install stonks-oracle infra/helm/stonks-oracle \ --namespace stonks-oracle \ --set "ghcrAuth.password=$GHCR_TOKEN" \ --set 'secrets.core.POSTGRES_PASSWORD=St0nks0racl3!' \ --set "secrets.core.MINIO_ACCESS_KEY=AKIA6V7J3N9B5P0D2YQH" \ --set 'secrets.core.MINIO_SECRET_KEY=8fG3!v2rJ7$wN@9mLpQ6zXbC4tKdPqW1' \ --set 'secrets.core.REDIS_PASSWORD=PSCh4ng3me!' # Then restart deployments to pick up secret changes: for dep in $(kso get deployments -o name); do kso rollout restart "$dep"; done ``` ### Run database migrations ```bash for f in $(ls infra/migrations/*.sql | sort); do kubectl exec -i -n postgresql-service postgresql-1 -c postgres -- psql -U postgres -d stonks < "$f" done ``` ## Trading Mode Toggle Current mode is set via ConfigMap `stonks-config` key `BROKER_MODE`. ```bash # Check current mode kso get configmap stonks-config -o jsonpath='{.data.BROKER_MODE}' # To switch modes, update values.yaml config.BROKER_MODE and helm upgrade, # then restart broker-adapter and risk deployments. ``` **Modes:** - `paper` — all orders go through paper trading simulation (default, safe) - `live` — orders are submitted to the real broker API (requires operator approval workflow) **Never switch to live without:** 1. Confirming paper trading PnL is acceptable 2. Verifying risk limits are configured in `risk_configuration` table 3. Enabling operator approval in `operator_approvals` table ## Operator Approval for Live Trades The risk engine requires explicit operator approval before executing live trades. Approvals are managed via the risk API: ```bash # Check pending approvals curl -s https://stonks-api.celestium.life/risk/approvals/pending # Approve a recommendation curl -X POST https://stonks-api.celestium.life/risk/approvals//approve ``` ## Common Failure Modes ### CrashLoopBackOff on workers Queue workers (aggregation, extractor, recommendation, broker-adapter, lake-publisher) exit with code 0 when the queue is empty. Kubernetes restarts them, which is normal. They'll process work when messages arrive. ### PostgreSQL auth failure Password mismatch between `stonks-core-secrets.POSTGRES_PASSWORD` and the actual DB user password. Fix: ```bash kubectl exec -i -n postgresql-service postgresql-1 -c postgres -- psql -U postgres -d stonks <<'EOF' ALTER USER stonks WITH PASSWORD ''; EOF ``` Then update the Helm secret and restart. ### Redis connection refused Check Redis is running: `kubectl get pods -n redis-service` If Redis master is down, restart it: `kubectl rollout restart -n redis-service statefulset/redis-master` ### ImagePullBackOff GHCR credentials expired or missing. Re-run `helm upgrade` with fresh `ghcrAuth.password`. ### Superset won't start Needs custom image with `sqlalchemy-trino` package. Stock `apache/superset:latest` doesn't include it. ## Log Access All services output JSON logs when `JSON_LOGS=true` (default). ```bash # Stream all logs from a service kso logs -f deployment/ --tail=100 # Search for errors across all pods kso logs --all-containers --prefix --tail=100 | grep -i error ``` ## Ingress Endpoints | URL | Service | |-----|---------| | https://stonks-api.celestium.life | Query API | | https://stonks-registry.celestium.life | Symbol Registry | | https://stonks-dash.celestium.life | Superset | | https://stonks-trino.celestium.life | Trino |