docs: rewrite README and runbook for current platform state

README: updated architecture diagram, three signal layers, tracked
universe, autonomous trading engine, global news interpolation,
competitive intelligence, paper trading, notification service,
updated services table, project structure, deployment, endpoints.

Runbook: updated service overview, deployment via runmefirst.sh,
secrets management (keys in kube dir not repo), backup/restore
scripts, trading engine operations, signal layer toggles, database
nuke & rebuild, monitoring, CI/CD, removed hardcoded secrets.
This commit is contained in:
Celes Renata
2026-04-16 02:06:18 +00:00
parent e652a62dbc
commit 9aae57f3e1
2 changed files with 366 additions and 134 deletions
+113 -76
View File
@@ -1,130 +1,159 @@
# Stonks Oracle # Stonks Oracle
AI-powered market intelligence and paper-trading platform. Ingests market data, company news, and regulatory filings; extracts structured intelligence with local LLMs; computes trend summaries and trade recommendations; and optionally executes paper trades — all self-hosted on Kubernetes. AI-powered market intelligence and autonomous paper-trading platform. Ingests market data, company news, and regulatory filings; extracts structured intelligence with local LLMs; aggregates signals across three layers (company, macro, competitive); and autonomously executes paper trades — all self-hosted on Kubernetes.
## What It Does ## What It Does
Stonks Oracle monitors tracked companies across multiple data sources, runs every article and filing through a local Ollama model to extract structured intelligence (sentiment, catalysts, risks, key facts), aggregates those signals into rolling trend summaries with contradiction detection, and generates explainable trade recommendations with risk controls. Stonks Oracle tracks 50 companies across 10 sectors. It monitors multiple data sources, runs every article and filing through a local Ollama model to extract structured intelligence, aggregates those signals into rolling trend summaries with contradiction detection, and generates explainable trade recommendations. An autonomous trading engine then evaluates those recommendations and executes paper trades through Alpaca without manual intervention.
Everything is auditable — raw artifacts, prompts, model outputs, and decision traces are preserved. Historical data flows into a MinIO-backed lakehouse queryable via Trino and visualized through Superset dashboards and a built-in React dashboard. Everything is auditable — raw artifacts, prompts, model outputs, decision traces, and trade execution logs are preserved. Historical data flows into a MinIO-backed lakehouse queryable via Trino and visualized through Superset dashboards and a built-in React dashboard.
## Architecture ## Architecture
``` ```
┌─────────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐ ┌──────────────────────────────────────────┐
Scheduler │───▶│ Ingestion│───▶│ Parser │───▶│ Extractor │ Signal Aggregation
└─────────────┘ └──────────┘ └──────────┘ └──────┬──────┘ │ │
┌───────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────┐
┌──────────────────────────────────────┘ │ Scheduler │─▶│Ingestion │─▶│ │ Parser │─▶│ Extractor │ │
└───────────┘ └──────────┘ │ └──────────┘ └──────┬─────────┘ │
┌─────────────┐ ┌────────────────┐ ┌──────────────┐ │ │ │
│ Aggregation │───▶│ Recommendation │───▶│ Risk Engine │ ┌─────────────┘
└─────────────┘ └────────────────┘ └──────┬───────┘ │ ▼ │
│ ┌─────────────┐ ┌────────────────┐
┌────────────────────────────────────────┘ │ │ Aggregation │───▶│ Recommendation │ │
│ └──────┬──────┘ └───────┬────────┘ │
┌──────────────┐ ┌────────────────┐ │ │ │ │
│Broker Adapter│ │ Lake Publisher │ Macro signals Competitive
└──────────────┘ └────────────────┘ │ + Competitive signals │
│ signals merged
────────────────────┘ └──────────────────────────────────────────┘
┌──────────┐ ┌──────────┐ ┌─────────── ┌───────────────────────┘
│ Trino │ │ Superset │ │ Dashboard │
────────── ────────── ─────────── ┌───────────── ┌──────────────── ┌──────────────
│ Risk Engine │───▶│ Trading Engine │───▶│Broker Adapter│
└─────────────┘ └────────────────┘ └──────────────┘
┌────────────────┐ ┌──────────┘
│ Lake Publisher │ ▼
└───────┬────────┘ Alpaca (paper)
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌───────────┐
│ Trino │ │ Superset │ │ Dashboard │
└──────────┘ └──────────┘ └───────────┘
``` ```
Two planes: Two planes:
- **Operational** — ingestion, parsing, extraction, aggregation, recommendations, risk evaluation, trade execution (PostgreSQL, Redis, MinIO) - **Operational** — ingestion, parsing, extraction, aggregation, recommendations, risk evaluation, autonomous trading, trade execution (PostgreSQL, Redis, MinIO)
- **Analytical** — historical fact tables, SQL queries, dashboards (MinIO/Parquet, Trino, Superset) - **Analytical** — historical fact tables, SQL queries, dashboards (MinIO/Parquet, Trino, Superset)
## Signal Layers
The aggregation engine merges signals from three independent layers via a unified `WeightedSignal` abstraction. Each layer has a runtime toggle — no restart required.
| Layer | Source | What It Does |
|-------|--------|-------------|
| **Layer 1: Company** | News, filings, market data | Document intelligence extraction → per-company impact records → trend windows |
| **Layer 2: Macro** | Global news, geopolitical events | Ollama-based event classification → exposure profile matching → per-company macro impact scores |
| **Layer 3: Competitive** | Historical platform data | Pattern mining on past catalyst outcomes → cross-company signal propagation via competitor relationships |
- Pattern-only or macro-only trend shifts are forced to informational mode (suppression safety)
- Macro weight default: 0.3, competitive weight default: 0.2
- Toggles: `macro_enabled` and `competitive_enabled` in `risk_configs`
## Tracked Universe
50 companies across 10 sectors: Technology, Consumer Cyclical, Financial Services, Healthcare, Energy, Communication Services, Industrials, Consumer Defensive, Real Estate, Utilities.
46 competitor relationships (direct_rival, same_sector, overlapping_products, supply_chain_adjacent).
Seed data: `python -m services.symbol_registry.seed`
## Features ## Features
### Autonomous Trading Engine
Continuous decision loop that polls for actionable recommendations and executes paper trades without manual intervention. Includes confidence-based position sizing, dynamic stop-loss/take-profit (ATR-based), circuit breakers (daily loss cap, single-position loss, volatility detection), reserve pool management (auto-siphon from profits), risk tier auto-adjustment (conservative/moderate/aggressive based on trailing performance), portfolio rebalancing (sector and concentration limits), gradual entry (multi-tranche orders), correlation-aware diversification, earnings calendar awareness, portfolio heat management, tax-lot tracking with wash sale detection, performance tracking (Sharpe, drawdown, win rate, profit factor), and backtesting against historical data.
### Global News Interpolation
Macro/geopolitical event ingestion from dedicated sources. Ollama-based classification by impact type, severity, affected regions, and sectors. Company exposure profiles (geographic revenue mix, supply chain regions, commodity dependencies, market position tier) map events to per-company macro impact scores with resilience modifiers. Forward-looking trend projections combine company momentum with macro trajectories.
### Competitive Intelligence
Historical pattern mining on the platform's own data — how similar catalyst types resolved in the past for a company and its competitors. Cross-company signal propagation via competitor relationships. Major corporate decision tracking (M&A, restructuring, leadership changes) with extended lookback windows. Auto-inference of competitor relationships from sector matching and document co-mentions.
### Data Ingestion ### Data Ingestion
- Market data via Polygon.io (quotes, OHLCV bars, corporate actions) - Grouped daily market data from Polygon.io (OHLCV bars, corporate actions)
- Company news via news APIs with full article scraping - Company news via news APIs with full article scraping
- SEC filings and regulatory events - SEC filings and regulatory events
- Configurable polling intervals, rate limiting, retries, and backoff - Macro/geopolitical news from dedicated sources
- Content hash deduplication across all sources - Content hash deduplication, rate limiting, retries, raw artifact preservation in MinIO
- Raw artifact preservation in MinIO for full auditability
### AI-Powered Extraction ### AI-Powered Extraction
- Local Ollama models with schema-constrained JSON output - Local Ollama models with schema-constrained JSON output
- Per-document intelligence: sentiment, catalysts, impact horizon, key facts, risks, macro themes - Per-document intelligence: sentiment, catalysts, impact horizon, key facts, risks, macro themes
- Per-company impact records when a document mentions multiple companies - Per-company impact records when a document mentions multiple companies
- Schema and semantic validation with retry on invalid outputs - Schema and semantic validation with retry on invalid outputs
- Prompt, model metadata, and raw output preservation for reproducibility
### Trend Aggregation ### Trend Aggregation
- Rolling company-level trend summaries across 5 windows (intraday, 1d, 7d, 30d, 90d) - Rolling company-level trend summaries across 5 windows (intraday, 1d, 7d, 30d, 90d)
- Recency decay, source credibility weighting, and document novelty scoring - Recency decay, source credibility weighting, document novelty scoring
- Contradiction detection with explicit disagreement representation - Contradiction detection with explicit disagreement representation
- Sector and market-level rollups - Sector and market-level rollups incorporating macro event impacts
- Evidence ranking with top supporting and opposing documents - Forward-looking trend projections with driving factor explanations
### Trade Recommendations ### Paper Trading
- Explainable recommendation objects with action, thesis, confidence, and cited evidence - $100k paper capital via Alpaca integration
- Deterministic eligibility scoring separated from action mapping - Moderate risk tier default, auto-adjustable
- Position sizing based on portfolio rules
- Data quality suppression — low-confidence or stale data forces informational-only mode
- Optional LLM thesis rewriting for analyst-quality prose
### Risk Engine and Trading
- Paper trading mode and live trading mode as separate environments
- Hard blocks: max position size, daily loss cap, sector exposure limits, symbol cooldowns
- Operator approval workflow for live trading
- Idempotent order submission with duplicate prevention
- Fail-closed behavior on broker outages
- Full execution audit trail from signal to broker response - Full execution audit trail from signal to broker response
- Operator approval workflow available for live mode
### Notification Service
- AWS SNS for SMS alerts on critical events (circuit breaker triggers, risk tier changes, large trades)
- Gmail API for email alerts and daily performance summaries
- Configurable alert channels and thresholds
### Lakehouse and SQL Analytics ### Lakehouse and SQL Analytics
- Parquet fact tables on MinIO with Hive-compatible partitioning - Parquet fact tables on MinIO with Hive-compatible partitioning
- Iceberg table metadata for schema evolution - Iceberg table metadata for schema evolution
- Trino SQL engine for ad-hoc analytical queries - Trino SQL engine for ad-hoc queries
- Fact tables: market bars, documents, extractions, trade signals, orders, fills, positions, PnL, prediction vs outcome - Fact tables: market bars, documents, extractions, trade signals, orders, fills, positions, PnL, global events, macro impacts, competitive signals, trend projections
- Apache Superset for pre-built dashboards - Apache Superset for pre-built dashboards
### Web Dashboard ### Web Dashboard
- React/TypeScript SPA with Tailwind CSS - React/TypeScript SPA with Tailwind CSS
- Company, watchlist, and source management - Company, watchlist, and source management
- Document timeline with intelligence drill-down - Document timeline with intelligence drill-down
- Trend visualization with evidence chain navigation - Trend visualization with evidence chain navigation (company, macro, and competitive signals distinguished)
- Recommendation review with full provenance - Trading engine overview: risk tier, circuit breaker status, active/reserve pool, portfolio heat, P&L
- Order and position tracking with audit trails - Portfolio composition, trade history, backtesting panel
- Trading mode controls, risk configuration, approval workflow - Global events browser, macro exposure panels, trend projection visualization
- DevOps dashboards: pipeline health, ingestion throughput, model performance, source coverage - Competitor relationship management, historical pattern explorer, corporate decision timeline
- DevOps dashboards: pipeline health, ingestion throughput, model performance
- Interactive SQL explorer with Monaco Editor and chart builder - Interactive SQL explorer with Monaco Editor and chart builder
- Pre-built analytical dashboards: symbol overview, sentiment heatmap, prediction accuracy, paper trading PnL, model quality
### Observability ### Observability
- Structured JSON logging across all services - Structured JSON logging across all services
- Prometheus metrics for every pipeline stage - Prometheus metrics for every pipeline stage
- Alerting for source failures, schema failure spikes, analytical lag, and broker issues - Alerting for source failures, schema failure spikes, analytical lag, broker issues, and trading anomalies
- Dead-letter queues with replay tooling - Dead-letter queues with replay tooling
- Data retention and lifecycle controls - Data retention and lifecycle controls
### Global News Interpolation *(planned)*
- Macro/geopolitical event ingestion and Ollama-based classification
- Company exposure profiles (geographic revenue mix, supply chain, commodities, market position)
- Per-company macro impact scoring with resilience modifiers
- Macro signals blended into trend aggregation with configurable weight
- Runtime toggle to enable/disable macro signal layer
- Forward-looking trend projections combining company momentum with macro trajectories
- Dashboard pages for global events, macro exposure panels, and projection visualization
## Services ## Services
| Service | Description | | Service | Description |
|---------|-------------| |---------|-------------|
| `scheduler` | Triggers ingestion cycles based on source polling intervals | | `scheduler` | Triggers ingestion cycles based on source polling intervals |
| `symbol-registry` | Manages companies, aliases, watchlists, sources, and exposure profiles | | `symbol-registry` | Manages companies, aliases, watchlists, sources, exposure profiles, and competitor relationships |
| `ingestion` | Fetches market data, news, and filings from external APIs | | `ingestion` | Fetches market data, news, filings, and macro events from external APIs |
| `parser` | Normalizes raw HTML/text, reduces boilerplate, scores parse quality | | `parser` | Normalizes raw HTML/text, reduces boilerplate, scores parse quality |
| `extractor` | Runs Ollama extraction to produce document intelligence objects | | `extractor` | Runs Ollama extraction to produce document intelligence and global event classifications |
| `aggregation` | Computes rolling trend summaries with contradiction detection | | `aggregation` | Computes rolling trend summaries with contradiction detection and trend projections |
| `recommendation` | Generates trade recommendations from aggregated evidence | | `recommendation` | Generates trade recommendations from aggregated evidence across all signal layers |
| `risk` | Evaluates orders against portfolio risk controls | | `risk` | Evaluates orders against portfolio risk controls |
| `broker-adapter` | Interfaces with broker APIs for paper/live trading | | `trading-engine` | Autonomous decision loop: position sizing, stop-loss, circuit breakers, reserve pool, rebalancing |
| `broker-adapter` | Interfaces with Alpaca for paper/live order execution |
| `lake-publisher` | Writes analytical Parquet datasets to MinIO | | `lake-publisher` | Writes analytical Parquet datasets to MinIO |
| `query-api` | REST API for all operational and analytical queries | | `query-api` | REST API for all operational and analytical queries |
| `dashboard` | React SPA served via nginx | | `dashboard` | React SPA served via nginx |
@@ -143,6 +172,7 @@ Two planes:
- **CI/CD**: GitHub Actions → GHCR container registry - **CI/CD**: GitHub Actions → GHCR container registry
- **Broker**: Alpaca (paper trading) - **Broker**: Alpaca (paper trading)
- **Market Data**: Polygon.io - **Market Data**: Polygon.io
- **Notifications**: AWS SNS (SMS), Gmail API (email)
## Project Structure ## Project Structure
@@ -150,13 +180,14 @@ Two planes:
├── services/ ├── services/
│ ├── shared/ # Config, schemas, Redis keys, logging, audit │ ├── shared/ # Config, schemas, Redis keys, logging, audit
│ ├── scheduler/ # Job scheduling and source polling │ ├── scheduler/ # Job scheduling and source polling
│ ├── symbol_registry/ # Company and source management API │ ├── symbol_registry/ # Company, source, exposure profile, competitor management API
│ ├── ingestion/ # External API adapters and raw artifact storage │ ├── ingestion/ # External API adapters and raw artifact storage
│ ├── parser/ # HTML parsing, boilerplate reduction, quality scoring │ ├── parser/ # HTML parsing, boilerplate reduction, quality scoring
│ ├── extractor/ # Ollama extraction and schema validation │ ├── extractor/ # Ollama extraction, event classification, schema validation
│ ├── aggregation/ # Trend computation and contradiction detection │ ├── aggregation/ # Trend computation, contradiction detection, projections
│ ├── recommendation/ # Recommendation generation and suppression │ ├── recommendation/ # Recommendation generation and suppression
│ ├── risk/ # Risk evaluation and approval workflow │ ├── risk/ # Risk evaluation and approval workflow
│ ├── trading/ # Autonomous trading engine, backtester, performance tracker
│ ├── adapters/ # Broker API integration │ ├── adapters/ # Broker API integration
│ ├── lake_publisher/ # Parquet fact table publication │ ├── lake_publisher/ # Parquet fact table publication
│ └── api/ # Query API (FastAPI) │ └── api/ # Query API (FastAPI)
@@ -169,6 +200,7 @@ Two planes:
│ ├── hive/ # Hive metastore configuration │ ├── hive/ # Hive metastore configuration
│ ├── minio/ # MinIO lifecycle policies │ ├── minio/ # MinIO lifecycle policies
│ └── superset/ # Superset configuration │ └── superset/ # Superset configuration
├── scripts/ # Backup/restore scripts (backup-db.sh, restore-db.sh, backup-redis.sh)
├── dashboards/ # Superset dashboard JSON exports ├── dashboards/ # Superset dashboard JSON exports
├── tests/ # Python test suite ├── tests/ # Python test suite
└── docker/ # Dockerfiles for services and Superset └── docker/ # Dockerfiles for services and Superset
@@ -198,17 +230,21 @@ npx vitest --run
## Deployment ## Deployment
The platform runs on Kubernetes with Helm: The platform runs on Kubernetes (k3s cluster, 4 NixOS nodes). Full deployment is handled by `runmefirst.sh`, which sets up the database, runs migrations, and deploys via Helm.
```bash ```bash
# CI builds and pushes images automatically on push to main # Full deploy (from gremlin-1 where secrets are available):
# Deploy to cluster: bash ~/sources/kube/stonks-oracle/runmefirst.sh
# Quick Helm upgrade after CI builds new images:
helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle
# Restart a specific service: # Restart a specific service:
kubectl rollout restart deployment/<service-name> -n stonks-oracle kubectl rollout restart deployment/<service-name> -n stonks-oracle
``` ```
Secrets are stored in `~/sources/kube/stonks-oracle/` on the deploy host — not in the repo. The deploy script reads them from disk and injects them via Helm `--set` flags. See the [runbook](docs/notes/runbook.md) for operational details.
## Live Endpoints ## Live Endpoints
| Service | URL | | Service | URL |
@@ -216,6 +252,7 @@ kubectl rollout restart deployment/<service-name> -n stonks-oracle
| Dashboard | https://stonks.celestium.life | | Dashboard | https://stonks.celestium.life |
| Query API | https://stonks-api.celestium.life | | Query API | https://stonks-api.celestium.life |
| Symbol Registry | https://stonks-registry.celestium.life | | Symbol Registry | https://stonks-registry.celestium.life |
| Trading Engine | https://stonks-trading.celestium.life |
| Superset | https://stonks-dash.celestium.life | | Superset | https://stonks-dash.celestium.life |
| Trino | https://stonks-trino.celestium.life | | Trino | https://stonks-trino.celestium.life |
+253 -58
View File
@@ -8,25 +8,81 @@ kubectl config use-context <your-context>
alias kso='kubectl -n stonks-oracle' alias kso='kubectl -n stonks-oracle'
``` ```
4-node k3s cluster (gremlin-1 through gremlin-4). Deploy host is gremlin-1 (192.168.42.254) where secrets and the deploy script live.
## Service Overview ## Service Overview
| Service | Type | Replicas | Notes | | Service | Type | Replicas | Notes |
|---------|------|----------|-------| |---------|------|----------|-------|
| scheduler | CronJob-like worker | 1 | Polls sources on schedule | | scheduler | CronJob-like worker | 1 | Polls sources on schedule |
| symbol-registry | FastAPI | 1 | Company/watchlist CRUD | | symbol-registry | FastAPI | 1 | Company/watchlist/exposure/competitor CRUD |
| ingestion | Queue worker | 2 | Fetches from adapters | | ingestion | Queue worker | 2 | Fetches from adapters (market data, news, filings, macro) |
| parser | Queue worker | 2 | HTML→text extraction | | parser | Queue worker | 2 | HTML→text extraction |
| extractor | Queue worker | 1 | LLM-based intelligence extraction | | extractor | Queue worker | 1 | LLM-based intelligence extraction + event classification |
| aggregation | Queue worker | 1 | Trend/signal aggregation | | aggregation | Queue worker | 1 | Trend/signal aggregation across all 3 layers |
| recommendation | Queue worker | 1 | Trade signal generation | | recommendation | Queue worker | 1 | Trade signal generation |
| trading-engine | FastAPI | 1 | Autonomous decision loop, position sizing, backtesting |
| risk | FastAPI | 1 | Risk evaluation + approval | | risk | FastAPI | 1 | Risk evaluation + approval |
| broker-adapter | Queue worker | 1 | Paper/live order execution | | broker-adapter | Queue worker | 1 | Paper/live order execution via Alpaca |
| lake-publisher | Queue worker | 1 | Iceberg table publication | | lake-publisher | Queue worker | 1 | Iceberg table publication |
| query-api | FastAPI | 1 | Dashboard/analytics queries | | query-api | FastAPI | 1 | Dashboard/analytics queries |
| dashboard | nginx | 1 | React SPA on port 8080 |
| trino | Analytics engine | 1 | SQL over lakehouse | | trino | Analytics engine | 1 | SQL over lakehouse |
| superset | Dashboard | 1 | Visualization | | superset | Dashboard | 1 | Visualization |
| hive-metastore | Metastore | 1 | Iceberg catalog backend | | hive-metastore | Metastore | 1 | Iceberg catalog backend |
## Deployment
### Full Deploy
Run from gremlin-1 where secrets are available:
```bash
bash ~/sources/kube/stonks-oracle/runmefirst.sh
```
This script:
1. Pulls latest code
2. Creates namespace with Helm labels
3. Sets up PostgreSQL user and database
4. Runs all migrations in order
5. Deploys via Helm with secrets injected
6. Rolling restarts all deployments
### Quick Helm Upgrade
After CI builds new images:
```bash
helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle
```
### Full Teardown
Preserves PostgreSQL, Redis, and MinIO data:
```bash
bash ~/sources/kube/stonks-oracle/runmelast.sh
```
## Secrets Management
Secrets are stored on the deploy host at `~/sources/kube/stonks-oracle/`. This directory is NOT a git repo — secrets stay local.
Required secret files:
- `~/sources/kube/stonks-oracle/polygon.io.key` — Polygon.io API key
- `~/sources/kube/stonks-oracle/alpaca.key` — Alpaca API key
- `~/sources/kube/stonks-oracle/alpaca.secret` — Alpaca API secret
- `~/sources/kube/stonks-oracle/alpaca.url` — Alpaca base URL (defaults to paper API)
- `/run/secrets/github_token` — GHCR authentication token
The deploy script (`runmefirst.sh`) reads these files and injects them into Kubernetes secrets via Helm `--set` flags. Never hardcode secrets in manifests, values files, or this runbook.
To rotate a secret:
1. Update the file on gremlin-1
2. Re-run `runmefirst.sh` (or `helm upgrade` with the new `--set` values)
3. Restart affected deployments
## Common Operations ## Common Operations
### Restart a service ### Restart a service
@@ -46,20 +102,6 @@ kso logs <pod-name> --previous --tail=50
kso scale deployment/<service-name> --replicas=N kso scale deployment/<service-name> --replicas=N
``` ```
### Redeploy with updated secrets
```bash
GHCR_TOKEN=$(cat /run/secrets/github_token)
helm upgrade --install stonks-oracle infra/helm/stonks-oracle \
--namespace stonks-oracle \
--set "ghcrAuth.password=$GHCR_TOKEN" \
--set 'secrets.core.POSTGRES_PASSWORD=St0nks0racl3!' \
--set "secrets.core.MINIO_ACCESS_KEY=AKIA6V7J3N9B5P0D2YQH" \
--set 'secrets.core.MINIO_SECRET_KEY=8fG3!v2rJ7$wN@9mLpQ6zXbC4tKdPqW1' \
--set 'secrets.core.REDIS_PASSWORD=PSCh4ng3me!'
# Then restart deployments to pick up secret changes:
for dep in $(kso get deployments -o name); do kso rollout restart "$dep"; done
```
### Run database migrations ### Run database migrations
```bash ```bash
for f in $(ls infra/migrations/*.sql | sort); do for f in $(ls infra/migrations/*.sql | sort); do
@@ -67,73 +109,179 @@ for f in $(ls infra/migrations/*.sql | sort); do
done done
``` ```
## Trading Mode Toggle ## Trading Engine Operations
### Check trading engine status
```bash
curl -s https://stonks-trading.celestium.life/health
curl -s https://stonks-trading.celestium.life/ready
```
### Pause trading
```bash
# Via API — sets enabled=false in trading_engine_config
curl -X PUT https://stonks-trading.celestium.life/api/trading/config \
-H 'Content-Type: application/json' \
-d '{"enabled": false}'
```
### Resume trading
```bash
curl -X PUT https://stonks-trading.celestium.life/api/trading/config \
-H 'Content-Type: application/json' \
-d '{"enabled": true}'
```
### Check recent trading decisions
```bash
curl -s https://stonks-api.celestium.life/api/trading/decisions?limit=10
```
### Run a backtest
```bash
curl -X POST https://stonks-trading.celestium.life/api/trading/backtest \
-H 'Content-Type: application/json' \
-d '{"start_date": "2025-01-01", "end_date": "2025-06-01", "initial_capital": 100000, "risk_tier": "moderate"}'
```
### Check circuit breaker status
```bash
curl -s https://stonks-api.celestium.life/api/trading/circuit-breaker
```
### Check portfolio state
```bash
curl -s https://stonks-api.celestium.life/api/trading/portfolio
```
## Broker Mode Toggle
Current mode is set via ConfigMap `stonks-config` key `BROKER_MODE`. Current mode is set via ConfigMap `stonks-config` key `BROKER_MODE`.
```bash ```bash
# Check current mode # Check current mode
kso get configmap stonks-config -o jsonpath='{.data.BROKER_MODE}' kso get configmap stonks-config -o jsonpath='{.data.BROKER_MODE}'
# To switch modes, update values.yaml config.BROKER_MODE and helm upgrade,
# then restart broker-adapter and risk deployments.
``` ```
**Modes:** **Modes:**
- `paper` — all orders go through paper trading simulation (default, safe) - `paper` — all orders go through paper trading simulation (default)
- `live` — orders are submitted to the real broker API (requires operator approval workflow) - `live` — orders submitted to real broker API (requires operator approval workflow)
**Never switch to live without:** **Never switch to live without:**
1. Confirming paper trading PnL is acceptable 1. Confirming paper trading PnL is acceptable
2. Verifying risk limits are configured in `risk_configuration` table 2. Verifying risk limits are configured
3. Enabling operator approval in `operator_approvals` table 3. Enabling operator approval in the risk engine
## Operator Approval for Live Trades ## Signal Layer Toggles
The risk engine requires explicit operator approval before executing live trades.
Approvals are managed via the risk API:
### Macro signal layer
```bash ```bash
# Check pending approvals # Check status
curl -s https://stonks-api.celestium.life/risk/approvals/pending curl -s https://stonks-api.celestium.life/api/admin/macro/status
# Approve a recommendation # Toggle
curl -X POST https://stonks-api.celestium.life/risk/approvals/<id>/approve curl -X PUT https://stonks-api.celestium.life/api/admin/macro/toggle
``` ```
## Common Failure Modes ### Competitive signal layer
### CrashLoopBackOff on workers
Queue workers (aggregation, extractor, recommendation, broker-adapter, lake-publisher) exit with code 0 when the queue is empty. Kubernetes restarts them, which is normal. They'll process work when messages arrive.
### PostgreSQL auth failure
Password mismatch between `stonks-core-secrets.POSTGRES_PASSWORD` and the actual DB user password. Fix:
```bash ```bash
kubectl exec -i -n postgresql-service postgresql-1 -c postgres -- psql -U postgres -d stonks <<'EOF' # Check status
ALTER USER stonks WITH PASSWORD '<new-password>'; curl -s https://stonks-api.celestium.life/api/admin/competitive/status
EOF
# Toggle
curl -X PUT https://stonks-api.celestium.life/api/admin/competitive/toggle
``` ```
Then update the Helm secret and restart.
### Redis connection refused ## Backup and Restore
Check Redis is running: `kubectl get pods -n redis-service`
If Redis master is down, restart it: `kubectl rollout restart -n redis-service statefulset/redis-master`
### ImagePullBackOff ### Database backup
GHCR credentials expired or missing. Re-run `helm upgrade` with fresh `ghcrAuth.password`. ```bash
# Local backup (keeps last 7)
./scripts/backup-db.sh
### Superset won't start # Backup + upload to MinIO
Needs custom image with `sqlalchemy-trino` package. Stock `apache/superset:latest` doesn't include it. ./scripts/backup-db.sh --upload-minio
```
## Log Access Backups go to `~/backups/stonks-oracle/`. Old backups are auto-pruned (keeps last 7).
All services output JSON logs when `JSON_LOGS=true` (default). ### Database restore
```bash
# Lists available backups if no argument given
./scripts/restore-db.sh
# Restore a specific backup (WARNING: replaces all data)
./scripts/restore-db.sh ~/backups/stonks-oracle/stonks-20250615-180000.sql.gz
```
The restore script scales down all services, restores the dump, re-grants permissions, and scales services back up.
### Redis backup
```bash
./scripts/backup-redis.sh
```
Triggers a BGSAVE and copies the RDB dump locally.
## Database Nuke & Rebuild
When a full reset is needed:
```bash ```bash
# Stream all logs from a service # 1. Tear down Helm release
kso logs -f deployment/<service> --tail=100 bash ~/sources/kube/stonks-oracle/runmelast.sh
# Search for errors across all pods # 2. Terminate connections and drop database
kubectl exec -n postgresql-service postgresql-1 -c postgres -- \
psql -U postgres -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'stonks' AND pid <> pg_backend_pid();"
kubectl exec -n postgresql-service postgresql-1 -c postgres -- \
psql -U postgres -c "DROP DATABASE IF EXISTS stonks;"
# 3. Flush Redis dedup markers
# (clear all stonks:* keys from Redis)
# 4. Full redeploy (creates DB, runs migrations, deploys)
bash ~/sources/kube/stonks-oracle/runmefirst.sh
# 5. Re-seed companies and relationships
# (run from a pod or with port-forwarded DB access)
python -m services.symbol_registry.seed
```
## Monitoring
### Check pod status
```bash
kso get pods
kso get pods -o wide # includes node placement
```
### Check ingestion health
```bash
# Recent ingestion activity
kso logs deployment/ingestion --tail=20
# Source failure alerts
kso logs deployment/scheduler --tail=20 | grep -i "failure\|alert"
```
### Check broker errors
```bash
kso logs deployment/broker-adapter --tail=30 | grep -i "error\|fail"
```
### Check global event processing
```bash
kso logs deployment/extractor --tail=20 | grep -i "macro\|global"
```
### Check trading decisions
```bash
kso logs deployment/trading-engine --tail=30
```
### Stream all errors
```bash
kso logs --all-containers --prefix --tail=100 | grep -i error kso logs --all-containers --prefix --tail=100 | grep -i error
``` ```
@@ -141,7 +289,54 @@ kso logs --all-containers --prefix --tail=100 | grep -i error
| URL | Service | | URL | Service |
|-----|---------| |-----|---------|
| https://stonks.celestium.life | Dashboard |
| https://stonks-api.celestium.life | Query API | | https://stonks-api.celestium.life | Query API |
| https://stonks-registry.celestium.life | Symbol Registry | | https://stonks-registry.celestium.life | Symbol Registry |
| https://stonks-trading.celestium.life | Trading Engine |
| https://stonks-dash.celestium.life | Superset | | https://stonks-dash.celestium.life | Superset |
| https://stonks-trino.celestium.life | Trino | | https://stonks-trino.celestium.life | Trino |
## CI/CD
Workflow: `.github/workflows/build.yml`
Push to `main` triggers: lint → pytest → frontend vitest → build all service images → push to GHCR.
### Check recent builds
```bash
gh run list -L 5
```
### Re-run a failed build
```bash
gh run rerun <run-id> --failed
```
### View failure logs
```bash
gh run view <run-id> --log-failed
```
## Common Failure Modes
### CrashLoopBackOff on workers
Queue workers (aggregation, extractor, recommendation, broker-adapter, lake-publisher) exit with code 0 when the queue is empty. Kubernetes restarts them — this is normal. They process work when messages arrive.
### PostgreSQL auth failure
Password mismatch between the Kubernetes secret and the actual DB user. Fix by re-running `runmefirst.sh` which resets the password and redeploys.
### Redis connection refused
```bash
kubectl get pods -n redis-service
kubectl rollout restart -n redis-service statefulset/redis-master
```
### ImagePullBackOff
GHCR credentials expired or missing. Re-run `runmefirst.sh` with a fresh GitHub token at `/run/secrets/github_token`.
### Trading engine not making decisions
1. Check if trading is enabled: `curl -s https://stonks-trading.celestium.life/health`
2. Check circuit breaker status — may be tripped
3. Check if within trading window (9:45 AM 3:45 PM ET)
4. Check if there are actionable recommendations in the queue
5. Check logs: `kso logs deployment/trading-engine --tail=50`