c5b7bddadb
Migration 028: For each recommendation with no evidence rows, finds the closest matching trend_window (by ticker + time_horizon + timestamp) and re-inserts evidence from top_supporting/opposing_evidence arrays. Filters out non-UUID pattern IDs and verifies documents exist. This fixes 'No evidence linked' on recommendations created before the UUID filtering fix in persist_recommendation.
103 lines
6.5 KiB
Markdown
103 lines
6.5 KiB
Markdown
# Stonks Oracle — Project Context
|
|
|
|
## Overview
|
|
Stonks Oracle is a Kubernetes-native AI market intelligence and paper-trading platform.
|
|
Python monorepo with services under `services/`, infrastructure under `infra/`, lakehouse schemas under `lakehouse/`, frontend React dashboard under `frontend/`, and dashboards under `dashboards/`.
|
|
|
|
Three-layer signal aggregation engine:
|
|
1. **Company-specific signals** — document intelligence from news, filings, market data
|
|
2. **Macro signals** — global news interpolation, geopolitical event classification, exposure-based impact scoring
|
|
3. **Competitive signals** — historical pattern mining, cross-company signal propagation, competitor relationship management
|
|
|
|
## Tracked Universe
|
|
- 50 companies across 10 sectors (Technology, Consumer Cyclical, Financial Services, Healthcare, Energy, Communication Services, Industrials, Consumer Defensive, Real Estate, Utilities)
|
|
- 46 competitor relationships (direct_rival, same_sector, overlapping_products, supply_chain_adjacent)
|
|
- Seed script: `python -m services.symbol_registry.seed`
|
|
|
|
## Local Dev Environment
|
|
- Ubuntu dev machine, Python 3.12
|
|
- Virtual environment at `.venv/` — always use it for Python commands
|
|
- Node.js 24 via nvm — always load nvm before running Node/npm commands:
|
|
`export NVM_DIR="$HOME/.nvm" && [ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh" && nvm use 24`
|
|
- For tools not in `.venv/` (like `ruff`, `gh`), install via pip or apt as needed
|
|
- Docker available locally for image builds (but let CI handle pushes)
|
|
|
|
## Live Endpoints
|
|
- Dashboard: `https://stonks.celestium.life`
|
|
- Query API: `https://stonks-api.celestium.life`
|
|
- Symbol Registry: `https://stonks-registry.celestium.life`
|
|
- Trading Engine: `https://stonks-trading.celestium.life`
|
|
- Superset: `https://stonks-dash.celestium.life`
|
|
- Trino: `https://stonks-trino.celestium.life`
|
|
|
|
## Infrastructure
|
|
- Kubernetes cluster: 4x NixOS nodes (gremlin-1 through gremlin-4), reachable via `kubectl`, `virtctl`, `ssh root@gremlin-{1,2,3,4}`
|
|
- NixOS configs stored at `/etc/nixos` on gremlin-1, git-pushed to other hosts
|
|
- Ingress: Traefik, domain `*.celestium.life`
|
|
- Cert-Manager: `ca-issuer` (local CA) for internal services
|
|
- Container registry: `ghcr.io/celesrenata/stonks-oracle`
|
|
|
|
## CI/CD
|
|
- GitHub Actions workflow at `.github/workflows/build.yml`
|
|
- Push to `main` triggers: lint → pytest → frontend vitest → build all service images + dashboard + superset → push to GHCR
|
|
- Images tagged as `ghcr.io/celesrenata/stonks-oracle/<service>:<sha>` and `:latest`
|
|
- Dashboard image: `frontend/Dockerfile` (multi-stage: node:24 → nginx-unprivileged on port 8080)
|
|
- Superset image: `docker/Dockerfile.superset` (apache/superset + trino + psycopg2)
|
|
- Python service images: `docker/Dockerfile` with `SERVICE_CMD` build arg
|
|
- Let CI handle image builds and pushes — do NOT manually `docker build && docker push`
|
|
- Check CI status: `gh run list -L 3`
|
|
|
|
## Deployment Scripts
|
|
- `~/sources/kube/stonks-oracle/runmefirst.sh` — full deploy: DB setup, migrations, Helm install, rolling restart (runs from gremlin-1 at 192.168.42.254 where secrets are available)
|
|
- `~/sources/kube/stonks-oracle/runmelast.sh` — teardown: Helm uninstall, clean resources (preserves DB/MinIO/Redis)
|
|
- After CI builds, deploy with: `helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle`
|
|
- Restart a single service: `kubectl rollout restart deployment/<name> -n stonks-oracle`
|
|
|
|
## Database Nuke & Rebuild
|
|
When a full reset is needed:
|
|
1. `bash ~/sources/kube/stonks-oracle/runmelast.sh` (from gremlin-1)
|
|
2. `kubectl exec -n postgresql-service postgresql-1 -c postgres -- psql -U postgres -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'stonks' AND pid <> pg_backend_pid();"`
|
|
3. `kubectl exec -n postgresql-service postgresql-1 -c postgres -- psql -U postgres -c "DROP DATABASE IF EXISTS stonks;"`
|
|
4. Flush Redis: clear all `stonks:*` keys to reset dedup markers
|
|
5. `bash ~/sources/kube/stonks-oracle/runmefirst.sh` (from gremlin-1)
|
|
6. Run seed: `POSTGRES_HOST=postgresql-rw.postgresql-service.svc.cluster.local POSTGRES_PASSWORD='St0nks0racl3!' POSTGRES_USER=stonks POSTGRES_DB=stonks .venv/bin/python -m services.symbol_registry.seed`
|
|
|
|
## API Secrets
|
|
- Stored as files in repo root (gitignored): `polygon.io.key`, `alpaca.key`, `alpaca.secret`, `alpaca.url`
|
|
- GitHub token at `/run/secrets/github_token` (on gremlin-1 only)
|
|
- Injected into K8s secrets via `runmefirst.sh` Helm `--set` flags
|
|
|
|
## Existing Cluster Services (do NOT redeploy these)
|
|
- PostgreSQL: `postgresql-rw.postgresql-service.svc.cluster.local:5432`
|
|
- Redis: `redis-master.redis-service.svc.cluster.local:6379` (password: in Helm secrets)
|
|
- MinIO: `minio.minio-service.svc.cluster.local:80` (API)
|
|
- Ollama: `ollama.ollama-service.svc.cluster.local:11434` (cluster-internal), also at `http://10.1.1.12:2701` (external), GPU: 4070 Ti Super 16GB
|
|
|
|
## Database Migrations
|
|
- Located in `infra/migrations/001_*.sql` through `027_*.sql`
|
|
- Applied automatically by `runmefirst.sh` in sorted order
|
|
- Next migration number: **029**
|
|
- Key migrations:
|
|
- 016: Global news interpolation (global_events, macro_impact_records, exposure_profiles, trend_projections)
|
|
- 017: Competitive intelligence (competitor_relationships, competitive_signal_records)
|
|
- 024: Trend history time-series table
|
|
- 026: AI agents management (ai_agents, agent_performance_log)
|
|
- 027: Agent variants (agent_variants table for A/B testing)
|
|
|
|
## Key Conventions
|
|
- All services use `services/shared/config.py` for configuration via env vars
|
|
- Redis queues defined in `services/shared/redis_keys.py`
|
|
- Pydantic schemas in `services/shared/schemas.py`
|
|
- Helm chart in `infra/helm/stonks-oracle/`, all in `stonks-oracle` namespace
|
|
- Lakehouse DDL in `lakehouse/schemas/`
|
|
- Frontend proxies: `/api/` → query-api:8000, `/registry/` → symbol-registry:8000, `/risk/` → risk:8000
|
|
- Network policies: default-deny with explicit allow rules per service
|
|
|
|
## Signal Layers
|
|
- **Layer 1 (Company)**: document_impact_records → WeightedSignal → trend_windows
|
|
- **Layer 2 (Macro)**: global_events → macro_impact_records → WeightedSignal (toggle: `macro_enabled` in risk_configs)
|
|
- **Layer 3 (Competitive)**: pattern_matcher → signal_propagation → WeightedSignal (toggle: `competitive_enabled` in risk_configs)
|
|
- All three layers merge into the aggregation engine via the same WeightedSignal abstraction
|
|
- Each layer has an independent runtime toggle in risk_configs (no restart needed)
|
|
- Pattern-only and macro-only trend shifts are forced to informational mode (suppression safety)
|