Files
stonks-oracle/docs/architecture-kubernetes.md
T
Celes Renata f468e30af0
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
feat: implement dual-pipeline signal engine service
New service at services/signal_engine/ implementing concurrent heuristic
(deterministic scoring) and probabilistic (Bayesian inference) pipelines
that evaluate technical signals across 6 timeframes (M30-M) and produce
independent BUY/WATCH/SKIP verdicts per ticker per evaluation tick.

Components:
- Input Normalizer: multi-source data assembly with sentinel fallbacks
- Signal Library: Fibonacci, MA Stack, RSI, Cup & Handle, Elliott Wave
- Multi-Timeframe Confluence Engine: weighted scoring with D/W/M anchors
- Hard Filter Engine: macro_bias, valuation, earnings proximity gating
- Heuristic Pipeline: S_total scoring with confidence-gated verdicts
- Probabilistic Pipeline: Bayesian log-odds with regime priors, entropy
  gating, EV_R calculation, and signal correlation penalty
- Exit Engine: stop-loss, targets, trailing ATR-based stops
- Delta Analyzer: pipeline agreement tracking with rolling Redis metrics
- Output Formatter: SignalOutput contract + Recommendation schema mapping
- Worker orchestrator: concurrent pipelines with failure isolation
- Main entry point: queue polling with fail-safe config loading

Infrastructure:
- Migration 039: signal_engine_outputs table with 3 indexes
- Helm chart: signalEngine service entry (processing tier)
- Redis key: QUEUE_SIGNAL_ENGINE constant

Tests: 390 tests (unit + property-based) covering all components
Config: dual_pipeline_enabled=false by default (safe rollout)
2026-05-02 07:32:26 +00:00

21 KiB

Kubernetes Architecture — Stonks Oracle

This document describes the Kubernetes deployment topology for Stonks Oracle, derived from the Helm chart at infra/helm/stonks-oracle/.

All application workloads deploy to the stonks-oracle namespace. External cluster services (PostgreSQL, Redis, MinIO, Ollama) run in their own namespaces and are referenced via cross-namespace DNS.

Deployment Diagram

graph TB
    %% ── External traffic ──────────────────────────────────────────
    internet((Internet))

    subgraph traefik ["kube-system · Traefik Ingress Controller"]
        direction LR
        ing_dash["stonks.celestium.life"]
        ing_api["stonks-api.celestium.life"]
        ing_reg["stonks-registry.celestium.life"]
        ing_trade["stonks-trading.celestium.life"]
        ing_superset["stonks-dash.celestium.life"]
        ing_trino["stonks-trino.celestium.life"]
    end

    internet --> traefik

    %% ── stonks-oracle namespace ───────────────────────────────────
    subgraph ns ["stonks-oracle namespace"]
        direction TB

        %% ── API Tier (ingress-facing) ─────────────────────────────
        subgraph api_tier ["API Tier · tier: api"]
            direction LR
            query_api["query-api<br/><i>Deployment · 1 replica</i><br/>:8000<br/><i>readiness: /docs</i>"]
            symbol_registry["symbol-registry<br/><i>Deployment · 1 replica</i><br/>:8000<br/><i>readiness: /docs · liveness: /docs</i>"]
        end

        %% ── Frontend Tier ─────────────────────────────────────────
        subgraph frontend_tier ["Frontend Tier · tier: frontend"]
            dashboard["dashboard<br/><i>Deployment · 1 replica</i><br/>:8080<br/><i>nginx-unprivileged</i><br/><i>readiness: / · liveness: /</i>"]
        end

        %% ── Trading Tier ──────────────────────────────────────────
        subgraph trading_tier ["Trading Tier · tier: trading"]
            direction LR
            trading_engine["trading-engine<br/><i>Deployment · 1 replica</i><br/>:8000<br/><i>readiness: /ready · liveness: /health</i>"]
            risk_engine["risk-engine<br/><i>Deployment · 1 replica</i><br/>:8000"]
            broker_adapter["broker-adapter<br/><i>Deployment · 1 replica</i><br/><i>queue-driven worker · pipeline-gated</i>"]
        end

        %% ── Orchestration Tier ────────────────────────────────────
        subgraph orchestration_tier ["Orchestration Tier · tier: orchestration"]
            scheduler["scheduler<br/><i>Deployment · 1 replica · pipeline-gated</i><br/><i>init: migrations → seed → backfill</i>"]
        end

        %% ── Ingestion Tier ────────────────────────────────────────
        subgraph ingestion_tier ["Ingestion Tier · tier: ingestion"]
            ingestion["ingestion<br/><i>Deployment · 1 replica · pipeline-gated</i><br/><i>queue-driven worker</i>"]
        end

        %% ── Processing Tier (pipeline workers) ────────────────────
        subgraph processing_tier ["Processing Tier · tier: processing"]
            direction LR
            parser["parser<br/><i>Deployment · 2 replicas · pipeline-gated</i>"]
            extractor["extractor<br/><i>Deployment · 1 replica · pipeline-gated</i>"]
            aggregation["aggregation<br/><i>Deployment · 4 replicas · pipeline-gated</i>"]
            recommendation["recommendation<br/><i>Deployment · 1 replica · pipeline-gated</i>"]
        end

        %% ── Analytics Tier ────────────────────────────────────────
        subgraph analytics_tier ["Analytics Tier · tier: analytics"]
            direction LR
            lake_publisher["lake-publisher<br/><i>Deployment · 1 replica · pipeline-gated</i><br/><i>queue-driven worker</i>"]
            hive_metastore["hive-metastore<br/><i>Deployment · 1 replica</i><br/>:9083<br/><i>apache/hive:4.0.0</i><br/><i>PVC: hive-metastore-data</i>"]
            trino["trino<br/><i>Deployment · 1 replica</i><br/>:8080<br/><i>trinodb/trino:latest</i><br/><i>readiness: /v1/info</i>"]
        end

        %% ── Superset (tier: dashboard in template) ────────────────
        subgraph superset_block ["Superset · tier: dashboard"]
            superset["superset<br/><i>Deployment · 1 replica</i><br/>:8088<br/><i>custom image</i><br/><i>PVC: superset-data</i><br/><i>readiness: /health</i>"]
        end

        %% ── Helm Secrets ──────────────────────────────────────────
        subgraph secrets_block ["Helm-Managed Secrets"]
            direction LR
            sec_core["stonks-core-secrets<br/><i>POSTGRES_PASSWORD</i><br/><i>MINIO_ACCESS_KEY</i><br/><i>MINIO_SECRET_KEY</i><br/><i>REDIS_PASSWORD</i>"]
            sec_broker["stonks-broker-secrets<br/><i>BROKER_API_KEY</i><br/><i>BROKER_API_SECRET</i><br/><i>BROKER_BASE_URL</i>"]
            sec_market["stonks-market-secrets<br/><i>MARKET_DATA_API_KEY</i>"]
            sec_gmail["stonks-gmail-secrets<br/><i>GMAIL_SENDER</i><br/><i>GMAIL_RECIPIENT</i><br/><i>GMAIL_APP_PASSWORD</i>"]
            sec_dashboard["stonks-dashboard-secrets<br/><i>SUPERSET_SECRET_KEY</i><br/><i>SUPERSET_ADMIN_PASSWORD</i>"]
        end

        %% ── ConfigMap ─────────────────────────────────────────────
        configmap["stonks-config<br/><i>ConfigMap</i><br/><i>All env vars from values.yaml config block</i>"]
    end

    %% ── External Cluster Services ─────────────────────────────────
    subgraph pg_ns ["postgresql-service namespace"]
        postgres[("PostgreSQL<br/>postgresql-rw:5432")]
    end

    subgraph redis_ns ["redis-service namespace"]
        redis[("Redis<br/>redis-master:6379")]
    end

    subgraph minio_ns ["minio-service namespace"]
        minio[("MinIO<br/>minio:80")]
    end

    subgraph ollama_ns ["ollama-service namespace"]
        ollama[("Ollama<br/>ollama:11434<br/><i>GPU: 4070 Ti Super 16GB</i>")]
    end

    %% ── Ingress Routes ────────────────────────────────────────────
    ing_dash -->|":8080"| dashboard
    ing_api -->|":8000"| query_api
    ing_reg -->|":8000"| symbol_registry
    ing_trade -->|":8000"| trading_engine
    ing_superset -->|":8088"| superset
    ing_trino -->|":8080"| trino

    %% ── Dashboard → Backend APIs ──────────────────────────────────
    dashboard -.->|"/api/ proxy"| query_api
    dashboard -.->|"/registry/ proxy"| symbol_registry
    dashboard -.->|"/risk/ proxy"| risk_engine

    %% ── Pipeline data flow (via Redis queues) ─────────────────────
    scheduler -->|"enqueue jobs"| redis
    ingestion -->|"stonks:queue:parsing"| redis
    parser -->|"stonks:queue:extraction"| redis
    extractor -->|"stonks:queue:aggregation"| redis
    aggregation -->|"stonks:queue:recommendation"| redis
    recommendation -->|"stonks:queue:trading_decisions"| redis
    trading_engine -->|"stonks:queue:broker_orders"| redis
    broker_adapter -->|"read orders"| redis
    lake_publisher -->|"stonks:queue:lake_publish"| redis

    %% ── External service connections ──────────────────────────────
    scheduler --> postgres
    scheduler --> redis
    ingestion --> postgres
    ingestion --> redis
    ingestion --> minio
    parser --> postgres
    parser --> redis
    extractor --> postgres
    extractor --> redis
    extractor --> ollama
    aggregation --> postgres
    aggregation --> redis
    recommendation --> postgres
    recommendation --> redis
    trading_engine --> postgres
    trading_engine --> redis
    risk_engine --> postgres
    broker_adapter --> postgres
    broker_adapter --> redis
    lake_publisher --> postgres
    lake_publisher --> minio
    query_api --> postgres
    query_api --> redis
    query_api --> minio
    symbol_registry --> postgres

    %% ── Analytics plane connections ───────────────────────────────
    lake_publisher -->|"Parquet → s3a://stonks-lakehouse"| minio
    hive_metastore -->|"s3a:// catalog"| minio
    trino -->|"thrift://hive-metastore:9083"| hive_metastore
    superset -->|"trino:8080"| trino
    query_api -->|"trino:8080"| trino
    superset --> postgres
    superset --> redis

    %% ── Trading tier external egress ──────────────────────────────
    trading_engine -->|"HTTPS :443<br/>Alpaca API"| internet
    trading_engine -->|"SMTP :587<br/>Gmail notifications"| internet
    broker_adapter -->|"HTTPS :443<br/>Alpaca API"| internet
    ingestion -->|"HTTPS :443<br/>Polygon.io / News APIs"| internet

    %% ── Secret consumption ────────────────────────────────────────
    sec_core -.-> query_api
    sec_core -.-> symbol_registry
    sec_core -.-> scheduler
    sec_core -.-> ingestion
    sec_core -.-> parser
    sec_core -.-> extractor
    sec_core -.-> aggregation
    sec_core -.-> recommendation
    sec_core -.-> trading_engine
    sec_core -.-> risk_engine
    sec_core -.-> broker_adapter
    sec_core -.-> lake_publisher
    sec_core -.-> hive_metastore
    sec_core -.-> trino
    sec_core -.-> superset

    sec_broker -.-> ingestion
    sec_broker -.-> trading_engine
    sec_broker -.-> risk_engine
    sec_broker -.-> broker_adapter

    sec_market -.-> ingestion
    sec_market -.-> query_api

    sec_gmail -.-> trading_engine

    sec_dashboard -.-> superset

    configmap -.-> query_api
    configmap -.-> symbol_registry
    configmap -.-> scheduler
    configmap -.-> ingestion
    configmap -.-> parser
    configmap -.-> extractor
    configmap -.-> aggregation
    configmap -.-> recommendation
    configmap -.-> trading_engine
    configmap -.-> risk_engine
    configmap -.-> broker_adapter
    configmap -.-> lake_publisher
    configmap -.-> superset

    %% ── Styles ────────────────────────────────────────────────────
    classDef apiSvc fill:#4a90d9,stroke:#2c5f8a,color:#fff
    classDef frontendSvc fill:#50c878,stroke:#2e7d46,color:#fff
    classDef tradingSvc fill:#e8a838,stroke:#b07d1a,color:#fff
    classDef processSvc fill:#9b59b6,stroke:#6c3483,color:#fff
    classDef orchSvc fill:#1abc9c,stroke:#148f77,color:#fff
    classDef ingestionSvc fill:#e67e22,stroke:#bf6516,color:#fff
    classDef analyticsSvc fill:#e74c3c,stroke:#a93226,color:#fff
    classDef supersetSvc fill:#c0392b,stroke:#96281b,color:#fff
    classDef extSvc fill:#95a5a6,stroke:#717d7e,color:#fff
    classDef secretSvc fill:#f5f5dc,stroke:#999,color:#333
    classDef configSvc fill:#dfe6e9,stroke:#999,color:#333

    class query_api,symbol_registry apiSvc
    class dashboard frontendSvc
    class trading_engine,risk_engine,broker_adapter tradingSvc
    class scheduler orchSvc
    class ingestion ingestionSvc
    class parser,extractor,aggregation,recommendation processSvc
    class lake_publisher,hive_metastore,trino analyticsSvc
    class superset supersetSvc
    class postgres,redis,minio,ollama extSvc
    class sec_core,sec_broker,sec_market,sec_gmail,sec_dashboard secretSvc
    class configmap configSvc

Network Policy Boundaries

The Helm chart deploys a default-deny-ingress policy that blocks all inbound traffic to pods in the stonks-oracle namespace. Each service that needs inbound connections has an explicit allow policy:

graph LR
    subgraph netpol ["Network Policies — stonks-oracle namespace"]
        direction TB

        deny["🔒 default-deny-ingress<br/><i>Blocks ALL ingress to all pods</i>"]

        subgraph allows ["Explicit Allow Rules"]
            direction TB

            np_dash["allow-dashboard-ingress<br/>dashboard :8080<br/>← kube-system (Traefik)"]

            np_api["allow-query-api-ingress<br/>query-api :8000<br/>← kube-system (Traefik)<br/>← dashboard pod"]

            np_reg["allow-symbol-registry-ingress<br/>symbol-registry :8000<br/>← kube-system (Traefik)<br/>← dashboard pod"]

            np_trade["allow-trading-engine-ingress<br/>trading-engine :8000<br/>← kube-system (Traefik)<br/>← query-api pod<br/>← dashboard pod<br/><i>Egress: PostgreSQL :5432,</i><br/><i>Redis :6379, HTTPS :443, SMTP :587</i>"]

            np_risk["allow-risk-engine-ingress<br/>risk-engine :8000<br/>← broker-adapter pod<br/>← query-api pod<br/>← dashboard pod"]

            np_superset["allow-superset-ingress<br/>superset :8088<br/>← kube-system (Traefik)"]

            np_trino["allow-trino-ingress<br/>trino :8080<br/>← superset pod<br/>← query-api pod<br/>← kube-system (Traefik)"]

            np_hive["allow-hive-metastore-ingress<br/>hive-metastore :9083<br/>← trino pod<br/>← lake-publisher pod"]

            np_broker["deny-broker-adapter-ingress<br/>broker-adapter<br/><i>No inbound traffic allowed</i>"]
        end
    end

    style deny fill:#e74c3c,stroke:#c0392b,color:#fff
    style np_broker fill:#e74c3c,stroke:#c0392b,color:#fff
    style np_dash fill:#2ecc71,stroke:#27ae60,color:#fff
    style np_api fill:#2ecc71,stroke:#27ae60,color:#fff
    style np_reg fill:#2ecc71,stroke:#27ae60,color:#fff
    style np_trade fill:#f39c12,stroke:#d68910,color:#fff
    style np_risk fill:#f39c12,stroke:#d68910,color:#fff
    style np_superset fill:#2ecc71,stroke:#27ae60,color:#fff
    style np_trino fill:#2ecc71,stroke:#27ae60,color:#fff
    style np_hive fill:#3498db,stroke:#2980b9,color:#fff

Services Without Ingress Policies (Pipeline Workers)

The following services have no inbound network policy — they are queue-driven workers that only make outbound connections to PostgreSQL, Redis, MinIO, and Ollama. The default-deny-ingress policy blocks any unsolicited inbound traffic:

Service Tier Behavior
scheduler orchestration Polls DB, enqueues to Redis. Runs migrations + seed + backfill as init containers
ingestion ingestion Reads from stonks:queue:ingestion, writes to DB/MinIO/Redis. Egress to Polygon.io/News APIs
parser processing Reads from stonks:queue:parsing, writes to DB/Redis
extractor processing Reads from stonks:queue:extraction, calls Ollama, writes to DB/Redis
aggregation processing Reads from stonks:queue:aggregation, writes to DB/Redis
recommendation processing Reads from stonks:queue:recommendation, writes to DB/Redis
lake-publisher analytics Reads from stonks:queue:lake_publish, writes Parquet to MinIO

Service Tier Summary

Tier Services Ingress? Replicas Pipeline-Gated? Notes
api query-api, symbol-registry Yes (Traefik) 1 each No FastAPI, readiness probes on /docs
frontend dashboard Yes (Traefik) 1 No nginx-unprivileged on :8080, proxies to API services
trading trading-engine, risk-engine, broker-adapter trading-engine: Yes; risk-engine: internal only; broker-adapter: denied 1 each broker-adapter only trading-engine has egress to Alpaca + Gmail
orchestration scheduler No 1 Yes Runs DB migrations + seed + backfill as init containers
ingestion ingestion No 1 Yes Fetches from external APIs (Polygon.io, news, filings)
processing parser, extractor, aggregation, recommendation No 2, 1, 4, 1 Yes Queue-driven pipeline workers
analytics lake-publisher, trino, hive-metastore trino: Yes (Traefik); others: No 1 each lake-publisher only trino + hive-metastore gated by trino.enabled / hiveMetastore.enabled
dashboard (Superset) superset Yes (Traefik) 1 No Gated by superset.enabled, custom image with trino + psycopg2 drivers

Secret Consumption Map

Secret Keys Consumers
stonks-core-secrets POSTGRES_PASSWORD, MINIO_ACCESS_KEY, MINIO_SECRET_KEY, REDIS_PASSWORD All 13 app services + hive-metastore (init), trino (init), superset
stonks-broker-secrets BROKER_API_KEY, BROKER_API_SECRET, BROKER_BASE_URL ingestion, trading-engine, risk-engine, broker-adapter
stonks-market-secrets MARKET_DATA_API_KEY ingestion, query-api
stonks-gmail-secrets GMAIL_SENDER, GMAIL_RECIPIENT, GMAIL_APP_PASSWORD trading-engine
stonks-dashboard-secrets SUPERSET_SECRET_KEY, SUPERSET_ADMIN_PASSWORD superset

Pipeline Toggle

Setting pipelineEnabled: false in values.yaml scales all services with pipeline: true to 0 replicas. This affects:

  • scheduler, ingestion, parser, extractor, aggregation, recommendation, broker-adapter, lake-publisher

API-tier services (query-api, symbol-registry), trading-tier services (trading-engine, risk-engine), analytics services (trino, hive-metastore, superset), and the dashboard always run regardless of this toggle.

External Cluster Services

These services run outside the stonks-oracle namespace and are referenced via cross-namespace DNS:

Service Namespace DNS Port Notes
PostgreSQL postgresql-service postgresql-rw.postgresql-service.svc.cluster.local 5432 CloudNativePG managed
Redis redis-service redis-master.redis-service.svc.cluster.local 6379 Password in stonks-core-secrets
MinIO minio-service minio.minio-service.svc.cluster.local 80 S3-compatible object store
Ollama ollama-service ollama.ollama-service.svc.cluster.local 11434 LLM inference, GPU: 4070 Ti Super 16GB

Analytics Plane

The analytics stack runs within the stonks-oracle namespace:

  1. Lake Publisher writes Parquet fact tables to MinIO at s3a://stonks-lakehouse/warehouse. Pipeline-gated — scales to 0 when pipelineEnabled: false.
  2. Hive Metastore (Apache Hive 4.0.0) manages table metadata, backed by embedded Derby DB with a PVC (hive-metastore-data) for persistence. Connects to MinIO for S3A filesystem access. Gated by hiveMetastore.enabled.
  3. Trino queries the lakehouse via Hive Metastore (thrift://hive-metastore:9083). Exposes two catalogs: lakehouse (Hive connector) and iceberg (Iceberg connector). Both connect to MinIO for data access. Gated by trino.enabled. Readiness probe on /v1/info.
  4. Superset connects to Trino for lakehouse queries and to PostgreSQL for its metadata DB. Uses Redis for caching. Exposed externally via Traefik ingress. Gated by superset.enabled. Uses custom image (registry.celestium.life/stonks-oracle/superset:latest) with trino + psycopg2 drivers. PVC (superset-data) for persistence.

Ingress Routes

All ingress resources use the traefik IngressClass with TLS certificates issued by the ca-issuer ClusterIssuer:

Domain Backend Service Port TLS Secret
stonks.celestium.life dashboard 8080 stonks-dashboard-tls
stonks-api.celestium.life query-api 8000 stonks-api-tls
stonks-registry.celestium.life symbol-registry 8000 stonks-registry-tls
stonks-trading.celestium.life trading-engine 8000 stonks-trading-tls
stonks-dash.celestium.life superset 8088 stonks-dash-tls
stonks-trino.celestium.life trino 8080 stonks-trino-tls

Deployment Stages

The Helm chart supports multiple deployment stages via value override files:

Stage Override File Namespace Key Differences
Production values.yaml (base) stonks-oracle Full analytics stack, all services
Paper values-paper.yaml stonks-oracle BROKER_MODE=paper, DEPLOY_STAGE=paper, separate DB (stonks_paper), Redis DB 2, paper-specific ingress hostnames
Beta values-beta.yaml stonks-oracle-beta DEPLOY_STAGE=beta, LOG_LEVEL=DEBUG, separate DB (stonks_beta), Redis DB 1, analytics stack disabled, beta-specific ingress hostnames