Files
stonks-oracle/docs/docker-deployment.md
T
Celes Renata f468e30af0
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
feat: implement dual-pipeline signal engine service
New service at services/signal_engine/ implementing concurrent heuristic
(deterministic scoring) and probabilistic (Bayesian inference) pipelines
that evaluate technical signals across 6 timeframes (M30-M) and produce
independent BUY/WATCH/SKIP verdicts per ticker per evaluation tick.

Components:
- Input Normalizer: multi-source data assembly with sentinel fallbacks
- Signal Library: Fibonacci, MA Stack, RSI, Cup & Handle, Elliott Wave
- Multi-Timeframe Confluence Engine: weighted scoring with D/W/M anchors
- Hard Filter Engine: macro_bias, valuation, earnings proximity gating
- Heuristic Pipeline: S_total scoring with confidence-gated verdicts
- Probabilistic Pipeline: Bayesian log-odds with regime priors, entropy
  gating, EV_R calculation, and signal correlation penalty
- Exit Engine: stop-loss, targets, trailing ATR-based stops
- Delta Analyzer: pipeline agreement tracking with rolling Redis metrics
- Output Formatter: SignalOutput contract + Recommendation schema mapping
- Worker orchestrator: concurrent pipelines with failure isolation
- Main entry point: queue polling with fail-safe config loading

Infrastructure:
- Migration 039: signal_engine_outputs table with 3 indexes
- Helm chart: signalEngine service entry (processing tier)
- Redis key: QUEUE_SIGNAL_ENGINE constant

Tests: 390 tests (unit + property-based) covering all components
Config: dual_pipeline_enabled=false by default (safe rollout)
2026-05-02 07:32:26 +00:00

37 KiB

Docker Deployment Guide

This guide covers running the full Stonks Oracle platform locally using Docker Compose. It documents every service, environment variable, volume mount, health check, and operational command.

Prerequisites

  • Docker Engine 24+ and Docker Compose v2
  • NVIDIA GPU with drivers and NVIDIA Container Toolkit (for Ollama LLM inference)
  • At least 16 GB RAM (Ollama + Trino + all services)
  • API keys for Polygon.io and Alpaca (optional — platform runs in degraded mode without them)

Quick Start

# 1. Clone the repository
git clone <repo-url> && cd stonks-oracle

# 2. Configure API keys (create .env in the repo root)
cat > .env <<'EOF'
MARKET_DATA_API_KEY=your_polygon_key
BROKER_API_KEY=your_alpaca_key
BROKER_API_SECRET=your_alpaca_secret
BROKER_BASE_URL=https://paper-api.alpaca.markets
EOF

# 3. Start everything
docker compose up -d

# 4. Pull an LLM model into Ollama
docker compose exec ollama ollama pull qwen3.5:9b-fast

# 5. Seed the database
docker compose exec scheduler python -m services.symbol_registry.seed

# 6. Verify all services are healthy
docker compose ps

# 7. Access the dashboard
open http://localhost:3000

Automated Deployment

The deploy-docker.sh script automates the full deployment to a remote host via SSH, including prerequisite installation, repository sync, environment configuration, image builds, service startup, database seeding, and Ollama model pulling:

# Deploy with defaults (GPU-accelerated Docker Ollama)
bash deploy-docker.sh

# Specify a custom Ollama model
bash deploy-docker.sh --ollama-model qwen3.6

# Deploy to a different host
bash deploy-docker.sh --host user@myserver --dir /opt/stonks
Flag Default Description
--host celes@192.168.42.254 SSH target (USER@HOST)
--ollama-url (auto — Docker container) Ollama API URL
--ollama-model qwen3.5:9b-fast Ollama model to pull
--dir ~/stonks-oracle Remote install directory

The script detects the target OS and package manager (apt, dnf, yum, pacman, zypper) and installs Docker, NVIDIA drivers, and the NVIDIA Container Toolkit as needed. It also handles WSL environments and firewall configuration.


Service Inventory

Infrastructure Services

Service Image Ports Volumes Purpose
postgres postgres:16-alpine 5432:5432 pgdata/var/lib/postgresql/data, ./infra/migrations/docker-entrypoint-initdb.d Primary database; migrations auto-applied on first start
redis redis:7-alpine 6379:6379 Queue broker, caching, deduplication
minio minio/minio:latest 9000:9000 (API), 9001:9001 (console) miniodata/data Object storage for raw artifacts and lakehouse
minio-init minio/mc:latest One-shot init container that creates required buckets
ollama ollama/ollama:latest 11434:11434 ollama_models/root/.ollama LLM inference server for extraction and classification
trino trinodb/trino:latest 8080:8080 ./infra/trino/catalog/etc/trino/catalog SQL query engine over the lakehouse
hive-metastore apache/hive:4.0.0 9083:9083 hive_data/opt/hive/data, ./infra/hive/core-site.xml/opt/hive/conf/core-site.xml, ./infra/hive/metastore-site.xml/opt/hive/conf/metastore-site.xml Iceberg/Hive metadata catalog for Trino
superset apache/superset:latest 8088:8088 superset_data/app/superset_home BI dashboards over Trino

Application Services

Service Dockerfile SERVICE_CMD / Command Ports Depends On
scheduler docker/Dockerfile.scheduler python -m services.scheduler.app postgres (healthy), redis (healthy)
symbol-registry docker/Dockerfile uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 8001:8000 postgres (healthy)
ingestion docker/Dockerfile python -m services.ingestion.worker postgres (healthy), redis (healthy), minio (healthy)
parser docker/Dockerfile python -m services.parser.worker postgres (healthy), redis (healthy)
extractor docker/Dockerfile python -m services.extractor.main postgres (healthy), redis (healthy), ollama (started)
aggregation docker/Dockerfile python -m services.aggregation.main postgres (healthy), redis (healthy)
recommendation docker/Dockerfile python -m services.recommendation.main postgres (healthy), redis (healthy)
trading-engine docker/Dockerfile uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 8002:8000 postgres (healthy), redis (healthy)
risk-engine docker/Dockerfile uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 8003:8000 postgres (healthy)
broker-adapter docker/Dockerfile python -m services.adapters.broker_service postgres (healthy), redis (healthy)
lake-publisher docker/Dockerfile python -m services.lake_publisher.jobs postgres (healthy), minio (healthy)
query-api docker/Dockerfile uvicorn services.api.app:app --host 0.0.0.0 --port 8000 8004:8000 postgres (healthy), redis (healthy), minio (healthy)
dashboard frontend/Dockerfile nginx (built-in) 3000:8080 query-api (healthy)

The risk-engine service has a Docker network alias of risk so the dashboard's nginx reverse proxy can resolve it as http://risk:8000.

Port Summary

Port Service Protocol
3000 Dashboard (React UI) HTTP
5432 PostgreSQL TCP
6379 Redis TCP
8001 Symbol Registry API HTTP
8002 Trading Engine API HTTP
8003 Risk Engine API HTTP
8004 Query API HTTP
8080 Trino HTTP
8088 Superset HTTP
9000 MinIO API HTTP
9001 MinIO Console HTTP
9083 Hive Metastore Thrift
11434 Ollama HTTP

Environment Variables

Shared Application Environment (x-app-env)

All application services inherit these variables via the x-app-env YAML anchor:

Variable Default Description
POSTGRES_HOST postgres PostgreSQL hostname (Docker service name)
POSTGRES_PORT 5432 PostgreSQL port
POSTGRES_DB stonks Database name
POSTGRES_USER stonks Database user
POSTGRES_PASSWORD stonks_dev Database password
REDIS_HOST redis Redis hostname (Docker service name)
REDIS_PORT 6379 Redis port
MINIO_ENDPOINT minio:9000 MinIO API endpoint
MINIO_ACCESS_KEY minioadmin MinIO access key
MINIO_SECRET_KEY minioadmin MinIO secret key
OLLAMA_BASE_URL http://ollama:11434 Ollama LLM server URL

.env File

The .env file is loaded by ingestion, broker-adapter, and trading-engine via the env_file directive. Create it in the repository root:

# Stonks Oracle — Environment Variables
# Loaded by: ingestion, broker-adapter, trading-engine

# ── Required for live data ingestion ──
MARKET_DATA_API_KEY=

# ── Required for paper/live trading ──
BROKER_API_KEY=
BROKER_API_SECRET=
BROKER_BASE_URL=https://paper-api.alpaca.markets

# ── Trading engine settings (optional) ──
TRADING_ENABLED=true
TRADING_RISK_TIER=moderate
TRADING_MAX_OPEN_POSITIONS=15

# ── LLM model (optional) ──
OLLAMA_MODEL=qwen3.5:9b-fast

# ── Signal layers (optional) ──
MACRO_ENABLED=true
COMPETITIVE_ENABLED=true
Variable Required Default Used By Description
MARKET_DATA_API_KEY No* (empty) ingestion Polygon.io API key for market data fetching
BROKER_API_KEY No* (empty) broker-adapter, trading-engine Alpaca API key
BROKER_API_SECRET No* (empty) broker-adapter, trading-engine Alpaca API secret
BROKER_BASE_URL No https://paper-api.alpaca.markets broker-adapter, trading-engine Alpaca API base URL

*Services start without these keys but run in degraded mode — ingestion cannot fetch market data and the broker adapter cannot execute trades.

Infrastructure Service Environment

PostgreSQL (postgres):

Variable Value Description
POSTGRES_DB stonks Database created on first start
POSTGRES_USER stonks Superuser for the database
POSTGRES_PASSWORD stonks_dev Password for the database user

MinIO (minio):

Variable Value Description
MINIO_ROOT_USER minioadmin MinIO admin username
MINIO_ROOT_PASSWORD minioadmin MinIO admin password

Trino (trino):

Variable Value Description
MINIO_ACCESS_KEY minioadmin Passed to Trino for MinIO catalog access
MINIO_SECRET_KEY minioadmin Passed to Trino for MinIO catalog access

Hive Metastore (hive-metastore):

Variable Value Description
SERVICE_NAME metastore Tells Hive to run in metastore-only mode
DB_DRIVER derby Embedded Derby database for metadata

Superset (superset):

Variable Value Description
SUPERSET_SECRET_KEY stonks-dev-secret-key-change-me Flask secret key (change in production)
ADMIN_USERNAME admin Initial admin username
ADMIN_PASSWORD admin Initial admin password
ADMIN_EMAIL admin@stonks.local Initial admin email

Additional Configuration Variables

All application services support additional environment variables loaded via services/shared/config.py. These can be added to individual service environment blocks or to the x-app-env anchor as needed:

Variable Default Description
REDIS_DB 0 Redis database number
REDIS_PASSWORD (none) Redis password (not needed in Docker Compose)
MINIO_SECURE false Use HTTPS for MinIO
OLLAMA_MODEL qwen3.5:9b Default LLM model for extraction
OLLAMA_TIMEOUT 120 Ollama request timeout (seconds)
OLLAMA_MAX_RETRIES 2 Max retries for Ollama requests
OLLAMA_RETRY_BASE_DELAY 1.0 Base delay between retries (seconds)
OLLAMA_RETRY_MAX_DELAY 10.0 Maximum delay between retries (seconds)
OLLAMA_RETRY_BACKOFF_MULTIPLIER 2.0 Backoff multiplier for retries
VLLM_BASE_URL http://192.168.42.254:8000 vLLM server URL (if using vLLM instead of Ollama)
VLLM_MODEL RedHatAI/Qwen3.6-35B-A3B-NVFP4 vLLM model name
VLLM_TIMEOUT 120 vLLM request timeout (seconds)
VLLM_MAX_RETRIES 2 Max retries for vLLM requests
VLLM_TEMPERATURE 0.7 vLLM sampling temperature
VLLM_MAX_TOKENS 4096 vLLM max output tokens
VLLM_API_KEY (empty) vLLM API key (if required)
TRINO_HOST localhost Trino hostname
TRINO_PORT 8080 Trino port
TRINO_CATALOG lakehouse Trino catalog name
TRINO_SCHEMA stonks Trino schema name
TRINO_ICEBERG_CATALOG iceberg Trino Iceberg catalog name
MARKET_DATA_BASE_URL https://api.polygon.io Polygon.io base URL
MARKET_DATA_PROVIDER polygon Market data provider
BROKER_MODE paper Broker mode: paper or live
BROKER_PROVIDER alpaca Broker provider
TRADING_ENABLED false Enable autonomous trading engine
TRADING_RISK_TIER moderate Risk tier: conservative, moderate, aggressive
TRADING_POLLING_INTERVAL_SECONDS 60 Recommendation polling interval
TRADING_MAX_OPEN_POSITIONS 10 Maximum concurrent open positions
TRADING_RESERVE_SIPHON_PCT 0.20 Percentage of profits siphoned to reserve pool
TRADING_STOP_LOSS_CHECK_INTERVAL_SECONDS 300 Stop-loss check interval
TRADING_FAST_STOP_LOSS_INTERVAL_SECONDS 60 Fast stop-loss check interval
TRADING_GRADUAL_ENTRY_TRANCHES 3 Number of tranches for gradual entry
TRADING_GRADUAL_ENTRY_THRESHOLD_DOLLARS 30.0 Dollar threshold for gradual entry
TRADING_ABSOLUTE_POSITION_CAP 50.0 Maximum position size (dollars)
TRADING_ACTIVE_POOL_MINIMUM 100.0 Minimum active pool balance
TRADING_EMERGENCY_DRAWDOWN_THRESHOLD_PCT 0.40 Emergency drawdown threshold
TRADING_RESERVE_HIGH_WATER_PCT 0.30 Reserve high-water mark percentage
TRADING_MICRO_TRADING_ENABLED false Enable micro-trading mode
TRADING_MICRO_TRADING_INTERVAL_SECONDS 300 Micro-trading polling interval
TRADING_MICRO_TRADING_ALLOCATION_CAP_PCT 0.03 Micro-trading allocation cap
TRADING_MICRO_TRADING_MAX_DAILY 10 Max micro-trades per day
TRADING_MICRO_TRADING_MAX_HOLD_MINUTES 120 Max micro-trade hold time
TRADING_SNS_TOPIC_ARN (empty) AWS SNS topic ARN for notifications
TRADING_SNS_PHONE_NUMBER (empty) Phone number for SNS notifications
TRADING_GMAIL_SENDER (empty) Gmail sender address for notifications
TRADING_GMAIL_RECIPIENT (empty) Gmail recipient address for notifications
MACRO_ENABLED true Enable macro signal layer
MACRO_SIGNAL_WEIGHT 0.3 Relative weight of macro vs company signals
MACRO_CONFIDENCE_THRESHOLD 0.4 Minimum confidence for macro event inclusion
MACRO_SHORT_TERM_STALENESS_HOURS 48 Hours before short-term events get accelerated decay
PROJECTION_CONFIDENCE_THRESHOLD 0.3 Minimum confidence for projections to influence recommendations
COMPETITIVE_ENABLED true Enable competitive signal layer
COMPETITIVE_SIGNAL_WEIGHT 0.2 Relative weight of competitive signals
COMPETITIVE_PATTERN_CONFIDENCE_THRESHOLD 0.3 Minimum confidence for pattern inclusion
COMPETITIVE_PROPAGATION_STRENGTH_THRESHOLD 0.2 Minimum strength for signal propagation
COMPETITIVE_ROUTINE_LOOKBACK_DAYS 180 Lookback window for routine patterns
COMPETITIVE_MAJOR_DECISION_LOOKBACK_DAYS 365 Lookback window for major decisions
COMPETITIVE_MIN_PATTERN_SAMPLES 3 Minimum samples for pattern matching
COMPETITIVE_MAJOR_DECISION_WEIGHT_MULTIPLIER 1.3 Weight multiplier for major decision patterns
COMPETITIVE_STALENESS_WINDOW_DAYS 180 Window for staleness decay on competitive signals
COMPETITIVE_STALENESS_RECENT_DAYS 90 Days within which signals are considered recent
COMPETITIVE_STALENESS_DECAY_PENALTY 0.5 Decay penalty for stale competitive signals
COMPETITIVE_PROPAGATION_FAILURE_THRESHOLD 5 Consecutive propagation failures before operator alert
ALERT_SOURCE_FAILURE_THRESHOLD 3 Consecutive source failures before alert fires
ALERT_SOURCE_FAILURE_WINDOW_HOURS 6 Lookback window for source failure alerting
ALERT_SCHEMA_FAILURE_RATE_THRESHOLD 0.3 Extraction failure rate (30%) that triggers alert
ALERT_SCHEMA_FAILURE_WINDOW_HOURS 1 Lookback window for schema failure spike
ALERT_LAKE_LAG_THRESHOLD_MINUTES 60 Minutes since last lake publish before alert
ALERT_BROKER_ERROR_THRESHOLD 3 Consecutive broker errors before alert
ALERT_BROKER_ERROR_WINDOW_HOURS 1 Lookback window for broker error alerting
ALERT_CHECK_INTERVAL_SECONDS 120 How often alerting rules are evaluated
RETENTION_RAW_MARKET_DAYS 90 Retention period for raw market data (days)
RETENTION_RAW_NEWS_DAYS 180 Retention period for raw news articles (days)
RETENTION_RAW_FILINGS_DAYS 365 Retention period for raw SEC filings (days)
RETENTION_NORMALIZED_DAYS 180 Retention period for normalized documents (days)
RETENTION_LLM_PROMPTS_DAYS 365 Retention period for LLM prompt archives (days)
RETENTION_LLM_RESULTS_DAYS 365 Retention period for LLM extraction results (days)
RETENTION_LAKEHOUSE_DAYS 730 Retention period for lakehouse Parquet files (days)
RETENTION_AUDIT_DAYS 730 Retention period for audit trail artifacts (days)
RETENTION_CLEANUP_INTERVAL_HOURS 24 How often the retention cleanup worker runs
RETENTION_BATCH_SIZE 1000 Number of objects processed per cleanup batch
LOG_LEVEL INFO Logging level
JSON_LOGS true Enable structured JSON logging
DEPLOY_STAGE (empty) Deployment stage prefix for bucket names

See services/shared/config.py for the complete list of all supported environment variables with their defaults.


LLM Provider Configuration

Stonks Oracle supports two LLM backends: Ollama (local, self-hosted) and vLLM (high-performance inference server). The active provider is configured per-agent in the ai_agents database table, but the connection details come from environment variables.

Option A: Bundled Ollama (default)

The docker-compose.yml includes an Ollama container with GPU passthrough via the NVIDIA Container Toolkit. On first start, pull a model:

docker compose exec ollama ollama pull qwen3.5:9b-fast

No additional configuration needed — services connect to http://ollama:11434 by default.

The Ollama container requests all available NVIDIA GPUs via the deploy.resources.reservations.devices configuration. If no GPU is available, Ollama falls back to CPU inference (significantly slower).

Option B: External Ollama

If Ollama is already running on the host (e.g. with GPU access), create a docker-compose.override.yml:

services:
  ollama:
    entrypoint: ["true"]
    restart: "no"
    ports: []
  extractor:
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    environment:
      OLLAMA_BASE_URL: "http://host.docker.internal:11434"
    extra_hosts:
      - "host.docker.internal:host-gateway"
  recommendation:
    environment:
      OLLAMA_BASE_URL: "http://host.docker.internal:11434"
    extra_hosts:
      - "host.docker.internal:host-gateway"

This disables the bundled Ollama container and routes services to the host's instance. Replace the port if your Ollama runs on a non-standard port. For a remote Ollama instance (not on localhost), replace host.docker.internal with the remote IP and remove the extra_hosts block.

Option C: vLLM Server

For higher throughput or quantized models (e.g. RedHatAI/Qwen3.6-35B-A3B-NVFP4), point services at a vLLM server. Add to your .env:

VLLM_BASE_URL=http://192.168.42.254:8000
VLLM_MODEL=RedHatAI/Qwen3.6-35B-A3B-NVFP4
VLLM_TIMEOUT=120
VLLM_TEMPERATURE=0.7

Then update the ai_agents table to use the vLLM provider:

UPDATE ai_agents SET model_provider = 'vllm', model_name = 'RedHatAI/Qwen3.6-35B-A3B-NVFP4' WHERE active = true;

Or use the API:

curl -X PUT http://localhost:8004/api/admin/agents/document-extractor \
  -H 'Content-Type: application/json' \
  -d '{"model_provider": "vllm", "model_name": "RedHatAI/Qwen3.6-35B-A3B-NVFP4"}'

Option D: Mixed (Ollama + vLLM)

You can run different agents on different providers. For example, use vLLM for the high-volume extractor and Ollama for the thesis rewriter:

UPDATE ai_agents SET model_provider = 'vllm', model_name = 'RedHatAI/Qwen3.6-35B-A3B-NVFP4' WHERE slug = 'document-extractor';
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'RedHatAI/Qwen3.6-35B-A3B-NVFP4' WHERE slug = 'event-classifier';
UPDATE ai_agents SET model_provider = 'ollama', model_name = 'qwen3.5:9b-fast' WHERE slug = 'thesis-rewriter';

Both OLLAMA_BASE_URL and VLLM_BASE_URL must be set in the environment for mixed mode.

Automated Deployment

The deploy-docker.sh script handles LLM configuration automatically. It always uses the Docker Ollama container with GPU passthrough (NVIDIA Container Toolkit):

# Deploy with defaults (Docker Ollama, GPU-accelerated)
bash deploy-docker.sh

# Specify a custom model
bash deploy-docker.sh --ollama-model qwen3.6

# Specify a different host and directory
bash deploy-docker.sh --host user@myserver --dir /opt/stonks

If an external Ollama URL is provided via --ollama-url, the script creates a docker-compose.override.yml that disables the bundled container and routes services to the external instance.


Volume Mounts and Data Persistence

Docker Compose defines five named volumes for persistent data:

Volume Mounted By Mount Path Contents
pgdata postgres /var/lib/postgresql/data PostgreSQL database files
miniodata minio /data MinIO object storage (raw artifacts, lakehouse Parquet files)
ollama_models ollama /root/.ollama Downloaded LLM model weights
hive_data hive-metastore /opt/hive/data Hive metastore Derby database
superset_data superset /app/superset_home Superset configuration and metadata

Bind Mounts

In addition to named volumes, several services use bind mounts for configuration:

Service Host Path Container Path Mode Purpose
postgres ./infra/migrations /docker-entrypoint-initdb.d rw SQL migrations auto-applied on first start
trino ./infra/trino/catalog /etc/trino/catalog rw Trino catalog configuration (lakehouse, iceberg)
hive-metastore ./infra/hive/core-site.xml /opt/hive/conf/core-site.xml ro Hadoop core-site config for MinIO access
hive-metastore ./infra/hive/metastore-site.xml /opt/hive/conf/metastore-site.xml ro Hive metastore config

Resetting Data

To destroy all persistent data and start fresh:

# Stop all containers and remove named volumes
docker compose down -v

This removes pgdata, miniodata, ollama_models, hive_data, and superset_data. The next docker compose up will re-initialize PostgreSQL with migrations, re-create MinIO buckets (via minio-init), and re-download Ollama models.

To reset only specific volumes:

docker compose down
docker volume rm stonks-oracle_pgdata    # Reset database only
docker compose up -d

Note

: Volume names are prefixed with the project directory name (e.g., stonks-oracle_pgdata). Use docker volume ls to see exact names.


Health Checks

Every service has a health check configured. Docker Compose uses these to enforce startup ordering via depends_on with condition: service_healthy.

Infrastructure Health Checks

Service Test Command Interval Retries
postgres pg_isready -U stonks 5s 5
redis redis-cli ping 5s 5
minio mc ready local 5s 5

Application Health Checks — FastAPI Services

FastAPI services (symbol-registry, trading-engine, risk-engine, query-api) use HTTP health endpoints:

Service Test Command Interval Timeout Retries Start Period
symbol-registry curl -f http://localhost:8000/health 10s 5s 3 15s
trading-engine curl -f http://localhost:8000/health 10s 5s 3 15s
risk-engine curl -f http://localhost:8000/health 10s 5s 3 15s
query-api curl -f http://localhost:8000/health 10s 5s 3 15s
dashboard curl -f http://localhost:8080/ 10s 5s 3 10s

Application Health Checks — Worker Services

Worker services (no HTTP endpoint) use process liveness checks:

Service Test Command Interval Timeout Retries Start Period
scheduler pgrep -f 'python -m services.scheduler.app' 10s 5s 3 15s
ingestion pgrep -f 'python -m services.ingestion.worker' 10s 5s 3 15s
parser pgrep -f 'python -m services.parser.worker' 10s 5s 3 15s
extractor pgrep -f 'python -m services.extractor.main' 10s 5s 3 15s
aggregation pgrep -f 'python -m services.aggregation.main' 10s 5s 3 15s
recommendation pgrep -f 'python -m services.recommendation.main' 10s 5s 3 15s
broker-adapter pgrep -f 'python -m services.adapters.broker_service' 10s 5s 3 15s
lake-publisher pgrep -f 'python -m services.lake_publisher.jobs' 10s 5s 3 15s

Verifying Service Health

# Check all service statuses
docker compose ps

# Check a specific service
docker compose ps query-api

# Inspect health check details for a container
docker inspect --format='{{json .State.Health}}' stonks-oracle-query-api-1 | python -m json.tool

# Wait for all services to be healthy
docker compose up -d --wait

Dockerfile Build Details

docker/Dockerfile — Generic Python Service Image

Used by all application services except the scheduler. Accepts a SERVICE_CMD build argument that determines which service the container runs.

Base image: python:3.12-slim (via Harbor proxy cache in CI)

Build arguments:

Argument Default Description
SERVICE_CMD python -m services.scheduler.app The command executed when the container starts
CACHE_BUST (none) Optional cache-busting argument to force rebuild of source layers

What gets copied:

  • requirements.txt → pip dependencies installed
  • services/ → all service source code
  • scripts/ → operational scripts
  • tests/ → test files (available for in-container testing)
  • conftest.py → pytest configuration

Environment variables set:

  • PYTHONDONTWRITEBYTECODE=1 — no .pyc files
  • PYTHONUNBUFFERED=1 — unbuffered stdout/stderr for log visibility
  • PYTHONPATH=/app — ensures services.* imports resolve

System packages installed: gcc, libpq-dev (PostgreSQL client library), curl (for health checks)

Security: Runs as non-root user stonks (UID 1000).

How SERVICE_CMD works: The CMD directive is sh -c "${SERVICE_CMD}", so the build argument becomes the runtime command. Each service in docker-compose.yml overrides this via the args.SERVICE_CMD build parameter:

query-api:
  build:
    context: .
    dockerfile: docker/Dockerfile
    args:
      SERVICE_CMD: "uvicorn services.api.app:app --host 0.0.0.0 --port 8000"

docker/Dockerfile.scheduler — Scheduler Image

A specialized variant of the generic Dockerfile used only by the scheduler service. Adds postgresql-client for running database migrations via psql.

Additional contents:

  • infra/migrations/ → copied to /app/infra/migrations/ for migration execution
  • postgresql-client system package installed

Command: Hardcoded CMD ["python", "-m", "services.scheduler.app"] (no SERVICE_CMD argument).

docker/Dockerfile.superset — Custom Superset Image

Extends the official Apache Superset image with additional database drivers.

Base image: apache/superset:latest (via Harbor proxy cache in CI)

Additional packages: trino[sqlalchemy], psycopg2-binary, redis

frontend/Dockerfile — Dashboard Image

Multi-stage build for the React dashboard.

Stage 1 — Build (base: node:24-alpine):

Build Argument Default Description
VITE_QUERY_API_URL "" Query API base URL (empty = use relative /api/ proxy)
VITE_SYMBOL_REGISTRY_URL "" Symbol Registry base URL (empty = use relative /registry/ proxy)
VITE_RISK_ENGINE_URL "" Risk Engine base URL (empty = use relative /risk/ proxy)

Stage 2 — Serve (base: nginxinc/nginx-unprivileged:alpine):

  • Serves the built static files on port 8080
  • Uses frontend/nginx.conf for SPA fallback and API reverse proxying
  • Proxies /api/query-api:8000, /registry/symbol-registry:8000, /risk/risk:8000, /trading/trading-engine:8000
  • SSE stream endpoint (/api/ops/pipeline/stream) has buffering disabled for real-time delivery
  • Static assets under /assets/ are cached with 1-year expiry

Building Custom Images

To build a single service image locally:

# Build the query-api image
docker compose build query-api

# Build with a custom SERVICE_CMD
docker build -t my-custom-service \
  --build-arg SERVICE_CMD="python -m services.my_service.main" \
  -f docker/Dockerfile .

# Build the dashboard with custom API URLs
docker build -t my-dashboard \
  --build-arg VITE_QUERY_API_URL="https://api.example.com" \
  -f frontend/Dockerfile frontend/

# Rebuild all images
docker compose build

# Rebuild without cache (force fresh build)
docker compose build --no-cache

Dependency Ordering

Docker Compose enforces startup order using depends_on with health check conditions. The dependency graph is:

postgres (healthy) ──┬── scheduler
                     ├── symbol-registry
                     ├── ingestion
                     ├── parser
                     ├── extractor
                     ├── aggregation
                     ├── recommendation
                     ├── trading-engine
                     ├── risk-engine
                     ├── broker-adapter
                     ├── lake-publisher
                     └── query-api

redis (healthy) ─────┬── scheduler
                     ├── ingestion
                     ├── parser
                     ├── extractor
                     ├── aggregation
                     ├── recommendation
                     ├── trading-engine
                     ├── broker-adapter
                     └── query-api

minio (healthy) ─────┬── minio-init
                     ├── ingestion
                     ├── lake-publisher
                     └── query-api

ollama (started) ────── extractor

minio ───────────────── trino
hive-metastore ─────── trino
trino ──────────────── superset (via depends_on)

query-api (healthy) ── dashboard

Services with condition: service_healthy wait until the dependency's health check passes. The extractor depends on ollama with condition: service_started (no health check — Ollama may take time to load models).


Operational Commands

Starting Services

# Start all services in the background
docker compose up -d

# Start all services and wait for health checks
docker compose up -d --wait

# Start only infrastructure (useful for local development)
docker compose up -d postgres redis minio minio-init ollama

# Start a specific service and its dependencies
docker compose up -d query-api

Stopping Services

# Stop all services (preserves volumes)
docker compose down

# Stop all services and remove volumes (full reset)
docker compose down -v

# Stop a specific service
docker compose stop trading-engine

Restarting Services

# Restart a specific service
docker compose restart query-api

# Restart with a fresh build
docker compose up -d --build query-api

# Force recreate a service (picks up compose file changes)
docker compose up -d --force-recreate query-api

Viewing Logs

# Follow logs for all services
docker compose logs -f

# Follow logs for a specific service
docker compose logs -f query-api

# View last 50 lines of a service's logs
docker compose logs --tail=50 ingestion

# View logs for multiple services
docker compose logs -f scheduler ingestion extractor

Scaling Replicas

# Scale a worker service to 3 replicas
docker compose up -d --scale ingestion=3

# Scale multiple services
docker compose up -d --scale ingestion=3 --scale extractor=2

# Scale back to 1
docker compose up -d --scale ingestion=1

Note

: Scaling works best for worker services (ingestion, parser, extractor, aggregation, recommendation, broker-adapter, lake-publisher) that consume from Redis queues. Do not scale FastAPI services that expose host ports without adjusting port mappings.

Inspecting Services

# List all services and their status
docker compose ps

# View resource usage
docker compose top

# Execute a command inside a running container
docker compose exec query-api python -c "from services.shared.config import load_config; print(load_config())"

# Open a shell in a container
docker compose exec postgres psql -U stonks -d stonks

# Seed the database
docker compose exec scheduler python -m services.symbol_registry.seed

Full Reset

# Nuclear option: stop everything, remove volumes, rebuild, restart
docker compose down -v
docker compose build --no-cache
docker compose up -d

This destroys all data (database, object storage, model weights, metastore, Superset config) and starts from scratch. PostgreSQL migrations are re-applied automatically. MinIO buckets are re-created by minio-init. Ollama models must be re-downloaded.


MinIO Bucket Initialization

The minio-init service runs once on startup and creates the required object storage buckets:

Bucket Purpose
stonks-raw-market Raw market data from Polygon.io
stonks-raw-news Raw news articles
stonks-raw-filings Raw SEC filings
stonks-normalized Normalized/parsed documents
stonks-llm-prompts LLM prompt archives
stonks-llm-results LLM extraction results
stonks-lakehouse Parquet fact tables for Trino
stonks-audit Audit trail artifacts

Access the MinIO console at http://localhost:9001 (credentials: minioadmin / minioadmin).


Dashboard Reverse Proxy

The dashboard container runs nginx with reverse proxy rules that route API requests to backend services using Docker Compose service names:

Path Proxied To Service
/api/ http://query-api:8000 Query API
/api/ops/pipeline/stream http://query-api:8000 (SSE, no buffering) Query API (real-time pipeline stream)
/registry/ http://symbol-registry:8000/ Symbol Registry API
/risk/ http://risk:8000/ Risk Engine (via network alias)
/trading/ http://trading-engine:8000/ Trading Engine API

The risk-engine service has a network alias of risk in docker-compose.yml so the nginx upstream resolves correctly.

All other paths serve the React SPA with try_files fallback to index.html. Static assets under /assets/ are served with 1-year cache headers.

Security headers applied: X-Frame-Options: SAMEORIGIN, X-Content-Type-Options: nosniff, Referrer-Policy: strict-origin-when-cross-origin.


Troubleshooting

Service won't start

Check dependency health:

docker compose ps postgres redis minio

If infrastructure services are unhealthy, application services will wait indefinitely. Check infrastructure logs:

docker compose logs postgres

Database migration errors

Migrations in ./infra/migrations/ are applied by PostgreSQL's docker-entrypoint-initdb.d mechanism, which only runs on first database initialization. If you need to re-run migrations:

docker compose down -v   # Remove pgdata volume
docker compose up -d     # Migrations re-applied on fresh init

Ollama model not available

The extractor service needs an LLM model loaded. Pull a model manually:

# If using bundled Ollama container:
docker compose exec ollama ollama pull qwen3.5:9b-fast

# If using host Ollama:
ollama pull qwen3.5:9b-fast

# If using vLLM, ensure the model is loaded on the vLLM server
curl http://your-vllm-host:8000/v1/models

Ollama port conflict (address already in use)

If Ollama is already running on the host, the bundled container will fail to bind port 11434. Use the external Ollama configuration described in the "LLM Provider Configuration" section above, or use deploy-docker.sh which handles this automatically.

GPU not detected by Ollama container

Ensure the NVIDIA Container Toolkit is installed and Docker is configured:

# Verify GPU passthrough works
docker run --rm --gpus all nvidia/cuda:12.8.0-base-ubuntu24.04 nvidia-smi

# If it fails, reconfigure Docker runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Port conflicts

If a port is already in use, modify the host port mapping in docker-compose.yml:

query-api:
  ports:
    - "9004:8000"   # Changed from 8004 to 9004

Container runs out of memory

The full stack requires at least 16 GB RAM. If services are being OOM-killed:

# Check which containers are using the most memory
docker stats --no-stream

# Reduce memory usage by stopping non-essential services
docker compose stop trino hive-metastore superset