Files
stonks-oracle/docs/docker-deployment.md
T
Celes Renata 11c6457559
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
docs: add LLM provider config (Ollama/vLLM/mixed), fix risk network alias in compose
2026-04-29 03:08:54 +00:00

745 lines
28 KiB
Markdown

# Docker Deployment Guide
This guide covers running the full Stonks Oracle platform locally using Docker Compose. It documents every service, environment variable, volume mount, health check, and operational command.
## Prerequisites
- Docker Engine 24+ and Docker Compose v2
- At least 16 GB RAM (Ollama + Trino + all services)
- API keys for Polygon.io and Alpaca (optional — platform runs in degraded mode without them)
## Quick Start
```bash
# 1. Clone the repository
git clone <repo-url> && cd stonks-oracle
# 2. Configure API keys
cp .env.example .env # or edit the existing .env
# Fill in MARKET_DATA_API_KEY, BROKER_API_KEY, BROKER_API_SECRET
# 3. Start everything
docker compose up -d
# 4. Verify all services are healthy
docker compose ps
# 5. Access the dashboard
open http://localhost:3000
```
---
## Service Inventory
### Infrastructure Services
| Service | Image | Ports | Volumes | Purpose |
|---------|-------|-------|---------|---------|
| `postgres` | `postgres:16-alpine` | `5432:5432` | `pgdata``/var/lib/postgresql/data`, `./infra/migrations``/docker-entrypoint-initdb.d` | Primary database; migrations auto-applied on first start |
| `redis` | `redis:7-alpine` | `6379:6379` | — | Queue broker, caching, deduplication |
| `minio` | `minio/minio:latest` | `9000:9000` (API), `9001:9001` (console) | `miniodata``/data` | Object storage for raw artifacts and lakehouse |
| `minio-init` | `minio/mc:latest` | — | — | One-shot init container that creates required buckets |
| `ollama` | `ollama/ollama:latest` | `11434:11434` | `ollama_models``/root/.ollama` | LLM inference server for extraction and classification |
| `trino` | `trinodb/trino:latest` | `8080:8080` | `./infra/trino/catalog``/etc/trino/catalog` | SQL query engine over the lakehouse |
| `hive-metastore` | `apache/hive:4.0.0` | `9083:9083` | `hive_data``/opt/hive/data`, `./infra/hive/core-site.xml``/opt/hive/conf/core-site.xml`, `./infra/hive/metastore-site.xml``/opt/hive/conf/metastore-site.xml` | Iceberg/Hive metadata catalog for Trino |
| `superset` | `apache/superset:latest` | `8088:8088` | `superset_data``/app/superset_home` | BI dashboards over Trino |
### Application Services
| Service | Dockerfile | `SERVICE_CMD` / Command | Ports | Depends On |
|---------|-----------|------------------------|-------|------------|
| `scheduler` | `docker/Dockerfile.scheduler` | `python -m services.scheduler.app` | — | postgres (healthy), redis (healthy) |
| `symbol-registry` | `docker/Dockerfile` | `uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000` | `8001:8000` | postgres (healthy) |
| `ingestion` | `docker/Dockerfile` | `python -m services.ingestion.worker` | — | postgres (healthy), redis (healthy), minio (healthy) |
| `parser` | `docker/Dockerfile` | `python -m services.parser.worker` | — | postgres (healthy), redis (healthy) |
| `extractor` | `docker/Dockerfile` | `python -m services.extractor.main` | — | postgres (healthy), redis (healthy), ollama (started) |
| `aggregation` | `docker/Dockerfile` | `python -m services.aggregation.main` | — | postgres (healthy), redis (healthy) |
| `recommendation` | `docker/Dockerfile` | `python -m services.recommendation.main` | — | postgres (healthy), redis (healthy) |
| `trading-engine` | `docker/Dockerfile` | `uvicorn services.trading.app:app --host 0.0.0.0 --port 8000` | `8002:8000` | postgres (healthy), redis (healthy) |
| `risk-engine` | `docker/Dockerfile` | `uvicorn services.risk.app:app --host 0.0.0.0 --port 8000` | `8003:8000` | postgres (healthy) |
| `broker-adapter` | `docker/Dockerfile` | `python -m services.adapters.broker_service` | — | postgres (healthy), redis (healthy) |
| `lake-publisher` | `docker/Dockerfile` | `python -m services.lake_publisher.jobs` | — | postgres (healthy), minio (healthy) |
| `query-api` | `docker/Dockerfile` | `uvicorn services.api.app:app --host 0.0.0.0 --port 8000` | `8004:8000` | postgres (healthy), redis (healthy), minio (healthy) |
| `dashboard` | `frontend/Dockerfile` | nginx (built-in) | `3000:8080` | query-api (healthy) |
### Port Summary
| Port | Service | Protocol |
|------|---------|----------|
| 3000 | Dashboard (React UI) | HTTP |
| 5432 | PostgreSQL | TCP |
| 6379 | Redis | TCP |
| 8001 | Symbol Registry API | HTTP |
| 8002 | Trading Engine API | HTTP |
| 8003 | Risk Engine API | HTTP |
| 8004 | Query API | HTTP |
| 8080 | Trino | HTTP |
| 8088 | Superset | HTTP |
| 9000 | MinIO API | HTTP |
| 9001 | MinIO Console | HTTP |
| 9083 | Hive Metastore | Thrift |
| 11434 | Ollama | HTTP |
---
## Environment Variables
### Shared Application Environment (`x-app-env`)
All application services inherit these variables via the `x-app-env` YAML anchor:
| Variable | Default | Description |
|----------|---------|-------------|
| `POSTGRES_HOST` | `postgres` | PostgreSQL hostname (Docker service name) |
| `POSTGRES_PORT` | `5432` | PostgreSQL port |
| `POSTGRES_DB` | `stonks` | Database name |
| `POSTGRES_USER` | `stonks` | Database user |
| `POSTGRES_PASSWORD` | `stonks_dev` | Database password |
| `REDIS_HOST` | `redis` | Redis hostname (Docker service name) |
| `REDIS_PORT` | `6379` | Redis port |
| `MINIO_ENDPOINT` | `minio:9000` | MinIO API endpoint |
| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
| `MINIO_SECRET_KEY` | `minioadmin` | MinIO secret key |
| `OLLAMA_BASE_URL` | `http://ollama:11434` | Ollama LLM server URL |
### `.env` File
The `.env` file is loaded by `ingestion`, `broker-adapter`, and `trading-engine` via the `env_file` directive. Create it in the repository root:
```dotenv
# Stonks Oracle — Environment Variables
# These are loaded by ingestion, broker-adapter, and trading-engine services.
# Polygon.io market data API key (required for live data ingestion)
MARKET_DATA_API_KEY=
# Alpaca broker credentials (required for paper/live trading)
BROKER_API_KEY=
BROKER_API_SECRET=
BROKER_BASE_URL=https://paper-api.alpaca.markets
```
| Variable | Required | Default | Used By | Description |
|----------|----------|---------|---------|-------------|
| `MARKET_DATA_API_KEY` | No* | (empty) | ingestion | Polygon.io API key for market data fetching |
| `BROKER_API_KEY` | No* | (empty) | broker-adapter, trading-engine | Alpaca API key |
| `BROKER_API_SECRET` | No* | (empty) | broker-adapter, trading-engine | Alpaca API secret |
| `BROKER_BASE_URL` | No | `https://paper-api.alpaca.markets` | broker-adapter, trading-engine | Alpaca API base URL |
*Services start without these keys but run in degraded mode — ingestion cannot fetch market data and the broker adapter cannot execute trades.
### Infrastructure Service Environment
**PostgreSQL** (`postgres`):
| Variable | Value | Description |
|----------|-------|-------------|
| `POSTGRES_DB` | `stonks` | Database created on first start |
| `POSTGRES_USER` | `stonks` | Superuser for the database |
| `POSTGRES_PASSWORD` | `stonks_dev` | Password for the database user |
**MinIO** (`minio`):
| Variable | Value | Description |
|----------|-------|-------------|
| `MINIO_ROOT_USER` | `minioadmin` | MinIO admin username |
| `MINIO_ROOT_PASSWORD` | `minioadmin` | MinIO admin password |
**Trino** (`trino`):
| Variable | Value | Description |
|----------|-------|-------------|
| `MINIO_ACCESS_KEY` | `minioadmin` | Passed to Trino for MinIO catalog access |
| `MINIO_SECRET_KEY` | `minioadmin` | Passed to Trino for MinIO catalog access |
**Hive Metastore** (`hive-metastore`):
| Variable | Value | Description |
|----------|-------|-------------|
| `SERVICE_NAME` | `metastore` | Tells Hive to run in metastore-only mode |
| `DB_DRIVER` | `derby` | Embedded Derby database for metadata |
**Superset** (`superset`):
| Variable | Value | Description |
|----------|-------|-------------|
| `SUPERSET_SECRET_KEY` | `stonks-dev-secret-key-change-me` | Flask secret key (change in production) |
| `ADMIN_USERNAME` | `admin` | Initial admin username |
| `ADMIN_PASSWORD` | `admin` | Initial admin password |
| `ADMIN_EMAIL` | `admin@stonks.local` | Initial admin email |
### Additional Configuration Variables
All application services support additional environment variables loaded via `services/shared/config.py`. These can be added to individual service `environment` blocks or to the `x-app-env` anchor as needed:
| Variable | Default | Description |
|----------|---------|-------------|
| `REDIS_DB` | `0` | Redis database number |
| `REDIS_PASSWORD` | (none) | Redis password (not needed in Docker Compose) |
| `MINIO_SECURE` | `false` | Use HTTPS for MinIO |
| `OLLAMA_BASE_URL` | `http://ollama:11434` | Ollama LLM server URL |
| `OLLAMA_MODEL` | `qwen3.5:9b` | Default LLM model for extraction |
| `OLLAMA_TIMEOUT` | `120` | Ollama request timeout (seconds) |
| `OLLAMA_MAX_RETRIES` | `2` | Max retries for Ollama requests |
| `VLLM_BASE_URL` | (empty) | vLLM server URL (if using vLLM instead of Ollama) |
| `VLLM_MODEL` | (empty) | vLLM model name (e.g. `AxionML/Qwen3.5-9B-NVFP4`) |
| `VLLM_TIMEOUT` | `120` | vLLM request timeout (seconds) |
| `VLLM_MAX_RETRIES` | `2` | Max retries for vLLM requests |
| `VLLM_TEMPERATURE` | `0.7` | vLLM sampling temperature |
| `VLLM_API_KEY` | (empty) | vLLM API key (if required) |
| `TRINO_HOST` | `localhost` | Trino hostname |
| `TRINO_PORT` | `8080` | Trino port |
| `TRINO_CATALOG` | `lakehouse` | Trino catalog name |
| `TRINO_SCHEMA` | `stonks` | Trino schema name |
| `MARKET_DATA_BASE_URL` | `https://api.polygon.io` | Polygon.io base URL |
| `MARKET_DATA_PROVIDER` | `polygon` | Market data provider |
| `BROKER_MODE` | `paper` | Broker mode: `paper` or `live` |
| `BROKER_PROVIDER` | `alpaca` | Broker provider |
| `TRADING_ENABLED` | `false` | Enable autonomous trading engine |
| `TRADING_RISK_TIER` | `moderate` | Risk tier: `conservative`, `moderate`, `aggressive` |
| `TRADING_POLLING_INTERVAL_SECONDS` | `60` | Recommendation polling interval |
| `TRADING_MAX_OPEN_POSITIONS` | `10` | Maximum concurrent open positions |
| `MACRO_ENABLED` | `true` | Enable macro signal layer |
| `COMPETITIVE_ENABLED` | `true` | Enable competitive signal layer |
| `LOG_LEVEL` | `INFO` | Logging level |
| `JSON_LOGS` | `true` | Enable structured JSON logging |
| `DEPLOY_STAGE` | (empty) | Deployment stage prefix for bucket names |
See `services/shared/config.py` for the complete list of all supported environment variables with their defaults.
---
## LLM Provider Configuration
Stonks Oracle supports two LLM backends: **Ollama** (local, self-hosted) and **vLLM** (high-performance inference server). The active provider is configured per-agent in the `ai_agents` database table, but the connection details come from environment variables.
### Option A: Bundled Ollama (default)
The `docker-compose.yml` includes an Ollama container. On first start, pull a model:
```bash
docker compose exec ollama ollama pull qwen3.5:9b-fast
```
No additional configuration needed — services connect to `http://ollama:11434` by default.
### Option B: External Ollama
If Ollama is already running on the host (e.g. with GPU access), create a `docker-compose.override.yml`:
```yaml
services:
ollama:
entrypoint: ["true"]
restart: "no"
ports: []
extractor:
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
environment:
OLLAMA_BASE_URL: "http://host.docker.internal:11434"
extra_hosts:
- "host.docker.internal:host-gateway"
recommendation:
environment:
OLLAMA_BASE_URL: "http://host.docker.internal:11434"
extra_hosts:
- "host.docker.internal:host-gateway"
```
This disables the bundled Ollama container and routes services to the host's instance. Replace the port if your Ollama runs on a non-standard port.
### Option C: vLLM Server
For higher throughput or quantized models (e.g. `AxionML/Qwen3.5-9B-NVFP4`), point services at a vLLM server. Add to your `.env`:
```dotenv
VLLM_BASE_URL=http://192.168.42.254:8000
VLLM_MODEL=AxionML/Qwen3.5-9B-NVFP4
VLLM_TIMEOUT=120
VLLM_TEMPERATURE=0.7
```
Then update the `ai_agents` table to use the vLLM provider:
```sql
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'AxionML/Qwen3.5-9B-NVFP4' WHERE active = true;
```
Or use the API:
```bash
curl -X PUT http://localhost:8004/api/admin/agents/document-extractor \
-H 'Content-Type: application/json' \
-d '{"model_provider": "vllm", "model_name": "AxionML/Qwen3.5-9B-NVFP4"}'
```
### Option D: Mixed (Ollama + vLLM)
You can run different agents on different providers. For example, use vLLM for the high-volume extractor and Ollama for the thesis rewriter:
```sql
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'AxionML/Qwen3.5-9B-NVFP4' WHERE slug = 'document-extractor';
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'AxionML/Qwen3.5-9B-NVFP4' WHERE slug = 'event-classifier';
UPDATE ai_agents SET model_provider = 'ollama', model_name = 'qwen3.5:9b-fast' WHERE slug = 'thesis-rewriter';
```
Both `OLLAMA_BASE_URL` and `VLLM_BASE_URL` must be set in the environment for mixed mode.
### Automated Deployment
The `deploy-docker.sh` script handles LLM configuration automatically:
```bash
# Auto-detect host Ollama, use default model
bash deploy-docker.sh
# Specify a remote Ollama instance
bash deploy-docker.sh --ollama-url http://10.1.1.12:2701 --ollama-model qwen3.6
# Specify a different host
bash deploy-docker.sh --host user@myserver --dir /opt/stonks
```
---
## Volume Mounts and Data Persistence
Docker Compose defines five named volumes for persistent data:
| Volume | Mounted By | Mount Path | Contents |
|--------|-----------|------------|----------|
| `pgdata` | postgres | `/var/lib/postgresql/data` | PostgreSQL database files |
| `miniodata` | minio | `/data` | MinIO object storage (raw artifacts, lakehouse Parquet files) |
| `ollama_models` | ollama | `/root/.ollama` | Downloaded LLM model weights |
| `hive_data` | hive-metastore | `/opt/hive/data` | Hive metastore Derby database |
| `superset_data` | superset | `/app/superset_home` | Superset configuration and metadata |
### Bind Mounts
In addition to named volumes, several services use bind mounts for configuration:
| Service | Host Path | Container Path | Mode | Purpose |
|---------|-----------|---------------|------|---------|
| postgres | `./infra/migrations` | `/docker-entrypoint-initdb.d` | rw | SQL migrations auto-applied on first start |
| trino | `./infra/trino/catalog` | `/etc/trino/catalog` | rw | Trino catalog configuration (lakehouse, iceberg) |
| hive-metastore | `./infra/hive/core-site.xml` | `/opt/hive/conf/core-site.xml` | ro | Hadoop core-site config for MinIO access |
| hive-metastore | `./infra/hive/metastore-site.xml` | `/opt/hive/conf/metastore-site.xml` | ro | Hive metastore config |
### Resetting Data
To destroy all persistent data and start fresh:
```bash
# Stop all containers and remove named volumes
docker compose down -v
```
This removes `pgdata`, `miniodata`, `ollama_models`, `hive_data`, and `superset_data`. The next `docker compose up` will re-initialize PostgreSQL with migrations, re-create MinIO buckets (via `minio-init`), and re-download Ollama models.
To reset only specific volumes:
```bash
docker compose down
docker volume rm stonks-oracle_pgdata # Reset database only
docker compose up -d
```
> **Note**: Volume names are prefixed with the project directory name (e.g., `stonks-oracle_pgdata`). Use `docker volume ls` to see exact names.
---
## Health Checks
Every service has a health check configured. Docker Compose uses these to enforce startup ordering via `depends_on` with `condition: service_healthy`.
### Infrastructure Health Checks
| Service | Test Command | Interval | Retries |
|---------|-------------|----------|---------|
| `postgres` | `pg_isready -U stonks` | 5s | 5 |
| `redis` | `redis-cli ping` | 5s | 5 |
| `minio` | `mc ready local` | 5s | 5 |
### Application Health Checks — FastAPI Services
FastAPI services (symbol-registry, trading-engine, risk-engine, query-api) use HTTP health endpoints:
| Service | Test Command | Interval | Timeout | Retries | Start Period |
|---------|-------------|----------|---------|---------|-------------|
| `symbol-registry` | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
| `trading-engine` | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
| `risk-engine` | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
| `query-api` | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
| `dashboard` | `curl -f http://localhost:8080/` | 10s | 5s | 3 | 10s |
### Application Health Checks — Worker Services
Worker services (no HTTP endpoint) use process liveness checks:
| Service | Test Command | Interval | Timeout | Retries | Start Period |
|---------|-------------|----------|---------|---------|-------------|
| `scheduler` | `pgrep -f 'python -m services.scheduler.app'` | 10s | 5s | 3 | 15s |
| `ingestion` | `pgrep -f 'python -m services.ingestion.worker'` | 10s | 5s | 3 | 15s |
| `parser` | `pgrep -f 'python -m services.parser.worker'` | 10s | 5s | 3 | 15s |
| `extractor` | `pgrep -f 'python -m services.extractor.main'` | 10s | 5s | 3 | 15s |
| `aggregation` | `pgrep -f 'python -m services.aggregation.main'` | 10s | 5s | 3 | 15s |
| `recommendation` | `pgrep -f 'python -m services.recommendation.main'` | 10s | 5s | 3 | 15s |
| `broker-adapter` | `pgrep -f 'python -m services.adapters.broker_service'` | 10s | 5s | 3 | 15s |
| `lake-publisher` | `pgrep -f 'python -m services.lake_publisher.jobs'` | 10s | 5s | 3 | 15s |
### Verifying Service Health
```bash
# Check all service statuses
docker compose ps
# Check a specific service
docker compose ps query-api
# Inspect health check details for a container
docker inspect --format='{{json .State.Health}}' stonks-oracle-query-api-1 | python -m json.tool
```
---
## Dockerfile Build Details
### `docker/Dockerfile` — Generic Python Service Image
Used by all application services except the scheduler. Accepts a `SERVICE_CMD` build argument that determines which service the container runs.
**Base image**: `python:3.12-slim`
**Build arguments**:
| Argument | Default | Description |
|----------|---------|-------------|
| `SERVICE_CMD` | `python -m services.scheduler.app` | The command executed when the container starts |
**What gets copied**:
- `requirements.txt` → pip dependencies installed
- `services/` → all service source code
- `tests/` → test files (available for in-container testing)
- `conftest.py` → pytest configuration
**Environment variables set**:
- `PYTHONDONTWRITEBYTECODE=1` — no `.pyc` files
- `PYTHONUNBUFFERED=1` — unbuffered stdout/stderr for log visibility
- `PYTHONPATH=/app` — ensures `services.*` imports resolve
**System packages installed**: `gcc`, `libpq-dev` (PostgreSQL client library), `curl` (for health checks)
**Security**: Runs as non-root user `stonks` (UID 1000).
**How `SERVICE_CMD` works**: The `CMD` directive is `sh -c "${SERVICE_CMD}"`, so the build argument becomes the runtime command. Each service in `docker-compose.yml` overrides this via the `args.SERVICE_CMD` build parameter:
```yaml
query-api:
build:
context: .
dockerfile: docker/Dockerfile
args:
SERVICE_CMD: "uvicorn services.api.app:app --host 0.0.0.0 --port 8000"
```
### `docker/Dockerfile.scheduler` — Scheduler Image
A specialized variant of the generic Dockerfile used only by the `scheduler` service. Adds `postgresql-client` for running database migrations via `psql`.
**Additional contents**:
- `infra/migrations/` → copied to `/app/infra/migrations/` for migration execution
- `postgresql-client` system package installed
**Command**: Hardcoded `CMD ["python", "-m", "services.scheduler.app"]` (no `SERVICE_CMD` argument).
### `docker/Dockerfile.superset` — Custom Superset Image
Extends the official Apache Superset image with additional database drivers.
**Base image**: `apache/superset:latest`
**Additional packages**: `trino[sqlalchemy]`, `psycopg2-binary`, `redis`
### `frontend/Dockerfile` — Dashboard Image
Multi-stage build for the React dashboard.
**Stage 1 — Build** (base: `node:24-alpine`):
| Build Argument | Default | Description |
|---------------|---------|-------------|
| `VITE_QUERY_API_URL` | `""` | Query API base URL (empty = use relative `/api/` proxy) |
| `VITE_SYMBOL_REGISTRY_URL` | `""` | Symbol Registry base URL (empty = use relative `/registry/` proxy) |
| `VITE_RISK_ENGINE_URL` | `""` | Risk Engine base URL (empty = use relative `/risk/` proxy) |
**Stage 2 — Serve** (base: `nginxinc/nginx-unprivileged:alpine`):
- Serves the built static files on port 8080
- Uses `frontend/nginx.conf` for SPA fallback and API reverse proxying
- Proxies `/api/``query-api:8000`, `/registry/``symbol-registry:8000`, `/risk/``risk-engine:8000`, `/trading/``trading-engine:8000`
### Building Custom Images
To build a single service image locally:
```bash
# Build the query-api image
docker compose build query-api
# Build with a custom SERVICE_CMD
docker build -t my-custom-service \
--build-arg SERVICE_CMD="python -m services.my_service.main" \
-f docker/Dockerfile .
# Build the dashboard with custom API URLs
docker build -t my-dashboard \
--build-arg VITE_QUERY_API_URL="https://api.example.com" \
-f frontend/Dockerfile frontend/
# Rebuild all images
docker compose build
```
---
## Dependency Ordering
Docker Compose enforces startup order using `depends_on` with health check conditions. The dependency graph is:
```
postgres (healthy) ──┬── scheduler
├── symbol-registry
├── ingestion
├── parser
├── extractor
├── aggregation
├── recommendation
├── trading-engine
├── risk-engine
├── broker-adapter
├── lake-publisher
└── query-api
redis (healthy) ─────┬── scheduler
├── ingestion
├── parser
├── extractor
├── aggregation
├── recommendation
├── trading-engine
├── broker-adapter
└── query-api
minio (healthy) ─────┬── minio-init
├── ingestion
├── lake-publisher
└── query-api
ollama (started) ────── extractor
minio ───────────────── trino
hive-metastore ─────── trino
trino ──────────────── superset (via depends_on)
query-api (healthy) ── dashboard
```
Services with `condition: service_healthy` wait until the dependency's health check passes. The `extractor` depends on `ollama` with `condition: service_started` (no health check — Ollama may take time to load models).
---
## Operational Commands
### Starting Services
```bash
# Start all services in the background
docker compose up -d
# Start only infrastructure (useful for local development)
docker compose up -d postgres redis minio minio-init ollama
# Start a specific service and its dependencies
docker compose up -d query-api
```
### Stopping Services
```bash
# Stop all services (preserves volumes)
docker compose down
# Stop all services and remove volumes (full reset)
docker compose down -v
# Stop a specific service
docker compose stop trading-engine
```
### Restarting Services
```bash
# Restart a specific service
docker compose restart query-api
# Restart with a fresh build
docker compose up -d --build query-api
# Force recreate a service (picks up compose file changes)
docker compose up -d --force-recreate query-api
```
### Viewing Logs
```bash
# Follow logs for all services
docker compose logs -f
# Follow logs for a specific service
docker compose logs -f query-api
# View last 50 lines of a service's logs
docker compose logs --tail=50 ingestion
# View logs for multiple services
docker compose logs -f scheduler ingestion extractor
```
### Scaling Replicas
```bash
# Scale a worker service to 3 replicas
docker compose up -d --scale ingestion=3
# Scale multiple services
docker compose up -d --scale ingestion=3 --scale extractor=2
# Scale back to 1
docker compose up -d --scale ingestion=1
```
> **Note**: Scaling works best for worker services (ingestion, parser, extractor, aggregation, recommendation, broker-adapter, lake-publisher) that consume from Redis queues. Do not scale FastAPI services that expose host ports without adjusting port mappings.
### Inspecting Services
```bash
# List all services and their status
docker compose ps
# View resource usage
docker compose top
# Execute a command inside a running container
docker compose exec query-api python -c "from services.shared.config import load_config; print(load_config())"
# Open a shell in a container
docker compose exec postgres psql -U stonks -d stonks
```
### Full Reset
```bash
# Nuclear option: stop everything, remove volumes, rebuild, restart
docker compose down -v
docker compose build --no-cache
docker compose up -d
```
This destroys all data (database, object storage, model weights, metastore, Superset config) and starts from scratch. PostgreSQL migrations are re-applied automatically. MinIO buckets are re-created by `minio-init`. Ollama models must be re-downloaded.
---
## MinIO Bucket Initialization
The `minio-init` service runs once on startup and creates the required object storage buckets:
| Bucket | Purpose |
|--------|---------|
| `stonks-raw-market` | Raw market data from Polygon.io |
| `stonks-raw-news` | Raw news articles |
| `stonks-raw-filings` | Raw SEC filings |
| `stonks-normalized` | Normalized/parsed documents |
| `stonks-llm-prompts` | LLM prompt archives |
| `stonks-llm-results` | LLM extraction results |
| `stonks-lakehouse` | Parquet fact tables for Trino |
| `stonks-audit` | Audit trail artifacts |
Access the MinIO console at `http://localhost:9001` (credentials: `minioadmin` / `minioadmin`).
---
## Dashboard Reverse Proxy
The dashboard container runs nginx with reverse proxy rules that route API requests to backend services using Docker Compose service names:
| Path | Proxied To | Service |
|------|-----------|---------|
| `/api/` | `http://query-api:8000` | Query API |
| `/registry/` | `http://symbol-registry:8000/` | Symbol Registry API |
| `/risk/` | `http://risk:8000/` | Risk Engine (via network alias) |
| `/trading/` | `http://trading-engine:8000/` | Trading Engine API |
The `risk-engine` service has a network alias of `risk` in `docker-compose.yml` so the nginx upstream resolves correctly.
All other paths serve the React SPA with `try_files` fallback to `index.html`.
---
## Troubleshooting
### Service won't start
Check dependency health:
```bash
docker compose ps postgres redis minio
```
If infrastructure services are unhealthy, application services will wait indefinitely. Check infrastructure logs:
```bash
docker compose logs postgres
```
### Database migration errors
Migrations in `./infra/migrations/` are applied by PostgreSQL's `docker-entrypoint-initdb.d` mechanism, which only runs on first database initialization. If you need to re-run migrations:
```bash
docker compose down -v # Remove pgdata volume
docker compose up -d # Migrations re-applied on fresh init
```
### Ollama model not available
The extractor service needs an LLM model loaded. Pull a model manually:
```bash
# If using bundled Ollama container:
docker compose exec ollama ollama pull qwen3.5:9b-fast
# If using host Ollama:
ollama pull qwen3.5:9b-fast
# If using vLLM, ensure the model is loaded on the vLLM server
curl http://your-vllm-host:8000/v1/models
```
### Ollama port conflict (address already in use)
If Ollama is already running on the host, the bundled container will fail to bind port 11434. Use the external Ollama configuration described in the "LLM Provider Configuration" section above, or use `deploy-docker.sh` which handles this automatically.
### Port conflicts
If a port is already in use, modify the host port mapping in `docker-compose.yml`:
```yaml
query-api:
ports:
- "9004:8000" # Changed from 8004 to 9004
```