feat: comprehensive docs, unit tests, docker-compose app services
- Add scheduler and ingestion unit tests (test_scheduler_unit.py, test_ingestion_unit.py) - Add all 13 app services + dashboard to docker-compose.yml - Add full documentation suite: API reference, Helm reference, Docker deployment guide, 3 architecture diagrams (K8s, Docker Compose, data pipeline), AI agent guide, backup/restore guide, observability/metrics reference, per-service docs - Add intelligence pipeline deep-dive docs with Mermaid diagrams - Update README with documentation index and links - Add specs for comprehensive-quality-docs, intelligence-pipeline-deep-dive, sanitized-pipeline-docs
This commit is contained in:
@@ -0,0 +1,322 @@
|
||||
# Docker Compose Architecture — Stonks Oracle
|
||||
|
||||
This document describes the Docker Compose deployment topology for Stonks Oracle, derived from the `docker-compose.yml` file at the repository root.
|
||||
|
||||
All containers run on a single Docker network created by Compose. Infrastructure services (PostgreSQL, Redis, MinIO, Ollama, Trino, Hive Metastore, Superset) start first, and application services wait for their dependencies via `depends_on` with health check conditions.
|
||||
|
||||
## Container Topology Diagram
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
%% ── Host machine ──────────────────────────────────────────────
|
||||
host((Host Machine))
|
||||
|
||||
%% ── .env file ─────────────────────────────────────────────────
|
||||
envfile[".env file<br/><i>MARKET_DATA_API_KEY</i><br/><i>BROKER_API_KEY</i><br/><i>BROKER_API_SECRET</i><br/><i>BROKER_BASE_URL</i>"]
|
||||
|
||||
%% ── Docker Compose default network ────────────────────────────
|
||||
subgraph network ["Docker Compose Network (default)"]
|
||||
direction TB
|
||||
|
||||
%% ── Infrastructure Containers ─────────────────────────────
|
||||
subgraph infra ["Infrastructure Containers"]
|
||||
direction LR
|
||||
postgres[("postgres<br/><i>postgres:16-alpine</i><br/>host :5432 → :5432")]
|
||||
redis[("redis<br/><i>redis:7-alpine</i><br/>host :6379 → :6379")]
|
||||
minio[("minio<br/><i>minio/minio:latest</i><br/>host :9000 → :9000<br/>host :9001 → :9001")]
|
||||
ollama[("ollama<br/><i>ollama/ollama:latest</i><br/>host :11434 → :11434")]
|
||||
end
|
||||
|
||||
subgraph infra_init ["Infrastructure Init"]
|
||||
minio_init["minio-init<br/><i>minio/mc:latest</i><br/><i>Creates buckets on startup</i>"]
|
||||
end
|
||||
|
||||
subgraph analytics ["Analytics Containers"]
|
||||
direction LR
|
||||
hive_metastore["hive-metastore<br/><i>apache/hive:4.0.0</i><br/>host :9083 → :9083"]
|
||||
trino["trino<br/><i>trinodb/trino:latest</i><br/>host :8080 → :8080"]
|
||||
superset["superset<br/><i>apache/superset:latest</i><br/>host :8088 → :8088"]
|
||||
end
|
||||
|
||||
%% ── Application Containers ────────────────────────────────
|
||||
|
||||
subgraph api_tier ["API Tier"]
|
||||
direction LR
|
||||
query_api["query-api<br/><i>docker/Dockerfile</i><br/><i>uvicorn services.api.app</i><br/>host :8004 → :8000"]
|
||||
symbol_registry["symbol-registry<br/><i>docker/Dockerfile</i><br/><i>uvicorn services.symbol_registry.app</i><br/>host :8001 → :8000"]
|
||||
end
|
||||
|
||||
subgraph frontend_tier ["Frontend Tier"]
|
||||
dashboard["dashboard<br/><i>frontend/Dockerfile</i><br/><i>nginx on :8080</i><br/>host :3000 → :8080"]
|
||||
end
|
||||
|
||||
subgraph trading_tier ["Trading Tier"]
|
||||
direction LR
|
||||
trading_engine["trading-engine<br/><i>docker/Dockerfile</i><br/><i>uvicorn services.trading.app</i><br/>host :8002 → :8000"]
|
||||
risk_engine["risk-engine<br/><i>docker/Dockerfile</i><br/><i>uvicorn services.risk.app</i><br/>host :8003 → :8000"]
|
||||
broker_adapter["broker-adapter<br/><i>docker/Dockerfile</i><br/><i>python -m services.adapters.broker_service</i><br/><i>no host port</i>"]
|
||||
end
|
||||
|
||||
subgraph orchestration_tier ["Orchestration Tier"]
|
||||
scheduler["scheduler<br/><i>docker/Dockerfile.scheduler</i><br/><i>no host port</i>"]
|
||||
end
|
||||
|
||||
subgraph processing_tier ["Processing Tier (pipeline workers)"]
|
||||
direction LR
|
||||
ingestion["ingestion<br/><i>docker/Dockerfile</i><br/><i>python -m services.ingestion.worker</i><br/><i>no host port</i>"]
|
||||
parser["parser<br/><i>docker/Dockerfile</i><br/><i>python -m services.parser.worker</i><br/><i>no host port</i>"]
|
||||
extractor["extractor<br/><i>docker/Dockerfile</i><br/><i>python -m services.extractor.main</i><br/><i>no host port</i>"]
|
||||
aggregation["aggregation<br/><i>docker/Dockerfile</i><br/><i>python -m services.aggregation.main</i><br/><i>no host port</i>"]
|
||||
recommendation["recommendation<br/><i>docker/Dockerfile</i><br/><i>python -m services.recommendation.main</i><br/><i>no host port</i>"]
|
||||
end
|
||||
|
||||
subgraph analytics_worker ["Analytics Worker"]
|
||||
lake_publisher["lake-publisher<br/><i>docker/Dockerfile</i><br/><i>python -m services.lake_publisher.jobs</i><br/><i>no host port</i>"]
|
||||
end
|
||||
end
|
||||
|
||||
%% ── Host port access ──────────────────────────────────────────
|
||||
host -->|":5432"| postgres
|
||||
host -->|":6379"| redis
|
||||
host -->|":9000 / :9001"| minio
|
||||
host -->|":11434"| ollama
|
||||
host -->|":8080"| trino
|
||||
host -->|":9083"| hive_metastore
|
||||
host -->|":8088"| superset
|
||||
host -->|":8001"| symbol_registry
|
||||
host -->|":8004"| query_api
|
||||
host -->|":8002"| trading_engine
|
||||
host -->|":8003"| risk_engine
|
||||
host -->|":3000"| dashboard
|
||||
|
||||
%% ── .env injection ────────────────────────────────────────────
|
||||
envfile -.->|"env_file: .env"| ingestion
|
||||
envfile -.->|"env_file: .env"| broker_adapter
|
||||
envfile -.->|"env_file: .env"| trading_engine
|
||||
|
||||
%% ── Styles ────────────────────────────────────────────────────
|
||||
classDef infraSvc fill:#95a5a6,stroke:#717d7e,color:#fff
|
||||
classDef analyticsSvc fill:#e74c3c,stroke:#a93226,color:#fff
|
||||
classDef apiSvc fill:#4a90d9,stroke:#2c5f8a,color:#fff
|
||||
classDef frontendSvc fill:#50c878,stroke:#2e7d46,color:#fff
|
||||
classDef tradingSvc fill:#e8a838,stroke:#b07d1a,color:#fff
|
||||
classDef orchSvc fill:#1abc9c,stroke:#148f77,color:#fff
|
||||
classDef processSvc fill:#9b59b6,stroke:#6c3483,color:#fff
|
||||
classDef initSvc fill:#bdc3c7,stroke:#7f8c8d,color:#333
|
||||
classDef envSvc fill:#f5f5dc,stroke:#999,color:#333
|
||||
|
||||
class postgres,redis,minio,ollama infraSvc
|
||||
class hive_metastore,trino,superset,lake_publisher analyticsSvc
|
||||
class query_api,symbol_registry apiSvc
|
||||
class dashboard frontendSvc
|
||||
class trading_engine,risk_engine,broker_adapter tradingSvc
|
||||
class scheduler orchSvc
|
||||
class ingestion,parser,extractor,aggregation,recommendation processSvc
|
||||
class minio_init initSvc
|
||||
class envfile envSvc
|
||||
```
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
The following diagram shows `depends_on` relationships and health check conditions. Solid arrows indicate `condition: service_healthy` (the dependent waits for the health check to pass). Dashed arrows indicate `condition: service_started` (the dependent waits only for the container to start).
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
%% ── Infrastructure health checks ──────────────────────────────
|
||||
postgres[("postgres<br/><i>pg_isready -U stonks</i>")]
|
||||
redis[("redis<br/><i>redis-cli ping</i>")]
|
||||
minio[("minio<br/><i>mc ready local</i>")]
|
||||
ollama[("ollama<br/><i>no health check</i>")]
|
||||
|
||||
%% ── Analytics dependencies ────────────────────────────────────
|
||||
hive["hive-metastore"] -->|started| minio
|
||||
trino["trino"] -->|started| minio
|
||||
trino -->|started| hive
|
||||
superset["superset"] -->|started| trino
|
||||
minio_init["minio-init"] -->|healthy| minio
|
||||
|
||||
%% ── Application depends_on (healthy) ──────────────────────────
|
||||
scheduler["scheduler"] -->|healthy| postgres
|
||||
scheduler -->|healthy| redis
|
||||
|
||||
symbol_registry["symbol-registry"] -->|healthy| postgres
|
||||
|
||||
ingestion["ingestion"] -->|healthy| postgres
|
||||
ingestion -->|healthy| redis
|
||||
ingestion -->|healthy| minio
|
||||
|
||||
parser["parser"] -->|healthy| postgres
|
||||
parser -->|healthy| redis
|
||||
|
||||
extractor["extractor"] -->|healthy| postgres
|
||||
extractor -->|healthy| redis
|
||||
extractor -.->|started| ollama
|
||||
|
||||
aggregation["aggregation"] -->|healthy| postgres
|
||||
aggregation -->|healthy| redis
|
||||
|
||||
recommendation["recommendation"] -->|healthy| postgres
|
||||
recommendation -->|healthy| redis
|
||||
|
||||
trading_engine["trading-engine"] -->|healthy| postgres
|
||||
trading_engine -->|healthy| redis
|
||||
|
||||
risk_engine["risk-engine"] -->|healthy| postgres
|
||||
|
||||
broker_adapter["broker-adapter"] -->|healthy| postgres
|
||||
broker_adapter -->|healthy| redis
|
||||
|
||||
lake_publisher["lake-publisher"] -->|healthy| postgres
|
||||
lake_publisher -->|healthy| minio
|
||||
|
||||
query_api["query-api"] -->|healthy| postgres
|
||||
query_api -->|healthy| redis
|
||||
query_api -->|healthy| minio
|
||||
|
||||
dashboard["dashboard"] -->|healthy| query_api
|
||||
|
||||
%% ── Styles ────────────────────────────────────────────────────
|
||||
classDef infraSvc fill:#95a5a6,stroke:#717d7e,color:#fff
|
||||
classDef appSvc fill:#4a90d9,stroke:#2c5f8a,color:#fff
|
||||
classDef analyticsSvc fill:#e74c3c,stroke:#a93226,color:#fff
|
||||
classDef initSvc fill:#bdc3c7,stroke:#7f8c8d,color:#333
|
||||
|
||||
class postgres,redis,minio,ollama infraSvc
|
||||
class scheduler,symbol_registry,ingestion,parser,extractor,aggregation,recommendation,trading_engine,risk_engine,broker_adapter,lake_publisher,query_api,dashboard appSvc
|
||||
class hive,trino,superset analyticsSvc
|
||||
class minio_init initSvc
|
||||
```
|
||||
|
||||
## Named Volumes
|
||||
|
||||
Docker Compose defines five named volumes for persistent data:
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
pgdata["📦 pgdata"]
|
||||
miniodata["📦 miniodata"]
|
||||
ollama_models["📦 ollama_models"]
|
||||
hive_data["📦 hive_data"]
|
||||
superset_data["📦 superset_data"]
|
||||
|
||||
pgdata -->|"/var/lib/postgresql/data"| postgres[("postgres")]
|
||||
miniodata -->|"/data"| minio[("minio")]
|
||||
ollama_models -->|"/root/.ollama"| ollama[("ollama")]
|
||||
hive_data -->|"/opt/hive/data"| hive["hive-metastore"]
|
||||
superset_data -->|"/app/superset_home"| superset["superset"]
|
||||
|
||||
classDef volStyle fill:#f5f5dc,stroke:#999,color:#333
|
||||
classDef svcStyle fill:#95a5a6,stroke:#717d7e,color:#fff
|
||||
|
||||
class pgdata,miniodata,ollama_models,hive_data,superset_data volStyle
|
||||
class postgres,minio,ollama,hive,superset svcStyle
|
||||
```
|
||||
|
||||
| Volume | Mount Point | Container | Purpose |
|
||||
|--------|-------------|-----------|---------|
|
||||
| `pgdata` | `/var/lib/postgresql/data` | postgres | PostgreSQL database files |
|
||||
| `miniodata` | `/data` | minio | MinIO object storage data |
|
||||
| `ollama_models` | `/root/.ollama` | ollama | Downloaded LLM model weights |
|
||||
| `hive_data` | `/opt/hive/data` | hive-metastore | Hive Metastore embedded Derby DB |
|
||||
| `superset_data` | `/app/superset_home` | superset | Superset configuration and metadata |
|
||||
|
||||
### Bind Mounts
|
||||
|
||||
In addition to named volumes, several containers use bind mounts for configuration files:
|
||||
|
||||
| Host Path | Mount Point | Container | Mode |
|
||||
|-----------|-------------|-----------|------|
|
||||
| `./infra/migrations/` | `/docker-entrypoint-initdb.d` | postgres | rw (init scripts) |
|
||||
| `./infra/trino/catalog/` | `/etc/trino/catalog` | trino | rw |
|
||||
| `./infra/hive/core-site.xml` | `/opt/hive/conf/core-site.xml` | hive-metastore | ro |
|
||||
| `./infra/hive/metastore-site.xml` | `/opt/hive/conf/metastore-site.xml` | hive-metastore | ro |
|
||||
|
||||
## Host Port Mappings
|
||||
|
||||
Services accessible from the host machine:
|
||||
|
||||
| Host Port | Container | Container Port | Service |
|
||||
|-----------|-----------|----------------|---------|
|
||||
| 5432 | postgres | 5432 | PostgreSQL database |
|
||||
| 6379 | redis | 6379 | Redis cache and queues |
|
||||
| 9000 | minio | 9000 | MinIO S3 API |
|
||||
| 9001 | minio | 9001 | MinIO web console |
|
||||
| 11434 | ollama | 11434 | Ollama LLM API |
|
||||
| 8080 | trino | 8080 | Trino query engine |
|
||||
| 9083 | hive-metastore | 9083 | Hive Metastore thrift |
|
||||
| 8088 | superset | 8088 | Superset dashboard |
|
||||
| 8001 | symbol-registry | 8000 | Symbol Registry API |
|
||||
| 8002 | trading-engine | 8000 | Trading Engine API |
|
||||
| 8003 | risk-engine | 8000 | Risk Engine API |
|
||||
| 8004 | query-api | 8000 | Query API |
|
||||
| 3000 | dashboard | 8080 | React dashboard (nginx) |
|
||||
|
||||
Services without host port mappings (internal only): scheduler, ingestion, parser, extractor, aggregation, recommendation, broker-adapter, lake-publisher, minio-init.
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
### Shared Environment (`x-app-env` YAML anchor)
|
||||
|
||||
All 13 application services and the scheduler receive these environment variables via the `x-app-env` anchor:
|
||||
|
||||
| Variable | Value | Purpose |
|
||||
|----------|-------|---------|
|
||||
| `POSTGRES_HOST` | `postgres` | Docker Compose service name for PostgreSQL |
|
||||
| `POSTGRES_PORT` | `5432` | PostgreSQL port |
|
||||
| `POSTGRES_DB` | `stonks` | Database name |
|
||||
| `POSTGRES_USER` | `stonks` | Database user |
|
||||
| `POSTGRES_PASSWORD` | `stonks_dev` | Database password (dev default) |
|
||||
| `REDIS_HOST` | `redis` | Docker Compose service name for Redis |
|
||||
| `REDIS_PORT` | `6379` | Redis port |
|
||||
| `MINIO_ENDPOINT` | `minio:9000` | Docker Compose service name for MinIO |
|
||||
| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
|
||||
| `MINIO_SECRET_KEY` | `minioadmin` | MinIO secret key |
|
||||
| `OLLAMA_BASE_URL` | `http://ollama:11434` | Docker Compose service name for Ollama |
|
||||
|
||||
### `.env` File (API Keys)
|
||||
|
||||
Three services load additional secrets from the `.env` file in the repository root via `env_file: .env`:
|
||||
|
||||
| Variable | Required By | Purpose |
|
||||
|----------|-------------|---------|
|
||||
| `MARKET_DATA_API_KEY` | ingestion | Polygon.io market data API key |
|
||||
| `BROKER_API_KEY` | broker-adapter, trading-engine | Alpaca broker API key |
|
||||
| `BROKER_API_SECRET` | broker-adapter, trading-engine | Alpaca broker API secret |
|
||||
| `BROKER_BASE_URL` | broker-adapter, trading-engine | Alpaca API base URL (default: `https://paper-api.alpaca.markets`) |
|
||||
|
||||
## Health Check Summary
|
||||
|
||||
| Container | Health Check Command | Interval | Timeout | Retries | Start Period |
|
||||
|-----------|---------------------|----------|---------|---------|--------------|
|
||||
| postgres | `pg_isready -U stonks` | 5s | — | 5 | — |
|
||||
| redis | `redis-cli ping` | 5s | — | 5 | — |
|
||||
| minio | `mc ready local` | 5s | — | 5 | — |
|
||||
| symbol-registry | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
|
||||
| query-api | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
|
||||
| trading-engine | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
|
||||
| risk-engine | `curl -f http://localhost:8000/health` | 10s | 5s | 3 | 15s |
|
||||
| dashboard | `curl -f http://localhost:8080/` | 10s | 5s | 3 | 10s |
|
||||
| scheduler | `pgrep -f 'python -m services.scheduler.app'` | 10s | 5s | 3 | 15s |
|
||||
| ingestion | `pgrep -f 'python -m services.ingestion.worker'` | 10s | 5s | 3 | 15s |
|
||||
| parser | `pgrep -f 'python -m services.parser.worker'` | 10s | 5s | 3 | 15s |
|
||||
| extractor | `pgrep -f 'python -m services.extractor.main'` | 10s | 5s | 3 | 15s |
|
||||
| aggregation | `pgrep -f 'python -m services.aggregation.main'` | 10s | 5s | 3 | 15s |
|
||||
| recommendation | `pgrep -f 'python -m services.recommendation.main'` | 10s | 5s | 3 | 15s |
|
||||
| broker-adapter | `pgrep -f 'python -m services.adapters.broker_service'` | 10s | 5s | 3 | 15s |
|
||||
| lake-publisher | `pgrep -f 'python -m services.lake_publisher.jobs'` | 10s | 5s | 3 | 15s |
|
||||
|
||||
Infrastructure services (ollama, trino, hive-metastore, superset) do not define health checks in docker-compose.yml. Application services that depend on ollama use `condition: service_started` instead of `condition: service_healthy`.
|
||||
|
||||
## Internal Network Connectivity
|
||||
|
||||
All containers share the default Docker Compose network. Services reference each other by their Compose service name as the hostname:
|
||||
|
||||
| Hostname | Resolved To | Used By |
|
||||
|----------|-------------|---------|
|
||||
| `postgres` | PostgreSQL container | All 13 app services, superset |
|
||||
| `redis` | Redis container | scheduler, ingestion, parser, extractor, aggregation, recommendation, trading-engine, broker-adapter, query-api |
|
||||
| `minio` | MinIO container | ingestion, lake-publisher, query-api (via `minio:9000`) |
|
||||
| `ollama` | Ollama container | extractor (via `http://ollama:11434`) |
|
||||
| `hive-metastore` | Hive Metastore container | trino (thrift://hive-metastore:9083) |
|
||||
| `trino` | Trino container | superset (trino:8080) |
|
||||
| `query-api` | Query API container | dashboard (nginx proxy upstream) |
|
||||
Reference in New Issue
Block a user