Files

T

Celes Renata c85c0068a2 fix: clean up utcnow deprecation warnings, fix 12 failing tests, add CI/CD pipeline manifests

- Replace all datetime.utcnow() with datetime.now(tz=timezone.utc) across 8 files
- Fix 12 failing tests to match current implementation behavior
- Fix pytest_plugins in non-top-level conftest (moved to root conftest.py)
- Auto-fix 189 lint issues (import sorting, unused imports)
- Add CI/CD pipeline infrastructure (ARC, ArgoCD, Kargo manifests)
- Add values-beta.yaml and values-paper.yaml for staged deployments
- Update GitHub Actions workflow to use self-hosted-gremlin runners
- Add integration-test job to CI pipeline

Result: 1596 passed, 0 failed, 0 warnings

2026-04-18 03:59:28 +00:00

17 KiB

Raw Permalink Blame History

Stonks Oracle — Local Development Setup (Windows + Docker Desktop)

This guide walks you through setting up Stonks Oracle on a Windows machine using Docker Desktop. By the end you will have the full platform running locally: PostgreSQL, Redis, MinIO, Ollama, Trino, and all application services.

Prerequisites

Windows 10/11 with WSL 2 enabled
Docker Desktop for Windows (with WSL 2 backend)
Git (Git for Windows or via WSL)
Python 3.12 (for running services outside Docker during development)
Node.js 24 (for frontend development)

Install Docker Desktop

Download from https://www.docker.com/products/docker-desktop/
During install, ensure "Use WSL 2 instead of Hyper-V" is checked
After install, open Docker Desktop → Settings → Resources → WSL Integration → enable for your distro
Allocate at least 8 GB RAM and 4 CPUs in Settings → Resources (Ollama needs room)

Install Python 3.12

Download from https://www.python.org/downloads/ and check "Add Python to PATH" during install. Or use winget install Python.Python.3.12 from PowerShell.

Install Node.js 24

Download from https://nodejs.org/ (LTS or Current, 24.x). Or use winget install OpenJS.NodeJS.

1. Register for API Accounts

You need two API accounts. Both have free tiers that work for development.

Polygon.io (Market Data)

Go to https://polygon.io/
Sign up for a free account
Navigate to Dashboard → API Keys
Copy your API key — this becomes MARKET_DATA_API_KEY

The free tier gives you delayed data and limited API calls. Paid tiers ($29+/mo) give real-time data and higher rate limits.

Alpaca (Paper Trading)

Go to https://alpaca.markets/
Sign up for a free account
Navigate to the Paper Trading dashboard (not live)
Go to API Keys → Generate New Key
Copy both the API Key ID and Secret Key — these become BROKER_API_KEY and BROKER_API_SECRET
Your paper trading base URL is https://paper-api.alpaca.markets

Alpaca paper trading is completely free with no time limit.

2. Clone the Repository

git clone https://github.com/celesrenata/stonks-oracle.git
cd stonks-oracle

3. Create Your Environment File

Create a .env file in the project root with your API keys:

# Polygon.io
MARKET_DATA_API_KEY=your_polygon_api_key_here

# Alpaca Paper Trading
BROKER_API_KEY=your_alpaca_key_id_here
BROKER_API_SECRET=your_alpaca_secret_key_here
BROKER_BASE_URL=https://paper-api.alpaca.markets
BROKER_MODE=paper

This file is gitignored. Keep it safe.

4. Start Infrastructure Services

The docker-compose.yml in the project root defines all infrastructure services. Start them:

docker compose up -d

This starts:

Service	Port	Purpose
PostgreSQL 16	5432	Primary database
Redis 7	6379	Job queues and caching
MinIO	9000 (API), 9001 (Console)	Object storage for artifacts
Ollama	11434	Local LLM inference
Trino	8080	SQL query engine for lakehouse
Hive Metastore	9083	Metadata catalog for Trino
Superset	8088	Analytics dashboards

The minio-init sidecar automatically creates the required storage buckets.

Verify everything is running

docker compose ps

All services should show running (healthy). Give it 30-60 seconds for health checks to pass.

Access the UIs

MinIO Console: http://localhost:9001 — login: minioadmin / minioadmin
Superset: http://localhost:8088 — login: admin / admin
Trino: http://localhost:8080

5. Pull the Ollama Model

Stonks Oracle uses the qwen3.5:9b model for document extraction and event classification. Pull it:

docker exec -it stonks-oracle-ollama-1 ollama pull qwen3.5:9b

This downloads ~5 GB. If you have a GPU and want faster inference, make sure Docker Desktop has GPU passthrough enabled (Settings → Resources → GPU). Ollama will auto-detect CUDA GPUs.

To verify the model is available:

docker exec stonks-oracle-ollama-1 ollama list

6. Set Up the Python Environment

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Verify the database migrations ran

PostgreSQL auto-runs the migration SQL files from infra/migrations/ on first start (they are mounted into /docker-entrypoint-initdb.d). Verify:

python -c "import asyncio, asyncpg; asyncio.run(asyncpg.connect('postgresql://stonks:stonks_dev@localhost:5432/stonks').then(lambda c: print('Connected!')))"

Or more simply, use psql if you have it:

docker exec -it stonks-oracle-postgres-1 psql -U stonks -d stonks -c "\dt" | head -20

You should see tables like companies, documents, trend_windows, recommendations, orders, etc.

Seed the company universe

python -m services.symbol_registry.seed

This populates 50 companies across 10 sectors and 46 competitor relationships.

7. Run the Application Services

You can run services directly with Python (for development) or build Docker images.

Option A: Run directly with Python (recommended for development)

Open separate terminal windows for each service. Each needs the virtualenv activated and environment variables set:

# Terminal 1 — Scheduler (triggers ingestion on a cadence)
.venv\Scripts\activate
set MARKET_DATA_API_KEY=your_polygon_key
python -m services.scheduler.app

# Terminal 2 — Ingestion (fetches articles, filings, market data)
.venv\Scripts\activate
set MARKET_DATA_API_KEY=your_polygon_key
python -m services.ingestion.worker

# Terminal 3 — Parser (normalizes raw documents)
.venv\Scripts\activate
python -m services.parser.worker

# Terminal 4 — Extractor (LLM-based intelligence extraction)
.venv\Scripts\activate
python -m services.extractor.main

# Terminal 5 — Aggregation (merges signals into trend summaries)
.venv\Scripts\activate
python -m services.aggregation.main

# Terminal 6 — Recommendation (generates trade recommendations)
.venv\Scripts\activate
python -m services.recommendation.main

# Terminal 7 — Query API (REST API for the dashboard)
.venv\Scripts\activate
uvicorn services.api.app:app --host 0.0.0.0 --port 8000

# Terminal 8 — Symbol Registry (company CRUD API)
.venv\Scripts\activate
uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8001

# Terminal 9 — Risk Engine
.venv\Scripts\activate
uvicorn services.risk.app:app --host 0.0.0.0 --port 8002

# Terminal 10 — Trading Engine (autonomous paper trading)
.venv\Scripts\activate
set BROKER_API_KEY=your_alpaca_key
set BROKER_API_SECRET=your_alpaca_secret
set BROKER_BASE_URL=https://paper-api.alpaca.markets
uvicorn services.trading.app:app --host 0.0.0.0 --port 8003

# Terminal 11 — Broker Adapter (executes trades via Alpaca)
.venv\Scripts\activate
set BROKER_API_KEY=your_alpaca_key
set BROKER_API_SECRET=your_alpaca_secret
set BROKER_BASE_URL=https://paper-api.alpaca.markets
python -m services.adapters.broker_service

Not all services are required for basic development. The minimum set is:

scheduler + ingestion + parser + extractor — to get data flowing
aggregation + recommendation — to generate signals
query-api — to serve the dashboard

Add the trading services when you want to test paper trading.

Option B: Build and run as Docker containers

# Build the Python service image
docker build -t stonks-oracle/services -f docker/Dockerfile .

# Build the frontend
docker build -t stonks-oracle/dashboard -f frontend/Dockerfile frontend/

# Run a service (example: scheduler)
docker run --rm --network host ^
  -e MARKET_DATA_API_KEY=your_polygon_key ^
  -e POSTGRES_HOST=localhost ^
  -e REDIS_HOST=localhost ^
  -e MINIO_ENDPOINT=localhost:9000 ^
  -e OLLAMA_BASE_URL=http://localhost:11434 ^
  -e SERVICE_CMD="python -m services.scheduler.app" ^
  stonks-oracle/services

8. Run the Frontend Dashboard

cd frontend
npm install
npm run dev

The dashboard starts at http://localhost:5173. It proxies API requests to the backend services via Vite's dev server.

For production-like testing, build and serve with nginx:

cd frontend
docker build -t stonks-oracle/dashboard .
docker run --rm -p 8080:8080 --network host stonks-oracle/dashboard

Then visit http://localhost:8080.

9. Run Tests

Python tests

.venv\Scripts\activate
pip install ruff pytest pytest-asyncio hypothesis
ruff check services/
python -m pytest tests/ -x --tb=short -q

Frontend tests

cd frontend
npx vitest --run

10. How the Pipeline Works

Once services are running, the data flows automatically:

Scheduler (every 15s)
  → enqueues ingestion jobs for due sources
    → Ingestion fetches articles/filings/market data from Polygon
      → Parser normalizes raw text
        → Extractor calls Ollama to extract structured intelligence
          → Aggregation merges signals into trend summaries
            → Recommendation generates buy/sell/watch signals
              → Trading Engine evaluates and executes paper trades
                → Broker Adapter submits orders to Alpaca

Monitor the pipeline via Redis queues:

docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:ingestion
docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:parsing
docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:extraction
docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:aggregation
docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:recommendation

Environment Variable Reference

All services read configuration from environment variables with sensible defaults for local development. You only need to set the ones that differ from defaults.

Required (no useful default)

Variable	Description
`MARKET_DATA_API_KEY`	Polygon.io API key
`BROKER_API_KEY`	Alpaca API key ID
`BROKER_API_SECRET`	Alpaca API secret

Infrastructure (defaults work with docker-compose)

Variable	Default	Description
`POSTGRES_HOST`	`localhost`	PostgreSQL host
`POSTGRES_PORT`	`5432`	PostgreSQL port
`POSTGRES_DB`	`stonks`	Database name
`POSTGRES_USER`	`stonks`	Database user
`POSTGRES_PASSWORD`	`stonks_dev`	Database password
`REDIS_HOST`	`localhost`	Redis host
`REDIS_PORT`	`6379`	Redis port
`REDIS_PASSWORD`	(none)	Redis password (not set in dev)
`MINIO_ENDPOINT`	`localhost:9000`	MinIO API endpoint
`MINIO_ACCESS_KEY`	`minioadmin`	MinIO access key
`MINIO_SECRET_KEY`	`minioadmin`	MinIO secret key
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API URL
`OLLAMA_MODEL`	`qwen3.5:9b`	LLM model name

Trading

Variable	Default	Description
`BROKER_MODE`	`paper`	Trading mode (`paper` or `live`)
`BROKER_PROVIDER`	`alpaca`	Broker provider
`BROKER_BASE_URL`	(none)	Alpaca API URL (set to `https://paper-api.alpaca.markets`)

11. Integration Tests

The integration test pipeline validates all API endpoints against a live Kubernetes sandbox with realistic seed data. It deploys ephemeral infrastructure (PostgreSQL, Redis, MinIO), seeds deterministic test data, deploys all API services, and runs the full test suite with profiling.

Prerequisites

kubectl configured with access to a Kubernetes cluster
Docker images built and pushed to GHCR (or use :latest)
envsubst available (usually part of gettext package)
GHCR_TOKEN environment variable set for image pulls (optional if images are public)

Running the Full Pipeline

# Run with latest images
bash infra/inttest/run_pipeline.sh

# Run with a specific image tag
bash infra/inttest/run_pipeline.sh --image-tag abc123

# Keep the sandbox running for debugging
bash infra/inttest/run_pipeline.sh --skip-teardown

# Custom namespace and results file
bash infra/inttest/run_pipeline.sh --namespace my-test --results-file results.json

CLI Options

Option	Default	Description
`--image-tag TAG`	`latest`	Docker image tag to deploy
`--namespace NAME`	`stonks-inttest-<timestamp>`	Kubernetes namespace name
`--skip-teardown`	`false`	Leave namespace running after tests
`--results-file PATH`	`inttest-results.json`	Path for JSON results output

Exit Codes

Code	Meaning
0	All tests passed
1	One or more test failures
2	Infrastructure setup failure

JSON Result Contract

The pipeline produces a JSON results file (inttest-results.json by default) with this structure:

{
  "run_id": "stonks-inttest-1705312800",
  "image_tag": "abc123",
  "started_at": "2025-01-15T12:00:00Z",
  "completed_at": "2025-01-15T12:07:30Z",
  "exit_code": 0,
  "stages": {
    "infra_deploy": {"duration_s": 45, "status": "ok"},
    "seed_data": {"duration_s": 8, "status": "ok"},
    "service_deploy": {"duration_s": 32, "status": "ok"},
    "integration_tests": {"duration_s": 28, "status": "ok"},
    "teardown": {"duration_s": 5, "status": "ok"}
  },
  "tests": {"total": 41, "passed": 41, "failed": 0, "errors": 0},
  "profiling": {
    "endpoints": {"/api/companies": {"p50_ms": 12, "p95_ms": 25, "p99_ms": 45}},
    "slow_endpoints": []
  }
}

Running Tests Locally (Development)

For faster iteration during development, you can run individual test files against local services:

# Start local services first (query-api on 8000, registry on 8001, etc.)
# Then run specific test files:
.venv/bin/python -m pytest tests/integration/test_query_api.py -v --tb=short
.venv/bin/python -m pytest tests/integration/test_registry_api.py -v --tb=short
.venv/bin/python -m pytest tests/integration/test_frontend_data_deps.py -v --tb=short

# Run with profiling output:
.venv/bin/python -m pytest tests/integration/ -v --profiling-output=profiling.json

Set the service URLs via environment variables:

export QUERY_API_URL=http://localhost:8000
export REGISTRY_API_URL=http://localhost:8001
export RISK_API_URL=http://localhost:8002
export TRADING_API_URL=http://localhost:8003

Future: CI/CD Pipeline

This integration test runner is designed as a standalone foundation. A future CI/CD pipeline spec will consume it as one stage in a larger pipeline that includes:

Self-hosted builds on gremlin nodes (no GitHub Actions compute costs)
Staged promotion: beta → paper → live
Market-hours promotion blockers (9:30–16:00 ET)
Break-glass emergency deploy to production
Per-stage enable/disable toggles

Troubleshooting

"Connection refused" to PostgreSQL/Redis/MinIO

Make sure Docker Desktop is running and docker compose ps shows all services healthy. On Windows, localhost should work since Docker Desktop maps ports to the host.

Ollama model not found

Run docker exec stonks-oracle-ollama-1 ollama pull qwen3.5:9b and wait for the download to complete. Check available models with ollama list.

Ollama is slow (no GPU)

Without a GPU, Ollama runs on CPU and extraction takes 2-5 minutes per document. If you have an NVIDIA GPU, ensure Docker Desktop has GPU support enabled and the NVIDIA Container Toolkit is installed. See Ollama Docker GPU docs.

Migrations didn't run

If the database is empty, the migrations may not have run on first start. You can apply them manually:

# Connect to postgres and run migrations in order
docker exec -i stonks-oracle-postgres-1 psql -U stonks -d stonks < infra/migrations/001_initial_schema.sql
docker exec -i stonks-oracle-postgres-1 psql -U stonks -d stonks < infra/migrations/002_documents_and_intelligence.sql
# ... repeat for all 030 migration files

Or run them all at once:

Get-ChildItem infra\migrations\*.sql | Sort-Object Name | ForEach-Object {
    Write-Host "Applying $($_.Name)..."
    Get-Content $_.FullName | docker exec -i stonks-oracle-postgres-1 psql -U stonks -d stonks
}

Frontend can't reach the API

When running the frontend with npm run dev, Vite proxies /api/ requests. Make sure the Query API is running on port 8000. If using different ports, set the Vite env vars:

set VITE_QUERY_API_URL=http://localhost:8000
set VITE_SYMBOL_REGISTRY_URL=http://localhost:8001
set VITE_RISK_ENGINE_URL=http://localhost:8002
npm run dev

WSL 2 memory issues

Docker Desktop on WSL 2 can consume a lot of memory. Create or edit %USERPROFILE%\.wslconfig:

[wsl2]
memory=8GB
processors=4

Then restart WSL: wsl --shutdown from PowerShell.

Stopping Everything

# Stop infrastructure
docker compose down

# Stop infrastructure AND delete all data (fresh start)
docker compose down -v

The -v flag removes the named volumes (database data, MinIO objects, Ollama models). Omit it to preserve data between restarts.

17 KiB Raw Permalink Blame History Unescape Escape

Stonks Oracle — Local Development Setup (Windows + Docker Desktop)

Prerequisites

Install Docker Desktop

Install Python 3.12

Install Node.js 24

1. Register for API Accounts

Polygon.io (Market Data)

Alpaca (Paper Trading)

2. Clone the Repository

3. Create Your Environment File

4. Start Infrastructure Services

Verify everything is running

Access the UIs

5. Pull the Ollama Model

6. Set Up the Python Environment

Verify the database migrations ran

Seed the company universe

7. Run the Application Services

Option A: Run directly with Python (recommended for development)

Option B: Build and run as Docker containers

8. Run the Frontend Dashboard

9. Run Tests

Python tests

Frontend tests

10. How the Pipeline Works

Environment Variable Reference

Required (no useful default)

Infrastructure (defaults work with docker-compose)

Trading

11. Integration Tests

Prerequisites

Running the Full Pipeline

CLI Options

Exit Codes

JSON Result Contract

Running Tests Locally (Development)

Future: CI/CD Pipeline

Troubleshooting

"Connection refused" to PostgreSQL/Redis/MinIO

Ollama model not found

Ollama is slow (no GPU)

Migrations didn't run

Frontend can't reach the API

WSL 2 memory issues

Stopping Everything

17 KiB

Raw Permalink Blame History