From 6136a767dacc68dbcc3c07febd04e3029402f508 Mon Sep 17 00:00:00 2001 From: Celes Renata Date: Fri, 17 Apr 2026 23:47:51 +0000 Subject: [PATCH] docs: add Windows Docker Desktop local dev setup guide --- docs/LOCAL_DEV_SETUP.md | 445 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 445 insertions(+) create mode 100644 docs/LOCAL_DEV_SETUP.md diff --git a/docs/LOCAL_DEV_SETUP.md b/docs/LOCAL_DEV_SETUP.md new file mode 100644 index 0000000..cb7ea46 --- /dev/null +++ b/docs/LOCAL_DEV_SETUP.md @@ -0,0 +1,445 @@ +# Stonks Oracle — Local Development Setup (Windows + Docker Desktop) + +This guide walks you through setting up Stonks Oracle on a Windows machine using Docker Desktop. By the end you will have the full platform running locally: PostgreSQL, Redis, MinIO, Ollama, Trino, and all application services. + +## Prerequisites + +- **Windows 10/11** with WSL 2 enabled +- **Docker Desktop** for Windows (with WSL 2 backend) +- **Git** (Git for Windows or via WSL) +- **Python 3.12** (for running services outside Docker during development) +- **Node.js 24** (for frontend development) + +### Install Docker Desktop + +1. Download from [https://www.docker.com/products/docker-desktop/](https://www.docker.com/products/docker-desktop/) +2. During install, ensure "Use WSL 2 instead of Hyper-V" is checked +3. After install, open Docker Desktop → Settings → Resources → WSL Integration → enable for your distro +4. Allocate at least **8 GB RAM** and **4 CPUs** in Settings → Resources (Ollama needs room) + +### Install Python 3.12 + +Download from [https://www.python.org/downloads/](https://www.python.org/downloads/) and check "Add Python to PATH" during install. Or use `winget install Python.Python.3.12` from PowerShell. + +### Install Node.js 24 + +Download from [https://nodejs.org/](https://nodejs.org/) (LTS or Current, 24.x). Or use `winget install OpenJS.NodeJS`. + +--- + +## 1. Register for API Accounts + +You need two API accounts. Both have free tiers that work for development. + +### Polygon.io (Market Data) + +1. Go to [https://polygon.io/](https://polygon.io/) +2. Sign up for a free account +3. Navigate to Dashboard → API Keys +4. Copy your API key — this becomes `MARKET_DATA_API_KEY` + +The free tier gives you delayed data and limited API calls. Paid tiers ($29+/mo) give real-time data and higher rate limits. + +### Alpaca (Paper Trading) + +1. Go to [https://alpaca.markets/](https://alpaca.markets/) +2. Sign up for a free account +3. Navigate to the **Paper Trading** dashboard (not live) +4. Go to API Keys → Generate New Key +5. Copy both the **API Key ID** and **Secret Key** — these become `BROKER_API_KEY` and `BROKER_API_SECRET` +6. Your paper trading base URL is `https://paper-api.alpaca.markets` + +Alpaca paper trading is completely free with no time limit. + +--- + +## 2. Clone the Repository + +```powershell +git clone https://github.com/celesrenata/stonks-oracle.git +cd stonks-oracle +``` + +--- + +## 3. Create Your Environment File + +Create a `.env` file in the project root with your API keys: + +```ini +# Polygon.io +MARKET_DATA_API_KEY=your_polygon_api_key_here + +# Alpaca Paper Trading +BROKER_API_KEY=your_alpaca_key_id_here +BROKER_API_SECRET=your_alpaca_secret_key_here +BROKER_BASE_URL=https://paper-api.alpaca.markets +BROKER_MODE=paper +``` + +This file is gitignored. Keep it safe. + +--- + +## 4. Start Infrastructure Services + +The `docker-compose.yml` in the project root defines all infrastructure services. Start them: + +```powershell +docker compose up -d +``` + +This starts: + +| Service | Port | Purpose | +|---------|------|---------| +| PostgreSQL 16 | 5432 | Primary database | +| Redis 7 | 6379 | Job queues and caching | +| MinIO | 9000 (API), 9001 (Console) | Object storage for artifacts | +| Ollama | 11434 | Local LLM inference | +| Trino | 8080 | SQL query engine for lakehouse | +| Hive Metastore | 9083 | Metadata catalog for Trino | +| Superset | 8088 | Analytics dashboards | + +The `minio-init` sidecar automatically creates the required storage buckets. + +### Verify everything is running + +```powershell +docker compose ps +``` + +All services should show `running` (healthy). Give it 30-60 seconds for health checks to pass. + +### Access the UIs + +- **MinIO Console**: [http://localhost:9001](http://localhost:9001) — login: `minioadmin` / `minioadmin` +- **Superset**: [http://localhost:8088](http://localhost:8088) — login: `admin` / `admin` +- **Trino**: [http://localhost:8080](http://localhost:8080) + +--- + +## 5. Pull the Ollama Model + +Stonks Oracle uses the `qwen3.5:9b` model for document extraction and event classification. Pull it: + +```powershell +docker exec -it stonks-oracle-ollama-1 ollama pull qwen3.5:9b +``` + +This downloads ~5 GB. If you have a GPU and want faster inference, make sure Docker Desktop has GPU passthrough enabled (Settings → Resources → GPU). Ollama will auto-detect CUDA GPUs. + +To verify the model is available: + +```powershell +docker exec stonks-oracle-ollama-1 ollama list +``` + +--- + +## 6. Set Up the Python Environment + +```powershell +python -m venv .venv +.venv\Scripts\activate +pip install -r requirements.txt +``` + +### Verify the database migrations ran + +PostgreSQL auto-runs the migration SQL files from `infra/migrations/` on first start (they are mounted into `/docker-entrypoint-initdb.d`). Verify: + +```powershell +python -c "import asyncio, asyncpg; asyncio.run(asyncpg.connect('postgresql://stonks:stonks_dev@localhost:5432/stonks').then(lambda c: print('Connected!')))" +``` + +Or more simply, use `psql` if you have it: + +```powershell +docker exec -it stonks-oracle-postgres-1 psql -U stonks -d stonks -c "\dt" | head -20 +``` + +You should see tables like `companies`, `documents`, `trend_windows`, `recommendations`, `orders`, etc. + +### Seed the company universe + +```powershell +python -m services.symbol_registry.seed +``` + +This populates 50 companies across 10 sectors and 46 competitor relationships. + +--- + +## 7. Run the Application Services + +You can run services directly with Python (for development) or build Docker images. + +### Option A: Run directly with Python (recommended for development) + +Open separate terminal windows for each service. Each needs the virtualenv activated and environment variables set: + +```powershell +# Terminal 1 — Scheduler (triggers ingestion on a cadence) +.venv\Scripts\activate +set MARKET_DATA_API_KEY=your_polygon_key +python -m services.scheduler.app + +# Terminal 2 — Ingestion (fetches articles, filings, market data) +.venv\Scripts\activate +set MARKET_DATA_API_KEY=your_polygon_key +python -m services.ingestion.worker + +# Terminal 3 — Parser (normalizes raw documents) +.venv\Scripts\activate +python -m services.parser.worker + +# Terminal 4 — Extractor (LLM-based intelligence extraction) +.venv\Scripts\activate +python -m services.extractor.main + +# Terminal 5 — Aggregation (merges signals into trend summaries) +.venv\Scripts\activate +python -m services.aggregation.main + +# Terminal 6 — Recommendation (generates trade recommendations) +.venv\Scripts\activate +python -m services.recommendation.main + +# Terminal 7 — Query API (REST API for the dashboard) +.venv\Scripts\activate +uvicorn services.api.app:app --host 0.0.0.0 --port 8000 + +# Terminal 8 — Symbol Registry (company CRUD API) +.venv\Scripts\activate +uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8001 + +# Terminal 9 — Risk Engine +.venv\Scripts\activate +uvicorn services.risk.app:app --host 0.0.0.0 --port 8002 + +# Terminal 10 — Trading Engine (autonomous paper trading) +.venv\Scripts\activate +set BROKER_API_KEY=your_alpaca_key +set BROKER_API_SECRET=your_alpaca_secret +set BROKER_BASE_URL=https://paper-api.alpaca.markets +uvicorn services.trading.app:app --host 0.0.0.0 --port 8003 + +# Terminal 11 — Broker Adapter (executes trades via Alpaca) +.venv\Scripts\activate +set BROKER_API_KEY=your_alpaca_key +set BROKER_API_SECRET=your_alpaca_secret +set BROKER_BASE_URL=https://paper-api.alpaca.markets +python -m services.adapters.broker_service +``` + +Not all services are required for basic development. The minimum set is: + +- **scheduler** + **ingestion** + **parser** + **extractor** — to get data flowing +- **aggregation** + **recommendation** — to generate signals +- **query-api** — to serve the dashboard + +Add the trading services when you want to test paper trading. + +### Option B: Build and run as Docker containers + +```powershell +# Build the Python service image +docker build -t stonks-oracle/services -f docker/Dockerfile . + +# Build the frontend +docker build -t stonks-oracle/dashboard -f frontend/Dockerfile frontend/ + +# Run a service (example: scheduler) +docker run --rm --network host ^ + -e MARKET_DATA_API_KEY=your_polygon_key ^ + -e POSTGRES_HOST=localhost ^ + -e REDIS_HOST=localhost ^ + -e MINIO_ENDPOINT=localhost:9000 ^ + -e OLLAMA_BASE_URL=http://localhost:11434 ^ + -e SERVICE_CMD="python -m services.scheduler.app" ^ + stonks-oracle/services +``` + +--- + +## 8. Run the Frontend Dashboard + +```powershell +cd frontend +npm install +npm run dev +``` + +The dashboard starts at [http://localhost:5173](http://localhost:5173). It proxies API requests to the backend services via Vite's dev server. + +For production-like testing, build and serve with nginx: + +```powershell +cd frontend +docker build -t stonks-oracle/dashboard . +docker run --rm -p 8080:8080 --network host stonks-oracle/dashboard +``` + +Then visit [http://localhost:8080](http://localhost:8080). + +--- + +## 9. Run Tests + +### Python tests + +```powershell +.venv\Scripts\activate +pip install ruff pytest pytest-asyncio hypothesis +ruff check services/ +python -m pytest tests/ -x --tb=short -q +``` + +### Frontend tests + +```powershell +cd frontend +npx vitest --run +``` + +--- + +## 10. How the Pipeline Works + +Once services are running, the data flows automatically: + +``` +Scheduler (every 15s) + → enqueues ingestion jobs for due sources + → Ingestion fetches articles/filings/market data from Polygon + → Parser normalizes raw text + → Extractor calls Ollama to extract structured intelligence + → Aggregation merges signals into trend summaries + → Recommendation generates buy/sell/watch signals + → Trading Engine evaluates and executes paper trades + → Broker Adapter submits orders to Alpaca +``` + +Monitor the pipeline via Redis queues: + +```powershell +docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:ingestion +docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:parsing +docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:extraction +docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:aggregation +docker exec stonks-oracle-redis-1 redis-cli llen stonks:queue:recommendation +``` + +--- + +## Environment Variable Reference + +All services read configuration from environment variables with sensible defaults for local development. You only need to set the ones that differ from defaults. + +### Required (no useful default) + +| Variable | Description | +|----------|-------------| +| `MARKET_DATA_API_KEY` | Polygon.io API key | +| `BROKER_API_KEY` | Alpaca API key ID | +| `BROKER_API_SECRET` | Alpaca API secret | + +### Infrastructure (defaults work with docker-compose) + +| Variable | Default | Description | +|----------|---------|-------------| +| `POSTGRES_HOST` | `localhost` | PostgreSQL host | +| `POSTGRES_PORT` | `5432` | PostgreSQL port | +| `POSTGRES_DB` | `stonks` | Database name | +| `POSTGRES_USER` | `stonks` | Database user | +| `POSTGRES_PASSWORD` | `stonks_dev` | Database password | +| `REDIS_HOST` | `localhost` | Redis host | +| `REDIS_PORT` | `6379` | Redis port | +| `REDIS_PASSWORD` | *(none)* | Redis password (not set in dev) | +| `MINIO_ENDPOINT` | `localhost:9000` | MinIO API endpoint | +| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key | +| `MINIO_SECRET_KEY` | `minioadmin` | MinIO secret key | +| `OLLAMA_BASE_URL` | `http://localhost:11434` | Ollama API URL | +| `OLLAMA_MODEL` | `qwen3.5:9b` | LLM model name | + +### Trading + +| Variable | Default | Description | +|----------|---------|-------------| +| `BROKER_MODE` | `paper` | Trading mode (`paper` or `live`) | +| `BROKER_PROVIDER` | `alpaca` | Broker provider | +| `BROKER_BASE_URL` | *(none)* | Alpaca API URL (set to `https://paper-api.alpaca.markets`) | + +--- + +## Troubleshooting + +### "Connection refused" to PostgreSQL/Redis/MinIO + +Make sure Docker Desktop is running and `docker compose ps` shows all services healthy. On Windows, `localhost` should work since Docker Desktop maps ports to the host. + +### Ollama model not found + +Run `docker exec stonks-oracle-ollama-1 ollama pull qwen3.5:9b` and wait for the download to complete. Check available models with `ollama list`. + +### Ollama is slow (no GPU) + +Without a GPU, Ollama runs on CPU and extraction takes 2-5 minutes per document. If you have an NVIDIA GPU, ensure Docker Desktop has GPU support enabled and the NVIDIA Container Toolkit is installed. See [Ollama Docker GPU docs](https://github.com/ollama/ollama/blob/main/docs/docker.md). + +### Migrations didn't run + +If the database is empty, the migrations may not have run on first start. You can apply them manually: + +```powershell +# Connect to postgres and run migrations in order +docker exec -i stonks-oracle-postgres-1 psql -U stonks -d stonks < infra/migrations/001_initial_schema.sql +docker exec -i stonks-oracle-postgres-1 psql -U stonks -d stonks < infra/migrations/002_documents_and_intelligence.sql +# ... repeat for all 030 migration files +``` + +Or run them all at once: + +```powershell +Get-ChildItem infra\migrations\*.sql | Sort-Object Name | ForEach-Object { + Write-Host "Applying $($_.Name)..." + Get-Content $_.FullName | docker exec -i stonks-oracle-postgres-1 psql -U stonks -d stonks +} +``` + +### Frontend can't reach the API + +When running the frontend with `npm run dev`, Vite proxies `/api/` requests. Make sure the Query API is running on port 8000. If using different ports, set the Vite env vars: + +```powershell +set VITE_QUERY_API_URL=http://localhost:8000 +set VITE_SYMBOL_REGISTRY_URL=http://localhost:8001 +set VITE_RISK_ENGINE_URL=http://localhost:8002 +npm run dev +``` + +### WSL 2 memory issues + +Docker Desktop on WSL 2 can consume a lot of memory. Create or edit `%USERPROFILE%\.wslconfig`: + +```ini +[wsl2] +memory=8GB +processors=4 +``` + +Then restart WSL: `wsl --shutdown` from PowerShell. + +--- + +## Stopping Everything + +```powershell +# Stop infrastructure +docker compose down + +# Stop infrastructure AND delete all data (fresh start) +docker compose down -v +``` + +The `-v` flag removes the named volumes (database data, MinIO objects, Ollama models). Omit it to preserve data between restarts.