# Development Process — Test-Develop-Debug ## Local Environment - Ubuntu dev machine, Python 3.12, virtualenv at `.venv/` - Always use `.venv/bin/python` or activate with `source .venv/bin/activate` before running Python commands - Node.js 24 via nvm — always load nvm before running Node/npm/npx commands: `export NVM_DIR="$HOME/.nvm" && [ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh" && nvm use 24` - For tools not available in `.venv/` (ruff, gh, etc.), install via pip or apt as needed - Frontend tests: load nvm first, then `cd frontend && npx vitest --run` - Python tests: `.venv/bin/ruff check services/` then `.venv/bin/python -m pytest tests/ -x --tb=short -q` ## Workflow 1. Write or update tests for the target behavior 2. Implement the minimal code to pass 3. Debug failures, fix, re-run 4. Commit and push — CI builds images automatically 5. Deploy: `helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle` 6. Restart changed services: `kubectl rollout restart deployment/ -n stonks-oracle` ## Testing - Python: `pytest` with `pytest-asyncio` for async code, tests in `tests/` - Property-based tests: Hypothesis with `@settings(max_examples=100)`, files prefixed `test_pbt_*` - Frontend: Vitest + MSW (Mock Service Worker) for deterministic API mocking, tests in `frontend/src/test/` - Run Python tests: `python -m pytest tests/ -x --tb=short -q` - Run frontend tests: `cd frontend && npx vitest --run` - Lint Python: `.venv/bin/ruff check services/` - **Always run `.venv/bin/ruff check services/` before committing Python changes** — CI will reject the push if ruff fails - Ruff auto-fix: `.venv/bin/ruff check --fix services/` (fixes import sorting and other auto-fixable issues) - Ruff is pinned to `ruff==0.15.10` in `requirements.txt` — CI uses the same version - Ruff config: `ruff.toml` with `known-first-party = ["services"]` for consistent import sorting - Pre-existing test failures (not regressions): `test_extractor_prompts.py`, `test_extractor_schemas.py`, `test_filings_adapter.py`, `test_ollama_client.py` ## CI/CD — GitHub Actions - Workflow: `.github/workflows/build.yml` - Triggers on push to `main` and PRs - Jobs: - `lint-and-test`: ruff lint + pytest + frontend vitest (Node 24) - `build-services`: matrix build of all Python services → GHCR - `build-dashboard`: frontend/Dockerfile → GHCR (TypeScript strict mode — catches unused imports) - `build-superset`: docker/Dockerfile.superset → GHCR - CI handles all image builds and pushes — do NOT manually docker push - Check CI: `gh run list -L 3` - Re-run failed: `gh run rerun --failed` - View failure logs: `gh run view --log-failed` ## Deploy - Full deploy/redeploy: `bash ~/sources/kube/stonks-oracle/runmefirst.sh` (from gremlin-1) - Full teardown: `bash ~/sources/kube/stonks-oracle/runmelast.sh` (from gremlin-1) - Quick Helm upgrade: `helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle` - Restart single service: `kubectl rollout restart deployment/ -n stonks-oracle` - Restart multiple: `kubectl rollout restart deployment/aggregation deployment/query-api deployment/dashboard -n stonks-oracle` - Check pods: `kubectl get pods -n stonks-oracle` - Check logs: `kubectl logs deployment/ -n stonks-oracle --tail=30` ## Manually Triggering Ingestion The scheduler runs on a cadence, but to trigger immediately: ```python # From a scheduler pod: kubectl exec -n stonks-oracle -- python -c " import redis, json, asyncio, asyncpg async def enqueue(): pool = await asyncpg.create_pool(dsn='postgresql://stonks:@postgresql-rw.postgresql-service.svc.cluster.local:5432/stonks') r = redis.from_url('redis://:@redis-master.redis-service.svc.cluster.local:6379/0') rows = await pool.fetch('SELECT s.id AS source_id, s.source_type, s.config, c.id AS company_id, c.ticker FROM sources s JOIN companies c ON c.id = s.company_id WHERE s.active = TRUE AND c.active = TRUE') for row in rows: cfg = row['config'] if isinstance(row['config'], dict) else {} r.rpush('stonks:queue:ingestion', json.dumps({'source_id': str(row['source_id']), 'source_type': row['source_type'], 'ticker': row['ticker'], 'company_id': str(row['company_id']), 'config': cfg})) await pool.close() asyncio.run(enqueue()) " ``` Ingestion jobs MUST include `source_id`, `source_type`, `ticker`, `company_id`, and `config`. ## Git Conventions - Commit after each completed phase task - Commit message format: `feat:`, `fix:`, `phase N:` prefix - Push to `main` triggers CI ## Code Style - Python 3.12, type hints everywhere - Pydantic for data validation - FastAPI for HTTP services - asyncio + asyncpg/aioredis for async I/O - Minimal dependencies, prefer stdlib where possible - Frontend: React 19, TypeScript strict mode, Tailwind CSS, TanStack Router/Query, Recharts for charts - UUID fields from asyncpg must be converted to str via `_row_dict()` helpers - asyncpg interval parameters must be Python `timedelta` objects, not SQL strings - Recharts v3 callbacks: do NOT add explicit type annotations on `formatter`/`tickFormatter` — let TypeScript infer ## Common Pitfalls - asyncpg expects `timedelta` for `$N::interval` params, not strings like `'7 days'` - asyncpg expects UUID objects/strings for `$N::uuid` params — synthetic IDs like `pattern:AAPL:earnings:7d` will fail - The `competitor_relationships` table uses UUID company IDs — queries must join through `companies` to match by ticker - The dashboard Docker build uses TypeScript strict mode — unused imports that pass local diagnostics will fail in CI - Ingestion jobs require `source_id` from the `sources` table — don't just pass `ticker` ## Documentation - Do NOT create large summary/success markdown files after each step - Keep notes short, concise, and organized under `docs/notes/` - If a note isn't useful for future reference, don't write it