88ad1e8d99
- Add scheduler and ingestion unit tests (test_scheduler_unit.py, test_ingestion_unit.py) - Add all 13 app services + dashboard to docker-compose.yml - Add full documentation suite: API reference, Helm reference, Docker deployment guide, 3 architecture diagrams (K8s, Docker Compose, data pipeline), AI agent guide, backup/restore guide, observability/metrics reference, per-service docs - Add intelligence pipeline deep-dive docs with Mermaid diagrams - Update README with documentation index and links - Add specs for comprehensive-quality-docs, intelligence-pipeline-deep-dive, sanitized-pipeline-docs
101 lines
6.1 KiB
Markdown
101 lines
6.1 KiB
Markdown
# Development Process — Test-Develop-Debug
|
|
|
|
## Local Environment
|
|
- Ubuntu dev machine, Python 3.12, virtualenv at `.venv/`
|
|
- Always use `.venv/bin/python` or activate with `source .venv/bin/activate` before running Python commands
|
|
- Node.js 24 via nvm — always load nvm before running Node/npm/npx commands:
|
|
`export NVM_DIR="$HOME/.nvm" && [ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh" && nvm use 24`
|
|
- For tools not available in `.venv/` (ruff, gh, etc.), install via pip or apt as needed
|
|
- Frontend tests: load nvm first, then `cd frontend && npx vitest --run`
|
|
- Python tests: `.venv/bin/ruff check services/` then `.venv/bin/python -m pytest tests/ -x --tb=short -q`
|
|
|
|
## Workflow
|
|
1. Write or update tests for the target behavior
|
|
2. Implement the minimal code to pass
|
|
3. Debug failures, fix, re-run
|
|
4. Commit and push — CI builds images automatically
|
|
5. Deploy: `helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle`
|
|
6. Restart changed services: `kubectl rollout restart deployment/<name> -n stonks-oracle`
|
|
|
|
## Testing
|
|
- Python: `pytest` with `pytest-asyncio` for async code, tests in `tests/`
|
|
- Property-based tests: Hypothesis with `@settings(max_examples=100)`, files prefixed `test_pbt_*`
|
|
- Frontend: Vitest + MSW (Mock Service Worker) for deterministic API mocking, tests in `frontend/src/test/`
|
|
- Run Python tests: `python -m pytest tests/ -x --tb=short -q`
|
|
- Run frontend tests: `cd frontend && npx vitest --run`
|
|
- Lint Python: `.venv/bin/ruff check services/`
|
|
- **Always run `.venv/bin/ruff check services/` before committing Python changes** — CI will reject the push if ruff fails
|
|
- Ruff auto-fix: `.venv/bin/ruff check --fix services/` (fixes import sorting and other auto-fixable issues)
|
|
- Ruff is pinned to `ruff==0.15.10` in `requirements.txt` — CI uses the same version
|
|
- Ruff config: `ruff.toml` with `known-first-party = ["services"]` for consistent import sorting
|
|
- Pre-existing test failures (not regressions): `test_extractor_prompts.py`, `test_extractor_schemas.py`, `test_filings_adapter.py`, `test_ollama_client.py`
|
|
|
|
## CI/CD — Woodpecker CI (Gitea) → GitHub promotion
|
|
- Woodpecker pipelines in `.woodpecker/` — triggered by push to `main` on Gitea
|
|
- Push to Gitea: `git push gitea main`
|
|
- Gitea remote: `http://admin:<password>@10.1.1.12:30300/admin/stonks-oracle.git`
|
|
- Pipeline stages: lint → pytest → frontend vitest → build all service images + dashboard + superset → push to Harbor
|
|
- ArgoCD watches Gitea `main` and auto-syncs beta/paper/live stages
|
|
- **Do NOT push directly to GitHub** — GitHub is the promotion target after CI passes
|
|
- Once Woodpecker builds and tests pass, code is promoted to GitHub (`git push origin main`)
|
|
- CI handles all image builds and pushes — do NOT manually docker push
|
|
- Check Woodpecker CI status from the Gitea web UI or Woodpecker dashboard
|
|
|
|
## Deploy
|
|
- Full deploy/redeploy: `bash ~/sources/kube/stonks-oracle/runmefirst.sh` (from gremlin-1)
|
|
- Full teardown: `bash ~/sources/kube/stonks-oracle/runmelast.sh` (from gremlin-1)
|
|
- Quick Helm upgrade: `helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle`
|
|
- Restart single service: `kubectl rollout restart deployment/<name> -n stonks-oracle`
|
|
- Restart multiple: `kubectl rollout restart deployment/aggregation deployment/query-api deployment/dashboard -n stonks-oracle`
|
|
- Check pods: `kubectl get pods -n stonks-oracle`
|
|
- Check logs: `kubectl logs deployment/<name> -n stonks-oracle --tail=30`
|
|
|
|
## Manually Triggering Ingestion
|
|
The scheduler runs on a cadence, but to trigger immediately:
|
|
```python
|
|
# From a scheduler pod:
|
|
kubectl exec -n stonks-oracle <scheduler-pod> -- python -c "
|
|
import redis, json, asyncio, asyncpg
|
|
async def enqueue():
|
|
pool = await asyncpg.create_pool(dsn='postgresql://stonks:<password>@postgresql-rw.postgresql-service.svc.cluster.local:5432/stonks')
|
|
r = redis.from_url('redis://:<password>@redis-master.redis-service.svc.cluster.local:6379/0')
|
|
rows = await pool.fetch('SELECT s.id AS source_id, s.source_type, s.config, c.id AS company_id, c.ticker FROM sources s JOIN companies c ON c.id = s.company_id WHERE s.active = TRUE AND c.active = TRUE')
|
|
for row in rows:
|
|
cfg = row['config'] if isinstance(row['config'], dict) else {}
|
|
r.rpush('stonks:queue:ingestion', json.dumps({'source_id': str(row['source_id']), 'source_type': row['source_type'], 'ticker': row['ticker'], 'company_id': str(row['company_id']), 'config': cfg}))
|
|
await pool.close()
|
|
asyncio.run(enqueue())
|
|
"
|
|
```
|
|
Ingestion jobs MUST include `source_id`, `source_type`, `ticker`, `company_id`, and `config`.
|
|
|
|
## Git Conventions
|
|
- Commit after each completed phase task
|
|
- Commit message format: `feat:`, `fix:`, `phase N:` prefix
|
|
- Always push to Gitea: `git push gitea main`
|
|
- Do NOT push to GitHub (`origin`) directly — GitHub is the promotion target after CI passes
|
|
- ArgoCD syncs from Gitea automatically
|
|
|
|
## Code Style
|
|
- Python 3.12, type hints everywhere
|
|
- Pydantic for data validation
|
|
- FastAPI for HTTP services
|
|
- asyncio + asyncpg/aioredis for async I/O
|
|
- Minimal dependencies, prefer stdlib where possible
|
|
- Frontend: React 19, TypeScript strict mode, Tailwind CSS, TanStack Router/Query, Recharts for charts
|
|
- UUID fields from asyncpg must be converted to str via `_row_dict()` helpers
|
|
- asyncpg interval parameters must be Python `timedelta` objects, not SQL strings
|
|
- Recharts v3 callbacks: do NOT add explicit type annotations on `formatter`/`tickFormatter` — let TypeScript infer
|
|
|
|
## Common Pitfalls
|
|
- asyncpg expects `timedelta` for `$N::interval` params, not strings like `'7 days'`
|
|
- asyncpg expects UUID objects/strings for `$N::uuid` params — synthetic IDs like `pattern:AAPL:earnings:7d` will fail
|
|
- The `competitor_relationships` table uses UUID company IDs — queries must join through `companies` to match by ticker
|
|
- The dashboard Docker build uses TypeScript strict mode — unused imports that pass local diagnostics will fail in CI
|
|
- Ingestion jobs require `source_id` from the `sources` table — don't just pass `ticker`
|
|
|
|
## Documentation
|
|
- Do NOT create large summary/success markdown files after each step
|
|
- Keep notes short, concise, and organized under `docs/notes/`
|
|
- If a note isn't useful for future reference, don't write it
|