spec: integration test pipeline — requirements, design, and tasks
This commit is contained in:
@@ -0,0 +1,219 @@
|
||||
# Integration Test Pipeline — Design
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Kubernetes Job: inttest-runner │
|
||||
│ │
|
||||
│ Stage 1: Create namespace stonks-inttest-{id} │
|
||||
│ Stage 2: Deploy infra (postgres, redis, minio) │
|
||||
│ Stage 3: Run migrations + seed data │
|
||||
│ Stage 4: Deploy services (query-api, registry, risk, trading) │
|
||||
│ Stage 5: Wait for readiness │
|
||||
│ Stage 6: Run API integration tests (pytest) │
|
||||
│ Stage 7: Run frontend render tests (vitest + fetch) │
|
||||
│ Stage 8: Collect profiling data │
|
||||
│ Stage 9: Teardown namespace │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Implementation Approach
|
||||
|
||||
### Option A: Kubernetes Job with Python test runner (chosen)
|
||||
A single Kubernetes Job that:
|
||||
1. Creates ephemeral infra via `kubectl apply` (postgres, redis, minio manifests)
|
||||
2. Runs a Python seed script directly against the DB
|
||||
3. Deploys service pods pointing at the ephemeral infra
|
||||
4. Runs pytest-based integration tests against the live services
|
||||
5. Collects timing metrics
|
||||
6. Tears down everything
|
||||
|
||||
### Why not Helm?
|
||||
The sandbox doesn't need Helm's complexity. Plain manifests are simpler, faster, and easier to debug. The infra is ephemeral — no upgrades, no rollbacks.
|
||||
|
||||
## Components
|
||||
|
||||
### 1. Sandbox Manifests (`infra/inttest/`)
|
||||
|
||||
```
|
||||
infra/inttest/
|
||||
├── namespace.yaml # Namespace template
|
||||
├── postgres.yaml # PostgreSQL 16 StatefulSet (no PV)
|
||||
├── redis.yaml # Redis 7 Deployment
|
||||
├── minio.yaml # MinIO Deployment + bucket init Job
|
||||
├── services.yaml # All 4 API services (query-api, registry, risk, trading)
|
||||
└── runner.yaml # The test runner Job itself
|
||||
```
|
||||
|
||||
Each manifest uses `${NAMESPACE}` placeholder, substituted at runtime.
|
||||
|
||||
### 2. Seed Script (`tests/integration/seed_sandbox.py`)
|
||||
|
||||
Pure SQL + MinIO operations. No external API calls. Inserts:
|
||||
- Companies, sources, aliases, competitor relationships
|
||||
- Documents with intelligence records
|
||||
- Trend windows with evidence and projections
|
||||
- Recommendations with evidence citations
|
||||
- Orders, positions, trading decisions
|
||||
- Global events with macro impacts
|
||||
- AI agents with variants and performance logs
|
||||
- Trading engine config, portfolio snapshots
|
||||
- MinIO objects (normalized text files)
|
||||
|
||||
All UUIDs are deterministic (hardcoded) for reproducible assertions.
|
||||
|
||||
### 3. Integration Tests (`tests/integration/test_api_endpoints.py`)
|
||||
|
||||
pytest-based tests that call every API endpoint the frontend depends on:
|
||||
|
||||
```python
|
||||
class TestQueryAPI:
|
||||
async def test_companies_list(self):
|
||||
resp = await client.get("/api/companies")
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert len(data) >= 5
|
||||
assert all("ticker" in c for c in data)
|
||||
|
||||
async def test_trend_detail(self):
|
||||
resp = await client.get(f"/api/trends/{SEED_TREND_ID}")
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert data["trend_direction"] in ("bullish", "bearish", "mixed")
|
||||
assert 0 <= data["confidence"] <= 1
|
||||
# ... 40+ test functions
|
||||
```
|
||||
|
||||
### 4. Frontend Render Tests (`tests/integration/test_frontend_renders.py`)
|
||||
|
||||
Uses the live sandbox APIs (not MSW mocks) to verify each page's data dependencies:
|
||||
|
||||
```python
|
||||
class TestFrontendDataDeps:
|
||||
"""Verify every API call each frontend page makes returns valid data."""
|
||||
|
||||
async def test_home_page_deps(self):
|
||||
# Home page calls: companies, pipeline health, ingestion summary, recommendations
|
||||
for endpoint in ["/api/companies", "/api/ops/pipeline/health", "/api/ops/ingestion/summary"]:
|
||||
resp = await client.get(endpoint)
|
||||
assert resp.status_code == 200
|
||||
|
||||
async def test_company_detail_deps(self):
|
||||
# CompanyDetail calls 12 different endpoints
|
||||
company_id = SEED_COMPANY_IDS["AAPL"]
|
||||
for endpoint in [
|
||||
f"/api/companies/{company_id}",
|
||||
f"/api/companies/{company_id}/sources",
|
||||
f"/api/companies/AAPL/macro-impacts",
|
||||
f"/api/companies/{company_id}/competitors",
|
||||
]:
|
||||
resp = await client.get(endpoint)
|
||||
assert resp.status_code == 200
|
||||
```
|
||||
|
||||
### 5. Profiler (`tests/integration/profiler.py`)
|
||||
|
||||
Wraps each test with timing:
|
||||
- Records wall-clock time per API call
|
||||
- Computes P50/P95/P99 across all calls
|
||||
- Outputs a summary table at the end
|
||||
- Flags any endpoint > 500ms as "slow"
|
||||
|
||||
### 6. Runner Script (`tests/integration/run_pipeline.sh`)
|
||||
|
||||
Orchestrates the full pipeline:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
NAMESPACE="stonks-inttest-$(date +%s)"
|
||||
PROFILING_OUTPUT="inttest-results-${NAMESPACE}.json"
|
||||
|
||||
# Stage 1: Create namespace
|
||||
kubectl create namespace $NAMESPACE
|
||||
|
||||
# Stage 2: Deploy infra
|
||||
envsubst < infra/inttest/postgres.yaml | kubectl apply -n $NAMESPACE -f -
|
||||
envsubst < infra/inttest/redis.yaml | kubectl apply -n $NAMESPACE -f -
|
||||
envsubst < infra/inttest/minio.yaml | kubectl apply -n $NAMESPACE -f -
|
||||
kubectl wait --for=condition=ready pod -l app=postgres -n $NAMESPACE --timeout=120s
|
||||
kubectl wait --for=condition=ready pod -l app=redis -n $NAMESPACE --timeout=60s
|
||||
kubectl wait --for=condition=ready pod -l app=minio -n $NAMESPACE --timeout=60s
|
||||
|
||||
# Stage 3: Run migrations + seed
|
||||
kubectl run seed-runner --image=ghcr.io/celesrenata/stonks-oracle/query-api:latest \
|
||||
-n $NAMESPACE --restart=Never --env="POSTGRES_HOST=postgres" ... \
|
||||
-- python -c "import asyncio; from tests.integration.seed_sandbox import seed; asyncio.run(seed())"
|
||||
kubectl wait --for=condition=complete job/seed-runner -n $NAMESPACE --timeout=120s
|
||||
|
||||
# Stage 4: Deploy services
|
||||
envsubst < infra/inttest/services.yaml | kubectl apply -n $NAMESPACE -f -
|
||||
kubectl wait --for=condition=ready pod -l tier=api -n $NAMESPACE --timeout=120s
|
||||
|
||||
# Stage 5: Run integration tests
|
||||
kubectl run test-runner --image=ghcr.io/celesrenata/stonks-oracle/query-api:latest \
|
||||
-n $NAMESPACE --restart=Never \
|
||||
-- python -m pytest tests/integration/ -v --tb=short
|
||||
|
||||
# Stage 6: Collect results
|
||||
kubectl logs job/test-runner -n $NAMESPACE > $PROFILING_OUTPUT
|
||||
|
||||
# Stage 7: Teardown
|
||||
kubectl delete namespace $NAMESPACE --wait=false
|
||||
```
|
||||
|
||||
## Profiling Strategy
|
||||
|
||||
### What to measure
|
||||
1. **Seed insertion time** — how long to populate all tables
|
||||
2. **Service startup time** — time from pod creation to readiness
|
||||
3. **API response times** — per-endpoint P50/P95/P99
|
||||
4. **Memory usage** — `kubectl top pods` snapshot during tests
|
||||
|
||||
### Performance targets
|
||||
| Metric | Target | Action if exceeded |
|
||||
|--------|--------|--------------------|
|
||||
| Seed insertion | < 30s | Batch INSERT optimization |
|
||||
| Service startup | < 30s each | Reduce import time, lazy loading |
|
||||
| API P95 | < 200ms | Query optimization, indexes |
|
||||
| API P99 | < 500ms | Connection pooling, caching |
|
||||
| Total pipeline | < 10 min | Parallelize stages |
|
||||
|
||||
### Optimization opportunities to discover
|
||||
- Slow SQL queries (missing indexes, N+1 patterns)
|
||||
- Heavy service startup (import chains)
|
||||
- Inefficient aggregation math
|
||||
- Unnecessary serialization overhead
|
||||
- Connection pool sizing
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
Seed Script
|
||||
├── PostgreSQL: companies, documents, trends, recommendations, orders, ...
|
||||
├── MinIO: normalized text files, audit artifacts
|
||||
└── Redis: (empty — no queue state needed for API tests)
|
||||
|
||||
Integration Tests
|
||||
├── Query API ← PostgreSQL (read-only queries)
|
||||
├── Symbol Registry ← PostgreSQL (CRUD operations)
|
||||
├── Risk Engine ← PostgreSQL (evaluation + approvals)
|
||||
└── Trading Engine ← PostgreSQL + Redis (status, decisions, backtest)
|
||||
```
|
||||
|
||||
## Namespace Lifecycle
|
||||
|
||||
```
|
||||
CREATE namespace
|
||||
→ Deploy postgres, redis, minio
|
||||
→ Wait for healthy
|
||||
→ Run migrations (init container)
|
||||
→ Run seed script
|
||||
→ Deploy services
|
||||
→ Wait for ready
|
||||
→ Run tests
|
||||
→ Collect results
|
||||
→ DELETE namespace (always, even on failure)
|
||||
```
|
||||
Reference in New Issue
Block a user