Files
stonks-oracle/.kiro/specs/integration-test-pipeline/design.md
T

8.4 KiB

Integration Test Pipeline — Design

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Kubernetes Job: inttest-runner                │
│                                                                 │
│  Stage 1: Create namespace stonks-inttest-{id}                  │
│  Stage 2: Deploy infra (postgres, redis, minio)                 │
│  Stage 3: Run migrations + seed data                            │
│  Stage 4: Deploy services (query-api, registry, risk, trading)  │
│  Stage 5: Wait for readiness                                    │
│  Stage 6: Run API integration tests (pytest)                    │
│  Stage 7: Run frontend render tests (vitest + fetch)            │
│  Stage 8: Collect profiling data                                │
│  Stage 9: Teardown namespace                                    │
└─────────────────────────────────────────────────────────────────┘

Implementation Approach

Option A: Kubernetes Job with Python test runner (chosen)

A single Kubernetes Job that:

  1. Creates ephemeral infra via kubectl apply (postgres, redis, minio manifests)
  2. Runs a Python seed script directly against the DB
  3. Deploys service pods pointing at the ephemeral infra
  4. Runs pytest-based integration tests against the live services
  5. Collects timing metrics
  6. Tears down everything

Why not Helm?

The sandbox doesn't need Helm's complexity. Plain manifests are simpler, faster, and easier to debug. The infra is ephemeral — no upgrades, no rollbacks.

Components

1. Sandbox Manifests (infra/inttest/)

infra/inttest/
├── namespace.yaml          # Namespace template
├── postgres.yaml           # PostgreSQL 16 StatefulSet (no PV)
├── redis.yaml              # Redis 7 Deployment
├── minio.yaml              # MinIO Deployment + bucket init Job
├── services.yaml           # All 4 API services (query-api, registry, risk, trading)
└── runner.yaml             # The test runner Job itself

Each manifest uses ${NAMESPACE} placeholder, substituted at runtime.

2. Seed Script (tests/integration/seed_sandbox.py)

Pure SQL + MinIO operations. No external API calls. Inserts:

  • Companies, sources, aliases, competitor relationships
  • Documents with intelligence records
  • Trend windows with evidence and projections
  • Recommendations with evidence citations
  • Orders, positions, trading decisions
  • Global events with macro impacts
  • AI agents with variants and performance logs
  • Trading engine config, portfolio snapshots
  • MinIO objects (normalized text files)

All UUIDs are deterministic (hardcoded) for reproducible assertions.

3. Integration Tests (tests/integration/test_api_endpoints.py)

pytest-based tests that call every API endpoint the frontend depends on:

class TestQueryAPI:
    async def test_companies_list(self):
        resp = await client.get("/api/companies")
        assert resp.status_code == 200
        data = resp.json()
        assert len(data) >= 5
        assert all("ticker" in c for c in data)

    async def test_trend_detail(self):
        resp = await client.get(f"/api/trends/{SEED_TREND_ID}")
        assert resp.status_code == 200
        data = resp.json()
        assert data["trend_direction"] in ("bullish", "bearish", "mixed")
        assert 0 <= data["confidence"] <= 1
    # ... 40+ test functions

4. Frontend Render Tests (tests/integration/test_frontend_renders.py)

Uses the live sandbox APIs (not MSW mocks) to verify each page's data dependencies:

class TestFrontendDataDeps:
    """Verify every API call each frontend page makes returns valid data."""

    async def test_home_page_deps(self):
        # Home page calls: companies, pipeline health, ingestion summary, recommendations
        for endpoint in ["/api/companies", "/api/ops/pipeline/health", "/api/ops/ingestion/summary"]:
            resp = await client.get(endpoint)
            assert resp.status_code == 200

    async def test_company_detail_deps(self):
        # CompanyDetail calls 12 different endpoints
        company_id = SEED_COMPANY_IDS["AAPL"]
        for endpoint in [
            f"/api/companies/{company_id}",
            f"/api/companies/{company_id}/sources",
            f"/api/companies/AAPL/macro-impacts",
            f"/api/companies/{company_id}/competitors",
        ]:
            resp = await client.get(endpoint)
            assert resp.status_code == 200

5. Profiler (tests/integration/profiler.py)

Wraps each test with timing:

  • Records wall-clock time per API call
  • Computes P50/P95/P99 across all calls
  • Outputs a summary table at the end
  • Flags any endpoint > 500ms as "slow"

6. Runner Script (tests/integration/run_pipeline.sh)

Orchestrates the full pipeline:

#!/bin/bash
set -euo pipefail

NAMESPACE="stonks-inttest-$(date +%s)"
PROFILING_OUTPUT="inttest-results-${NAMESPACE}.json"

# Stage 1: Create namespace
kubectl create namespace $NAMESPACE

# Stage 2: Deploy infra
envsubst < infra/inttest/postgres.yaml | kubectl apply -n $NAMESPACE -f -
envsubst < infra/inttest/redis.yaml | kubectl apply -n $NAMESPACE -f -
envsubst < infra/inttest/minio.yaml | kubectl apply -n $NAMESPACE -f -
kubectl wait --for=condition=ready pod -l app=postgres -n $NAMESPACE --timeout=120s
kubectl wait --for=condition=ready pod -l app=redis -n $NAMESPACE --timeout=60s
kubectl wait --for=condition=ready pod -l app=minio -n $NAMESPACE --timeout=60s

# Stage 3: Run migrations + seed
kubectl run seed-runner --image=ghcr.io/celesrenata/stonks-oracle/query-api:latest \
  -n $NAMESPACE --restart=Never --env="POSTGRES_HOST=postgres" ... \
  -- python -c "import asyncio; from tests.integration.seed_sandbox import seed; asyncio.run(seed())"
kubectl wait --for=condition=complete job/seed-runner -n $NAMESPACE --timeout=120s

# Stage 4: Deploy services
envsubst < infra/inttest/services.yaml | kubectl apply -n $NAMESPACE -f -
kubectl wait --for=condition=ready pod -l tier=api -n $NAMESPACE --timeout=120s

# Stage 5: Run integration tests
kubectl run test-runner --image=ghcr.io/celesrenata/stonks-oracle/query-api:latest \
  -n $NAMESPACE --restart=Never \
  -- python -m pytest tests/integration/ -v --tb=short

# Stage 6: Collect results
kubectl logs job/test-runner -n $NAMESPACE > $PROFILING_OUTPUT

# Stage 7: Teardown
kubectl delete namespace $NAMESPACE --wait=false

Profiling Strategy

What to measure

  1. Seed insertion time — how long to populate all tables
  2. Service startup time — time from pod creation to readiness
  3. API response times — per-endpoint P50/P95/P99
  4. Memory usagekubectl top pods snapshot during tests

Performance targets

Metric Target Action if exceeded
Seed insertion < 30s Batch INSERT optimization
Service startup < 30s each Reduce import time, lazy loading
API P95 < 200ms Query optimization, indexes
API P99 < 500ms Connection pooling, caching
Total pipeline < 10 min Parallelize stages

Optimization opportunities to discover

  • Slow SQL queries (missing indexes, N+1 patterns)
  • Heavy service startup (import chains)
  • Inefficient aggregation math
  • Unnecessary serialization overhead
  • Connection pool sizing

Data Flow

Seed Script
  ├── PostgreSQL: companies, documents, trends, recommendations, orders, ...
  ├── MinIO: normalized text files, audit artifacts
  └── Redis: (empty — no queue state needed for API tests)

Integration Tests
  ├── Query API ← PostgreSQL (read-only queries)
  ├── Symbol Registry ← PostgreSQL (CRUD operations)
  ├── Risk Engine ← PostgreSQL (evaluation + approvals)
  └── Trading Engine ← PostgreSQL + Redis (status, decisions, backtest)

Namespace Lifecycle

CREATE namespace
  → Deploy postgres, redis, minio
    → Wait for healthy
      → Run migrations (init container)
        → Run seed script
          → Deploy services
            → Wait for ready
              → Run tests
                → Collect results
                  → DELETE namespace (always, even on failure)