- Replace all datetime.utcnow() with datetime.now(tz=timezone.utc) across 8 files - Fix 12 failing tests to match current implementation behavior - Fix pytest_plugins in non-top-level conftest (moved to root conftest.py) - Auto-fix 189 lint issues (import sorting, unused imports) - Add CI/CD pipeline infrastructure (ARC, ArgoCD, Kargo manifests) - Add values-beta.yaml and values-paper.yaml for staged deployments - Update GitHub Actions workflow to use self-hosted-gremlin runners - Add integration-test job to CI pipeline Result: 1596 passed, 0 failed, 0 warnings
5.2 KiB
Integration Test Pipeline — Requirements
Overview
End-to-end integration test pipeline that runs in Kubernetes, spinning up isolated infrastructure (PostgreSQL, Redis, MinIO), seeding realistic data, deploying all application services, and validating every frontend page's data dependencies against live API responses. Includes profiling for performance optimization.
Functional Requirements
FR-1: Integration Test Stages
This spec covers the integration test foundation — sandbox infra, seed data, test suites, profiling, and a standalone runner script. A separate CI/CD pipeline spec will consume this foundation to provide build, staged promotion (beta → paper → live), market-hours gating, and break-glass deployment.
Stages owned by this spec:
- Deploy Sandbox — ephemeral namespace with own PostgreSQL, Redis, MinIO (no Ollama — too heavy for CI)
- Seed Data — populate DB and S3 with enough data to exercise every frontend component
- Integration Tests — HTTP-level validation of every API endpoint the frontend depends on
- Frontend Data Deps — verify every page's API dependencies return valid data
- Profiling — measure and report timing for each stage and each API endpoint
- Teardown — delete the ephemeral namespace and all resources
Stages deferred to the CI/CD pipeline spec:
- Lint, unit tests, Docker image builds (self-hosted on gremlin nodes)
- Staged promotion: beta → paper → live namespaces
- Market-hours promotion blockers (no deploys during 9:30–16:00 ET unless break-glass)
- Break-glass emergency production deploy
- Per-stage enable/disable toggles
FR-2: Sandbox Infrastructure
- PostgreSQL 16 (ephemeral, no persistent volume)
- Redis 7 (ephemeral)
- MinIO (ephemeral, with bucket initialization)
- All application services (query-api, symbol-registry, risk, trading-engine) running against sandbox infra
- No Ollama — LLM-dependent services (extractor, recommendation thesis rewriter) are excluded from integration tests
- No Trino/Hive/Superset — analytical stack excluded (not needed for frontend validation)
FR-3: Seed Data Coverage
The seed data must exercise every frontend page. Minimum:
- 5 companies with sources, aliases, competitor relationships
- 10 documents (mix of news, filings, macro_event) with intelligence extraction records
- 5 trend windows with projections and evidence
- 5 recommendations with evidence citations
- 3 orders (filled, pending, cancelled) with events and audit trail
- 2 positions with P&L
- 2 global events with macro impact records across multiple companies
- 2 competitive signal records
- 2 historical pattern records
- 1 trading engine config + 1 trading decision
- 1 portfolio snapshot
- 3 AI agents with 1 variant each + performance log entries
- 1 risk config with macro_enabled and competitive_enabled
- MinIO buckets with at least 1 object in stonks-normalized
FR-4: API Endpoint Validation
Test every endpoint the frontend calls:
- Query API (17 endpoints): companies, documents, trends, recommendations, orders, positions, macro events, pipeline health, ingestion summary, coverage gaps, agents, variants
- Symbol Registry (8 endpoints): companies CRUD, sources, aliases, competitors, exposure profiles
- Risk Engine (4 endpoints): evaluate, approvals pending/review, health
- Trading Engine (12 endpoints): status, config, decisions, metrics, backtest, notifications, override
FR-5: Frontend Page Validation
For each of the 17 frontend pages, verify:
- The page renders without JavaScript errors
- All API calls return 200 with non-empty data
- Key data fields are present (e.g., company has ticker, trend has direction)
FR-6: Profiling & Reporting
- Wall-clock time for each pipeline stage
- P50/P95/P99 response times for each API endpoint
- Total seed data insertion time
- Memory usage of each service pod
- Final pass/fail summary with details on any failures
Non-Functional Requirements
NFR-1: Isolation
Each pipeline run uses a unique Kubernetes namespace (stonks-inttest-{run-id}) that is fully cleaned up on completion (success or failure).
NFR-2: Speed
Target: full pipeline completes in under 10 minutes. Seed data insertion under 30 seconds. API validation under 60 seconds.
NFR-3: Reproducibility
Seed data is deterministic (fixed UUIDs, timestamps). No external API calls (Polygon, Alpaca). All data is synthetic.
NFR-4: Pipeline Integration Contract
The runner script is a standalone tool that can be invoked by any CI/CD system. It exposes:
- CLI interface:
bash infra/inttest/run_pipeline.sh [--image-tag TAG] [--namespace NAME] [--skip-teardown] - Exit codes: 0 = all tests passed, 1 = test failures, 2 = infra setup failure
- JSON result file:
inttest-results.jsonwith test counts, pass/fail, per-endpoint latency, stage timings - stdout/stderr: human-readable progress and summary
A future CI/CD pipeline spec will invoke this script as a stage, passing in the image tag from a self-hosted build step. That spec will handle:
- Self-hosted build runners on gremlin nodes (no GitHub Actions compute)
- Staged promotion (beta → paper → live) with per-stage enable/disable
- Market-hours promotion blockers (9:30–16:00 ET)
- Break-glass emergency deploy to production