Files

T

Celes Renata c85c0068a2 fix: clean up utcnow deprecation warnings, fix 12 failing tests, add CI/CD pipeline manifests

- Replace all datetime.utcnow() with datetime.now(tz=timezone.utc) across 8 files
- Fix 12 failing tests to match current implementation behavior
- Fix pytest_plugins in non-top-level conftest (moved to root conftest.py)
- Auto-fix 189 lint issues (import sorting, unused imports)
- Add CI/CD pipeline infrastructure (ARC, ArgoCD, Kargo manifests)
- Add values-beta.yaml and values-paper.yaml for staged deployments
- Update GitHub Actions workflow to use self-hosted-gremlin runners
- Add integration-test job to CI pipeline

Result: 1596 passed, 0 failed, 0 warnings

2026-04-18 03:59:28 +00:00

5.2 KiB

Raw Blame History

Integration Test Pipeline — Requirements

Overview

End-to-end integration test pipeline that runs in Kubernetes, spinning up isolated infrastructure (PostgreSQL, Redis, MinIO), seeding realistic data, deploying all application services, and validating every frontend page's data dependencies against live API responses. Includes profiling for performance optimization.

Functional Requirements

FR-1: Integration Test Stages

This spec covers the integration test foundation — sandbox infra, seed data, test suites, profiling, and a standalone runner script. A separate CI/CD pipeline spec will consume this foundation to provide build, staged promotion (beta → paper → live), market-hours gating, and break-glass deployment.

Stages owned by this spec:

Deploy Sandbox — ephemeral namespace with own PostgreSQL, Redis, MinIO (no Ollama — too heavy for CI)
Seed Data — populate DB and S3 with enough data to exercise every frontend component
Integration Tests — HTTP-level validation of every API endpoint the frontend depends on
Frontend Data Deps — verify every page's API dependencies return valid data
Profiling — measure and report timing for each stage and each API endpoint
Teardown — delete the ephemeral namespace and all resources

Stages deferred to the CI/CD pipeline spec:

Lint, unit tests, Docker image builds (self-hosted on gremlin nodes)
Staged promotion: beta → paper → live namespaces
Market-hours promotion blockers (no deploys during 9:30–16:00 ET unless break-glass)
Break-glass emergency production deploy
Per-stage enable/disable toggles

FR-2: Sandbox Infrastructure

PostgreSQL 16 (ephemeral, no persistent volume)
Redis 7 (ephemeral)
MinIO (ephemeral, with bucket initialization)
All application services (query-api, symbol-registry, risk, trading-engine) running against sandbox infra
No Ollama — LLM-dependent services (extractor, recommendation thesis rewriter) are excluded from integration tests
No Trino/Hive/Superset — analytical stack excluded (not needed for frontend validation)

FR-3: Seed Data Coverage

The seed data must exercise every frontend page. Minimum:

5 companies with sources, aliases, competitor relationships
10 documents (mix of news, filings, macro_event) with intelligence extraction records
5 trend windows with projections and evidence
5 recommendations with evidence citations
3 orders (filled, pending, cancelled) with events and audit trail
2 positions with P&L
2 global events with macro impact records across multiple companies
2 competitive signal records
2 historical pattern records
1 trading engine config + 1 trading decision
1 portfolio snapshot
3 AI agents with 1 variant each + performance log entries
1 risk config with macro_enabled and competitive_enabled
MinIO buckets with at least 1 object in stonks-normalized

FR-4: API Endpoint Validation

Test every endpoint the frontend calls:

Query API (17 endpoints): companies, documents, trends, recommendations, orders, positions, macro events, pipeline health, ingestion summary, coverage gaps, agents, variants
Symbol Registry (8 endpoints): companies CRUD, sources, aliases, competitors, exposure profiles
Risk Engine (4 endpoints): evaluate, approvals pending/review, health
Trading Engine (12 endpoints): status, config, decisions, metrics, backtest, notifications, override

FR-5: Frontend Page Validation

For each of the 17 frontend pages, verify:

The page renders without JavaScript errors
All API calls return 200 with non-empty data
Key data fields are present (e.g., company has ticker, trend has direction)

FR-6: Profiling & Reporting

Wall-clock time for each pipeline stage
P50/P95/P99 response times for each API endpoint
Total seed data insertion time
Memory usage of each service pod
Final pass/fail summary with details on any failures

Non-Functional Requirements

NFR-1: Isolation

Each pipeline run uses a unique Kubernetes namespace (stonks-inttest-{run-id}) that is fully cleaned up on completion (success or failure).

NFR-2: Speed

Target: full pipeline completes in under 10 minutes. Seed data insertion under 30 seconds. API validation under 60 seconds.

NFR-3: Reproducibility

Seed data is deterministic (fixed UUIDs, timestamps). No external API calls (Polygon, Alpaca). All data is synthetic.

NFR-4: Pipeline Integration Contract

The runner script is a standalone tool that can be invoked by any CI/CD system. It exposes:

CLI interface: bash infra/inttest/run_pipeline.sh [--image-tag TAG] [--namespace NAME] [--skip-teardown]
Exit codes: 0 = all tests passed, 1 = test failures, 2 = infra setup failure
JSON result file: inttest-results.json with test counts, pass/fail, per-endpoint latency, stage timings
stdout/stderr: human-readable progress and summary

A future CI/CD pipeline spec will invoke this script as a stage, passing in the image tag from a self-hosted build step. That spec will handle:

Self-hosted build runners on gremlin nodes (no GitHub Actions compute)
Staged promotion (beta → paper → live) with per-stage enable/disable
Market-hours promotion blockers (9:30–16:00 ET)
Break-glass emergency deploy to production

5.2 KiB Raw Blame History Unescape Escape