feat: trading feedback engine — periodic performance reports with AI summarization

- Migration 038: trading_reports table + report-summarizer agent seed - 6 reporting modules: models, collector, sections, validator, summarizer, generator - API endpoints: GET /api/reports (paginated, filterable), GET /api/reports/{id} - Frontend hooks: useReports, useReport with TanStack Query - Scheduler: daily (after 16:30 ET) and weekly (Saturday) report triggers - Redis queue consumer for async report generation with retry/dedup - 5 property-based tests (chunking, serialization, validation, accuracy, deltas) - 109 unit/integration tests across all modules - 6 frontend hook tests with MSW mocks
2026-05-01 22:13:09 +00:00
parent 376fcb4bb4
commit bc077bfcc8
28 changed files with 6771 additions and 1 deletions
@@ -0,0 +1,117 @@
+# Requirements Document
+
+## Introduction
+
+The Trading Feedback Engine generates periodic performance reports from the Stonks Oracle trading system. Reports cover trading P&L, recommendation accuracy, position performance, risk metrics, and model quality trends. An AI agent (registered in the `ai_agents` table) summarizes sections of the report by processing data in small chunks that fit within the 8k-token context window. Reports are validated against live data from the prediction outcomes and model metric snapshots tables, stored in the database for retrieval, and exposed via API endpoints.
+
+## Glossary
+
+- **Feedback_Engine**: The backend service that orchestrates report generation, data collection, AI summarization, and report storage.
+- **Report_Summarizer_Agent**: The AI agent registered in the `ai_agents` table that generates natural-language summaries for report sections. Uses the existing `AgentConfigResolver` and `llm_factory` infrastructure.
+- **Report**: A structured JSON document containing trading performance metrics, AI-generated summaries, and validation data for a specific period (daily or weekly).
+- **Report_Section**: A self-contained portion of a report (e.g., P&L summary, recommendation accuracy, position performance) that can be independently generated and summarized.
+- **Chunk**: A subset of data rows small enough to fit within the 8k-token context window when serialized, allowing the Report_Summarizer_Agent to process it in a single LLM call.
+- **Portfolio_Snapshot**: A daily record in the `portfolio_snapshots` table containing portfolio value, pool balances, returns, win/loss counts, Sharpe ratio, max drawdown, and risk tier.
+- **Prediction_Outcome**: A record in the `prediction_outcomes` table containing realized returns, direction correctness, and excess returns vs benchmarks for a prediction at a specific horizon.
+- **Model_Metric_Snapshot**: A record in the `model_metric_snapshots` table containing aggregate model quality metrics (win rate, IC, ECE, Brier score) for a lookback/horizon combination.
+- **Trading_Decision**: A record in the `trading_decisions` table capturing the act/skip decision, skip reason, position sizing, risk tier, circuit breaker status, and decision trace for a recommendation evaluation.
+- **Validation_Data**: Live data from `prediction_outcomes`, `model_metric_snapshots`, and `signal_evidence_links` used to cross-check report claims against actual measured performance.
+- **Query_API**: The existing FastAPI service (`services/api/app.py`) that serves HTTP endpoints for the dashboard and external consumers.
+
+## Requirements
+
+### Requirement 1: Report Data Collection
+
+**User Story:** As a trader, I want the feedback engine to collect all relevant trading data for a reporting period, so that reports reflect the complete picture of trading activity.
+
+#### Acceptance Criteria
+
+1. WHEN a report generation is triggered for a date range, THE Feedback_Engine SHALL query trading_decisions, orders, positions, portfolio_snapshots, recommendations, prediction_outcomes, and model_metric_snapshots for that period.
+2. WHEN collecting trading decision data, THE Feedback_Engine SHALL include the decision type, skip reason, ticker, computed position size, risk tier, circuit breaker status, and correlation check result for each Trading_Decision.
+3. WHEN collecting portfolio data, THE Feedback_Engine SHALL retrieve the most recent Portfolio_Snapshot within the reporting period and compute period-over-period changes in portfolio value, active pool, reserve pool, and cumulative return.
+4. WHEN collecting recommendation accuracy data, THE Feedback_Engine SHALL join recommendations with Prediction_Outcomes to compute win rate, directional accuracy, and average excess return vs SPY for the period.
+5. IF no trading_decisions exist for the requested period, THEN THE Feedback_Engine SHALL generate a report with zero-activity sections and a note indicating no trading occurred.
+
+### Requirement 2: Chunked AI Summarization
+
+**User Story:** As a trader, I want AI-generated summaries in my reports, so that I can quickly understand performance trends without reading raw numbers.
+
+#### Acceptance Criteria
+
+1. THE Report_Summarizer_Agent SHALL be registered in the `ai_agents` table with slug `report-summarizer`, model `qwen3.5:9b-fast`, and source `system`.
+2. WHEN generating a summary for a Report_Section, THE Feedback_Engine SHALL serialize the section data into Chunks of no more than 6,000 characters each to stay within the 8k-token context window.
+3. WHEN a Report_Section contains data that exceeds a single Chunk, THE Feedback_Engine SHALL split the data into multiple Chunks, summarize each Chunk independently, and then produce a final merged summary from the individual Chunk summaries.
+4. WHEN invoking the Report_Summarizer_Agent, THE Feedback_Engine SHALL use the existing `AgentConfigResolver` and `llm_factory` infrastructure to resolve model configuration and build the LLM client.
+5. WHEN invoking the Report_Summarizer_Agent, THE Feedback_Engine SHALL log each invocation to the `agent_performance_log` table with agent_id, success status, duration_ms, and token estimates.
+6. IF the Report_Summarizer_Agent fails after max_retries, THEN THE Feedback_Engine SHALL fall back to a deterministic text summary built from the raw metrics and continue report generation.
+
+### Requirement 3: Report Structure and Content
+
+**User Story:** As a trader, I want reports to cover P&L, recommendation accuracy, position performance, risk metrics, and model quality, so that I have a comprehensive view of system performance.
+
+#### Acceptance Criteria
+
+1. THE Report SHALL contain a P&L section with realized P&L, unrealized P&L, daily return, cumulative return, win count, loss count, win rate, profit factor, and Sharpe ratio for the reporting period.
+2. THE Report SHALL contain a recommendation accuracy section with total recommendations evaluated, act/skip breakdown, win rate of acted-upon recommendations, and average confidence of acted vs skipped recommendations.
+3. THE Report SHALL contain a position performance section listing each position held during the period with ticker, entry price, current or exit price, unrealized or realized P&L, P&L percentage, and hold duration.
+4. THE Report SHALL contain a risk metrics section with current risk tier, portfolio heat, max drawdown, current drawdown percentage, reserve pool balance, and a count of circuit breaker events during the period.
+5. THE Report SHALL contain a model quality section with the latest Model_Metric_Snapshot values for win rate, directional accuracy, information coefficient, calibration error (ECE), and Brier score across the 7d, 30d, and 90d lookback windows.
+6. THE Report SHALL contain an AI-generated executive summary that synthesizes the key findings from all sections into a concise narrative of no more than 300 words.
+
+### Requirement 4: Report Validation Against Live Data
+
+**User Story:** As a trader, I want report metrics to be cross-checked against live validation data, so that I can trust the accuracy of the reported numbers.
+
+#### Acceptance Criteria
+
+1. WHEN generating the recommendation accuracy section, THE Feedback_Engine SHALL cross-reference reported win rates with the `direction_correct` and `profitable` fields from Prediction_Outcomes for the same tickers and period.
+2. WHEN generating the model quality section, THE Feedback_Engine SHALL compare the reported metrics against the most recent Model_Metric_Snapshot records and flag discrepancies greater than 5% between computed and snapshot values.
+3. WHEN a validation discrepancy is detected, THE Feedback_Engine SHALL include a `validation_warnings` array in the report section with the field name, computed value, snapshot value, and percentage difference.
+4. THE Report SHALL include a `validation_status` field set to `passed` when no discrepancies exceed 5%, or `warnings` when one or more discrepancies are detected.
+
+### Requirement 5: Report Storage and Retrieval
+
+**User Story:** As a trader, I want reports stored in the database and accessible via API, so that I can review historical performance at any time.
+
+#### Acceptance Criteria
+
+1. THE Feedback_Engine SHALL store each generated Report as a row in a `trading_reports` table with columns for id (UUID), report_type (daily/weekly), period_start (DATE), period_end (DATE), report_data (JSONB), validation_status (VARCHAR), generated_at (TIMESTAMPTZ), and created_at (TIMESTAMPTZ).
+2. THE Feedback_Engine SHALL enforce a unique constraint on (report_type, period_start, period_end) to prevent duplicate reports for the same period.
+3. WHEN a report for an existing period is regenerated, THE Feedback_Engine SHALL update the existing row with the new report_data, validation_status, and generated_at timestamp.
+4. THE Query_API SHALL expose a `GET /api/reports` endpoint that returns a paginated list of reports with id, report_type, period_start, period_end, validation_status, and generated_at.
+5. THE Query_API SHALL expose a `GET /api/reports/{report_id}` endpoint that returns the full report including report_data JSONB.
+6. THE Query_API SHALL support filtering reports by report_type and date range via query parameters on the `GET /api/reports` endpoint.
+
+### Requirement 6: Periodic Report Generation
+
+**User Story:** As a trader, I want reports generated automatically on a daily and weekly schedule, so that I always have up-to-date performance feedback.
+
+#### Acceptance Criteria
+
+1. THE Feedback_Engine SHALL generate a daily report after market close (after 16:30 ET) covering the current trading day.
+2. THE Feedback_Engine SHALL generate a weekly report on Saturday covering the Monday-through-Friday trading week.
+3. WHEN a scheduled report generation is triggered, THE Feedback_Engine SHALL enqueue a report generation job on a Redis queue for asynchronous processing.
+4. IF a report generation job fails, THEN THE Feedback_Engine SHALL retry the job up to 3 times with exponential backoff before marking the job as failed.
+5. WHILE a report generation job is in progress for a given period, THE Feedback_Engine SHALL reject duplicate job submissions for the same report_type and period.
+
+### Requirement 7: Agent Registration and Editability
+
+**User Story:** As a trader, I want the report summarizer agent registered in the ai_agents table, so that I can edit its prompts, model, and parameters through the existing agent management API.
+
+#### Acceptance Criteria
+
+1. THE Feedback_Engine SHALL register the Report_Summarizer_Agent in the `ai_agents` table via a database migration with slug `report-summarizer`, source `system`, model_provider `ollama`, and model_name `qwen3.5:9b-fast`.
+2. THE Report_Summarizer_Agent system prompt SHALL instruct the model to produce concise financial performance summaries, avoid fabricating data not present in the input, and keep each summary under 200 words.
+3. THE Report_Summarizer_Agent SHALL support variant creation and activation through the existing agent variants system, allowing A/B testing of different summarization prompts.
+4. WHEN the Report_Summarizer_Agent configuration is updated via the agent management API, THE Feedback_Engine SHALL pick up the new configuration within 60 seconds via the `AgentConfigResolver` TTL cache.
+
+### Requirement 8: Report Serialization Round-Trip
+
+**User Story:** As a developer, I want report data to survive serialization and deserialization without data loss, so that stored reports are always faithful to the generated content.
+
+#### Acceptance Criteria
+
+1. THE Feedback_Engine SHALL serialize Report objects to JSON for storage in the `report_data` JSONB column.
+2. THE Feedback_Engine SHALL deserialize stored JSON back into Report objects for API responses.
+3. FOR ALL valid Report objects, serializing to JSON then deserializing back SHALL produce an equivalent Report object (round-trip property).
+4. THE Feedback_Engine SHALL use ISO 8601 format for all datetime fields in serialized reports.