- Migration 038: trading_reports table + report-summarizer agent seed
- 6 reporting modules: models, collector, sections, validator, summarizer, generator
- API endpoints: GET /api/reports (paginated, filterable), GET /api/reports/{id}
- Frontend hooks: useReports, useReport with TanStack Query
- Scheduler: daily (after 16:30 ET) and weekly (Saturday) report triggers
- Redis queue consumer for async report generation with retry/dedup
- 5 property-based tests (chunking, serialization, validation, accuracy, deltas)
- 109 unit/integration tests across all modules
- 6 frontend hook tests with MSW mocks
12 KiB
Requirements Document
Introduction
The Trading Feedback Engine generates periodic performance reports from the Stonks Oracle trading system. Reports cover trading P&L, recommendation accuracy, position performance, risk metrics, and model quality trends. An AI agent (registered in the ai_agents table) summarizes sections of the report by processing data in small chunks that fit within the 8k-token context window. Reports are validated against live data from the prediction outcomes and model metric snapshots tables, stored in the database for retrieval, and exposed via API endpoints.
Glossary
- Feedback_Engine: The backend service that orchestrates report generation, data collection, AI summarization, and report storage.
- Report_Summarizer_Agent: The AI agent registered in the
ai_agentstable that generates natural-language summaries for report sections. Uses the existingAgentConfigResolverandllm_factoryinfrastructure. - Report: A structured JSON document containing trading performance metrics, AI-generated summaries, and validation data for a specific period (daily or weekly).
- Report_Section: A self-contained portion of a report (e.g., P&L summary, recommendation accuracy, position performance) that can be independently generated and summarized.
- Chunk: A subset of data rows small enough to fit within the 8k-token context window when serialized, allowing the Report_Summarizer_Agent to process it in a single LLM call.
- Portfolio_Snapshot: A daily record in the
portfolio_snapshotstable containing portfolio value, pool balances, returns, win/loss counts, Sharpe ratio, max drawdown, and risk tier. - Prediction_Outcome: A record in the
prediction_outcomestable containing realized returns, direction correctness, and excess returns vs benchmarks for a prediction at a specific horizon. - Model_Metric_Snapshot: A record in the
model_metric_snapshotstable containing aggregate model quality metrics (win rate, IC, ECE, Brier score) for a lookback/horizon combination. - Trading_Decision: A record in the
trading_decisionstable capturing the act/skip decision, skip reason, position sizing, risk tier, circuit breaker status, and decision trace for a recommendation evaluation. - Validation_Data: Live data from
prediction_outcomes,model_metric_snapshots, andsignal_evidence_linksused to cross-check report claims against actual measured performance. - Query_API: The existing FastAPI service (
services/api/app.py) that serves HTTP endpoints for the dashboard and external consumers.
Requirements
Requirement 1: Report Data Collection
User Story: As a trader, I want the feedback engine to collect all relevant trading data for a reporting period, so that reports reflect the complete picture of trading activity.
Acceptance Criteria
- WHEN a report generation is triggered for a date range, THE Feedback_Engine SHALL query trading_decisions, orders, positions, portfolio_snapshots, recommendations, prediction_outcomes, and model_metric_snapshots for that period.
- WHEN collecting trading decision data, THE Feedback_Engine SHALL include the decision type, skip reason, ticker, computed position size, risk tier, circuit breaker status, and correlation check result for each Trading_Decision.
- WHEN collecting portfolio data, THE Feedback_Engine SHALL retrieve the most recent Portfolio_Snapshot within the reporting period and compute period-over-period changes in portfolio value, active pool, reserve pool, and cumulative return.
- WHEN collecting recommendation accuracy data, THE Feedback_Engine SHALL join recommendations with Prediction_Outcomes to compute win rate, directional accuracy, and average excess return vs SPY for the period.
- IF no trading_decisions exist for the requested period, THEN THE Feedback_Engine SHALL generate a report with zero-activity sections and a note indicating no trading occurred.
Requirement 2: Chunked AI Summarization
User Story: As a trader, I want AI-generated summaries in my reports, so that I can quickly understand performance trends without reading raw numbers.
Acceptance Criteria
- THE Report_Summarizer_Agent SHALL be registered in the
ai_agentstable with slugreport-summarizer, modelqwen3.5:9b-fast, and sourcesystem. - WHEN generating a summary for a Report_Section, THE Feedback_Engine SHALL serialize the section data into Chunks of no more than 6,000 characters each to stay within the 8k-token context window.
- WHEN a Report_Section contains data that exceeds a single Chunk, THE Feedback_Engine SHALL split the data into multiple Chunks, summarize each Chunk independently, and then produce a final merged summary from the individual Chunk summaries.
- WHEN invoking the Report_Summarizer_Agent, THE Feedback_Engine SHALL use the existing
AgentConfigResolverandllm_factoryinfrastructure to resolve model configuration and build the LLM client. - WHEN invoking the Report_Summarizer_Agent, THE Feedback_Engine SHALL log each invocation to the
agent_performance_logtable with agent_id, success status, duration_ms, and token estimates. - IF the Report_Summarizer_Agent fails after max_retries, THEN THE Feedback_Engine SHALL fall back to a deterministic text summary built from the raw metrics and continue report generation.
Requirement 3: Report Structure and Content
User Story: As a trader, I want reports to cover P&L, recommendation accuracy, position performance, risk metrics, and model quality, so that I have a comprehensive view of system performance.
Acceptance Criteria
- THE Report SHALL contain a P&L section with realized P&L, unrealized P&L, daily return, cumulative return, win count, loss count, win rate, profit factor, and Sharpe ratio for the reporting period.
- THE Report SHALL contain a recommendation accuracy section with total recommendations evaluated, act/skip breakdown, win rate of acted-upon recommendations, and average confidence of acted vs skipped recommendations.
- THE Report SHALL contain a position performance section listing each position held during the period with ticker, entry price, current or exit price, unrealized or realized P&L, P&L percentage, and hold duration.
- THE Report SHALL contain a risk metrics section with current risk tier, portfolio heat, max drawdown, current drawdown percentage, reserve pool balance, and a count of circuit breaker events during the period.
- THE Report SHALL contain a model quality section with the latest Model_Metric_Snapshot values for win rate, directional accuracy, information coefficient, calibration error (ECE), and Brier score across the 7d, 30d, and 90d lookback windows.
- THE Report SHALL contain an AI-generated executive summary that synthesizes the key findings from all sections into a concise narrative of no more than 300 words.
Requirement 4: Report Validation Against Live Data
User Story: As a trader, I want report metrics to be cross-checked against live validation data, so that I can trust the accuracy of the reported numbers.
Acceptance Criteria
- WHEN generating the recommendation accuracy section, THE Feedback_Engine SHALL cross-reference reported win rates with the
direction_correctandprofitablefields from Prediction_Outcomes for the same tickers and period. - WHEN generating the model quality section, THE Feedback_Engine SHALL compare the reported metrics against the most recent Model_Metric_Snapshot records and flag discrepancies greater than 5% between computed and snapshot values.
- WHEN a validation discrepancy is detected, THE Feedback_Engine SHALL include a
validation_warningsarray in the report section with the field name, computed value, snapshot value, and percentage difference. - THE Report SHALL include a
validation_statusfield set topassedwhen no discrepancies exceed 5%, orwarningswhen one or more discrepancies are detected.
Requirement 5: Report Storage and Retrieval
User Story: As a trader, I want reports stored in the database and accessible via API, so that I can review historical performance at any time.
Acceptance Criteria
- THE Feedback_Engine SHALL store each generated Report as a row in a
trading_reportstable with columns for id (UUID), report_type (daily/weekly), period_start (DATE), period_end (DATE), report_data (JSONB), validation_status (VARCHAR), generated_at (TIMESTAMPTZ), and created_at (TIMESTAMPTZ). - THE Feedback_Engine SHALL enforce a unique constraint on (report_type, period_start, period_end) to prevent duplicate reports for the same period.
- WHEN a report for an existing period is regenerated, THE Feedback_Engine SHALL update the existing row with the new report_data, validation_status, and generated_at timestamp.
- THE Query_API SHALL expose a
GET /api/reportsendpoint that returns a paginated list of reports with id, report_type, period_start, period_end, validation_status, and generated_at. - THE Query_API SHALL expose a
GET /api/reports/{report_id}endpoint that returns the full report including report_data JSONB. - THE Query_API SHALL support filtering reports by report_type and date range via query parameters on the
GET /api/reportsendpoint.
Requirement 6: Periodic Report Generation
User Story: As a trader, I want reports generated automatically on a daily and weekly schedule, so that I always have up-to-date performance feedback.
Acceptance Criteria
- THE Feedback_Engine SHALL generate a daily report after market close (after 16:30 ET) covering the current trading day.
- THE Feedback_Engine SHALL generate a weekly report on Saturday covering the Monday-through-Friday trading week.
- WHEN a scheduled report generation is triggered, THE Feedback_Engine SHALL enqueue a report generation job on a Redis queue for asynchronous processing.
- IF a report generation job fails, THEN THE Feedback_Engine SHALL retry the job up to 3 times with exponential backoff before marking the job as failed.
- WHILE a report generation job is in progress for a given period, THE Feedback_Engine SHALL reject duplicate job submissions for the same report_type and period.
Requirement 7: Agent Registration and Editability
User Story: As a trader, I want the report summarizer agent registered in the ai_agents table, so that I can edit its prompts, model, and parameters through the existing agent management API.
Acceptance Criteria
- THE Feedback_Engine SHALL register the Report_Summarizer_Agent in the
ai_agentstable via a database migration with slugreport-summarizer, sourcesystem, model_providerollama, and model_nameqwen3.5:9b-fast. - THE Report_Summarizer_Agent system prompt SHALL instruct the model to produce concise financial performance summaries, avoid fabricating data not present in the input, and keep each summary under 200 words.
- THE Report_Summarizer_Agent SHALL support variant creation and activation through the existing agent variants system, allowing A/B testing of different summarization prompts.
- WHEN the Report_Summarizer_Agent configuration is updated via the agent management API, THE Feedback_Engine SHALL pick up the new configuration within 60 seconds via the
AgentConfigResolverTTL cache.
Requirement 8: Report Serialization Round-Trip
User Story: As a developer, I want report data to survive serialization and deserialization without data loss, so that stored reports are always faithful to the generated content.
Acceptance Criteria
- THE Feedback_Engine SHALL serialize Report objects to JSON for storage in the
report_dataJSONB column. - THE Feedback_Engine SHALL deserialize stored JSON back into Report objects for API responses.
- FOR ALL valid Report objects, serializing to JSON then deserializing back SHALL produce an equivalent Report object (round-trip property).
- THE Feedback_Engine SHALL use ISO 8601 format for all datetime fields in serialized reports.