Files
stonks-oracle/.kiro/specs/trading-feedback-engine/requirements.md
T
Celes Renata bc077bfcc8
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
feat: trading feedback engine — periodic performance reports with AI summarization
- Migration 038: trading_reports table + report-summarizer agent seed
- 6 reporting modules: models, collector, sections, validator, summarizer, generator
- API endpoints: GET /api/reports (paginated, filterable), GET /api/reports/{id}
- Frontend hooks: useReports, useReport with TanStack Query
- Scheduler: daily (after 16:30 ET) and weekly (Saturday) report triggers
- Redis queue consumer for async report generation with retry/dedup
- 5 property-based tests (chunking, serialization, validation, accuracy, deltas)
- 109 unit/integration tests across all modules
- 6 frontend hook tests with MSW mocks
2026-05-01 22:13:09 +00:00

12 KiB

Requirements Document

Introduction

The Trading Feedback Engine generates periodic performance reports from the Stonks Oracle trading system. Reports cover trading P&L, recommendation accuracy, position performance, risk metrics, and model quality trends. An AI agent (registered in the ai_agents table) summarizes sections of the report by processing data in small chunks that fit within the 8k-token context window. Reports are validated against live data from the prediction outcomes and model metric snapshots tables, stored in the database for retrieval, and exposed via API endpoints.

Glossary

  • Feedback_Engine: The backend service that orchestrates report generation, data collection, AI summarization, and report storage.
  • Report_Summarizer_Agent: The AI agent registered in the ai_agents table that generates natural-language summaries for report sections. Uses the existing AgentConfigResolver and llm_factory infrastructure.
  • Report: A structured JSON document containing trading performance metrics, AI-generated summaries, and validation data for a specific period (daily or weekly).
  • Report_Section: A self-contained portion of a report (e.g., P&L summary, recommendation accuracy, position performance) that can be independently generated and summarized.
  • Chunk: A subset of data rows small enough to fit within the 8k-token context window when serialized, allowing the Report_Summarizer_Agent to process it in a single LLM call.
  • Portfolio_Snapshot: A daily record in the portfolio_snapshots table containing portfolio value, pool balances, returns, win/loss counts, Sharpe ratio, max drawdown, and risk tier.
  • Prediction_Outcome: A record in the prediction_outcomes table containing realized returns, direction correctness, and excess returns vs benchmarks for a prediction at a specific horizon.
  • Model_Metric_Snapshot: A record in the model_metric_snapshots table containing aggregate model quality metrics (win rate, IC, ECE, Brier score) for a lookback/horizon combination.
  • Trading_Decision: A record in the trading_decisions table capturing the act/skip decision, skip reason, position sizing, risk tier, circuit breaker status, and decision trace for a recommendation evaluation.
  • Validation_Data: Live data from prediction_outcomes, model_metric_snapshots, and signal_evidence_links used to cross-check report claims against actual measured performance.
  • Query_API: The existing FastAPI service (services/api/app.py) that serves HTTP endpoints for the dashboard and external consumers.

Requirements

Requirement 1: Report Data Collection

User Story: As a trader, I want the feedback engine to collect all relevant trading data for a reporting period, so that reports reflect the complete picture of trading activity.

Acceptance Criteria

  1. WHEN a report generation is triggered for a date range, THE Feedback_Engine SHALL query trading_decisions, orders, positions, portfolio_snapshots, recommendations, prediction_outcomes, and model_metric_snapshots for that period.
  2. WHEN collecting trading decision data, THE Feedback_Engine SHALL include the decision type, skip reason, ticker, computed position size, risk tier, circuit breaker status, and correlation check result for each Trading_Decision.
  3. WHEN collecting portfolio data, THE Feedback_Engine SHALL retrieve the most recent Portfolio_Snapshot within the reporting period and compute period-over-period changes in portfolio value, active pool, reserve pool, and cumulative return.
  4. WHEN collecting recommendation accuracy data, THE Feedback_Engine SHALL join recommendations with Prediction_Outcomes to compute win rate, directional accuracy, and average excess return vs SPY for the period.
  5. IF no trading_decisions exist for the requested period, THEN THE Feedback_Engine SHALL generate a report with zero-activity sections and a note indicating no trading occurred.

Requirement 2: Chunked AI Summarization

User Story: As a trader, I want AI-generated summaries in my reports, so that I can quickly understand performance trends without reading raw numbers.

Acceptance Criteria

  1. THE Report_Summarizer_Agent SHALL be registered in the ai_agents table with slug report-summarizer, model qwen3.5:9b-fast, and source system.
  2. WHEN generating a summary for a Report_Section, THE Feedback_Engine SHALL serialize the section data into Chunks of no more than 6,000 characters each to stay within the 8k-token context window.
  3. WHEN a Report_Section contains data that exceeds a single Chunk, THE Feedback_Engine SHALL split the data into multiple Chunks, summarize each Chunk independently, and then produce a final merged summary from the individual Chunk summaries.
  4. WHEN invoking the Report_Summarizer_Agent, THE Feedback_Engine SHALL use the existing AgentConfigResolver and llm_factory infrastructure to resolve model configuration and build the LLM client.
  5. WHEN invoking the Report_Summarizer_Agent, THE Feedback_Engine SHALL log each invocation to the agent_performance_log table with agent_id, success status, duration_ms, and token estimates.
  6. IF the Report_Summarizer_Agent fails after max_retries, THEN THE Feedback_Engine SHALL fall back to a deterministic text summary built from the raw metrics and continue report generation.

Requirement 3: Report Structure and Content

User Story: As a trader, I want reports to cover P&L, recommendation accuracy, position performance, risk metrics, and model quality, so that I have a comprehensive view of system performance.

Acceptance Criteria

  1. THE Report SHALL contain a P&L section with realized P&L, unrealized P&L, daily return, cumulative return, win count, loss count, win rate, profit factor, and Sharpe ratio for the reporting period.
  2. THE Report SHALL contain a recommendation accuracy section with total recommendations evaluated, act/skip breakdown, win rate of acted-upon recommendations, and average confidence of acted vs skipped recommendations.
  3. THE Report SHALL contain a position performance section listing each position held during the period with ticker, entry price, current or exit price, unrealized or realized P&L, P&L percentage, and hold duration.
  4. THE Report SHALL contain a risk metrics section with current risk tier, portfolio heat, max drawdown, current drawdown percentage, reserve pool balance, and a count of circuit breaker events during the period.
  5. THE Report SHALL contain a model quality section with the latest Model_Metric_Snapshot values for win rate, directional accuracy, information coefficient, calibration error (ECE), and Brier score across the 7d, 30d, and 90d lookback windows.
  6. THE Report SHALL contain an AI-generated executive summary that synthesizes the key findings from all sections into a concise narrative of no more than 300 words.

Requirement 4: Report Validation Against Live Data

User Story: As a trader, I want report metrics to be cross-checked against live validation data, so that I can trust the accuracy of the reported numbers.

Acceptance Criteria

  1. WHEN generating the recommendation accuracy section, THE Feedback_Engine SHALL cross-reference reported win rates with the direction_correct and profitable fields from Prediction_Outcomes for the same tickers and period.
  2. WHEN generating the model quality section, THE Feedback_Engine SHALL compare the reported metrics against the most recent Model_Metric_Snapshot records and flag discrepancies greater than 5% between computed and snapshot values.
  3. WHEN a validation discrepancy is detected, THE Feedback_Engine SHALL include a validation_warnings array in the report section with the field name, computed value, snapshot value, and percentage difference.
  4. THE Report SHALL include a validation_status field set to passed when no discrepancies exceed 5%, or warnings when one or more discrepancies are detected.

Requirement 5: Report Storage and Retrieval

User Story: As a trader, I want reports stored in the database and accessible via API, so that I can review historical performance at any time.

Acceptance Criteria

  1. THE Feedback_Engine SHALL store each generated Report as a row in a trading_reports table with columns for id (UUID), report_type (daily/weekly), period_start (DATE), period_end (DATE), report_data (JSONB), validation_status (VARCHAR), generated_at (TIMESTAMPTZ), and created_at (TIMESTAMPTZ).
  2. THE Feedback_Engine SHALL enforce a unique constraint on (report_type, period_start, period_end) to prevent duplicate reports for the same period.
  3. WHEN a report for an existing period is regenerated, THE Feedback_Engine SHALL update the existing row with the new report_data, validation_status, and generated_at timestamp.
  4. THE Query_API SHALL expose a GET /api/reports endpoint that returns a paginated list of reports with id, report_type, period_start, period_end, validation_status, and generated_at.
  5. THE Query_API SHALL expose a GET /api/reports/{report_id} endpoint that returns the full report including report_data JSONB.
  6. THE Query_API SHALL support filtering reports by report_type and date range via query parameters on the GET /api/reports endpoint.

Requirement 6: Periodic Report Generation

User Story: As a trader, I want reports generated automatically on a daily and weekly schedule, so that I always have up-to-date performance feedback.

Acceptance Criteria

  1. THE Feedback_Engine SHALL generate a daily report after market close (after 16:30 ET) covering the current trading day.
  2. THE Feedback_Engine SHALL generate a weekly report on Saturday covering the Monday-through-Friday trading week.
  3. WHEN a scheduled report generation is triggered, THE Feedback_Engine SHALL enqueue a report generation job on a Redis queue for asynchronous processing.
  4. IF a report generation job fails, THEN THE Feedback_Engine SHALL retry the job up to 3 times with exponential backoff before marking the job as failed.
  5. WHILE a report generation job is in progress for a given period, THE Feedback_Engine SHALL reject duplicate job submissions for the same report_type and period.

Requirement 7: Agent Registration and Editability

User Story: As a trader, I want the report summarizer agent registered in the ai_agents table, so that I can edit its prompts, model, and parameters through the existing agent management API.

Acceptance Criteria

  1. THE Feedback_Engine SHALL register the Report_Summarizer_Agent in the ai_agents table via a database migration with slug report-summarizer, source system, model_provider ollama, and model_name qwen3.5:9b-fast.
  2. THE Report_Summarizer_Agent system prompt SHALL instruct the model to produce concise financial performance summaries, avoid fabricating data not present in the input, and keep each summary under 200 words.
  3. THE Report_Summarizer_Agent SHALL support variant creation and activation through the existing agent variants system, allowing A/B testing of different summarization prompts.
  4. WHEN the Report_Summarizer_Agent configuration is updated via the agent management API, THE Feedback_Engine SHALL pick up the new configuration within 60 seconds via the AgentConfigResolver TTL cache.

Requirement 8: Report Serialization Round-Trip

User Story: As a developer, I want report data to survive serialization and deserialization without data loss, so that stored reports are always faithful to the generated content.

Acceptance Criteria

  1. THE Feedback_Engine SHALL serialize Report objects to JSON for storage in the report_data JSONB column.
  2. THE Feedback_Engine SHALL deserialize stored JSON back into Report objects for API responses.
  3. FOR ALL valid Report objects, serializing to JSON then deserializing back SHALL produce an equivalent Report object (round-trip property).
  4. THE Feedback_Engine SHALL use ISO 8601 format for all datetime fields in serialized reports.