Files
stonks-oracle/.kiro/specs/intelligence-pipeline-deep-dive/tasks.md
T
Celes Renata 88ad1e8d99 feat: comprehensive docs, unit tests, docker-compose app services
- Add scheduler and ingestion unit tests (test_scheduler_unit.py, test_ingestion_unit.py)
- Add all 13 app services + dashboard to docker-compose.yml
- Add full documentation suite: API reference, Helm reference, Docker deployment guide,
  3 architecture diagrams (K8s, Docker Compose, data pipeline), AI agent guide,
  backup/restore guide, observability/metrics reference, per-service docs
- Add intelligence pipeline deep-dive docs with Mermaid diagrams
- Update README with documentation index and links
- Add specs for comprehensive-quality-docs, intelligence-pipeline-deep-dive,
  sanitized-pipeline-docs
2026-04-22 02:56:41 +00:00

36 lines
8.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Tasks — Intelligence Pipeline Deep Dive
## Task 1: Create directory structure and index file
- [x] 1.1 Create `docs/intelligence-pipeline-deep-dive/` directory and `docs/intelligence-pipeline-deep-dive/diagrams/` subdirectory
- [x] 1.2 Create `docs/intelligence-pipeline-deep-dive/index.md` with table of contents linking to all 6 pages and all diagram files, plus references to existing docs (`docs/services.md`, `docs/ai-agents.md`, `docs/architecture-data-pipeline.md`, `docs/llm-to-trade-pipeline.md`)
## Task 2: Create Mermaid diagram files
- [x] 2.1 Create `docs/intelligence-pipeline-deep-dive/diagrams/ingestion-to-extraction-flow.md` — flowchart from Scheduler through Ingestion, Parser, to Extractor with all queues (`stonks:queue:ingestion`, `stonks:queue:parsing`, `stonks:queue:extraction`, `stonks:queue:macro_classification`), storage (MinIO buckets, PostgreSQL tables), and service module paths
- [x] 2.2 Create `docs/intelligence-pipeline-deep-dive/diagrams/three-layer-signal-merging.md` — flowchart showing Company signals (`document_impact_records`), Macro signals (`macro_impact_records`), and Competitive signals (`competitive_signal_records`) each producing `WeightedSignal` objects that merge into the Aggregation engine (`services/aggregation/worker.py`)
- [x] 2.3 Create `docs/intelligence-pipeline-deep-dive/diagrams/weighted-signal-computation.md` — diagram showing the composite weight formula components: confidence gate, recency decay, source credibility, novelty bonus, and market context multiplier
- [x] 2.4 Create `docs/intelligence-pipeline-deep-dive/diagrams/trend-accumulation-escalation.md` — diagram showing how consecutive signals accumulate across time windows to escalate from neutral → watch → hold → buy/sell decisions
- [x] 2.5 Create `docs/intelligence-pipeline-deep-dive/diagrams/recommendation-generation-flow.md` — flowchart from TrendSummary through data quality suppression, eligibility evaluation, thesis generation, risk classification, to recommendation persistence
- [x] 2.6 Create `docs/intelligence-pipeline-deep-dive/diagrams/trading-engine-decision-loop.md` — flowchart showing the pre-trade check sequence (circuit breaker → trading window → confidence gate → dedup → declining positions → max positions), position sizing, and order submission to `stonks:queue:broker_orders`
## Task 3: Write Page 1 — Data Ingestion and Preparation
- [x] 3.1 Write `docs/intelligence-pipeline-deep-dive/01-data-ingestion-and-preparation.md` covering: four input data categories (Polygon news, SEC EDGAR filings, Polygon market data, macro news APIs), Scheduler cadence polling (market_api: 300s, news_api: 300s, filings_api: 3600s, macro_news: 600s) with rate limiting and backoff, Ingestion worker adapter dispatch (`PolygonMarketAdapter`, `PolygonNewsAdapter`, `SECEdgarAdapter`, `MacroNewsAdapter`), content deduplication via Redis (`stonks:dedupe:*` with 24h TTL), raw artifact storage in MinIO (`stonks-raw-market`, `stonks-raw-news`, `stonks-raw-filings`), Parser role (HTML normalization, quality scoring, company mention detection, routing `macro_event` docs to `stonks:queue:macro_classification`). Written in narrative prose with links to diagrams and transition to Page 2.
## Task 4: Write Page 2 — AI Agent Processing and Structured Extraction
- [x] 4.1 Write `docs/intelligence-pipeline-deep-dive/02-ai-agent-processing-and-extraction.md` covering: Document Intelligence Extractor agent (`document-extractor` slug, `services/extractor/main.py``services/extractor/client.py`, system prompt, `build_extraction_prompt()` in `services/extractor/prompts.py`), `ExtractionResult` JSON schema with all fields, Global Event Classifier agent (`event-classifier` slug, `services/extractor/event_classifier.py`, `GlobalEvent` schema, anti-hallucination rules), JSON repair pipeline (direct parse → fence stripping → `json-repair` fallback), structural + semantic validation in `services/extractor/schemas.py`, `AgentConfigResolver` mechanism (`services/shared/agent_config.py`, `ai_agents`/`agent_variants` tables, 60s TTL cache), persistence to `document_intelligence` and `document_impact_records`, aggregation job enqueue. Written in narrative prose with links to diagrams and transition to Page 3.
## Task 5: Write Page 3 — Signal Scoring and the WeightedSignal Abstraction
- [x] 5.1 Write `docs/intelligence-pipeline-deep-dive/03-signal-scoring-and-weighted-signals.md` covering: `WeightedSignal` dataclass (`services/aggregation/scoring.py`), composite weight formula (`combined = gate × recency × credibility × (1 + novelty_bonus) × market_context_multiplier`), each component in detail (confidence gate threshold 0.2, recency decay half-lives per window, source credibility clamped [0.1, 1.0], novelty bonus up to 25%, market context volatility boost up to 30% and volume surge boost 15%), sentiment mapping via `sentiment_to_numeric()`, weighted sentiment average computation, three signal layers (Company, Macro weight 0.3, Competitive weight 0.2), runtime toggle via `risk_configs` table. Written in narrative prose with links to diagrams and transition to Page 4.
## Task 6: Write Page 4 — Trend Aggregation and Accumulating Signals
- [x] 6.1 Write `docs/intelligence-pipeline-deep-dive/04-trend-aggregation-and-accumulating-signals.md` covering: Aggregation engine computing TrendSummary across 5 windows (intraday, 1d, 7d, 30d, 90d), trend direction rules (bullish ≥ 0.15, bearish ≤ -0.15, mixed, neutral), contradiction detection (`services/aggregation/contradiction.py`, minority_weight/total_weight), evidence ranking (`rank_evidence()` composite scoring), confidence computation (unique source count caps at 15, log₂ scaling saturates at 7 sources), how consecutive same-direction signals accumulate to escalate decisions (neutral → watch → hold → buy/sell), trend projections (`services/aggregation/projection.py`, macro decay, momentum, divergence detection), persistence to `trend_windows`, `trend_history`, `trend_evidence`, `trend_projections`. Written in narrative prose with links to diagrams and transition to Page 5.
## Task 7: Write Page 5 — Recommendation Generation and Signal-to-Action Translation
- [x] 7.1 Write `docs/intelligence-pipeline-deep-dive/05-recommendation-generation.md` covering: data quality suppression (`services/recommendation/suppression.py`, 6 checks: extraction confidence < 0.40, staleness > 168h, source diversity < 1, failure rate > 50%, valid docs < 2, quality score < 0.30, plus macro-only and pattern-only safety), eligibility evaluation (`services/recommendation/eligibility.py`, gate checks, action mapping BUY/SELL/HOLD/WATCH, mode escalation informational/paper_eligible/live_eligible), position sizing (base 1% + confidence × strength up to 10%, contradiction and evidence penalties), thesis generation (deterministic + optional LLM rewrite via `thesis-rewriter` agent), risk classification (low/moderate/high/very_high), persistence to `recommendations`, `recommendation_evidence`, `risk_evaluations`. Written in narrative prose with links to diagrams and transition to Page 6.
## Task 8: Write Page 6 — Trading Engine Decisions and Execution
- [x] 8.1 Write `docs/intelligence-pipeline-deep-dive/06-trading-decisions-and-execution.md` covering: Trading engine decision loop (`services/trading/engine.py`, 5 concurrent tasks: decision loop 60s, stop-loss monitor, performance loop, risk tier scheduler, rebalance scheduler), pre-trade check sequence (circuit breaker → trading window → confidence gate → dedup → declining positions → max positions), position sizing (`services/trading/position_sizer.py`, confidence scaling, risk tier adjustment, correlation diversification, sector exposure, earnings proximity, absolute cap), circuit breaker (`services/trading/circuit_breaker.py`, daily_loss, single_position, volatility triggers, cooldown, Redis state), reserve pool (`services/trading/reserve_pool.py`, profit siphoning 20%, high-water mark 30%, emergency liquidation), risk tier auto-adjustment (`services/trading/risk_tier_controller.py`, Sharpe/drawdown/win-rate evaluation, conservative/moderate/aggressive tiers), order submission flow (TradingDecision → `stonks:queue:broker_orders` → broker adapter → Alpaca). Written in narrative prose with links to diagrams.
## Task 9: Update index and verify cross-references
- [x] 9.1 Update `docs/intelligence-pipeline-deep-dive/index.md` to ensure all page links and diagram links are correct and all files exist
- [x] 9.2 Verify all inter-page links within narrative pages resolve correctly and all diagram references point to existing files