Files
stonks-oracle/.kiro/specs/global-news-interpolation/tasks.md
T

22 KiB

Implementation Plan: Global News Interpolation Layer

Overview

This plan implements a macro-level global news interpolation layer that ingests global/geopolitical news events, classifies them via Ollama, maps them to companies via exposure profiles, and feeds macro impact scores into the existing aggregation engine. The implementation extends existing services (extractor, aggregation, symbol registry, recommendation, API, lake publisher, dashboard) rather than creating new deployments. Tasks are ordered so each step builds on the previous, with property-based tests validating core scoring logic early.

Tasks

  • 1. Database migration and shared schemas

    • 1.1 Create PostgreSQL migration infra/migrations/016_global_news_interpolation.sql

      • Add global_events table with event_types, severity, affected_regions, affected_sectors, affected_commodities, summary, key_facts, estimated_duration, confidence, source_document_id FK, model metadata, created_at
      • Add macro_impact_records table with event_id FK, company_id FK, ticker, macro_impact_score, impact_direction, contributing_factors, confidence, computed_at
      • Add exposure_profiles table with company_id FK, geographic_revenue_mix, supply_chain_regions, key_input_commodities, regulatory_jurisdictions, market_position_tier, export_dependency_pct, source, confidence, version, active, created_at, updated_at
      • Add trend_projections table with trend_window_id FK, projected_direction, projected_strength, projected_confidence, projection_horizon, driving_factors, macro_contribution_pct, diverges_from_current, computed_at
      • Add indexes on macro_impact_records(event_id), macro_impact_records(company_id, computed_at), macro_impact_records(ticker, computed_at), exposure_profiles(company_id, active), global_events(created_at), trend_projections(trend_window_id)
      • Requirements: 7.1, 7.2, 3.1, 12.5
    • 1.2 Add new Pydantic schemas and enums to services/shared/schemas.py

      • Add ImpactType, SeverityLevel, MarketPositionTier, EstimatedDuration enums
      • Add MACRO_EVENT = "macro_event" to DocumentType enum
      • Add GlobalEventSchema, MacroImpactRecordSchema, ExposureProfileSchema, TrendProjectionSchema Pydantic models
      • Requirements: 2.2, 4.5, 3.1, 12.1
    • 1.3 Add macro-related Redis queue name to services/shared/redis_keys.py

      • Add QUEUE_MACRO_CLASSIFICATION = "macro_classification" for event classification jobs
      • Requirements: 1.1
    • 1.4 Add macro configuration fields to services/shared/config.py

      • Add macro_signal_weight, macro_enabled, macro_confidence_threshold, macro_short_term_staleness_hours, projection_confidence_threshold fields to a new MacroConfig dataclass
      • Add macro: MacroConfig to AppConfig with env var loading in load_config()
      • Requirements: 5.6, 10.1, 10.2, 12.9
  • 2. Checkpoint — Ensure migration and schemas are consistent

    • Ensure all tests pass, ask the user if questions arise.
  • 3. Event classifier module

    • 3.1 Implement services/extractor/event_classifier.py

      • Implement GlobalEvent dataclass matching the design specification
      • Implement get_event_json_schema() returning the Ollama structured output schema for event classification
      • Implement build_event_classification_prompt(text: str) -> str with anti-hallucination instructions for macro event extraction
      • Implement classify_global_event(normalized_text, document_id, ollama_client) -> GlobalEvent using the existing OllamaClient with retry logic
      • Persist classification prompt, schema, model metadata, and raw output to MinIO under stonks-llm-prompts/ and stonks-llm-results/
      • Persist the GlobalEvent record to the global_events PostgreSQL table
      • Requirements: 2.1, 2.2, 2.3, 2.4, 2.5
    • 3.2 Write property test for GlobalEvent schema completeness

      • Property 2: Macro pipeline output schema completeness
      • Validates: Requirements 2.2, 4.5
    • 3.3 Write property test for multiple impact types preserved

      • Property 3: Multiple impact types preserved
      • Validates: Requirements 2.4
  • 4. Exposure profile management

    • 4.1 Implement services/symbol_registry/exposure.py

      • Implement ExposureProfile Pydantic model for API request/response
      • Implement GET /companies/{company_id}/exposure endpoint returning the current active profile
      • Implement PUT /companies/{company_id}/exposure endpoint that archives the previous version (sets active=FALSE) and inserts a new version with incremented version number
      • Implement GET /companies/{company_id}/exposure/history endpoint returning all profile versions ordered by version descending
      • Register routes on the Symbol Registry FastAPI app
      • Requirements: 3.1, 3.3, 3.4
    • 4.2 Write property test for exposure profile version history

      • Property 6: Exposure profile version history
      • Validates: Requirements 3.3
    • 4.3 Write property test for default exposure profile derivation

      • Property 5: Default exposure profile derivation
      • Validates: Requirements 3.2
  • 5. Interpolation engine — core scoring logic

    • 5.1 Implement services/aggregation/interpolation.py

      • Implement MacroImpactRecord dataclass matching the design specification
      • Implement compute_geographic_overlap(event_regions, revenue_mix) -> float using revenue percentage weighting
      • Implement compute_supply_chain_overlap(event_regions, supply_regions) -> float using set intersection ratio
      • Implement compute_commodity_overlap(event_commodities, company_commodities) -> float using set intersection ratio
      • Implement apply_resilience_modifier(raw_score, tier, event_is_international) -> float with tier multipliers: global_leader=0.7, multinational=0.85, regional=1.0, domestic=1.2
      • Implement compute_macro_impact(event: GlobalEvent, profile: ExposureProfile) -> MacroImpactRecord using the scoring formula: severity_weight * (0.35*geo + 0.25*supply + 0.25*commodity + 0.15*sector) then resilience modifier
      • Implement build_default_profile(sector, industry, market_cap_bucket) -> ExposureProfile for companies without manual profiles
      • Handle zero-overlap case: return score 0.0 and skip further processing
      • Handle mixed direction: when both positive and negative factors exist, set direction to 'mixed' and preserve both factor lists
      • Persist MacroImpactRecord objects to the macro_impact_records PostgreSQL table
      • Requirements: 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 3.2
    • 5.2 Write property test for macro impact score bounds and zero-overlap invariant

      • Property 7: Macro impact score bounds and zero-overlap invariant
      • Validates: Requirements 4.1, 4.4
    • 5.3 Write property test for scoring monotonicity

      • Property 8: Scoring monotonicity
      • Validates: Requirements 4.2
    • 5.4 Write property test for resilience modifier tier ordering

      • Property 9: Resilience modifier tier ordering
      • Validates: Requirements 4.3
    • 5.5 Write property test for mixed direction dual-effect events

      • Property 10: Mixed direction for dual-effect events
      • Validates: Requirements 4.6
  • 6. Checkpoint — Ensure core scoring logic and property tests pass

    • Ensure all tests pass, ask the user if questions arise.
  • 7. Aggregation engine integration

    • 7.1 Extend services/aggregation/worker.py to incorporate macro signals

      • Add macro_signal_weight and macro_enabled fields to AggregationConfig
      • In aggregate_company_window, check macro toggle state from risk_configs table
      • Fetch macro_impact_records for the ticker within the aggregation window
      • Convert each MacroImpactRecord to a WeightedSignal using: document_id=event.source_document_id, sentiment_value mapped from impact_direction, impact_score=macro_impact_score * macro_signal_weight, recency decay from event publication time, confidence gating from macro record confidence
      • Merge macro signals with company-specific signals before computing trend direction, strength, confidence, and contradiction score
      • Include contributing GlobalEvent source_document_ids in evidence references
      • When macro layer is disabled or no macro data exists, produce identical output to company-only aggregation
      • Requirements: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6
    • 7.2 Write property test for macro signals influencing trend output

      • Property 11: Macro signals influence trend output
      • Validates: Requirements 5.1
    • 7.3 Write property test for macro-company contradiction detection

      • Property 12: Macro-company contradiction detection
      • Validates: Requirements 5.3
    • 7.4 Write property test for macro evidence traceability

      • Property 13: Macro evidence traceability
      • Validates: Requirements 5.4
    • 7.5 Write property test for no degradation without macro data and disabled-layer equivalence

      • Property 14: No degradation without macro data and disabled-layer equivalence
      • Validates: Requirements 5.5, 11.2
  • 8. Sector and market rollup enhancement

    • 8.1 Extend sector and market rollup logic in services/aggregation/worker.py

      • When computing sector-level rollups, incorporate macro impact signals affecting the sector weighted by constituent company exposure
      • When computing market-level rollups, aggregate macro signals across all sectors reflecting breadth and severity
      • When a GlobalEvent disproportionately affects one sector (>60% of total macro impact), surface that sector in material_risks or dominant_catalysts of the market-level rollup
      • Requirements: 6.1, 6.2, 6.3
    • 8.2 Write property test for sector and market rollup macro incorporation

      • Property 15: Sector and market rollup macro incorporation
      • Validates: Requirements 6.1, 6.2, 6.3
  • 9. Trend projection module

    • 9.1 Implement services/aggregation/projection.py

      • Implement TrendProjection dataclass matching the design specification
      • Implement projection logic: compute trend momentum (rate of change in strength across recent windows), project macro signal decay based on estimated_duration and severity, factor in upcoming catalysts from document intelligence, combine into projected direction/strength/confidence
      • Flag divergence when projected direction differs from current trend direction, include divergence reason in driving_factors
      • When macro layer is disabled, compute projections using only company-specific momentum with reduced confidence
      • Mark projections with projected_confidence below threshold (default 0.3) as low_confidence
      • Persist TrendProjection to the trend_projections PostgreSQL table alongside the trend_window record
      • Call projection computation from aggregate_company_window after trend summary is assembled
      • Requirements: 12.1, 12.2, 12.3, 12.4, 12.5, 12.9
    • 9.2 Write property test for trend projection always produced

      • Property 20: Trend projection always produced
      • Validates: Requirements 12.1
    • 9.3 Write property test for projection divergence flagging

      • Property 21: Projection divergence flagging
      • Validates: Requirements 12.3
    • 9.4 Write property test for macro-disabled projections have reduced confidence

      • Property 22: Macro-disabled projections have reduced confidence
      • Validates: Requirements 12.4
    • 9.5 Write property test for low-confidence projection exclusion

      • Property 23: Low-confidence projection exclusion
      • Validates: Requirements 12.9
  • 10. Checkpoint — Ensure aggregation integration and projections work correctly

    • Ensure all tests pass, ask the user if questions arise.
  • 11. Macro signal suppression and safety

    • 11.1 Implement exposure profile auto-inference in services/extractor/exposure_inference.py

      • Implement infer_exposure_profile(document_intelligences, sector, industry, market_cap_bucket) -> ExposureProfile
      • Scan recent filing extractions for geographic revenue breakdowns, supplier mentions, and commodity references
      • Produce profile with source='inferred' and a confidence score reflecting data quality
      • Fall back to sector-based default profile when insufficient filing data
      • Requirements: 9.1, 9.2, 9.3
    • 11.2 Write property test for inferred exposure profile correctness

      • Property 16: Inferred exposure profile correctness
      • Validates: Requirements 9.1, 9.2
    • 11.3 Extend services/recommendation/suppression.py with macro-only suppression

      • Add MACRO_ONLY_SIGNAL = "macro_only_signal" to SuppressionReason enum
      • Implement evaluate_macro_only_suppression(summary, macro_signal_count, company_signal_count) -> bool
      • When macro signals are the sole basis for a trend direction change, force recommendation to mode='informational' and append macro-only caveat to thesis
      • Requirements: 10.3
    • 11.4 Write property test for macro-only recommendation suppression

      • Property 19: Macro-only recommendation suppression
      • Validates: Requirements 10.3
    • 11.5 Implement low-confidence event exclusion and accelerated decay in interpolation engine

      • In services/aggregation/interpolation.py, skip events with confidence below configurable threshold (default 0.4) and log exclusion reason
      • Apply accelerated decay factor for short_term events older than 48 hours (effective weight strictly less than standard recency decay)
      • Requirements: 10.1, 10.2
    • 11.6 Write property test for low-confidence event exclusion

      • Property 17: Low-confidence event exclusion
      • Validates: Requirements 10.1
    • 11.7 Write property test for accelerated decay for stale short-term events

      • Property 18: Accelerated decay for stale short-term events
      • Validates: Requirements 10.2
  • 12. Macro signal layer toggle and API endpoints

    • 12.1 Implement macro toggle and status endpoints in services/api/app.py

      • Add GET /api/admin/macro/status returning current enabled/disabled state from risk_configs table
      • Add PUT /api/admin/macro/toggle to switch macro layer on/off, persisting to risk_configs and recording an audit event with previous state, new state, and operator
      • Toggle state is read from PostgreSQL at the start of each aggregation cycle (no caching)
      • Requirements: 11.1, 11.5, 11.7
    • 12.2 Implement macro event and impact query endpoints in services/api/app.py

      • Add GET /api/macro/events — list recent global events with filtering by severity, region, sector, date range
      • Add GET /api/macro/events/{event_id} — event detail with list of affected companies and their macro impact scores
      • Add GET /api/macro/impacts/{ticker} — macro impacts for a specific company
      • Add GET /api/trends/{trend_id}/projection — trend projection for a specific trend window
      • Include projection data in existing GET /api/trends list response
      • Requirements: 8.1, 8.2, 12.10
    • 12.3 Ensure macro ingestion continues when layer is disabled

      • When macro layer is disabled, ingestion and classification continue (historical data preserved), but interpolation and aggregation integration are skipped
      • When re-enabled, resume computing macro impact scores using most recent classifications including events ingested while disabled
      • Requirements: 11.2, 11.3, 11.4
  • 13. Checkpoint — Ensure API endpoints and toggle logic work correctly

    • Ensure all tests pass, ask the user if questions arise.
  • 14. Lake publisher extensions

    • 14.1 Add macro fact publishers to the lake publisher service

      • Implement publish_global_event_fact writing partitioned Parquet datasets to stonks-lakehouse/warehouse/global_events/dt={date}/
      • Implement publish_macro_impact_fact writing partitioned Parquet datasets to stonks-lakehouse/warehouse/macro_impacts/dt={date}/ticker={ticker}/
      • Implement publish_trend_projection_fact writing partitioned Parquet datasets to stonks-lakehouse/warehouse/trend_projections/dt={date}/ticker={ticker}/
      • Register new fact types in the lake publisher's job processing loop
      • Requirements: 7.3, 12.6
    • 14.2 Write property test for macro data persistence round-trip

      • Property 4: Macro data persistence round-trip
      • Validates: Requirements 3.1, 7.1, 7.2, 12.5
    • 14.3 Write property test for content hash stability and uniqueness

      • Property 1: Content hash stability and uniqueness
      • Validates: Requirements 1.2
  • 15. Macro ingestion pipeline wiring

    • 15.1 Wire macro source ingestion into the scheduler and ingestion worker

      • Configure scheduler to trigger macro news source fetches on polling interval
      • Ingestion worker stores raw payloads in MinIO under stonks-raw-news/macro/ prefix
      • Metadata records use document_type='macro_event' in PostgreSQL
      • Content hash deduplication consistent with existing behavior
      • Source failure handling with retry policy consistent with existing sources
      • Requirements: 1.1, 1.2, 1.3, 1.4
    • 15.2 Wire event classification into the extractor worker

      • After parsing, route macro_event documents to event_classifier.classify_global_event() instead of standard document extraction
      • After classification, trigger interpolation for all tracked companies via aggregation queue
      • Requirements: 2.1, 2.2, 2.3
    • 15.3 Wire interpolation into the aggregation pipeline

      • After event classification, load exposure profiles for all tracked companies (manual, inferred, or default)
      • Compute MacroImpactRecord for each company with non-zero overlap
      • Persist records and trigger aggregation for affected tickers
      • Handle sustained macro ingestion failures: alert operators and continue with company-only signals
      • Requirements: 4.1, 4.5, 10.4
  • 16. Checkpoint — Ensure full backend pipeline works end-to-end

    • Ensure all tests pass, ask the user if questions arise.
  • 17. Dashboard — Global Events page and macro exposure panel

    • 17.1 Create Global Events list page at frontend/src/pages/GlobalEvents.tsx

      • Filterable list of recent global events with columns: summary, impact types, severity badge, affected regions, affected sectors, event date
      • Add API hooks for GET /api/macro/events in frontend/src/api/hooks.ts
      • Add route /macro/events in frontend/src/routes.tsx
      • Add navigation entry in sidebar in frontend/src/components/AppLayout.tsx
      • Requirements: 8.1
    • 17.2 Create Global Event detail page at frontend/src/pages/GlobalEventDetail.tsx

      • Display full classification detail: all affected companies with Macro_Impact_Scores, impact directions, contributing factors
      • Add API hook for GET /api/macro/events/{event_id}
      • Add route /macro/events/:id in frontend/src/routes.tsx
      • Requirements: 8.2
    • 17.3 Add macro exposure panel to Company Detail page

      • On frontend/src/pages/CompanyDetail.tsx, add a new tab/panel showing the company's Exposure_Profile and active GlobalEvents affecting the company with their Macro_Impact_Scores
      • Add API hook for GET /api/macro/impacts/{ticker}
      • Requirements: 8.3
    • 17.4 Add macro evidence indicators to Trend and Recommendation detail pages

      • On frontend/src/pages/TrendDetail.tsx, visually distinguish macro-sourced evidence from company-specific evidence in the evidence chain
      • On frontend/src/pages/RecommendationDetail.tsx, display macro signals that contributed with links back to originating GlobalEvents
      • Requirements: 8.4, 8.5
    • 17.5 Add trend projection display to Trend detail page

      • On frontend/src/pages/TrendDetail.tsx, display projected direction/strength alongside current trend with visual indicator and expandable driving factors panel
      • Add API hook for GET /api/trends/{trend_id}/projection
      • Requirements: 12.7
    • 17.6 Add macro toggle to Trading Controls page

      • On frontend/src/pages/Trading.tsx, add macro signal layer enable/disable switch with confirmation dialog
      • Add API hooks for GET /api/admin/macro/status and PUT /api/admin/macro/toggle
      • Requirements: 11.5, 11.6
  • 18. Checkpoint — Ensure frontend pages render and integrate with API

    • Ensure all tests pass, ask the user if questions arise.
  • 19. Integration wiring and final validation

    • 19.1 Add recommendation engine integration for trend projections

      • Incorporate trend projection into recommendation thesis and time_horizon fields, citing projected direction and key driving factors
      • Exclude low-confidence projections from influencing recommendation eligibility
      • Requirements: 12.8, 12.9
    • 19.2 Write integration tests for macro pipeline end-to-end

      • Test macro article ingestion → parsing → classification → interpolation → aggregation flow
      • Test lake publisher writes correct Parquet partitions for global events and macro impacts
      • Test macro toggle state change propagates to next aggregation cycle
      • Requirements: 1.1, 2.1, 4.1, 5.1, 7.3, 11.1
    • 19.3 Write unit tests for API endpoints and dashboard components

      • Test macro event list/detail endpoints return correct data
      • Test macro toggle endpoint persists state and records audit event
      • Test trend projection endpoint returns projection data
      • Add MSW handlers for macro endpoints in frontend/src/test/mocks/handlers.ts
      • Test GlobalEvents page and macro exposure panel render correctly
      • Requirements: 8.1, 8.2, 11.5, 12.10
  • 20. Final checkpoint — Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.

Notes

  • Tasks marked with * are optional and can be skipped for faster MVP
  • Each task references specific requirements for traceability
  • Checkpoints ensure incremental validation after each major phase
  • Property tests validate the 23 correctness properties from the design using Hypothesis
  • The design uses Python throughout — no language selection needed
  • No new Kubernetes deployments required; all modules extend existing services
  • Next migration number is 016