Files
stonks-oracle/.kiro/specs/global-news-interpolation/requirements.md
T

21 KiB

Requirements Document — Global News Interpolation Layer

Introduction

This feature adds a macro-level global news interpolation layer to the Stonks Oracle platform. The existing system ingests company-specific news, filings, and market data to produce per-company trend summaries and trade recommendations. This extension introduces a parallel signal path that ingests global and geopolitical news events — tariffs, wars, sanctions, central bank rate decisions, commodity shocks, natural disasters, regulatory changes, pandemics, and similar macro events — classifies them by impact type and severity, maps them to affected business sectors and individual companies based on exposure profiles, and feeds the resulting macro intelligence into the aggregation engine as an additional weighted signal layer alongside existing company-specific document intelligence.

The interpolation layer accounts for the fact that the same global event affects different businesses differently depending on their business class, what they produce or market, their geographic revenue exposure, supply chain dependencies, and their position on the world scale (domestic-only vs. multinational vs. emerging-market-dependent).

Glossary

  • Global_Event: A macro-level news event with potential cross-sector or cross-geography market impact (e.g., a tariff announcement, armed conflict, central bank rate decision, commodity supply disruption, natural disaster, or regulatory change).
  • Event_Classifier: The Ollama-based extraction service that classifies a Global_Event by impact type, severity, affected regions, and affected sectors.
  • Exposure_Profile: A per-company record describing geographic revenue mix, supply chain dependencies, key input commodities, regulatory jurisdictions, and market position tier that determines how a Global_Event maps to that company.
  • Macro_Impact_Score: A computed score in [0, 1] representing the estimated magnitude of a Global_Event's effect on a specific company, derived from the event's severity and the company's Exposure_Profile overlap.
  • Interpolation_Engine: The component that combines Global_Event classifications with company Exposure_Profiles to produce per-company Macro_Impact_Scores and feed them into the existing Aggregation_Engine.
  • Aggregation_Engine: The existing trend aggregation system (services/aggregation/) that computes rolling trend summaries from document intelligence signals.
  • Impact_Type: The category of economic effect a Global_Event produces (e.g., supply_disruption, demand_shift, cost_increase, regulatory_pressure, currency_impact, commodity_shock, trade_barrier, geopolitical_risk).
  • Severity_Level: A classification of a Global_Event's magnitude: low, moderate, high, or critical.
  • Market_Position_Tier: A company's scale classification affecting its resilience to macro shocks: global_leader, multinational, regional, or domestic.
  • Macro_Source: A news source configured specifically for global/macro event ingestion, distinct from company-specific news sources.

Requirements

Requirement 1: Global Event Ingestion

User Story: As an analyst, I want the platform to ingest global and geopolitical news from macro-focused sources, so that macro events are captured alongside company-specific intelligence.

Acceptance Criteria

  1. WHEN the Scheduler triggers a macro news ingestion cycle, THE Ingestion_Engine SHALL fetch articles from configured Macro_Sources and persist raw response payloads to MinIO under the stonks-raw-news bucket with a macro/ prefix path segment.
  2. WHEN a macro news article is ingested, THE Ingestion_Engine SHALL generate a stable content hash and use it to prevent duplicate processing, consistent with existing deduplication behavior.
  3. WHEN a macro news article is ingested, THE Ingestion_Engine SHALL persist a metadata record in PostgreSQL with source, URL, title, publication time, retrieval time, language, and content hash, using document_type macro_event.
  4. IF a macro news source is unreachable or returns an error, THEN THE Ingestion_Engine SHALL record the failure reason, retry policy state, and next eligible retry time, consistent with existing source failure handling.

Requirement 2: Global Event Classification

User Story: As an analyst, I want each global news article classified by impact type, severity, affected regions, and affected sectors, so that the platform understands what kind of macro shock each event represents.

Acceptance Criteria

  1. WHEN a macro news article passes parsing, THE Event_Classifier SHALL send the normalized text to a local Ollama model using structured JSON output with an explicit schema.
  2. WHEN the Event_Classifier processes a macro article, THE Event_Classifier SHALL produce a Global_Event intelligence object containing at minimum: event_id, event_type (one or more Impact_Types), severity (a Severity_Level), affected_regions (list of ISO country or region codes), affected_sectors (list of GICS sector identifiers or equivalent), affected_commodities (list when applicable), summary, key_facts, estimated_duration (short_term, medium_term, long_term), confidence score, and model metadata.
  3. WHEN the Ollama model returns an invalid or incomplete classification, THE Event_Classifier SHALL retry extraction according to policy and preserve both the failed output and validation errors.
  4. WHEN a Global_Event affects multiple Impact_Types simultaneously, THE Event_Classifier SHALL represent all applicable types rather than collapsing to a single category.
  5. THE Event_Classifier SHALL persist the classification prompt, schema, model metadata, and raw model output to MinIO for audit and reproducibility.

Requirement 3: Company Exposure Profiles

User Story: As an operator, I want to define each tracked company's geographic exposure, supply chain dependencies, and market position, so that the platform can determine how global events affect each company differently.

Acceptance Criteria

  1. WHEN an operator creates or updates a company's Exposure_Profile, THE Symbol_Registry SHALL persist the profile containing: geographic_revenue_mix (a map of region codes to revenue percentage), supply_chain_regions (list of regions where key suppliers operate), key_input_commodities (list of commodities the company depends on), regulatory_jurisdictions (list of jurisdictions with material regulatory exposure), market_position_tier (one of global_leader, multinational, regional, domestic), and export_dependency_pct (percentage of revenue from exports).
  2. WHEN no Exposure_Profile exists for a tracked company, THE Interpolation_Engine SHALL use a default profile derived from the company's sector and industry fields, with market_position_tier inferred from market_cap_bucket.
  3. WHEN an operator updates an Exposure_Profile, THE Symbol_Registry SHALL record the previous profile version for audit trail purposes.
  4. THE Symbol_Registry SHALL expose Exposure_Profile CRUD operations through its existing REST API.

Requirement 4: Macro-to-Company Impact Mapping

User Story: As a strategist, I want the platform to compute how each global event specifically impacts each tracked company based on their exposure profile, so that macro intelligence is company-specific rather than generic.

Acceptance Criteria

  1. WHEN a Global_Event classification is produced, THE Interpolation_Engine SHALL compute a Macro_Impact_Score for each tracked company by evaluating the overlap between the event's affected_regions, affected_sectors, and affected_commodities against the company's Exposure_Profile.
  2. WHEN computing a Macro_Impact_Score, THE Interpolation_Engine SHALL weight the score by the event's Severity_Level, the degree of geographic overlap (using geographic_revenue_mix percentages), the supply chain exposure (using supply_chain_regions), and the commodity dependency overlap.
  3. WHEN computing a Macro_Impact_Score, THE Interpolation_Engine SHALL apply a resilience modifier based on the company's Market_Position_Tier, where global_leader companies receive a dampening factor and domestic companies receive an amplification factor for international events.
  4. WHEN a Global_Event has zero overlap with a company's Exposure_Profile, THE Interpolation_Engine SHALL assign a Macro_Impact_Score of 0.0 and skip further processing for that company-event pair.
  5. WHEN a Macro_Impact_Score is computed, THE Interpolation_Engine SHALL produce a macro impact record containing: event_id, company_id, ticker, macro_impact_score, impact_direction (positive, negative, or mixed), contributing_factors (list of which profile dimensions matched), and confidence score.
  6. WHEN the same Global_Event produces both positive and negative effects on a company, THE Interpolation_Engine SHALL represent the net direction as mixed and preserve both the positive and negative contributing factors separately.

Requirement 5: Aggregation Engine Integration

User Story: As a strategist, I want macro impact signals to be blended into existing company trend summaries alongside company-specific document intelligence, so that recommendations reflect both micro and macro conditions.

Acceptance Criteria

  1. WHEN the Aggregation_Engine computes a company trend summary, THE Aggregation_Engine SHALL include macro impact records as additional weighted signals alongside existing document intelligence signals.
  2. WHEN weighting macro impact signals, THE Aggregation_Engine SHALL apply recency decay, event severity weighting, and confidence gating consistent with existing signal scoring, using the Global_Event's publication time for recency and the Macro_Impact_Score as the impact score.
  3. WHEN macro signals and company-specific signals disagree in direction, THE Aggregation_Engine SHALL represent the disagreement explicitly in the contradiction_score and disagreement_details fields, consistent with existing contradiction detection behavior.
  4. WHEN a trend summary includes macro signal contributions, THE Aggregation_Engine SHALL include the contributing Global_Event IDs in the evidence references so that the macro signal chain is traceable from recommendation back to source event.
  5. WHEN no macro impact records exist for a company in the aggregation window, THE Aggregation_Engine SHALL produce the trend summary using only company-specific signals, with no degradation of existing behavior.
  6. THE Aggregation_Engine SHALL expose a configurable weight parameter (macro_signal_weight) that controls the relative influence of macro signals versus company-specific signals in the combined trend, defaulting to 0.3.

Requirement 6: Sector and Market Rollup Enhancement

User Story: As an analyst, I want sector-level and market-level trend rollups to reflect macro event impacts, so that I can see how global events are shifting entire sectors.

Acceptance Criteria

  1. WHEN the Aggregation_Engine computes a sector-level rollup, THE Aggregation_Engine SHALL incorporate macro impact signals that affect the sector, weighted by the number and exposure of constituent companies impacted.
  2. WHEN the Aggregation_Engine computes a market-level rollup, THE Aggregation_Engine SHALL incorporate macro impact signals aggregated across all sectors, reflecting the breadth and severity of active global events.
  3. WHEN a Global_Event disproportionately affects one sector, THE Aggregation_Engine SHALL surface that sector as a material_risk or dominant_catalyst in the market-level rollup.

Requirement 7: Global Event Storage and Queryability

User Story: As a data engineer, I want global event classifications and macro impact records stored in both the operational database and the analytical lake, so that I can query macro intelligence alongside company data.

Acceptance Criteria

  1. WHEN a Global_Event classification is produced, THE System SHALL persist the classification record in PostgreSQL with fields for event_id, event_types, severity, affected_regions, affected_sectors, affected_commodities, summary, estimated_duration, confidence, source_document_id, and model metadata.
  2. WHEN a macro impact record is computed, THE System SHALL persist it in PostgreSQL with fields for event_id, company_id, ticker, macro_impact_score, impact_direction, contributing_factors, confidence, and computed_at timestamp.
  3. WHEN the Lake_Publisher runs, THE Lake_Publisher SHALL publish global event facts and macro impact facts as partitioned Parquet datasets to MinIO under the stonks-lakehouse bucket.
  4. WHEN analytical queries join macro impact data with company trends, THE System SHALL support SQL joins between global_events, macro_impacts, trend_windows, and recommendations tables through Trino.

Requirement 8: Dashboard Visibility

User Story: As an analyst, I want to see active global events, their severity, and which companies they impact through the web dashboard, so that I can understand the macro context behind trend shifts.

Acceptance Criteria

  1. WHEN an analyst navigates to a new Global Events section, THE Dashboard SHALL display a filterable list of recent Global_Events with columns for event summary, impact types, severity badge, affected regions, affected sectors, and event date.
  2. WHEN an analyst clicks a Global_Event, THE Dashboard SHALL display the full classification detail including all affected companies with their Macro_Impact_Scores, impact directions, and contributing factors.
  3. WHEN an analyst views a company detail page, THE Dashboard SHALL display a macro exposure panel showing the company's Exposure_Profile and a list of active Global_Events affecting that company with their Macro_Impact_Scores.
  4. WHEN an analyst views a trend summary, THE Dashboard SHALL visually distinguish macro-sourced evidence from company-specific evidence in the evidence chain.
  5. WHEN an analyst views a recommendation, THE Dashboard SHALL display any macro signals that contributed to the recommendation with links back to the originating Global_Events.

Requirement 9: Exposure Profile Auto-Inference

User Story: As an operator, I want the platform to automatically infer a baseline exposure profile from company filings and public data when I haven't manually configured one, so that macro interpolation works out of the box for newly tracked companies.

Acceptance Criteria

  1. WHEN a company is tracked and has no manually configured Exposure_Profile, THE Event_Classifier SHALL attempt to infer a baseline profile from the company's most recent filing extractions, using geographic revenue breakdowns, supplier mentions, and commodity references found in the document intelligence.
  2. WHEN the Event_Classifier infers an Exposure_Profile, THE Event_Classifier SHALL mark the profile as source inferred with a confidence score, distinguishing it from operator-configured profiles marked as source manual.
  3. IF the Event_Classifier cannot infer a meaningful profile due to insufficient filing data, THEN THE Interpolation_Engine SHALL fall back to the sector-based default profile described in Requirement 3.2.

Requirement 10: Macro Signal Suppression and Safety

User Story: As a risk owner, I want macro signals to be subject to quality controls so that low-confidence or stale global event classifications do not drive automated trading decisions.

Acceptance Criteria

  1. WHEN a Global_Event classification has a confidence score below a configurable threshold (default 0.4), THE Interpolation_Engine SHALL exclude the event from macro impact computation and log the exclusion reason.
  2. WHEN a Global_Event's estimated_duration is short_term and the event is older than 48 hours, THE Interpolation_Engine SHALL apply an accelerated decay factor to the event's macro impact signals.
  3. WHEN macro signals are the sole basis for a trend direction change (no supporting company-specific signals), THE Recommendation_Engine SHALL mark the recommendation as informational only and append a macro-only caveat to the thesis.
  4. IF the macro ingestion pipeline experiences sustained failures exceeding a configurable threshold, THEN THE System SHALL alert operators and continue producing recommendations using only company-specific signals.

Requirement 11: Macro Signal Layer Toggle

User Story: As an operator, I want to enable or disable the macro signal interpolation layer at runtime without redeploying services, so that I can control whether global news influences trend summaries and recommendations.

Acceptance Criteria

  1. WHEN an operator toggles the macro signal layer via the Trading Controls page or the API, THE System SHALL persist the setting in the risk_configs table and apply it immediately to subsequent aggregation and recommendation cycles without requiring a service restart.
  2. WHEN the macro signal layer is disabled, THE Aggregation_Engine SHALL skip all macro impact signals and produce trend summaries using only company-specific document intelligence, with no change to existing behavior.
  3. WHEN the macro signal layer is disabled, THE Ingestion_Engine SHALL continue ingesting and classifying macro news articles so that historical macro data is preserved, but THE Interpolation_Engine SHALL skip macro-to-company impact computation.
  4. WHEN the macro signal layer is re-enabled after being disabled, THE Interpolation_Engine SHALL resume computing macro impact scores using the most recent Global_Event classifications, including events ingested while the layer was disabled.
  5. THE Query API SHALL expose a GET /api/admin/macro/status endpoint returning the current enabled/disabled state and a PUT /api/admin/macro/toggle endpoint to switch it.
  6. THE Dashboard Trading Controls page SHALL display the macro signal layer toggle alongside the existing trading mode controls, with a confirmation dialog for state changes.
  7. WHEN the macro signal layer state changes, THE System SHALL record an audit event with the previous state, new state, and the operator who made the change.

Requirement 12: Trend Projections

User Story: As a strategist, I want the platform to generate forward-looking trend projections that combine historical company-specific signals with active macro event trajectories, so that I can anticipate where a company's trend is heading rather than only seeing where it is now.

Acceptance Criteria

  1. WHEN the Aggregation_Engine produces a trend summary for a company, THE Aggregation_Engine SHALL also compute a trend projection containing a projected_direction (bullish, bearish, mixed, neutral), projected_strength, projected_confidence, projection_horizon (1d, 7d, 30d), and a list of driving_factors explaining what is expected to push the trend in that direction.
  2. WHEN computing a trend projection, THE Aggregation_Engine SHALL consider: the current trend trajectory and momentum (rate of change in strength over recent windows), active Global_Events with estimated_duration extending beyond the current window, the severity and decay profile of active macro signals, upcoming known catalysts from document intelligence (earnings dates, regulatory deadlines, product launches), and the historical pattern of how similar macro event types have resolved for companies with similar Exposure_Profiles.
  3. WHEN a trend projection diverges from the current trend direction (e.g., current trend is bullish but projection is bearish), THE Aggregation_Engine SHALL flag the projection as a potential reversal signal and include the divergence reason in the driving_factors.
  4. WHEN the macro signal layer is disabled, THE Aggregation_Engine SHALL still compute trend projections using only company-specific signal momentum and known upcoming catalysts, with reduced projection confidence.
  5. WHEN a trend projection is produced, THE System SHALL persist it in PostgreSQL alongside the trend_window record with fields for projected_direction, projected_strength, projected_confidence, projection_horizon, driving_factors, macro_contribution_pct (percentage of projection driven by macro signals vs company-specific), and computed_at timestamp.
  6. WHEN the Lake_Publisher runs, THE Lake_Publisher SHALL publish trend projection facts as a partitioned Parquet dataset to MinIO for analytical queries and backtesting.
  7. WHEN an analyst views a trend summary on the Dashboard, THE Dashboard SHALL display the trend projection alongside the current trend with a visual indicator showing the projected direction and strength, and an expandable panel listing the driving factors.
  8. WHEN a recommendation is generated, THE Recommendation_Engine SHALL incorporate the trend projection into the thesis and time_horizon fields, citing the projected direction and key driving factors.
  9. WHEN a trend projection's confidence falls below a configurable threshold (default 0.3), THE System SHALL mark the projection as low_confidence and exclude it from influencing recommendation eligibility, while still displaying it as informational on the dashboard.
  10. THE System SHALL expose a GET /api/trends/{trend_id}/projection endpoint returning the projection for a specific trend window, and include projection data in the existing GET /api/trends list response.