feat: competitive intelligence & historical pattern matching layer

This commit is contained in:
Celes Renata
2026-04-14 19:42:48 +00:00
parent b478022ba3
commit f7a11d14ea
203 changed files with 20155 additions and 97 deletions
@@ -0,0 +1 @@
{"specId": "3e745894-9abc-49ff-97cc-c921f436bb32", "workflowType": "requirements-first", "specType": "feature"}
@@ -0,0 +1,619 @@
# Global News Interpolation Layer — Design
## Overview
This design extends the Stonks Oracle platform with a macro-level global news interpolation layer. The layer introduces a parallel signal path that ingests global/geopolitical news events, classifies them by impact type and severity using Ollama, maps them to individual companies via exposure profiles, and feeds the resulting macro impact scores into the existing aggregation engine as weighted signals alongside company-specific document intelligence.
The design integrates with the existing service architecture — no new Kubernetes deployments are required. The event classifier reuses the extractor service's Ollama client, the interpolation engine runs within the aggregation worker, and exposure profiles are managed through the symbol registry API. A runtime toggle allows operators to enable/disable the macro signal layer without redeployment.
### Design Rationale
- **Reuse over new services**: The macro pipeline reuses existing ingestion, parsing, extraction, aggregation, and lake publisher infrastructure. New logic is added as modules within existing services rather than standalone deployments.
- **Exposure-driven specificity**: Rather than applying a blanket macro sentiment to all companies, the system computes company-specific impact scores based on geographic revenue mix, supply chain exposure, and commodity dependencies.
- **Safety-first**: Macro signals are subject to confidence gating, staleness decay, and a dedicated runtime toggle. Macro-only trend shifts are forced to informational mode.
- **Auditability**: Every macro impact score is traceable from the originating global event through the classification, exposure profile overlap, and final weighted contribution to the trend summary.
## Architecture
The macro interpolation layer adds four logical components that run within existing services:
```mermaid
flowchart TD
subgraph Ingestion["Ingestion Service (existing)"]
MS[Macro Source Adapter]
end
subgraph Parser["Parser Service (existing)"]
MP[Macro Article Parser]
end
subgraph Extractor["Extractor Service (existing)"]
EC[Event Classifier Module]
end
subgraph SymReg["Symbol Registry (existing)"]
EP[Exposure Profile CRUD]
end
subgraph Aggregation["Aggregation Service (existing)"]
IE[Interpolation Engine]
AE[Aggregation Engine]
TP[Trend Projections]
end
subgraph Recommendation["Recommendation Service (existing)"]
RE[Macro-Aware Recommendations]
end
subgraph LakePublisher["Lake Publisher (existing)"]
LP[Macro Fact Publisher]
end
subgraph QueryAPI["Query API (existing)"]
MA[Macro API Endpoints]
MT[Macro Toggle Endpoint]
end
subgraph Dashboard["Dashboard (existing)"]
GEP[Global Events Page]
MEP[Macro Exposure Panel]
end
MS -->|raw macro articles| MP
MP -->|normalized text| EC
EC -->|Global_Event classification| IE
EP -->|Exposure_Profiles| IE
IE -->|macro impact signals| AE
AE -->|trend summaries + projections| TP
TP --> RE
EC -->|event facts| LP
IE -->|impact facts| LP
MT -->|toggle state| AE
MA --> GEP
MA --> MEP
```
### Data Flow
1. **Ingestion**: Scheduler triggers macro source fetches. The existing ingestion worker fetches from configured macro news sources and stores raw payloads in MinIO under `stonks-raw-news/macro/`. Metadata records use `document_type = 'macro_event'`.
2. **Parsing**: The existing parser normalizes macro articles identically to company-specific articles. No parser changes needed — the parser is document-type agnostic.
3. **Classification**: A new `event_classifier` module in the extractor service uses a dedicated Ollama prompt and JSON schema to produce `GlobalEvent` classification objects. The module reuses the existing `OllamaClient` for inference and retry logic.
4. **Interpolation**: A new `interpolation` module in the aggregation service loads company exposure profiles, computes overlap scores against each classified event, and produces `MacroImpactRecord` objects. These are stored in PostgreSQL and fed into the aggregation engine as additional weighted signals.
5. **Aggregation**: The existing `aggregate_company_window` function is extended to fetch macro impact records alongside document impact records. Macro signals use the same `WeightedSignal` abstraction with recency decay, confidence gating, and contradiction detection.
6. **Trend Projections**: A new projection module computes forward-looking trend estimates by combining current trend momentum with active macro event trajectories and known upcoming catalysts.
7. **Recommendation**: The recommendation engine incorporates macro signals through the trend summary (no direct changes needed). A new check forces macro-only trend shifts to informational mode.
8. **Lake Publication**: New `publish_global_event_fact` and `publish_macro_impact_fact` functions in the lake publisher write partitioned Parquet datasets for analytical queries.
## Components and Interfaces
### Event Classifier Module
**Location**: `services/extractor/event_classifier.py`
Responsible for classifying macro news articles into structured `GlobalEvent` objects using Ollama.
```python
@dataclass
class GlobalEvent:
event_id: str # UUID
event_types: list[str] # Impact_Type values
severity: str # Severity_Level: low|moderate|high|critical
affected_regions: list[str] # ISO 3166-1 alpha-2 codes or region names
affected_sectors: list[str] # GICS sector identifiers
affected_commodities: list[str] # commodity identifiers when applicable
summary: str
key_facts: list[str]
estimated_duration: str # short_term|medium_term|long_term
confidence: float # [0, 1]
source_document_id: str # FK to documents table
model_metadata: ModelMetadata
```
**Interface**:
- `classify_global_event(normalized_text: str, document_id: str, ollama_client: OllamaClient) -> GlobalEvent`
- `build_event_classification_prompt(text: str) -> str`
- `get_event_json_schema() -> dict`
**Ollama Integration**: Uses the existing `OllamaClient` with a dedicated prompt template (`event-classification-v1`) and JSON schema. Retries follow the same policy as document extraction.
### Exposure Profile Management
**Location**: `services/symbol_registry/exposure.py`
New endpoints on the Symbol Registry API for managing company exposure profiles.
```python
class ExposureProfile(BaseModel):
company_id: str
geographic_revenue_mix: dict[str, float] # region_code -> pct (0-1)
supply_chain_regions: list[str] # region codes
key_input_commodities: list[str] # commodity identifiers
regulatory_jurisdictions: list[str] # jurisdiction codes
market_position_tier: str # global_leader|multinational|regional|domestic
export_dependency_pct: float # 0-1
source: str # "manual" | "inferred"
confidence: float # [0, 1], relevant for inferred profiles
version: int # auto-incremented on update
```
**API Endpoints** (on Symbol Registry):
- `GET /companies/{company_id}/exposure` — get current profile
- `PUT /companies/{company_id}/exposure` — create/update profile (archives previous version)
- `GET /companies/{company_id}/exposure/history` — list profile versions
### Interpolation Engine
**Location**: `services/aggregation/interpolation.py`
Computes per-company macro impact scores by evaluating overlap between global event classifications and company exposure profiles.
```python
@dataclass
class MacroImpactRecord:
event_id: str
company_id: str
ticker: str
macro_impact_score: float # [0, 1]
impact_direction: str # positive|negative|mixed
contributing_factors: list[str] # which profile dimensions matched
confidence: float # [0, 1]
computed_at: datetime
```
**Core Functions**:
- `compute_macro_impact(event: GlobalEvent, profile: ExposureProfile) -> MacroImpactRecord`
- `compute_geographic_overlap(event_regions: list[str], revenue_mix: dict[str, float]) -> float`
- `compute_supply_chain_overlap(event_regions: list[str], supply_regions: list[str]) -> float`
- `compute_commodity_overlap(event_commodities: list[str], company_commodities: list[str]) -> float`
- `apply_resilience_modifier(raw_score: float, tier: str, event_is_international: bool) -> float`
- `build_default_profile(sector: str, industry: str, market_cap_bucket: str) -> ExposureProfile`
**Scoring Formula**:
```
raw_score = severity_weight * (
geo_weight * geographic_overlap +
supply_weight * supply_chain_overlap +
commodity_weight * commodity_overlap +
sector_weight * sector_match
)
final_score = apply_resilience_modifier(raw_score, market_position_tier)
```
Where:
- `severity_weight`: critical=1.0, high=0.75, moderate=0.5, low=0.25
- `geo_weight=0.35, supply_weight=0.25, commodity_weight=0.25, sector_weight=0.15`
- Resilience modifiers: global_leader=0.7, multinational=0.85, regional=1.0, domestic=1.2 (for international events)
### Aggregation Engine Extensions
**Location**: Modified `services/aggregation/worker.py`
The existing `aggregate_company_window` function is extended to:
1. Check the macro signal layer toggle (from `risk_configs` table)
2. Fetch macro impact records for the ticker within the window
3. Convert macro impact records to `WeightedSignal` objects using the same scoring pipeline
4. Merge macro signals with company-specific signals before computing the trend summary
5. Apply `macro_signal_weight` (default 0.3) to control relative influence
**New config field on `AggregationConfig`**:
```python
macro_signal_weight: float = 0.3 # relative weight of macro vs company signals
macro_enabled: bool = True # runtime toggle state
```
**Macro signal conversion**: Each `MacroImpactRecord` is converted to a `WeightedSignal` using:
- `document_id` = event's `source_document_id` (for evidence tracing)
- `sentiment_value` = mapped from `impact_direction` (positive=+1, negative=-1, mixed=0)
- `impact_score` = `macro_impact_score * macro_signal_weight`
- Recency decay uses the global event's publication time
- Confidence gating uses the macro impact record's confidence
### Trend Projection Module
**Location**: `services/aggregation/projection.py`
Computes forward-looking trend projections alongside current trend summaries.
```python
@dataclass
class TrendProjection:
projected_direction: str # bullish|bearish|mixed|neutral
projected_strength: float # [0, 1]
projected_confidence: float # [0, 1]
projection_horizon: str # 1d|7d|30d
driving_factors: list[str] # human-readable explanations
macro_contribution_pct: float # % of projection driven by macro signals
diverges_from_current: bool # True if projection != current direction
computed_at: datetime
```
**Inputs**:
- Current trend summary (direction, strength, momentum)
- Active global events with `estimated_duration` extending beyond the current window
- Upcoming known catalysts from document intelligence (earnings dates, regulatory deadlines)
- Historical resolution patterns for similar event types (optional, v2)
**Projection Logic**:
1. Compute trend momentum as rate of change in strength across recent windows
2. Project macro signal decay based on event `estimated_duration` and severity
3. Factor in upcoming catalysts that may shift direction
4. Combine momentum + macro trajectory + catalyst outlook into projected direction/strength
5. Flag divergence when projected direction differs from current direction
### Macro Signal Suppression
**Location**: Extended `services/recommendation/suppression.py`
New suppression check: when macro signals are the sole basis for a trend direction change (no supporting company-specific signals agree), the recommendation is forced to informational mode with a macro-only caveat.
**New function**:
- `evaluate_macro_only_suppression(summary: TrendSummary, macro_signal_count: int, company_signal_count: int) -> bool`
### Exposure Profile Auto-Inference
**Location**: `services/extractor/exposure_inference.py`
Infers baseline exposure profiles from company filing extractions when no manual profile exists.
**Interface**:
- `infer_exposure_profile(document_intelligences: list[DocumentIntelligence], sector: str, industry: str, market_cap_bucket: str) -> ExposureProfile`
Scans recent filing extractions for geographic revenue breakdowns, supplier mentions, and commodity references. Produces an `ExposureProfile` with `source='inferred'` and a confidence score reflecting data quality.
### Query API Extensions
**Location**: Extended `services/api/`
New endpoints:
- `GET /api/macro/events` — list recent global events with filtering
- `GET /api/macro/events/{event_id}` — event detail with affected companies
- `GET /api/macro/impacts/{ticker}` — macro impacts for a company
- `GET /api/admin/macro/status` — macro layer enabled/disabled state
- `PUT /api/admin/macro/toggle` — toggle macro layer on/off
- `GET /api/trends/{trend_id}/projection` — trend projection for a specific window
### Dashboard Extensions
**Location**: Extended `frontend/src/`
New pages/panels:
- **Global Events page** (`/macro/events`): filterable list of global events with severity badges, region/sector tags, and drill-down to affected companies
- **Macro Exposure panel** on Company Detail page: shows exposure profile and active macro impacts
- **Macro evidence indicators** on Trend and Recommendation detail pages: visually distinguishes macro-sourced evidence
- **Trend projection display** on Trend detail page: projected direction/strength with driving factors
- **Macro toggle** on Trading Controls page: enable/disable switch with confirmation dialog
## Data Models
### New PostgreSQL Tables
#### `global_events`
```sql
CREATE TABLE global_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
event_types TEXT[] NOT NULL,
severity VARCHAR(20) NOT NULL,
affected_regions TEXT[] NOT NULL DEFAULT '{}',
affected_sectors TEXT[] NOT NULL DEFAULT '{}',
affected_commodities TEXT[] NOT NULL DEFAULT '{}',
summary TEXT NOT NULL,
key_facts JSONB NOT NULL DEFAULT '[]',
estimated_duration VARCHAR(20) NOT NULL,
confidence FLOAT NOT NULL,
source_document_id UUID REFERENCES documents(id),
model_provider VARCHAR(100),
model_name VARCHAR(200),
prompt_version VARCHAR(100),
schema_version VARCHAR(20),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```
#### `macro_impact_records`
```sql
CREATE TABLE macro_impact_records (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
event_id UUID NOT NULL REFERENCES global_events(id),
company_id UUID NOT NULL REFERENCES companies(id),
ticker VARCHAR(20) NOT NULL,
macro_impact_score FLOAT NOT NULL,
impact_direction VARCHAR(20) NOT NULL,
contributing_factors JSONB NOT NULL DEFAULT '[]',
confidence FLOAT NOT NULL,
computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```
#### `exposure_profiles`
```sql
CREATE TABLE exposure_profiles (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
company_id UUID NOT NULL REFERENCES companies(id),
geographic_revenue_mix JSONB NOT NULL DEFAULT '{}',
supply_chain_regions TEXT[] NOT NULL DEFAULT '{}',
key_input_commodities TEXT[] NOT NULL DEFAULT '{}',
regulatory_jurisdictions TEXT[] NOT NULL DEFAULT '{}',
market_position_tier VARCHAR(30) NOT NULL DEFAULT 'regional',
export_dependency_pct FLOAT NOT NULL DEFAULT 0.0,
source VARCHAR(20) NOT NULL DEFAULT 'manual',
confidence FLOAT NOT NULL DEFAULT 1.0,
version INTEGER NOT NULL DEFAULT 1,
active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```
#### `trend_projections`
```sql
CREATE TABLE trend_projections (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
trend_window_id UUID NOT NULL REFERENCES trend_windows(id),
projected_direction VARCHAR(20) NOT NULL,
projected_strength FLOAT NOT NULL,
projected_confidence FLOAT NOT NULL,
projection_horizon VARCHAR(10) NOT NULL,
driving_factors JSONB NOT NULL DEFAULT '[]',
macro_contribution_pct FLOAT NOT NULL DEFAULT 0.0,
diverges_from_current BOOLEAN NOT NULL DEFAULT FALSE,
computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```
### New Pydantic Schemas
Added to `services/shared/schemas.py`:
```python
class ImpactType(str, Enum):
SUPPLY_DISRUPTION = "supply_disruption"
DEMAND_SHIFT = "demand_shift"
COST_INCREASE = "cost_increase"
REGULATORY_PRESSURE = "regulatory_pressure"
CURRENCY_IMPACT = "currency_impact"
COMMODITY_SHOCK = "commodity_shock"
TRADE_BARRIER = "trade_barrier"
GEOPOLITICAL_RISK = "geopolitical_risk"
class SeverityLevel(str, Enum):
LOW = "low"
MODERATE = "moderate"
HIGH = "high"
CRITICAL = "critical"
class MarketPositionTier(str, Enum):
GLOBAL_LEADER = "global_leader"
MULTINATIONAL = "multinational"
REGIONAL = "regional"
DOMESTIC = "domestic"
class EstimatedDuration(str, Enum):
SHORT_TERM = "short_term"
MEDIUM_TERM = "medium_term"
LONG_TERM = "long_term"
```
### Analytical Lake Datasets
New fact tables published to MinIO under `stonks-lakehouse/`:
- `lake.global_events` — partitioned by `dt`, columns: event_id, event_types, severity, affected_regions, affected_sectors, affected_commodities, summary, estimated_duration, confidence, source_document_id, created_at
- `lake.macro_impacts` — partitioned by `dt` and `ticker`, columns: event_id, company_id, ticker, macro_impact_score, impact_direction, contributing_factors, confidence, computed_at
- `lake.trend_projections` — partitioned by `dt` and `ticker`, columns: trend_window_id, ticker, projected_direction, projected_strength, projected_confidence, projection_horizon, driving_factors, macro_contribution_pct, diverges_from_current, computed_at
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
### Property 1: Content hash stability and uniqueness
*For any* macro news article content, computing the content hash twice on identical content SHALL produce the same hash, and computing the hash on distinct content SHALL produce different hashes.
**Validates: Requirements 1.2**
### Property 2: Macro pipeline output schema completeness
*For any* valid Ollama classification response, the resulting GlobalEvent object SHALL contain all required fields (event_id, event_types, severity, affected_regions, affected_sectors, summary, estimated_duration, confidence, source_document_id, model_metadata). Similarly, *for any* valid macro impact computation, the resulting MacroImpactRecord SHALL contain all required fields (event_id, company_id, ticker, macro_impact_score, impact_direction, contributing_factors, confidence).
**Validates: Requirements 2.2, 4.5**
### Property 3: Multiple impact types preserved
*For any* global event classification where the source article implies N distinct impact types, the resulting GlobalEvent's event_types list SHALL contain all N types without collapsing to a single category.
**Validates: Requirements 2.4**
### Property 4: Macro data persistence round-trip
*For any* valid GlobalEvent, MacroImpactRecord, ExposureProfile, or TrendProjection object, persisting it to PostgreSQL and reading it back SHALL produce an equivalent object with all fields preserved.
**Validates: Requirements 3.1, 7.1, 7.2, 12.5**
### Property 5: Default exposure profile derivation
*For any* company with a valid sector, industry, and market_cap_bucket but no manually configured ExposureProfile, the default profile SHALL have a market_position_tier consistent with the market_cap_bucket mapping (large_cap → global_leader, mid_cap → multinational, small_cap → regional, micro_cap → domestic) and SHALL have non-empty geographic_revenue_mix derived from the sector.
**Validates: Requirements 3.2**
### Property 6: Exposure profile version history
*For any* sequence of N updates to a company's ExposureProfile, the version history SHALL contain exactly N records, each preserving the complete profile state at the time of that update, with monotonically increasing version numbers.
**Validates: Requirements 3.3**
### Property 7: Macro impact score bounds and zero-overlap invariant
*For any* GlobalEvent and ExposureProfile pair, the computed Macro_Impact_Score SHALL be in [0, 1]. Furthermore, *for any* pair where the event's affected_regions, affected_sectors, and affected_commodities have zero intersection with the profile's geographic_revenue_mix keys, supply_chain_regions, and key_input_commodities, the score SHALL be exactly 0.0.
**Validates: Requirements 4.1, 4.4**
### Property 8: Scoring monotonicity
*For any* GlobalEvent and ExposureProfile pair, increasing the event's severity level (low → moderate → high → critical) while holding all other inputs constant SHALL produce a Macro_Impact_Score that is greater than or equal to the previous score. Similarly, increasing the geographic overlap percentage SHALL produce a score greater than or equal to the previous score.
**Validates: Requirements 4.2**
### Property 9: Resilience modifier tier ordering
*For any* positive raw impact score and an international event, applying the resilience modifier with market_position_tier=global_leader SHALL produce a final score less than or equal to multinational, which SHALL be less than or equal to regional, which SHALL be less than or equal to domestic.
**Validates: Requirements 4.3**
### Property 10: Mixed direction for dual-effect events
*For any* GlobalEvent and ExposureProfile pair where the computation identifies both positive and negative contributing factors, the resulting impact_direction SHALL be 'mixed' and both positive and negative factors SHALL be preserved separately in contributing_factors.
**Validates: Requirements 4.6**
### Property 11: Macro signals influence trend output
*For any* company with both company-specific signals and non-zero macro impact signals, the trend summary computed with macro signals included SHALL differ from the trend summary computed with only company-specific signals (in at least one of: trend_strength, confidence, or evidence references).
**Validates: Requirements 5.1**
### Property 12: Macro-company contradiction detection
*For any* set of signals where macro impact signals have a negative direction and company-specific signals have a positive sentiment (or vice versa), the resulting trend summary's contradiction_score SHALL be greater than zero and disagreement_details SHALL contain at least one entry.
**Validates: Requirements 5.3**
### Property 13: Macro evidence traceability
*For any* trend summary that includes macro signal contributions, the top_supporting_evidence or top_opposing_evidence lists SHALL contain the source_document_id of at least one contributing GlobalEvent.
**Validates: Requirements 5.4**
### Property 14: No degradation without macro data and disabled-layer equivalence
*For any* company with no macro impact records in the aggregation window, the trend summary produced with the macro layer enabled SHALL be identical to the trend summary produced with the macro layer disabled. Furthermore, *for any* aggregation run with the macro layer disabled, the output SHALL be identical to company-only aggregation regardless of existing macro data.
**Validates: Requirements 5.5, 11.2**
### Property 15: Sector and market rollup macro incorporation
*For any* sector containing companies with non-zero macro impact scores, the sector-level rollup SHALL reflect those macro signals in its trend_strength or confidence. Furthermore, *for any* GlobalEvent that disproportionately affects a single sector (>60% of total macro impact concentrated in one sector), that sector SHALL appear in the market-level rollup's material_risks or dominant_catalysts.
**Validates: Requirements 6.1, 6.2, 6.3**
### Property 16: Inferred exposure profile correctness
*For any* set of filing extractions containing geographic revenue breakdowns or commodity references, the inferred ExposureProfile SHALL have source='inferred', confidence in [0, 1], and geographic_revenue_mix entries that correspond to regions mentioned in the filings.
**Validates: Requirements 9.1, 9.2**
### Property 17: Low-confidence event exclusion
*For any* GlobalEvent classification with confidence below the configurable threshold (default 0.4), the Interpolation_Engine SHALL produce zero MacroImpactRecords for that event.
**Validates: Requirements 10.1**
### Property 18: Accelerated decay for stale short-term events
*For any* GlobalEvent with estimated_duration='short_term' and age exceeding 48 hours, the effective signal weight SHALL be strictly less than the weight computed using standard recency decay for the same age.
**Validates: Requirements 10.2**
### Property 19: Macro-only recommendation suppression
*For any* trend summary where the trend direction is driven solely by macro signals (no company-specific signals support the direction), the resulting recommendation SHALL have mode='informational' and the thesis SHALL contain a macro-only caveat.
**Validates: Requirements 10.3**
### Property 20: Trend projection always produced
*For any* trend summary produced by the Aggregation_Engine, a corresponding TrendProjection SHALL also be produced with valid projected_direction, projected_strength in [0, 1], projected_confidence in [0, 1], and a non-empty driving_factors list.
**Validates: Requirements 12.1**
### Property 21: Projection divergence flagging
*For any* TrendProjection where projected_direction differs from the current trend summary's trend_direction, the diverges_from_current field SHALL be True and driving_factors SHALL contain at least one entry explaining the divergence.
**Validates: Requirements 12.3**
### Property 22: Macro-disabled projections have reduced confidence
*For any* identical set of company signals and macro signals, the TrendProjection computed with the macro layer disabled SHALL have projected_confidence less than or equal to the projection computed with the macro layer enabled.
**Validates: Requirements 12.4**
### Property 23: Low-confidence projection exclusion
*For any* TrendProjection with projected_confidence below the configurable threshold (default 0.3), the projection SHALL be marked as low_confidence and SHALL NOT influence recommendation eligibility.
**Validates: Requirements 12.9**
## Error Handling
### Macro Ingestion Failures
- Source fetch failures follow existing retry/backoff logic from the ingestion service
- Sustained macro source failures (configurable threshold, default 3 consecutive) trigger operator alerts via the existing alerting framework
- The aggregation engine continues producing trends using company-specific signals only when macro ingestion is degraded
### Event Classification Failures
- Invalid Ollama responses trigger retries per existing extraction retry policy (max 2 retries with exponential backoff)
- Failed classifications are preserved in MinIO with validation errors for debugging
- Failed events do not produce macro impact records — they are silently excluded from interpolation
### Exposure Profile Fallbacks
- Missing manual profiles fall back to sector-based defaults
- Failed auto-inference falls back to sector-based defaults
- Default profiles use conservative assumptions (regional tier, even geographic distribution within sector norms)
### Interpolation Engine Failures
- Database errors during macro impact computation are logged and the event is skipped for that company
- The aggregation engine treats missing macro data as "no macro signal" — never blocks trend computation
### Projection Failures
- If projection computation fails (e.g., insufficient historical data), the trend summary is still persisted without a projection
- Low-confidence projections are marked but still displayed as informational
### Runtime Toggle Safety
- Toggle state is read from PostgreSQL at the start of each aggregation cycle — no caching that could become stale
- Toggle changes are audit-logged with operator identity, previous state, and new state
- Disabling the macro layer does not delete any data — ingestion and classification continue, only interpolation and aggregation integration are skipped
## Testing Strategy
### Property-Based Testing
This feature is well-suited for property-based testing. The core interpolation logic (impact scoring, overlap computation, resilience modifiers, signal weighting) consists of pure functions with clear input/output behavior and a large input space. The scoring formula has universal properties (monotonicity, bounds, zero-overlap invariant) that should hold across all valid inputs.
**Library**: [Hypothesis](https://hypothesis.readthedocs.io/) for Python property-based testing.
**Configuration**: Minimum 100 iterations per property test.
**Tag format**: `Feature: global-news-interpolation, Property {number}: {property_text}`
Each correctness property above maps to one property-based test. Generators will produce:
- Random `GlobalEvent` objects with valid enum values and realistic field ranges
- Random `ExposureProfile` objects with valid geographic mixes (summing to ~1.0), commodity lists, and tier values
- Random `WeightedSignal` lists mixing macro and company-specific signals
- Random `TrendSummary` objects for projection testing
### Unit Tests
Unit tests cover specific examples, edge cases, and integration points:
- Event classification prompt construction and schema validation
- Exposure profile API CRUD operations
- Default profile generation for each sector/market_cap combination
- Macro toggle API endpoints (status, toggle, audit logging)
- Recommendation thesis text includes macro signal references when present
- Dashboard component rendering for Global Events page, macro exposure panel, and projection display
### Integration Tests
Integration tests verify end-to-end data flow:
- Macro article ingestion → parsing → classification → interpolation → aggregation pipeline
- Lake publisher writes correct Parquet partitions for global events and macro impacts
- Trino queries joining global_events, macro_impacts, and trend_windows return expected results
- Macro toggle state change propagates to next aggregation cycle
@@ -0,0 +1,167 @@
# Requirements Document — Global News Interpolation Layer
## Introduction
This feature adds a macro-level global news interpolation layer to the Stonks Oracle platform. The existing system ingests company-specific news, filings, and market data to produce per-company trend summaries and trade recommendations. This extension introduces a parallel signal path that ingests global and geopolitical news events — tariffs, wars, sanctions, central bank rate decisions, commodity shocks, natural disasters, regulatory changes, pandemics, and similar macro events — classifies them by impact type and severity, maps them to affected business sectors and individual companies based on exposure profiles, and feeds the resulting macro intelligence into the aggregation engine as an additional weighted signal layer alongside existing company-specific document intelligence.
The interpolation layer accounts for the fact that the same global event affects different businesses differently depending on their business class, what they produce or market, their geographic revenue exposure, supply chain dependencies, and their position on the world scale (domestic-only vs. multinational vs. emerging-market-dependent).
## Glossary
- **Global_Event**: A macro-level news event with potential cross-sector or cross-geography market impact (e.g., a tariff announcement, armed conflict, central bank rate decision, commodity supply disruption, natural disaster, or regulatory change).
- **Event_Classifier**: The Ollama-based extraction service that classifies a Global_Event by impact type, severity, affected regions, and affected sectors.
- **Exposure_Profile**: A per-company record describing geographic revenue mix, supply chain dependencies, key input commodities, regulatory jurisdictions, and market position tier that determines how a Global_Event maps to that company.
- **Macro_Impact_Score**: A computed score in [0, 1] representing the estimated magnitude of a Global_Event's effect on a specific company, derived from the event's severity and the company's Exposure_Profile overlap.
- **Interpolation_Engine**: The component that combines Global_Event classifications with company Exposure_Profiles to produce per-company Macro_Impact_Scores and feed them into the existing Aggregation_Engine.
- **Aggregation_Engine**: The existing trend aggregation system (services/aggregation/) that computes rolling trend summaries from document intelligence signals.
- **Impact_Type**: The category of economic effect a Global_Event produces (e.g., supply_disruption, demand_shift, cost_increase, regulatory_pressure, currency_impact, commodity_shock, trade_barrier, geopolitical_risk).
- **Severity_Level**: A classification of a Global_Event's magnitude: low, moderate, high, or critical.
- **Market_Position_Tier**: A company's scale classification affecting its resilience to macro shocks: global_leader, multinational, regional, or domestic.
- **Macro_Source**: A news source configured specifically for global/macro event ingestion, distinct from company-specific news sources.
## Requirements
### Requirement 1: Global Event Ingestion
**User Story:** As an analyst, I want the platform to ingest global and geopolitical news from macro-focused sources, so that macro events are captured alongside company-specific intelligence.
#### Acceptance Criteria
1. WHEN the Scheduler triggers a macro news ingestion cycle, THE Ingestion_Engine SHALL fetch articles from configured Macro_Sources and persist raw response payloads to MinIO under the `stonks-raw-news` bucket with a `macro/` prefix path segment.
2. WHEN a macro news article is ingested, THE Ingestion_Engine SHALL generate a stable content hash and use it to prevent duplicate processing, consistent with existing deduplication behavior.
3. WHEN a macro news article is ingested, THE Ingestion_Engine SHALL persist a metadata record in PostgreSQL with source, URL, title, publication time, retrieval time, language, and content hash, using document_type `macro_event`.
4. IF a macro news source is unreachable or returns an error, THEN THE Ingestion_Engine SHALL record the failure reason, retry policy state, and next eligible retry time, consistent with existing source failure handling.
### Requirement 2: Global Event Classification
**User Story:** As an analyst, I want each global news article classified by impact type, severity, affected regions, and affected sectors, so that the platform understands what kind of macro shock each event represents.
#### Acceptance Criteria
1. WHEN a macro news article passes parsing, THE Event_Classifier SHALL send the normalized text to a local Ollama model using structured JSON output with an explicit schema.
2. WHEN the Event_Classifier processes a macro article, THE Event_Classifier SHALL produce a Global_Event intelligence object containing at minimum: event_id, event_type (one or more Impact_Types), severity (a Severity_Level), affected_regions (list of ISO country or region codes), affected_sectors (list of GICS sector identifiers or equivalent), affected_commodities (list when applicable), summary, key_facts, estimated_duration (short_term, medium_term, long_term), confidence score, and model metadata.
3. WHEN the Ollama model returns an invalid or incomplete classification, THE Event_Classifier SHALL retry extraction according to policy and preserve both the failed output and validation errors.
4. WHEN a Global_Event affects multiple Impact_Types simultaneously, THE Event_Classifier SHALL represent all applicable types rather than collapsing to a single category.
5. THE Event_Classifier SHALL persist the classification prompt, schema, model metadata, and raw model output to MinIO for audit and reproducibility.
### Requirement 3: Company Exposure Profiles
**User Story:** As an operator, I want to define each tracked company's geographic exposure, supply chain dependencies, and market position, so that the platform can determine how global events affect each company differently.
#### Acceptance Criteria
1. WHEN an operator creates or updates a company's Exposure_Profile, THE Symbol_Registry SHALL persist the profile containing: geographic_revenue_mix (a map of region codes to revenue percentage), supply_chain_regions (list of regions where key suppliers operate), key_input_commodities (list of commodities the company depends on), regulatory_jurisdictions (list of jurisdictions with material regulatory exposure), market_position_tier (one of global_leader, multinational, regional, domestic), and export_dependency_pct (percentage of revenue from exports).
2. WHEN no Exposure_Profile exists for a tracked company, THE Interpolation_Engine SHALL use a default profile derived from the company's sector and industry fields, with market_position_tier inferred from market_cap_bucket.
3. WHEN an operator updates an Exposure_Profile, THE Symbol_Registry SHALL record the previous profile version for audit trail purposes.
4. THE Symbol_Registry SHALL expose Exposure_Profile CRUD operations through its existing REST API.
### Requirement 4: Macro-to-Company Impact Mapping
**User Story:** As a strategist, I want the platform to compute how each global event specifically impacts each tracked company based on their exposure profile, so that macro intelligence is company-specific rather than generic.
#### Acceptance Criteria
1. WHEN a Global_Event classification is produced, THE Interpolation_Engine SHALL compute a Macro_Impact_Score for each tracked company by evaluating the overlap between the event's affected_regions, affected_sectors, and affected_commodities against the company's Exposure_Profile.
2. WHEN computing a Macro_Impact_Score, THE Interpolation_Engine SHALL weight the score by the event's Severity_Level, the degree of geographic overlap (using geographic_revenue_mix percentages), the supply chain exposure (using supply_chain_regions), and the commodity dependency overlap.
3. WHEN computing a Macro_Impact_Score, THE Interpolation_Engine SHALL apply a resilience modifier based on the company's Market_Position_Tier, where global_leader companies receive a dampening factor and domestic companies receive an amplification factor for international events.
4. WHEN a Global_Event has zero overlap with a company's Exposure_Profile, THE Interpolation_Engine SHALL assign a Macro_Impact_Score of 0.0 and skip further processing for that company-event pair.
5. WHEN a Macro_Impact_Score is computed, THE Interpolation_Engine SHALL produce a macro impact record containing: event_id, company_id, ticker, macro_impact_score, impact_direction (positive, negative, or mixed), contributing_factors (list of which profile dimensions matched), and confidence score.
6. WHEN the same Global_Event produces both positive and negative effects on a company, THE Interpolation_Engine SHALL represent the net direction as mixed and preserve both the positive and negative contributing factors separately.
### Requirement 5: Aggregation Engine Integration
**User Story:** As a strategist, I want macro impact signals to be blended into existing company trend summaries alongside company-specific document intelligence, so that recommendations reflect both micro and macro conditions.
#### Acceptance Criteria
1. WHEN the Aggregation_Engine computes a company trend summary, THE Aggregation_Engine SHALL include macro impact records as additional weighted signals alongside existing document intelligence signals.
2. WHEN weighting macro impact signals, THE Aggregation_Engine SHALL apply recency decay, event severity weighting, and confidence gating consistent with existing signal scoring, using the Global_Event's publication time for recency and the Macro_Impact_Score as the impact score.
3. WHEN macro signals and company-specific signals disagree in direction, THE Aggregation_Engine SHALL represent the disagreement explicitly in the contradiction_score and disagreement_details fields, consistent with existing contradiction detection behavior.
4. WHEN a trend summary includes macro signal contributions, THE Aggregation_Engine SHALL include the contributing Global_Event IDs in the evidence references so that the macro signal chain is traceable from recommendation back to source event.
5. WHEN no macro impact records exist for a company in the aggregation window, THE Aggregation_Engine SHALL produce the trend summary using only company-specific signals, with no degradation of existing behavior.
6. THE Aggregation_Engine SHALL expose a configurable weight parameter (macro_signal_weight) that controls the relative influence of macro signals versus company-specific signals in the combined trend, defaulting to 0.3.
### Requirement 6: Sector and Market Rollup Enhancement
**User Story:** As an analyst, I want sector-level and market-level trend rollups to reflect macro event impacts, so that I can see how global events are shifting entire sectors.
#### Acceptance Criteria
1. WHEN the Aggregation_Engine computes a sector-level rollup, THE Aggregation_Engine SHALL incorporate macro impact signals that affect the sector, weighted by the number and exposure of constituent companies impacted.
2. WHEN the Aggregation_Engine computes a market-level rollup, THE Aggregation_Engine SHALL incorporate macro impact signals aggregated across all sectors, reflecting the breadth and severity of active global events.
3. WHEN a Global_Event disproportionately affects one sector, THE Aggregation_Engine SHALL surface that sector as a material_risk or dominant_catalyst in the market-level rollup.
### Requirement 7: Global Event Storage and Queryability
**User Story:** As a data engineer, I want global event classifications and macro impact records stored in both the operational database and the analytical lake, so that I can query macro intelligence alongside company data.
#### Acceptance Criteria
1. WHEN a Global_Event classification is produced, THE System SHALL persist the classification record in PostgreSQL with fields for event_id, event_types, severity, affected_regions, affected_sectors, affected_commodities, summary, estimated_duration, confidence, source_document_id, and model metadata.
2. WHEN a macro impact record is computed, THE System SHALL persist it in PostgreSQL with fields for event_id, company_id, ticker, macro_impact_score, impact_direction, contributing_factors, confidence, and computed_at timestamp.
3. WHEN the Lake_Publisher runs, THE Lake_Publisher SHALL publish global event facts and macro impact facts as partitioned Parquet datasets to MinIO under the `stonks-lakehouse` bucket.
4. WHEN analytical queries join macro impact data with company trends, THE System SHALL support SQL joins between global_events, macro_impacts, trend_windows, and recommendations tables through Trino.
### Requirement 8: Dashboard Visibility
**User Story:** As an analyst, I want to see active global events, their severity, and which companies they impact through the web dashboard, so that I can understand the macro context behind trend shifts.
#### Acceptance Criteria
1. WHEN an analyst navigates to a new Global Events section, THE Dashboard SHALL display a filterable list of recent Global_Events with columns for event summary, impact types, severity badge, affected regions, affected sectors, and event date.
2. WHEN an analyst clicks a Global_Event, THE Dashboard SHALL display the full classification detail including all affected companies with their Macro_Impact_Scores, impact directions, and contributing factors.
3. WHEN an analyst views a company detail page, THE Dashboard SHALL display a macro exposure panel showing the company's Exposure_Profile and a list of active Global_Events affecting that company with their Macro_Impact_Scores.
4. WHEN an analyst views a trend summary, THE Dashboard SHALL visually distinguish macro-sourced evidence from company-specific evidence in the evidence chain.
5. WHEN an analyst views a recommendation, THE Dashboard SHALL display any macro signals that contributed to the recommendation with links back to the originating Global_Events.
### Requirement 9: Exposure Profile Auto-Inference
**User Story:** As an operator, I want the platform to automatically infer a baseline exposure profile from company filings and public data when I haven't manually configured one, so that macro interpolation works out of the box for newly tracked companies.
#### Acceptance Criteria
1. WHEN a company is tracked and has no manually configured Exposure_Profile, THE Event_Classifier SHALL attempt to infer a baseline profile from the company's most recent filing extractions, using geographic revenue breakdowns, supplier mentions, and commodity references found in the document intelligence.
2. WHEN the Event_Classifier infers an Exposure_Profile, THE Event_Classifier SHALL mark the profile as source `inferred` with a confidence score, distinguishing it from operator-configured profiles marked as source `manual`.
3. IF the Event_Classifier cannot infer a meaningful profile due to insufficient filing data, THEN THE Interpolation_Engine SHALL fall back to the sector-based default profile described in Requirement 3.2.
### Requirement 10: Macro Signal Suppression and Safety
**User Story:** As a risk owner, I want macro signals to be subject to quality controls so that low-confidence or stale global event classifications do not drive automated trading decisions.
#### Acceptance Criteria
1. WHEN a Global_Event classification has a confidence score below a configurable threshold (default 0.4), THE Interpolation_Engine SHALL exclude the event from macro impact computation and log the exclusion reason.
2. WHEN a Global_Event's estimated_duration is short_term and the event is older than 48 hours, THE Interpolation_Engine SHALL apply an accelerated decay factor to the event's macro impact signals.
3. WHEN macro signals are the sole basis for a trend direction change (no supporting company-specific signals), THE Recommendation_Engine SHALL mark the recommendation as informational only and append a macro-only caveat to the thesis.
4. IF the macro ingestion pipeline experiences sustained failures exceeding a configurable threshold, THEN THE System SHALL alert operators and continue producing recommendations using only company-specific signals.
### Requirement 11: Macro Signal Layer Toggle
**User Story:** As an operator, I want to enable or disable the macro signal interpolation layer at runtime without redeploying services, so that I can control whether global news influences trend summaries and recommendations.
#### Acceptance Criteria
1. WHEN an operator toggles the macro signal layer via the Trading Controls page or the API, THE System SHALL persist the setting in the risk_configs table and apply it immediately to subsequent aggregation and recommendation cycles without requiring a service restart.
2. WHEN the macro signal layer is disabled, THE Aggregation_Engine SHALL skip all macro impact signals and produce trend summaries using only company-specific document intelligence, with no change to existing behavior.
3. WHEN the macro signal layer is disabled, THE Ingestion_Engine SHALL continue ingesting and classifying macro news articles so that historical macro data is preserved, but THE Interpolation_Engine SHALL skip macro-to-company impact computation.
4. WHEN the macro signal layer is re-enabled after being disabled, THE Interpolation_Engine SHALL resume computing macro impact scores using the most recent Global_Event classifications, including events ingested while the layer was disabled.
5. THE Query API SHALL expose a `GET /api/admin/macro/status` endpoint returning the current enabled/disabled state and a `PUT /api/admin/macro/toggle` endpoint to switch it.
6. THE Dashboard Trading Controls page SHALL display the macro signal layer toggle alongside the existing trading mode controls, with a confirmation dialog for state changes.
7. WHEN the macro signal layer state changes, THE System SHALL record an audit event with the previous state, new state, and the operator who made the change.
### Requirement 12: Trend Projections
**User Story:** As a strategist, I want the platform to generate forward-looking trend projections that combine historical company-specific signals with active macro event trajectories, so that I can anticipate where a company's trend is heading rather than only seeing where it is now.
#### Acceptance Criteria
1. WHEN the Aggregation_Engine produces a trend summary for a company, THE Aggregation_Engine SHALL also compute a trend projection containing a projected_direction (bullish, bearish, mixed, neutral), projected_strength, projected_confidence, projection_horizon (1d, 7d, 30d), and a list of driving_factors explaining what is expected to push the trend in that direction.
2. WHEN computing a trend projection, THE Aggregation_Engine SHALL consider: the current trend trajectory and momentum (rate of change in strength over recent windows), active Global_Events with estimated_duration extending beyond the current window, the severity and decay profile of active macro signals, upcoming known catalysts from document intelligence (earnings dates, regulatory deadlines, product launches), and the historical pattern of how similar macro event types have resolved for companies with similar Exposure_Profiles.
3. WHEN a trend projection diverges from the current trend direction (e.g., current trend is bullish but projection is bearish), THE Aggregation_Engine SHALL flag the projection as a potential reversal signal and include the divergence reason in the driving_factors.
4. WHEN the macro signal layer is disabled, THE Aggregation_Engine SHALL still compute trend projections using only company-specific signal momentum and known upcoming catalysts, with reduced projection confidence.
5. WHEN a trend projection is produced, THE System SHALL persist it in PostgreSQL alongside the trend_window record with fields for projected_direction, projected_strength, projected_confidence, projection_horizon, driving_factors, macro_contribution_pct (percentage of projection driven by macro signals vs company-specific), and computed_at timestamp.
6. WHEN the Lake_Publisher runs, THE Lake_Publisher SHALL publish trend projection facts as a partitioned Parquet dataset to MinIO for analytical queries and backtesting.
7. WHEN an analyst views a trend summary on the Dashboard, THE Dashboard SHALL display the trend projection alongside the current trend with a visual indicator showing the projected direction and strength, and an expandable panel listing the driving factors.
8. WHEN a recommendation is generated, THE Recommendation_Engine SHALL incorporate the trend projection into the thesis and time_horizon fields, citing the projected direction and key driving factors.
9. WHEN a trend projection's confidence falls below a configurable threshold (default 0.3), THE System SHALL mark the projection as low_confidence and exclude it from influencing recommendation eligibility, while still displaying it as informational on the dashboard.
10. THE System SHALL expose a `GET /api/trends/{trend_id}/projection` endpoint returning the projection for a specific trend window, and include projection data in the existing `GET /api/trends` list response.
@@ -0,0 +1,338 @@
# Implementation Plan: Global News Interpolation Layer
## Overview
This plan implements a macro-level global news interpolation layer that ingests global/geopolitical news events, classifies them via Ollama, maps them to companies via exposure profiles, and feeds macro impact scores into the existing aggregation engine. The implementation extends existing services (extractor, aggregation, symbol registry, recommendation, API, lake publisher, dashboard) rather than creating new deployments. Tasks are ordered so each step builds on the previous, with property-based tests validating core scoring logic early.
## Tasks
- [x] 1. Database migration and shared schemas
- [x] 1.1 Create PostgreSQL migration `infra/migrations/016_global_news_interpolation.sql`
- Add `global_events` table with event_types, severity, affected_regions, affected_sectors, affected_commodities, summary, key_facts, estimated_duration, confidence, source_document_id FK, model metadata, created_at
- Add `macro_impact_records` table with event_id FK, company_id FK, ticker, macro_impact_score, impact_direction, contributing_factors, confidence, computed_at
- Add `exposure_profiles` table with company_id FK, geographic_revenue_mix, supply_chain_regions, key_input_commodities, regulatory_jurisdictions, market_position_tier, export_dependency_pct, source, confidence, version, active, created_at, updated_at
- Add `trend_projections` table with trend_window_id FK, projected_direction, projected_strength, projected_confidence, projection_horizon, driving_factors, macro_contribution_pct, diverges_from_current, computed_at
- Add indexes on `macro_impact_records(event_id)`, `macro_impact_records(company_id, computed_at)`, `macro_impact_records(ticker, computed_at)`, `exposure_profiles(company_id, active)`, `global_events(created_at)`, `trend_projections(trend_window_id)`
- _Requirements: 7.1, 7.2, 3.1, 12.5_
- [x] 1.2 Add new Pydantic schemas and enums to `services/shared/schemas.py`
- Add `ImpactType`, `SeverityLevel`, `MarketPositionTier`, `EstimatedDuration` enums
- Add `MACRO_EVENT = "macro_event"` to `DocumentType` enum
- Add `GlobalEventSchema`, `MacroImpactRecordSchema`, `ExposureProfileSchema`, `TrendProjectionSchema` Pydantic models
- _Requirements: 2.2, 4.5, 3.1, 12.1_
- [x] 1.3 Add macro-related Redis queue name to `services/shared/redis_keys.py`
- Add `QUEUE_MACRO_CLASSIFICATION = "macro_classification"` for event classification jobs
- _Requirements: 1.1_
- [x] 1.4 Add macro configuration fields to `services/shared/config.py`
- Add `macro_signal_weight`, `macro_enabled`, `macro_confidence_threshold`, `macro_short_term_staleness_hours`, `projection_confidence_threshold` fields to a new `MacroConfig` dataclass
- Add `macro: MacroConfig` to `AppConfig` with env var loading in `load_config()`
- _Requirements: 5.6, 10.1, 10.2, 12.9_
- [x] 2. Checkpoint — Ensure migration and schemas are consistent
- Ensure all tests pass, ask the user if questions arise.
- [x] 3. Event classifier module
- [x] 3.1 Implement `services/extractor/event_classifier.py`
- Implement `GlobalEvent` dataclass matching the design specification
- Implement `get_event_json_schema()` returning the Ollama structured output schema for event classification
- Implement `build_event_classification_prompt(text: str) -> str` with anti-hallucination instructions for macro event extraction
- Implement `classify_global_event(normalized_text, document_id, ollama_client) -> GlobalEvent` using the existing `OllamaClient` with retry logic
- Persist classification prompt, schema, model metadata, and raw output to MinIO under `stonks-llm-prompts/` and `stonks-llm-results/`
- Persist the `GlobalEvent` record to the `global_events` PostgreSQL table
- _Requirements: 2.1, 2.2, 2.3, 2.4, 2.5_
- [x] 3.2 Write property test for GlobalEvent schema completeness
- **Property 2: Macro pipeline output schema completeness**
- **Validates: Requirements 2.2, 4.5**
- [x] 3.3 Write property test for multiple impact types preserved
- **Property 3: Multiple impact types preserved**
- **Validates: Requirements 2.4**
- [x] 4. Exposure profile management
- [x] 4.1 Implement `services/symbol_registry/exposure.py`
- Implement `ExposureProfile` Pydantic model for API request/response
- Implement `GET /companies/{company_id}/exposure` endpoint returning the current active profile
- Implement `PUT /companies/{company_id}/exposure` endpoint that archives the previous version (sets `active=FALSE`) and inserts a new version with incremented version number
- Implement `GET /companies/{company_id}/exposure/history` endpoint returning all profile versions ordered by version descending
- Register routes on the Symbol Registry FastAPI app
- _Requirements: 3.1, 3.3, 3.4_
- [x] 4.2 Write property test for exposure profile version history
- **Property 6: Exposure profile version history**
- **Validates: Requirements 3.3**
- [x] 4.3 Write property test for default exposure profile derivation
- **Property 5: Default exposure profile derivation**
- **Validates: Requirements 3.2**
- [x] 5. Interpolation engine — core scoring logic
- [x] 5.1 Implement `services/aggregation/interpolation.py`
- Implement `MacroImpactRecord` dataclass matching the design specification
- Implement `compute_geographic_overlap(event_regions, revenue_mix) -> float` using revenue percentage weighting
- Implement `compute_supply_chain_overlap(event_regions, supply_regions) -> float` using set intersection ratio
- Implement `compute_commodity_overlap(event_commodities, company_commodities) -> float` using set intersection ratio
- Implement `apply_resilience_modifier(raw_score, tier, event_is_international) -> float` with tier multipliers: global_leader=0.7, multinational=0.85, regional=1.0, domestic=1.2
- Implement `compute_macro_impact(event: GlobalEvent, profile: ExposureProfile) -> MacroImpactRecord` using the scoring formula: `severity_weight * (0.35*geo + 0.25*supply + 0.25*commodity + 0.15*sector)` then resilience modifier
- Implement `build_default_profile(sector, industry, market_cap_bucket) -> ExposureProfile` for companies without manual profiles
- Handle zero-overlap case: return score 0.0 and skip further processing
- Handle mixed direction: when both positive and negative factors exist, set direction to 'mixed' and preserve both factor lists
- Persist `MacroImpactRecord` objects to the `macro_impact_records` PostgreSQL table
- _Requirements: 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 3.2_
- [x] 5.2 Write property test for macro impact score bounds and zero-overlap invariant
- **Property 7: Macro impact score bounds and zero-overlap invariant**
- **Validates: Requirements 4.1, 4.4**
- [x] 5.3 Write property test for scoring monotonicity
- **Property 8: Scoring monotonicity**
- **Validates: Requirements 4.2**
- [x] 5.4 Write property test for resilience modifier tier ordering
- **Property 9: Resilience modifier tier ordering**
- **Validates: Requirements 4.3**
- [x] 5.5 Write property test for mixed direction dual-effect events
- **Property 10: Mixed direction for dual-effect events**
- **Validates: Requirements 4.6**
- [x] 6. Checkpoint — Ensure core scoring logic and property tests pass
- Ensure all tests pass, ask the user if questions arise.
- [x] 7. Aggregation engine integration
- [x] 7.1 Extend `services/aggregation/worker.py` to incorporate macro signals
- Add `macro_signal_weight` and `macro_enabled` fields to `AggregationConfig`
- In `aggregate_company_window`, check macro toggle state from `risk_configs` table
- Fetch `macro_impact_records` for the ticker within the aggregation window
- Convert each `MacroImpactRecord` to a `WeightedSignal` using: `document_id=event.source_document_id`, `sentiment_value` mapped from `impact_direction`, `impact_score=macro_impact_score * macro_signal_weight`, recency decay from event publication time, confidence gating from macro record confidence
- Merge macro signals with company-specific signals before computing trend direction, strength, confidence, and contradiction score
- Include contributing `GlobalEvent` source_document_ids in evidence references
- When macro layer is disabled or no macro data exists, produce identical output to company-only aggregation
- _Requirements: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6_
- [x] 7.2 Write property test for macro signals influencing trend output
- **Property 11: Macro signals influence trend output**
- **Validates: Requirements 5.1**
- [x] 7.3 Write property test for macro-company contradiction detection
- **Property 12: Macro-company contradiction detection**
- **Validates: Requirements 5.3**
- [x] 7.4 Write property test for macro evidence traceability
- **Property 13: Macro evidence traceability**
- **Validates: Requirements 5.4**
- [x] 7.5 Write property test for no degradation without macro data and disabled-layer equivalence
- **Property 14: No degradation without macro data and disabled-layer equivalence**
- **Validates: Requirements 5.5, 11.2**
- [x] 8. Sector and market rollup enhancement
- [x] 8.1 Extend sector and market rollup logic in `services/aggregation/worker.py`
- When computing sector-level rollups, incorporate macro impact signals affecting the sector weighted by constituent company exposure
- When computing market-level rollups, aggregate macro signals across all sectors reflecting breadth and severity
- When a GlobalEvent disproportionately affects one sector (>60% of total macro impact), surface that sector in `material_risks` or `dominant_catalysts` of the market-level rollup
- _Requirements: 6.1, 6.2, 6.3_
- [x] 8.2 Write property test for sector and market rollup macro incorporation
- **Property 15: Sector and market rollup macro incorporation**
- **Validates: Requirements 6.1, 6.2, 6.3**
- [x] 9. Trend projection module
- [x] 9.1 Implement `services/aggregation/projection.py`
- Implement `TrendProjection` dataclass matching the design specification
- Implement projection logic: compute trend momentum (rate of change in strength across recent windows), project macro signal decay based on `estimated_duration` and severity, factor in upcoming catalysts from document intelligence, combine into projected direction/strength/confidence
- Flag divergence when projected direction differs from current trend direction, include divergence reason in `driving_factors`
- When macro layer is disabled, compute projections using only company-specific momentum with reduced confidence
- Mark projections with `projected_confidence` below threshold (default 0.3) as `low_confidence`
- Persist `TrendProjection` to the `trend_projections` PostgreSQL table alongside the trend_window record
- Call projection computation from `aggregate_company_window` after trend summary is assembled
- _Requirements: 12.1, 12.2, 12.3, 12.4, 12.5, 12.9_
- [x] 9.2 Write property test for trend projection always produced
- **Property 20: Trend projection always produced**
- **Validates: Requirements 12.1**
- [x] 9.3 Write property test for projection divergence flagging
- **Property 21: Projection divergence flagging**
- **Validates: Requirements 12.3**
- [x] 9.4 Write property test for macro-disabled projections have reduced confidence
- **Property 22: Macro-disabled projections have reduced confidence**
- **Validates: Requirements 12.4**
- [x] 9.5 Write property test for low-confidence projection exclusion
- **Property 23: Low-confidence projection exclusion**
- **Validates: Requirements 12.9**
- [x] 10. Checkpoint — Ensure aggregation integration and projections work correctly
- Ensure all tests pass, ask the user if questions arise.
- [x] 11. Macro signal suppression and safety
- [x] 11.1 Implement exposure profile auto-inference in `services/extractor/exposure_inference.py`
- Implement `infer_exposure_profile(document_intelligences, sector, industry, market_cap_bucket) -> ExposureProfile`
- Scan recent filing extractions for geographic revenue breakdowns, supplier mentions, and commodity references
- Produce profile with `source='inferred'` and a confidence score reflecting data quality
- Fall back to sector-based default profile when insufficient filing data
- _Requirements: 9.1, 9.2, 9.3_
- [x] 11.2 Write property test for inferred exposure profile correctness
- **Property 16: Inferred exposure profile correctness**
- **Validates: Requirements 9.1, 9.2**
- [x] 11.3 Extend `services/recommendation/suppression.py` with macro-only suppression
- Add `MACRO_ONLY_SIGNAL = "macro_only_signal"` to `SuppressionReason` enum
- Implement `evaluate_macro_only_suppression(summary, macro_signal_count, company_signal_count) -> bool`
- When macro signals are the sole basis for a trend direction change, force recommendation to `mode='informational'` and append macro-only caveat to thesis
- _Requirements: 10.3_
- [x] 11.4 Write property test for macro-only recommendation suppression
- **Property 19: Macro-only recommendation suppression**
- **Validates: Requirements 10.3**
- [x] 11.5 Implement low-confidence event exclusion and accelerated decay in interpolation engine
- In `services/aggregation/interpolation.py`, skip events with confidence below configurable threshold (default 0.4) and log exclusion reason
- Apply accelerated decay factor for short_term events older than 48 hours (effective weight strictly less than standard recency decay)
- _Requirements: 10.1, 10.2_
- [x] 11.6 Write property test for low-confidence event exclusion
- **Property 17: Low-confidence event exclusion**
- **Validates: Requirements 10.1**
- [x] 11.7 Write property test for accelerated decay for stale short-term events
- **Property 18: Accelerated decay for stale short-term events**
- **Validates: Requirements 10.2**
- [x] 12. Macro signal layer toggle and API endpoints
- [x] 12.1 Implement macro toggle and status endpoints in `services/api/app.py`
- Add `GET /api/admin/macro/status` returning current enabled/disabled state from `risk_configs` table
- Add `PUT /api/admin/macro/toggle` to switch macro layer on/off, persisting to `risk_configs` and recording an audit event with previous state, new state, and operator
- Toggle state is read from PostgreSQL at the start of each aggregation cycle (no caching)
- _Requirements: 11.1, 11.5, 11.7_
- [x] 12.2 Implement macro event and impact query endpoints in `services/api/app.py`
- Add `GET /api/macro/events` — list recent global events with filtering by severity, region, sector, date range
- Add `GET /api/macro/events/{event_id}` — event detail with list of affected companies and their macro impact scores
- Add `GET /api/macro/impacts/{ticker}` — macro impacts for a specific company
- Add `GET /api/trends/{trend_id}/projection` — trend projection for a specific trend window
- Include projection data in existing `GET /api/trends` list response
- _Requirements: 8.1, 8.2, 12.10_
- [x] 12.3 Ensure macro ingestion continues when layer is disabled
- When macro layer is disabled, ingestion and classification continue (historical data preserved), but interpolation and aggregation integration are skipped
- When re-enabled, resume computing macro impact scores using most recent classifications including events ingested while disabled
- _Requirements: 11.2, 11.3, 11.4_
- [x] 13. Checkpoint — Ensure API endpoints and toggle logic work correctly
- Ensure all tests pass, ask the user if questions arise.
- [x] 14. Lake publisher extensions
- [x] 14.1 Add macro fact publishers to the lake publisher service
- Implement `publish_global_event_fact` writing partitioned Parquet datasets to `stonks-lakehouse/warehouse/global_events/dt={date}/`
- Implement `publish_macro_impact_fact` writing partitioned Parquet datasets to `stonks-lakehouse/warehouse/macro_impacts/dt={date}/ticker={ticker}/`
- Implement `publish_trend_projection_fact` writing partitioned Parquet datasets to `stonks-lakehouse/warehouse/trend_projections/dt={date}/ticker={ticker}/`
- Register new fact types in the lake publisher's job processing loop
- _Requirements: 7.3, 12.6_
- [x] 14.2 Write property test for macro data persistence round-trip
- **Property 4: Macro data persistence round-trip**
- **Validates: Requirements 3.1, 7.1, 7.2, 12.5**
- [x] 14.3 Write property test for content hash stability and uniqueness
- **Property 1: Content hash stability and uniqueness**
- **Validates: Requirements 1.2**
- [x] 15. Macro ingestion pipeline wiring
- [x] 15.1 Wire macro source ingestion into the scheduler and ingestion worker
- Configure scheduler to trigger macro news source fetches on polling interval
- Ingestion worker stores raw payloads in MinIO under `stonks-raw-news/macro/` prefix
- Metadata records use `document_type='macro_event'` in PostgreSQL
- Content hash deduplication consistent with existing behavior
- Source failure handling with retry policy consistent with existing sources
- _Requirements: 1.1, 1.2, 1.3, 1.4_
- [x] 15.2 Wire event classification into the extractor worker
- After parsing, route `macro_event` documents to `event_classifier.classify_global_event()` instead of standard document extraction
- After classification, trigger interpolation for all tracked companies via aggregation queue
- _Requirements: 2.1, 2.2, 2.3_
- [x] 15.3 Wire interpolation into the aggregation pipeline
- After event classification, load exposure profiles for all tracked companies (manual, inferred, or default)
- Compute `MacroImpactRecord` for each company with non-zero overlap
- Persist records and trigger aggregation for affected tickers
- Handle sustained macro ingestion failures: alert operators and continue with company-only signals
- _Requirements: 4.1, 4.5, 10.4_
- [x] 16. Checkpoint — Ensure full backend pipeline works end-to-end
- Ensure all tests pass, ask the user if questions arise.
- [x] 17. Dashboard — Global Events page and macro exposure panel
- [x] 17.1 Create Global Events list page at `frontend/src/pages/GlobalEvents.tsx`
- Filterable list of recent global events with columns: summary, impact types, severity badge, affected regions, affected sectors, event date
- Add API hooks for `GET /api/macro/events` in `frontend/src/api/hooks.ts`
- Add route `/macro/events` in `frontend/src/routes.tsx`
- Add navigation entry in sidebar in `frontend/src/components/AppLayout.tsx`
- _Requirements: 8.1_
- [x] 17.2 Create Global Event detail page at `frontend/src/pages/GlobalEventDetail.tsx`
- Display full classification detail: all affected companies with Macro_Impact_Scores, impact directions, contributing factors
- Add API hook for `GET /api/macro/events/{event_id}`
- Add route `/macro/events/:id` in `frontend/src/routes.tsx`
- _Requirements: 8.2_
- [x] 17.3 Add macro exposure panel to Company Detail page
- On `frontend/src/pages/CompanyDetail.tsx`, add a new tab/panel showing the company's Exposure_Profile and active GlobalEvents affecting the company with their Macro_Impact_Scores
- Add API hook for `GET /api/macro/impacts/{ticker}`
- _Requirements: 8.3_
- [x] 17.4 Add macro evidence indicators to Trend and Recommendation detail pages
- On `frontend/src/pages/TrendDetail.tsx`, visually distinguish macro-sourced evidence from company-specific evidence in the evidence chain
- On `frontend/src/pages/RecommendationDetail.tsx`, display macro signals that contributed with links back to originating GlobalEvents
- _Requirements: 8.4, 8.5_
- [x] 17.5 Add trend projection display to Trend detail page
- On `frontend/src/pages/TrendDetail.tsx`, display projected direction/strength alongside current trend with visual indicator and expandable driving factors panel
- Add API hook for `GET /api/trends/{trend_id}/projection`
- _Requirements: 12.7_
- [x] 17.6 Add macro toggle to Trading Controls page
- On `frontend/src/pages/Trading.tsx`, add macro signal layer enable/disable switch with confirmation dialog
- Add API hooks for `GET /api/admin/macro/status` and `PUT /api/admin/macro/toggle`
- _Requirements: 11.5, 11.6_
- [x] 18. Checkpoint — Ensure frontend pages render and integrate with API
- Ensure all tests pass, ask the user if questions arise.
- [x] 19. Integration wiring and final validation
- [x] 19.1 Add recommendation engine integration for trend projections
- Incorporate trend projection into recommendation thesis and time_horizon fields, citing projected direction and key driving factors
- Exclude low-confidence projections from influencing recommendation eligibility
- _Requirements: 12.8, 12.9_
- [x] 19.2 Write integration tests for macro pipeline end-to-end
- Test macro article ingestion → parsing → classification → interpolation → aggregation flow
- Test lake publisher writes correct Parquet partitions for global events and macro impacts
- Test macro toggle state change propagates to next aggregation cycle
- _Requirements: 1.1, 2.1, 4.1, 5.1, 7.3, 11.1_
- [x] 19.3 Write unit tests for API endpoints and dashboard components
- Test macro event list/detail endpoints return correct data
- Test macro toggle endpoint persists state and records audit event
- Test trend projection endpoint returns projection data
- Add MSW handlers for macro endpoints in `frontend/src/test/mocks/handlers.ts`
- Test GlobalEvents page and macro exposure panel render correctly
- _Requirements: 8.1, 8.2, 11.5, 12.10_
- [x] 20. Final checkpoint — Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
## Notes
- Tasks marked with `*` are optional and can be skipped for faster MVP
- Each task references specific requirements for traceability
- Checkpoints ensure incremental validation after each major phase
- Property tests validate the 23 correctness properties from the design using Hypothesis
- The design uses Python throughout — no language selection needed
- No new Kubernetes deployments required; all modules extend existing services
- Next migration number is 016