admin/stonks-oracle

Fork 0

Files

T

Celes Renata f7a11d14ea feat: competitive intelligence & historical pattern matching layer

2026-04-14 19:42:48 +00:00

30 KiB

Raw Blame History

Global News Interpolation Layer — Design

Overview

This design extends the Stonks Oracle platform with a macro-level global news interpolation layer. The layer introduces a parallel signal path that ingests global/geopolitical news events, classifies them by impact type and severity using Ollama, maps them to individual companies via exposure profiles, and feeds the resulting macro impact scores into the existing aggregation engine as weighted signals alongside company-specific document intelligence.

The design integrates with the existing service architecture — no new Kubernetes deployments are required. The event classifier reuses the extractor service's Ollama client, the interpolation engine runs within the aggregation worker, and exposure profiles are managed through the symbol registry API. A runtime toggle allows operators to enable/disable the macro signal layer without redeployment.

Design Rationale

Reuse over new services: The macro pipeline reuses existing ingestion, parsing, extraction, aggregation, and lake publisher infrastructure. New logic is added as modules within existing services rather than standalone deployments.
Exposure-driven specificity: Rather than applying a blanket macro sentiment to all companies, the system computes company-specific impact scores based on geographic revenue mix, supply chain exposure, and commodity dependencies.
Safety-first: Macro signals are subject to confidence gating, staleness decay, and a dedicated runtime toggle. Macro-only trend shifts are forced to informational mode.
Auditability: Every macro impact score is traceable from the originating global event through the classification, exposure profile overlap, and final weighted contribution to the trend summary.

Architecture

The macro interpolation layer adds four logical components that run within existing services:

flowchart TD
    subgraph Ingestion["Ingestion Service (existing)"]
        MS[Macro Source Adapter]
    end

    subgraph Parser["Parser Service (existing)"]
        MP[Macro Article Parser]
    end

    subgraph Extractor["Extractor Service (existing)"]
        EC[Event Classifier Module]
    end

    subgraph SymReg["Symbol Registry (existing)"]
        EP[Exposure Profile CRUD]
    end

    subgraph Aggregation["Aggregation Service (existing)"]
        IE[Interpolation Engine]
        AE[Aggregation Engine]
        TP[Trend Projections]
    end

    subgraph Recommendation["Recommendation Service (existing)"]
        RE[Macro-Aware Recommendations]
    end

    subgraph LakePublisher["Lake Publisher (existing)"]
        LP[Macro Fact Publisher]
    end

    subgraph QueryAPI["Query API (existing)"]
        MA[Macro API Endpoints]
        MT[Macro Toggle Endpoint]
    end

    subgraph Dashboard["Dashboard (existing)"]
        GEP[Global Events Page]
        MEP[Macro Exposure Panel]
    end

    MS -->|raw macro articles| MP
    MP -->|normalized text| EC
    EC -->|Global_Event classification| IE
    EP -->|Exposure_Profiles| IE
    IE -->|macro impact signals| AE
    AE -->|trend summaries + projections| TP
    TP --> RE
    EC -->|event facts| LP
    IE -->|impact facts| LP
    MT -->|toggle state| AE
    MA --> GEP
    MA --> MEP

Data Flow

Ingestion: Scheduler triggers macro source fetches. The existing ingestion worker fetches from configured macro news sources and stores raw payloads in MinIO under stonks-raw-news/macro/. Metadata records use document_type = 'macro_event'.
Parsing: The existing parser normalizes macro articles identically to company-specific articles. No parser changes needed — the parser is document-type agnostic.
Classification: A new event_classifier module in the extractor service uses a dedicated Ollama prompt and JSON schema to produce GlobalEvent classification objects. The module reuses the existing OllamaClient for inference and retry logic.
Interpolation: A new interpolation module in the aggregation service loads company exposure profiles, computes overlap scores against each classified event, and produces MacroImpactRecord objects. These are stored in PostgreSQL and fed into the aggregation engine as additional weighted signals.
Aggregation: The existing aggregate_company_window function is extended to fetch macro impact records alongside document impact records. Macro signals use the same WeightedSignal abstraction with recency decay, confidence gating, and contradiction detection.
Trend Projections: A new projection module computes forward-looking trend estimates by combining current trend momentum with active macro event trajectories and known upcoming catalysts.
Recommendation: The recommendation engine incorporates macro signals through the trend summary (no direct changes needed). A new check forces macro-only trend shifts to informational mode.
Lake Publication: New publish_global_event_fact and publish_macro_impact_fact functions in the lake publisher write partitioned Parquet datasets for analytical queries.

Components and Interfaces

Event Classifier Module

Location: services/extractor/event_classifier.py

Responsible for classifying macro news articles into structured GlobalEvent objects using Ollama.

@dataclass
class GlobalEvent:
    event_id: str                    # UUID
    event_types: list[str]           # Impact_Type values
    severity: str                    # Severity_Level: low|moderate|high|critical
    affected_regions: list[str]      # ISO 3166-1 alpha-2 codes or region names
    affected_sectors: list[str]      # GICS sector identifiers
    affected_commodities: list[str]  # commodity identifiers when applicable
    summary: str
    key_facts: list[str]
    estimated_duration: str          # short_term|medium_term|long_term
    confidence: float                # [0, 1]
    source_document_id: str          # FK to documents table
    model_metadata: ModelMetadata

Interface:

classify_global_event(normalized_text: str, document_id: str, ollama_client: OllamaClient) -> GlobalEvent
build_event_classification_prompt(text: str) -> str
get_event_json_schema() -> dict

Ollama Integration: Uses the existing OllamaClient with a dedicated prompt template (event-classification-v1) and JSON schema. Retries follow the same policy as document extraction.

Exposure Profile Management

Location: services/symbol_registry/exposure.py

New endpoints on the Symbol Registry API for managing company exposure profiles.

class ExposureProfile(BaseModel):
    company_id: str
    geographic_revenue_mix: dict[str, float]   # region_code -> pct (0-1)
    supply_chain_regions: list[str]            # region codes
    key_input_commodities: list[str]           # commodity identifiers
    regulatory_jurisdictions: list[str]        # jurisdiction codes
    market_position_tier: str                  # global_leader|multinational|regional|domestic
    export_dependency_pct: float               # 0-1
    source: str                                # "manual" | "inferred"
    confidence: float                          # [0, 1], relevant for inferred profiles
    version: int                               # auto-incremented on update

API Endpoints (on Symbol Registry):

GET /companies/{company_id}/exposure — get current profile
PUT /companies/{company_id}/exposure — create/update profile (archives previous version)
GET /companies/{company_id}/exposure/history — list profile versions

Interpolation Engine

Location: services/aggregation/interpolation.py

Computes per-company macro impact scores by evaluating overlap between global event classifications and company exposure profiles.

@dataclass
class MacroImpactRecord:
    event_id: str
    company_id: str
    ticker: str
    macro_impact_score: float        # [0, 1]
    impact_direction: str            # positive|negative|mixed
    contributing_factors: list[str]  # which profile dimensions matched
    confidence: float                # [0, 1]
    computed_at: datetime

Core Functions:

compute_macro_impact(event: GlobalEvent, profile: ExposureProfile) -> MacroImpactRecord
compute_geographic_overlap(event_regions: list[str], revenue_mix: dict[str, float]) -> float
compute_supply_chain_overlap(event_regions: list[str], supply_regions: list[str]) -> float
compute_commodity_overlap(event_commodities: list[str], company_commodities: list[str]) -> float
apply_resilience_modifier(raw_score: float, tier: str, event_is_international: bool) -> float
build_default_profile(sector: str, industry: str, market_cap_bucket: str) -> ExposureProfile

Scoring Formula:

raw_score = severity_weight * (
    geo_weight * geographic_overlap +
    supply_weight * supply_chain_overlap +
    commodity_weight * commodity_overlap +
    sector_weight * sector_match
)
final_score = apply_resilience_modifier(raw_score, market_position_tier)

Where:

severity_weight: critical=1.0, high=0.75, moderate=0.5, low=0.25
geo_weight=0.35, supply_weight=0.25, commodity_weight=0.25, sector_weight=0.15
Resilience modifiers: global_leader=0.7, multinational=0.85, regional=1.0, domestic=1.2 (for international events)

Aggregation Engine Extensions

Location: Modified services/aggregation/worker.py

The existing aggregate_company_window function is extended to:

Check the macro signal layer toggle (from risk_configs table)
Fetch macro impact records for the ticker within the window
Convert macro impact records to WeightedSignal objects using the same scoring pipeline
Merge macro signals with company-specific signals before computing the trend summary
Apply macro_signal_weight (default 0.3) to control relative influence

New config field on AggregationConfig:

macro_signal_weight: float = 0.3  # relative weight of macro vs company signals
macro_enabled: bool = True         # runtime toggle state

Macro signal conversion: Each MacroImpactRecord is converted to a WeightedSignal using:

document_id = event's source_document_id (for evidence tracing)
sentiment_value = mapped from impact_direction (positive=+1, negative=-1, mixed=0)
impact_score = macro_impact_score * macro_signal_weight
Recency decay uses the global event's publication time
Confidence gating uses the macro impact record's confidence

Trend Projection Module

Location: services/aggregation/projection.py

Computes forward-looking trend projections alongside current trend summaries.

@dataclass
class TrendProjection:
    projected_direction: str         # bullish|bearish|mixed|neutral
    projected_strength: float        # [0, 1]
    projected_confidence: float      # [0, 1]
    projection_horizon: str          # 1d|7d|30d
    driving_factors: list[str]       # human-readable explanations
    macro_contribution_pct: float    # % of projection driven by macro signals
    diverges_from_current: bool      # True if projection != current direction
    computed_at: datetime

Inputs:

Current trend summary (direction, strength, momentum)
Active global events with estimated_duration extending beyond the current window
Upcoming known catalysts from document intelligence (earnings dates, regulatory deadlines)
Historical resolution patterns for similar event types (optional, v2)

Projection Logic:

Compute trend momentum as rate of change in strength across recent windows
Project macro signal decay based on event estimated_duration and severity
Factor in upcoming catalysts that may shift direction
Combine momentum + macro trajectory + catalyst outlook into projected direction/strength
Flag divergence when projected direction differs from current direction

Macro Signal Suppression

Location: Extended services/recommendation/suppression.py

New suppression check: when macro signals are the sole basis for a trend direction change (no supporting company-specific signals agree), the recommendation is forced to informational mode with a macro-only caveat.

New function:

evaluate_macro_only_suppression(summary: TrendSummary, macro_signal_count: int, company_signal_count: int) -> bool

Exposure Profile Auto-Inference

Location: services/extractor/exposure_inference.py

Infers baseline exposure profiles from company filing extractions when no manual profile exists.

Interface:

infer_exposure_profile(document_intelligences: list[DocumentIntelligence], sector: str, industry: str, market_cap_bucket: str) -> ExposureProfile

Scans recent filing extractions for geographic revenue breakdowns, supplier mentions, and commodity references. Produces an ExposureProfile with source='inferred' and a confidence score reflecting data quality.

Query API Extensions

Location: Extended services/api/

New endpoints:

GET /api/macro/events — list recent global events with filtering
GET /api/macro/events/{event_id} — event detail with affected companies
GET /api/macro/impacts/{ticker} — macro impacts for a company
GET /api/admin/macro/status — macro layer enabled/disabled state
PUT /api/admin/macro/toggle — toggle macro layer on/off
GET /api/trends/{trend_id}/projection — trend projection for a specific window

Dashboard Extensions

Location: Extended frontend/src/

New pages/panels:

Global Events page (/macro/events): filterable list of global events with severity badges, region/sector tags, and drill-down to affected companies
Macro Exposure panel on Company Detail page: shows exposure profile and active macro impacts
Macro evidence indicators on Trend and Recommendation detail pages: visually distinguishes macro-sourced evidence
Trend projection display on Trend detail page: projected direction/strength with driving factors
Macro toggle on Trading Controls page: enable/disable switch with confirmation dialog

Data Models

New PostgreSQL Tables

`global_events`

CREATE TABLE global_events (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    event_types TEXT[] NOT NULL,
    severity VARCHAR(20) NOT NULL,
    affected_regions TEXT[] NOT NULL DEFAULT '{}',
    affected_sectors TEXT[] NOT NULL DEFAULT '{}',
    affected_commodities TEXT[] NOT NULL DEFAULT '{}',
    summary TEXT NOT NULL,
    key_facts JSONB NOT NULL DEFAULT '[]',
    estimated_duration VARCHAR(20) NOT NULL,
    confidence FLOAT NOT NULL,
    source_document_id UUID REFERENCES documents(id),
    model_provider VARCHAR(100),
    model_name VARCHAR(200),
    prompt_version VARCHAR(100),
    schema_version VARCHAR(20),
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

`macro_impact_records`

CREATE TABLE macro_impact_records (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    event_id UUID NOT NULL REFERENCES global_events(id),
    company_id UUID NOT NULL REFERENCES companies(id),
    ticker VARCHAR(20) NOT NULL,
    macro_impact_score FLOAT NOT NULL,
    impact_direction VARCHAR(20) NOT NULL,
    contributing_factors JSONB NOT NULL DEFAULT '[]',
    confidence FLOAT NOT NULL,
    computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

`exposure_profiles`

CREATE TABLE exposure_profiles (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    company_id UUID NOT NULL REFERENCES companies(id),
    geographic_revenue_mix JSONB NOT NULL DEFAULT '{}',
    supply_chain_regions TEXT[] NOT NULL DEFAULT '{}',
    key_input_commodities TEXT[] NOT NULL DEFAULT '{}',
    regulatory_jurisdictions TEXT[] NOT NULL DEFAULT '{}',
    market_position_tier VARCHAR(30) NOT NULL DEFAULT 'regional',
    export_dependency_pct FLOAT NOT NULL DEFAULT 0.0,
    source VARCHAR(20) NOT NULL DEFAULT 'manual',
    confidence FLOAT NOT NULL DEFAULT 1.0,
    version INTEGER NOT NULL DEFAULT 1,
    active BOOLEAN NOT NULL DEFAULT TRUE,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

`trend_projections`

CREATE TABLE trend_projections (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    trend_window_id UUID NOT NULL REFERENCES trend_windows(id),
    projected_direction VARCHAR(20) NOT NULL,
    projected_strength FLOAT NOT NULL,
    projected_confidence FLOAT NOT NULL,
    projection_horizon VARCHAR(10) NOT NULL,
    driving_factors JSONB NOT NULL DEFAULT '[]',
    macro_contribution_pct FLOAT NOT NULL DEFAULT 0.0,
    diverges_from_current BOOLEAN NOT NULL DEFAULT FALSE,
    computed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

New Pydantic Schemas

Added to services/shared/schemas.py:

class ImpactType(str, Enum):
    SUPPLY_DISRUPTION = "supply_disruption"
    DEMAND_SHIFT = "demand_shift"
    COST_INCREASE = "cost_increase"
    REGULATORY_PRESSURE = "regulatory_pressure"
    CURRENCY_IMPACT = "currency_impact"
    COMMODITY_SHOCK = "commodity_shock"
    TRADE_BARRIER = "trade_barrier"
    GEOPOLITICAL_RISK = "geopolitical_risk"

class SeverityLevel(str, Enum):
    LOW = "low"
    MODERATE = "moderate"
    HIGH = "high"
    CRITICAL = "critical"

class MarketPositionTier(str, Enum):
    GLOBAL_LEADER = "global_leader"
    MULTINATIONAL = "multinational"
    REGIONAL = "regional"
    DOMESTIC = "domestic"

class EstimatedDuration(str, Enum):
    SHORT_TERM = "short_term"
    MEDIUM_TERM = "medium_term"
    LONG_TERM = "long_term"

Analytical Lake Datasets

New fact tables published to MinIO under stonks-lakehouse/:

lake.global_events — partitioned by dt, columns: event_id, event_types, severity, affected_regions, affected_sectors, affected_commodities, summary, estimated_duration, confidence, source_document_id, created_at
lake.macro_impacts — partitioned by dt and ticker, columns: event_id, company_id, ticker, macro_impact_score, impact_direction, contributing_factors, confidence, computed_at
lake.trend_projections — partitioned by dt and ticker, columns: trend_window_id, ticker, projected_direction, projected_strength, projected_confidence, projection_horizon, driving_factors, macro_contribution_pct, diverges_from_current, computed_at

Correctness Properties

A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.

Property 1: Content hash stability and uniqueness

For any macro news article content, computing the content hash twice on identical content SHALL produce the same hash, and computing the hash on distinct content SHALL produce different hashes.

Validates: Requirements 1.2

Property 2: Macro pipeline output schema completeness

For any valid Ollama classification response, the resulting GlobalEvent object SHALL contain all required fields (event_id, event_types, severity, affected_regions, affected_sectors, summary, estimated_duration, confidence, source_document_id, model_metadata). Similarly, for any valid macro impact computation, the resulting MacroImpactRecord SHALL contain all required fields (event_id, company_id, ticker, macro_impact_score, impact_direction, contributing_factors, confidence).

Validates: Requirements 2.2, 4.5

Property 3: Multiple impact types preserved

For any global event classification where the source article implies N distinct impact types, the resulting GlobalEvent's event_types list SHALL contain all N types without collapsing to a single category.

Validates: Requirements 2.4

Property 4: Macro data persistence round-trip

For any valid GlobalEvent, MacroImpactRecord, ExposureProfile, or TrendProjection object, persisting it to PostgreSQL and reading it back SHALL produce an equivalent object with all fields preserved.

Validates: Requirements 3.1, 7.1, 7.2, 12.5

Property 5: Default exposure profile derivation

For any company with a valid sector, industry, and market_cap_bucket but no manually configured ExposureProfile, the default profile SHALL have a market_position_tier consistent with the market_cap_bucket mapping (large_cap → global_leader, mid_cap → multinational, small_cap → regional, micro_cap → domestic) and SHALL have non-empty geographic_revenue_mix derived from the sector.

Validates: Requirements 3.2

Property 6: Exposure profile version history

For any sequence of N updates to a company's ExposureProfile, the version history SHALL contain exactly N records, each preserving the complete profile state at the time of that update, with monotonically increasing version numbers.

Validates: Requirements 3.3

Property 7: Macro impact score bounds and zero-overlap invariant

For any GlobalEvent and ExposureProfile pair, the computed Macro_Impact_Score SHALL be in [0, 1]. Furthermore, for any pair where the event's affected_regions, affected_sectors, and affected_commodities have zero intersection with the profile's geographic_revenue_mix keys, supply_chain_regions, and key_input_commodities, the score SHALL be exactly 0.0.

Validates: Requirements 4.1, 4.4

Property 8: Scoring monotonicity

For any GlobalEvent and ExposureProfile pair, increasing the event's severity level (low → moderate → high → critical) while holding all other inputs constant SHALL produce a Macro_Impact_Score that is greater than or equal to the previous score. Similarly, increasing the geographic overlap percentage SHALL produce a score greater than or equal to the previous score.

Validates: Requirements 4.2

Property 9: Resilience modifier tier ordering

For any positive raw impact score and an international event, applying the resilience modifier with market_position_tier=global_leader SHALL produce a final score less than or equal to multinational, which SHALL be less than or equal to regional, which SHALL be less than or equal to domestic.

Validates: Requirements 4.3

Property 10: Mixed direction for dual-effect events

For any GlobalEvent and ExposureProfile pair where the computation identifies both positive and negative contributing factors, the resulting impact_direction SHALL be 'mixed' and both positive and negative factors SHALL be preserved separately in contributing_factors.

Validates: Requirements 4.6

Property 11: Macro signals influence trend output

For any company with both company-specific signals and non-zero macro impact signals, the trend summary computed with macro signals included SHALL differ from the trend summary computed with only company-specific signals (in at least one of: trend_strength, confidence, or evidence references).

Validates: Requirements 5.1

Property 12: Macro-company contradiction detection

For any set of signals where macro impact signals have a negative direction and company-specific signals have a positive sentiment (or vice versa), the resulting trend summary's contradiction_score SHALL be greater than zero and disagreement_details SHALL contain at least one entry.

Validates: Requirements 5.3

Property 13: Macro evidence traceability

For any trend summary that includes macro signal contributions, the top_supporting_evidence or top_opposing_evidence lists SHALL contain the source_document_id of at least one contributing GlobalEvent.

Validates: Requirements 5.4

Property 14: No degradation without macro data and disabled-layer equivalence

For any company with no macro impact records in the aggregation window, the trend summary produced with the macro layer enabled SHALL be identical to the trend summary produced with the macro layer disabled. Furthermore, for any aggregation run with the macro layer disabled, the output SHALL be identical to company-only aggregation regardless of existing macro data.

Validates: Requirements 5.5, 11.2

Property 15: Sector and market rollup macro incorporation

For any sector containing companies with non-zero macro impact scores, the sector-level rollup SHALL reflect those macro signals in its trend_strength or confidence. Furthermore, for any GlobalEvent that disproportionately affects a single sector (>60% of total macro impact concentrated in one sector), that sector SHALL appear in the market-level rollup's material_risks or dominant_catalysts.

Validates: Requirements 6.1, 6.2, 6.3

Property 16: Inferred exposure profile correctness

For any set of filing extractions containing geographic revenue breakdowns or commodity references, the inferred ExposureProfile SHALL have source='inferred', confidence in [0, 1], and geographic_revenue_mix entries that correspond to regions mentioned in the filings.

Validates: Requirements 9.1, 9.2

Property 17: Low-confidence event exclusion

For any GlobalEvent classification with confidence below the configurable threshold (default 0.4), the Interpolation_Engine SHALL produce zero MacroImpactRecords for that event.

Validates: Requirements 10.1

Property 18: Accelerated decay for stale short-term events

For any GlobalEvent with estimated_duration='short_term' and age exceeding 48 hours, the effective signal weight SHALL be strictly less than the weight computed using standard recency decay for the same age.

Validates: Requirements 10.2

Property 19: Macro-only recommendation suppression

For any trend summary where the trend direction is driven solely by macro signals (no company-specific signals support the direction), the resulting recommendation SHALL have mode='informational' and the thesis SHALL contain a macro-only caveat.

Validates: Requirements 10.3

Property 20: Trend projection always produced

For any trend summary produced by the Aggregation_Engine, a corresponding TrendProjection SHALL also be produced with valid projected_direction, projected_strength in [0, 1], projected_confidence in [0, 1], and a non-empty driving_factors list.

Validates: Requirements 12.1

Property 21: Projection divergence flagging

For any TrendProjection where projected_direction differs from the current trend summary's trend_direction, the diverges_from_current field SHALL be True and driving_factors SHALL contain at least one entry explaining the divergence.

Validates: Requirements 12.3

Property 22: Macro-disabled projections have reduced confidence

For any identical set of company signals and macro signals, the TrendProjection computed with the macro layer disabled SHALL have projected_confidence less than or equal to the projection computed with the macro layer enabled.

Validates: Requirements 12.4

Property 23: Low-confidence projection exclusion

For any TrendProjection with projected_confidence below the configurable threshold (default 0.3), the projection SHALL be marked as low_confidence and SHALL NOT influence recommendation eligibility.

Validates: Requirements 12.9

Error Handling

Macro Ingestion Failures

Source fetch failures follow existing retry/backoff logic from the ingestion service
Sustained macro source failures (configurable threshold, default 3 consecutive) trigger operator alerts via the existing alerting framework
The aggregation engine continues producing trends using company-specific signals only when macro ingestion is degraded

Event Classification Failures

Invalid Ollama responses trigger retries per existing extraction retry policy (max 2 retries with exponential backoff)
Failed classifications are preserved in MinIO with validation errors for debugging
Failed events do not produce macro impact records — they are silently excluded from interpolation

Exposure Profile Fallbacks

Missing manual profiles fall back to sector-based defaults
Failed auto-inference falls back to sector-based defaults
Default profiles use conservative assumptions (regional tier, even geographic distribution within sector norms)

Interpolation Engine Failures

Database errors during macro impact computation are logged and the event is skipped for that company
The aggregation engine treats missing macro data as "no macro signal" — never blocks trend computation

Projection Failures

If projection computation fails (e.g., insufficient historical data), the trend summary is still persisted without a projection
Low-confidence projections are marked but still displayed as informational

Runtime Toggle Safety

Toggle state is read from PostgreSQL at the start of each aggregation cycle — no caching that could become stale
Toggle changes are audit-logged with operator identity, previous state, and new state
Disabling the macro layer does not delete any data — ingestion and classification continue, only interpolation and aggregation integration are skipped

Testing Strategy

Property-Based Testing

This feature is well-suited for property-based testing. The core interpolation logic (impact scoring, overlap computation, resilience modifiers, signal weighting) consists of pure functions with clear input/output behavior and a large input space. The scoring formula has universal properties (monotonicity, bounds, zero-overlap invariant) that should hold across all valid inputs.

Library: Hypothesis for Python property-based testing.

Configuration: Minimum 100 iterations per property test.

Tag format: Feature: global-news-interpolation, Property {number}: {property_text}

Each correctness property above maps to one property-based test. Generators will produce:

Random GlobalEvent objects with valid enum values and realistic field ranges
Random ExposureProfile objects with valid geographic mixes (summing to ~1.0), commodity lists, and tier values
Random WeightedSignal lists mixing macro and company-specific signals
Random TrendSummary objects for projection testing

Unit Tests

Unit tests cover specific examples, edge cases, and integration points:

Event classification prompt construction and schema validation
Exposure profile API CRUD operations
Default profile generation for each sector/market_cap combination
Macro toggle API endpoints (status, toggle, audit logging)
Recommendation thesis text includes macro signal references when present
Dashboard component rendering for Global Events page, macro exposure panel, and projection display

Integration Tests

Integration tests verify end-to-end data flow:

Macro article ingestion → parsing → classification → interpolation → aggregation pipeline
Lake publisher writes correct Parquet partitions for global events and macro impacts
Trino queries joining global_events, macro_impacts, and trend_windows return expected results
Macro toggle state change propagates to next aggregation cycle

30 KiB Raw Blame History