21 KiB
Requirements Document — Competitive Intelligence & Historical Pattern Matching Layer
Introduction
This feature adds a third signal layer to the Stonks Oracle aggregation engine: competitive intelligence and historical pattern matching. The existing platform produces per-company trend summaries from two signal sources — company-specific document intelligence (layer 1) and global macro news interpolation (layer 2). This extension introduces a third parallel signal path that mines the existing document_intelligence, document_impact_records, and trend_windows tables to identify historical patterns — how similar catalyst types for the same company or its competitors resolved in the past — and uses those patterns to reinforce or weaken current trend signals.
The core insight is that competitive dynamics are predictable: when a company receives a bullish product catalyst, its direct competitors often experience a measurable bearish reaction within a short window. By mining the platform's own historical data for these patterns, the system can propagate signals across competitor relationships and weight current trends based on how similar situations resolved historically.
This layer does not ingest new external data. It mines existing data already in PostgreSQL — sentiment, catalyst types, impact scores from document_impact_records, and historical direction/strength outcomes from trend_windows — to produce pattern-based signals that feed into the aggregation engine alongside the other two layers.
Glossary
- Competitor_Relationship: A directional or bidirectional link between two tracked companies indicating they compete in the same market segment. Relationships have a strength score in [0, 1] and a relationship_type (direct_rival, same_sector, overlapping_products, supply_chain_adjacent).
- Competitor_Registry: The component within the Symbol_Registry that manages Competitor_Relationships, supporting both operator-defined and auto-inferred relationships.
- Historical_Pattern: A statistical summary derived from past
document_impact_recordsandtrend_windowsdata, describing how a specific catalyst_type for a specific company (or its competitors) historically correlated with trend outcomes within a given time horizon. - Pattern_Matcher: The component that queries historical data to find past instances of similar catalyst types for a company or its competitors, computes outcome statistics, and produces Historical_Pattern objects.
- Pattern_Signal: A weighted signal derived from a Historical_Pattern that feeds into the Aggregation_Engine, representing the historical tendency for a given catalyst type to produce a specific trend outcome.
- Competitive_Signal: A Pattern_Signal that propagates from one company's news event to a competitor, based on historical evidence of how similar events affected the competitor in the past.
- Signal_Propagation_Engine: The component that evaluates incoming document intelligence for a company, identifies its competitors via the Competitor_Registry, queries the Pattern_Matcher for historical precedents, and produces Competitive_Signals for affected competitors.
- Aggregation_Engine: The existing trend aggregation system (services/aggregation/) that computes rolling trend summaries from document intelligence signals, macro signals, and now pattern-based signals.
- Pattern_Confidence: A score in [0, 1] reflecting how statistically reliable a Historical_Pattern is, based on sample size, consistency of outcomes, and recency of the historical data.
- Competitive_Layer_Toggle: A runtime switch allowing operators to enable or disable the competitive/historical pattern signal layer without redeployment, analogous to the macro layer toggle.
Requirements
Requirement 1: Competitor Relationship Management
User Story: As an operator, I want to define which companies are competitors of each other, so that the platform can propagate signals across competitive relationships.
Acceptance Criteria
- WHEN an operator creates a Competitor_Relationship between two companies, THE Competitor_Registry SHALL persist the relationship containing: company_a_id, company_b_id, relationship_type (one of direct_rival, same_sector, overlapping_products, supply_chain_adjacent), strength (a float in [0, 1] representing how closely the companies compete), bidirectional flag (whether the relationship applies in both directions), and source (manual or inferred).
- WHEN an operator queries competitors for a given company, THE Competitor_Registry SHALL return all Competitor_Relationships where the company appears as either company_a or company_b, ordered by strength descending.
- WHEN an operator deletes a Competitor_Relationship, THE Competitor_Registry SHALL soft-delete the relationship by marking it inactive rather than removing the row, preserving audit history.
- THE Competitor_Registry SHALL expose Competitor_Relationship CRUD operations through the Symbol_Registry REST API.
- WHEN a Competitor_Relationship is created or updated, THE Competitor_Registry SHALL record an audit event with the previous state, new state, and the operator who made the change.
Requirement 2: Competitor Auto-Inference
User Story: As an operator, I want the platform to automatically suggest competitor relationships based on sector, industry, and document co-mentions, so that I do not have to manually define every relationship.
Acceptance Criteria
- WHEN an operator triggers competitor auto-inference for a company, THE Competitor_Registry SHALL identify candidate competitors by matching companies that share the same sector and industry fields in the companies table.
- WHEN the Competitor_Registry identifies sector-based candidates, THE Competitor_Registry SHALL further rank candidates by counting co-mentions in the document_company_mentions table — companies frequently mentioned in the same documents receive higher strength scores.
- WHEN the Competitor_Registry produces auto-inferred relationships, THE Competitor_Registry SHALL mark each relationship with source
inferredand a strength score derived from the sector match and co-mention frequency, distinguishing them from operator-defined relationships marked as sourcemanual. - WHEN auto-inferred relationships already exist for a company, THE Competitor_Registry SHALL refresh them on re-inference rather than creating duplicates, updating strength scores based on the latest co-mention data.
- THE Competitor_Registry SHALL expose an inference endpoint at
POST /companies/{company_id}/competitors/inferthat triggers auto-inference and returns the resulting candidate relationships.
Requirement 3: Historical Pattern Mining
User Story: As a strategist, I want the platform to mine its historical data to find how similar catalyst types resolved in the past for a given company, so that current signals can be weighted by historical precedent.
Acceptance Criteria
- WHEN the Pattern_Matcher receives a query for a company and catalyst_type, THE Pattern_Matcher SHALL search the document_impact_records table for past instances where the same company received the same catalyst_type, and join against trend_windows to determine the trend direction and strength that followed within configurable time horizons (default: 1d, 7d, 30d).
- WHEN the Pattern_Matcher finds historical instances, THE Pattern_Matcher SHALL compute a Historical_Pattern containing: company ticker, catalyst_type, time_horizon, sample_count (number of historical instances found), bullish_pct (percentage of instances that resolved bullish), bearish_pct (percentage that resolved bearish), avg_strength (average trend strength of the outcomes), avg_time_to_resolution (average days until the trend direction stabilized), and pattern_confidence (a score reflecting statistical reliability).
- WHEN computing pattern_confidence, THE Pattern_Matcher SHALL weight the score by sample_count (more samples increase confidence, with diminishing returns above 20 samples), outcome_consistency (how uniform the historical outcomes are — 90% bullish is more confident than 55% bullish), and data_recency (patterns from the last 90 days receive higher weight than patterns from 180+ days ago).
- WHEN the Pattern_Matcher finds fewer than 3 historical instances for a company-catalyst pair, THE Pattern_Matcher SHALL mark the pattern_confidence as low (below 0.3) and flag the pattern as insufficient_data.
- WHEN the Pattern_Matcher queries historical data, THE Pattern_Matcher SHALL only consider document_impact_records linked to document_intelligence with validation_status
validand documents with status not equal torejected.
Requirement 4: Competitive Signal Propagation
User Story: As a strategist, I want the platform to evaluate how news about one company historically affected its competitors, so that competitor news can inform a company's trend assessment.
Acceptance Criteria
- WHEN new document intelligence is produced for a company, THE Signal_Propagation_Engine SHALL identify the company's active competitors via the Competitor_Registry and query the Pattern_Matcher for historical instances where the same catalyst_type hitting the source company correlated with trend outcomes for each competitor.
- WHEN the Pattern_Matcher finds historical cross-company patterns, THE Pattern_Matcher SHALL compute a Historical_Pattern for the competitor containing: source_ticker (the company that received the original catalyst), target_ticker (the competitor), catalyst_type, time_horizon, sample_count, bullish_pct, bearish_pct, avg_strength, and pattern_confidence.
- WHEN the Signal_Propagation_Engine produces a Competitive_Signal for a competitor, THE Signal_Propagation_Engine SHALL weight the signal by the Competitor_Relationship strength, the Historical_Pattern's pattern_confidence, and the source document's impact_score.
- WHEN a Competitive_Signal is produced, THE Signal_Propagation_Engine SHALL persist a competitive_signal_record containing: source_document_id, source_ticker, target_ticker, catalyst_type, pattern_confidence, signal_direction (bullish or bearish based on historical pattern), signal_strength, relationship_strength, and computed_at timestamp.
- WHEN the Competitor_Relationship strength is below a configurable threshold (default 0.2), THE Signal_Propagation_Engine SHALL skip signal propagation for that competitor pair and log the skip reason.
Requirement 5: Pattern-Based Trend Reinforcement
User Story: As a strategist, I want historical patterns to strengthen or weaken current trend signals, so that the aggregation engine accounts for how similar situations resolved in the past.
Acceptance Criteria
- WHEN the Aggregation_Engine computes a company trend summary, THE Aggregation_Engine SHALL include pattern-based signals (both self-company historical patterns and competitive signals) as additional weighted signals alongside existing document intelligence and macro signals.
- WHEN weighting pattern-based signals, THE Aggregation_Engine SHALL apply the pattern_confidence as a confidence gate, the Historical_Pattern's avg_strength as the impact_score, and recency decay based on the source document's publication time, consistent with existing signal scoring.
- WHEN a Historical_Pattern indicates a direction that contradicts the current company-specific signals, THE Aggregation_Engine SHALL represent the disagreement in the contradiction_score and disagreement_details fields, consistent with existing contradiction detection behavior.
- WHEN a trend summary includes pattern-based signal contributions, THE Aggregation_Engine SHALL include the source document IDs in the evidence references so that the pattern signal chain is traceable.
- WHEN no historical patterns or competitive signals exist for a company in the aggregation window, THE Aggregation_Engine SHALL produce the trend summary using only company-specific and macro signals, with no degradation of existing behavior.
- THE Aggregation_Engine SHALL expose a configurable weight parameter (competitive_signal_weight) that controls the relative influence of pattern-based signals versus other signal layers, defaulting to 0.2.
Requirement 6: Competitive Layer Toggle
User Story: As an operator, I want to enable or disable the competitive intelligence and historical pattern layer at runtime without redeploying services, so that I can control whether historical patterns and competitor signals influence trend summaries.
Acceptance Criteria
- WHEN an operator toggles the competitive signal layer via the Trading Controls page or the API, THE System SHALL persist the setting in the risk_configs table and apply it immediately to subsequent aggregation cycles without requiring a service restart.
- WHEN the competitive signal layer is disabled, THE Aggregation_Engine SHALL skip all pattern-based and competitive signals and produce trend summaries using only company-specific document intelligence and macro signals (if enabled).
- WHEN the competitive signal layer is disabled, THE Pattern_Matcher SHALL continue to be queryable for historical patterns (so that the data remains available for manual analysis), but THE Signal_Propagation_Engine SHALL skip automatic competitive signal computation during aggregation.
- WHEN the competitive signal layer is re-enabled after being disabled, THE Signal_Propagation_Engine SHALL resume computing pattern-based and competitive signals using the latest historical data, including any document intelligence ingested while the layer was disabled.
- THE Query API SHALL expose a
GET /api/admin/competitive/statusendpoint returning the current enabled/disabled state and aPUT /api/admin/competitive/toggleendpoint to switch it. - THE Dashboard Trading Controls page SHALL display the competitive signal layer toggle alongside the existing trading mode and macro layer controls, with a confirmation dialog for state changes.
- WHEN the competitive signal layer state changes, THE System SHALL record an audit event with the previous state, new state, and the operator who made the change.
Requirement 7: Competitive Intelligence Storage
User Story: As a data engineer, I want competitor relationships, historical patterns, and competitive signals stored in both the operational database and the analytical lake, so that I can query competitive intelligence alongside other platform data.
Acceptance Criteria
- WHEN a Competitor_Relationship is created, THE System SHALL persist it in PostgreSQL with fields for company_a_id, company_b_id, relationship_type, strength, bidirectional, source, active status, and timestamps.
- WHEN a competitive_signal_record is produced, THE System SHALL persist it in PostgreSQL with fields for source_document_id, source_ticker, target_ticker, catalyst_type, pattern_confidence, signal_direction, signal_strength, relationship_strength, and computed_at timestamp.
- WHEN the Lake_Publisher runs, THE Lake_Publisher SHALL publish competitor relationship facts and competitive signal facts as partitioned Parquet datasets to MinIO under the
stonks-lakehousebucket. - WHEN analytical queries join competitive signal data with company trends, THE System SHALL support SQL joins between competitor_relationships, competitive_signals, trend_windows, and document_impact_records tables through Trino.
Requirement 8: Dashboard Visibility
User Story: As an analyst, I want to see competitor relationships, historical patterns, and competitive signals through the web dashboard, so that I can understand the competitive context behind trend assessments.
Acceptance Criteria
- WHEN an analyst views a company detail page, THE Dashboard SHALL display a competitors panel showing the company's active Competitor_Relationships with each competitor's ticker, relationship_type, strength score, and source (manual or inferred).
- WHEN an analyst views a company detail page, THE Dashboard SHALL display a historical patterns panel showing recent Historical_Patterns for the company, including catalyst_type, historical outcome distribution (bullish_pct, bearish_pct), sample_count, and pattern_confidence.
- WHEN an analyst views a trend summary, THE Dashboard SHALL visually distinguish pattern-based and competitive signal evidence from company-specific and macro evidence in the evidence chain.
- WHEN an analyst clicks a competitive signal in the evidence chain, THE Dashboard SHALL display the full signal detail including the source company, source document, catalyst_type, historical pattern statistics, and the Competitor_Relationship that linked the two companies.
- WHEN an analyst views a company detail page, THE Dashboard SHALL display an incoming competitive signals panel showing recent Competitive_Signals targeting this company from competitor news, with source ticker, catalyst_type, signal_direction, and signal_strength.
Requirement 9: Pattern Signal Suppression and Safety
User Story: As a risk owner, I want pattern-based and competitive signals to be subject to quality controls, so that low-confidence historical patterns do not drive automated trading decisions.
Acceptance Criteria
- WHEN a Historical_Pattern has a pattern_confidence below a configurable threshold (default 0.3), THE Signal_Propagation_Engine SHALL exclude the pattern from competitive signal computation and log the exclusion reason.
- WHEN a Historical_Pattern is based on historical data older than a configurable staleness window (default 180 days with no instances in the last 90 days), THE Pattern_Matcher SHALL apply a decay penalty to the pattern_confidence.
- WHEN pattern-based signals are the sole basis for a trend direction change (no supporting company-specific or macro signals), THE Recommendation_Engine SHALL mark the recommendation as informational only and append a pattern-only caveat to the thesis.
- IF the competitive signal computation encounters sustained errors exceeding a configurable threshold, THEN THE System SHALL alert operators and continue producing recommendations using only company-specific and macro signals.
Requirement 10: Historical Pattern Query API
User Story: As an analyst, I want to query historical patterns on demand for any company and catalyst type, so that I can manually investigate how similar situations resolved in the past.
Acceptance Criteria
- THE Query API SHALL expose a
GET /api/patterns/{ticker}endpoint returning all available Historical_Patterns for a company, filterable by catalyst_type and time_horizon. - THE Query API SHALL expose a
GET /api/patterns/{ticker}/competitorsendpoint returning cross-company Historical_Patterns showing how the specified company's catalysts historically affected its competitors. - WHEN the pattern query endpoints return results, THE Query API SHALL include the underlying sample_count, outcome distribution, pattern_confidence, and the date range of the historical data used.
- THE Query API SHALL expose a
GET /api/patterns/{ticker}/competitive-signalsendpoint returning recent Competitive_Signals targeting the specified company, with source details and pattern statistics.
Requirement 11: Corporate Decision History Tracking
User Story: As a strategist, I want the platform to identify and track major corporate decisions (acquisitions, divestitures, leadership changes, strategic pivots, major partnerships, stock buybacks, dividend changes, restructurings) from the existing document intelligence, so that historical pattern mining can weight these high-impact events distinctly from routine news.
Acceptance Criteria
- WHEN the Pattern_Matcher mines historical data, THE Pattern_Matcher SHALL classify document_impact_records into two tiers: major_corporate_decision (catalyst types including m_and_a, legal, restructuring, leadership_change, strategic_pivot, buyback, dividend_change) and routine_signal (all other catalyst types), and compute separate Historical_Patterns for each tier.
- WHEN a major_corporate_decision pattern is found, THE Pattern_Matcher SHALL apply a higher base weight to the pattern_confidence calculation compared to routine_signal patterns, reflecting that major decisions have more predictable and durable market impact.
- WHEN the Pattern_Matcher computes a Historical_Pattern for a major_corporate_decision, THE Pattern_Matcher SHALL extend the default lookback window to 365 days (compared to 180 days for routine signals), since major corporate decisions are rarer but their outcomes are more structurally significant.
- WHEN an analyst views a company detail page, THE Dashboard SHALL display a corporate decision timeline showing major_corporate_decision events extracted from the company's document intelligence history, with the catalyst type, date, summary, and the trend outcome that followed.
- WHEN the Pattern_Matcher evaluates competitive signal propagation for a major_corporate_decision catalyst, THE Pattern_Matcher SHALL search for historical instances where similar major decisions by competitors produced measurable trend shifts for the target company, using the extended 365-day lookback window.
- THE Query API SHALL expose a
GET /api/patterns/{ticker}/decisionsendpoint returning the company's major corporate decision history with associated trend outcomes and pattern statistics.