feat: signal math upgrade — probabilistic, regime-aware scoring pipeline
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled

Implement full probabilistic signal processing pipeline gated behind
probabilistic_scoring_enabled feature flag in risk_configs:

- Bayesian log-likelihood accumulator with Beta posterior and entropy
- Regime detector (trend-following, panic, mean-reversion, uncertainty)
- Source accuracy tracker with per-source historical prediction accuracy
- Sigmoid confidence gate replacing binary gate
- Information gain surprise weighting for rare events
- Adaptive recency decay with event-specific half-lives
- Regime multiplier replacing market context multiplier
- Weighted disagreement entropy for contradiction detection
- Multiplicative macro exposure with conditional integration
- Graph-distance attenuated competitive signal propagation
- Exponentially weighted momentum with volatility scaling
- Expected value recommendation gate

All changes backward-compatible: flag=false preserves exact current behavior.
New outputs stored in existing JSONB columns (no schema changes except
source_accuracy table via migration 034).

Tests: 26 property-based tests (14 correctness properties), 99 unit tests,
1789 total tests passing with zero regressions.
This commit is contained in:
Celes Renata
2026-04-29 11:41:48 +00:00
parent 8c3c1aab43
commit 4e010bc048
24 changed files with 6058 additions and 60 deletions
@@ -0,0 +1 @@
{"specId": "b595d834-7e72-4fab-87a9-65c92115a069", "workflowType": "requirements-first", "specType": "feature"}
+732
View File
@@ -0,0 +1,732 @@
# Design Document — Signal Math Upgrade
## Overview
This design upgrades the Stonks Oracle signal processing pipeline from deterministic heuristic formulas to a probabilistic, regime-aware, and adaptive mathematical framework. The upgrade spans all pipeline stages — signal scoring, trend assembly, macro impact, competitive signals, trend projection, and recommendation generation — while preserving the existing `WeightedSignal` abstraction, three-layer architecture, database schema, and dataclass interfaces.
The core transformation replaces:
- **Binary confidence gate** → smooth sigmoid transition
- **Weighted sentiment average** → Bayesian log-likelihood accumulation with Beta posterior
- **Fixed recency decay** → adaptive event-specific half-lives
- **Linear macro exposure** → multiplicative compounding exposure
- **Additive macro integration** → conditional multiplicative modifiers
- **Simple contradiction ratio** → weighted disagreement entropy
- **Heuristic trend confidence** → Bayesian posterior variance
- **Threshold-based direction** → entropy-based mixed signal detection
- **Simple momentum** → exponentially weighted momentum with volatility scaling
- **Confidence/strength gates** → expected value recommendation gate
- **Fixed relationship transfer** → graph-distance attenuated competitive signals
All changes are gated behind a `probabilistic_scoring_enabled` feature flag in `risk_configs`, allowing incremental rollout with instant rollback. New outputs (P_bull, α, β, entropy, regime, EV) are stored in existing JSONB columns — no database migrations required.
### Design Rationale
Markets are fundamentally probabilistic and regime-dependent. The current pipeline collapses rich evidence into binary sentiment labels and fixed-weight averages, losing uncertainty structure. A Bayesian framework preserves the full posterior distribution, enabling the system to distinguish between "strongly bullish" and "weakly bullish with high uncertainty" — a distinction that directly impacts position sizing and risk management.
The regime detector adapts scoring thresholds to market conditions (panic vs. trending vs. mean-reverting), and the expected value gate ensures recommendations only proceed when the risk-adjusted outcome is positive. Together, these changes transform the pipeline from a sentiment aggregator into a probabilistic forecasting engine.
---
## Architecture
### High-Level Pipeline Flow
The upgraded pipeline maintains the existing three-layer architecture but introduces new computation stages within each layer. The feature flag controls which computation path is taken at each stage.
```mermaid
flowchart TD
subgraph "Layer 1: Company Signals"
A[Document Intelligence Records] --> B[Signal Scorer]
B --> |"probabilistic=false"| C1[Binary Gate + Fixed Decay]
B --> |"probabilistic=true"| C2[Sigmoid Gate + Adaptive Decay<br/>+ Info Gain + Source Accuracy]
C1 --> D[WeightedSignal list]
C2 --> D
end
subgraph "Layer 2: Macro Signals"
E[Global Events] --> F[Macro Scorer]
F --> |"probabilistic=false"| G1[Linear Weighted Sum]
F --> |"probabilistic=true"| G2[Multiplicative Exposure]
G1 --> H[Macro WeightedSignals]
G2 --> H
end
subgraph "Layer 3: Competitive Signals"
I[Pattern Matcher] --> J[Signal Propagation]
J --> |"probabilistic=false"| K1[Flat Transfer Strength]
J --> |"probabilistic=true"| K2[Graph-Distance Attenuation]
K1 --> L[Competitive WeightedSignals]
K2 --> L
end
subgraph "Regime Detection (new)"
M[Market Data] --> N[Regime Detector]
N --> O{Regime Classification}
O --> P[trend-following / panic / mean-reversion / uncertainty]
end
subgraph "Trend Assembly"
D --> Q[Merge Signals]
H --> |"probabilistic=false"| Q
H --> |"probabilistic=true"| R[Conditional Macro Modifier]
R --> Q
L --> Q
Q --> S[Trend Assembler]
S --> |"probabilistic=false"| T1[Heuristic Confidence + Threshold Direction]
S --> |"probabilistic=true"| T2[Bayesian Posterior + Entropy Direction<br/>+ Regime-Adjusted Thresholds]
P --> T2
T1 --> U[TrendSummary]
T2 --> U
end
subgraph "Projection"
U --> V[Projection Engine]
V --> |"probabilistic=false"| W1[Simple Momentum]
V --> |"probabilistic=true"| W2[EW Momentum + Vol Scaling]
W1 --> X[TrendProjection]
W2 --> X
end
subgraph "Recommendation"
U --> Y[Recommendation Engine]
X --> Y
Y --> |"probabilistic=false"| Z1[Confidence + Strength Gates]
Y --> |"probabilistic=true"| Z2[EV Gate + Existing Gates]
Z1 --> AA[Recommendation]
Z2 --> AA
end
```
### Feature Flag Control Flow
The feature flag `probabilistic_scoring_enabled` is read from the `risk_configs` table's `config` JSONB column at the start of each aggregation cycle. It propagates through all pipeline stages via the existing `AggregationConfig` dataclass.
```mermaid
sequenceDiagram
participant W as Worker (aggregate_company)
participant DB as PostgreSQL (risk_configs)
participant S as Signal Scorer
participant T as Trend Assembler
participant R as Recommendation Engine
W->>DB: SELECT config FROM risk_configs WHERE active=TRUE
DB-->>W: {"macro_enabled": true, "competitive_enabled": true, "probabilistic_scoring_enabled": false}
W->>W: Log pipeline mode (heuristic or probabilistic)
W->>S: compute_signal_weight(..., probabilistic=flag)
S-->>W: WeightedSignal (with or without Bayesian fields)
W->>T: assemble_trend_summary(..., probabilistic=flag)
T-->>W: TrendSummary (with or without entropy/regime)
W->>R: evaluate_eligibility(..., probabilistic=flag)
R-->>W: Recommendation (with or without EV gate)
```
---
## Components and Interfaces
### New Modules
| Module | File | Responsibility |
|--------|------|----------------|
| Bayesian Accumulator | `services/aggregation/bayesian.py` | Log-likelihood accumulation, Beta posterior, P_bull, Bayesian confidence |
| Regime Detector | `services/aggregation/regime.py` | EMA computation, volatility ratio, regime classification, threshold adjustment |
| Adaptive Decay | integrated into `scoring.py` | Event-specific half-life computation from impact, surprise, market reaction |
| Information Gain | integrated into `scoring.py` | Surprise weighting from event type base rates |
| Source Accuracy | `services/aggregation/source_accuracy.py` | Historical prediction accuracy tracking per source |
| Entropy Detector | integrated into `bayesian.py` | Shannon entropy for mixed signal detection |
| EV Gate | integrated into `eligibility.py` | Expected value computation for recommendation eligibility |
### Modified Modules
| Module | File | Changes |
|--------|------|---------|
| Signal Scorer | `services/aggregation/scoring.py` | Sigmoid gate, info gain factor, adaptive decay, regime multiplier, source accuracy factor |
| Trend Assembler | `services/aggregation/worker.py` | Bayesian confidence, entropy-based direction, regime-adjusted thresholds, entropy-based contradiction |
| Contradiction | `services/aggregation/contradiction.py` | Weighted disagreement entropy replacing minority/majority ratio |
| Macro Scorer | `services/aggregation/interpolation.py` | Multiplicative exposure formula, conditional integration mode |
| Competitive Scorer | `services/aggregation/signal_propagation.py` | Graph-distance attenuation with historical correlation |
| Projection Engine | `services/aggregation/projection.py` | Exponentially weighted momentum, volatility scaling |
| Recommendation | `services/recommendation/eligibility.py` | EV gate, P_bull-based position sizing adjustments |
| Config | `services/shared/config.py` | New probabilistic config parameters |
| Schemas | `services/shared/schemas.py` | Optional new fields on TrendSummary, Recommendation |
### Component Interface Details
#### 1. Bayesian Accumulator (`services/aggregation/bayesian.py`)
```python
@dataclass(frozen=True)
class BayesianPosterior:
"""Bayesian posterior state from signal accumulation."""
p_bull: float # σ(L_t), bullish probability [0, 1]
alpha: float # Beta distribution α parameter (≥ 1.0)
beta: float # Beta distribution β parameter (≥ 1.0)
log_likelihood: float # Raw log-likelihood accumulation L_t
bayesian_confidence: float # 1 - 4αβ/(α+β)², [0, 1]
entropy: float # Shannon entropy H, [0, 1]
signal_count: int # Number of signals processed
# Uninformative prior (no evidence)
PRIOR = BayesianPosterior(
p_bull=0.5, alpha=1.0, beta=1.0,
log_likelihood=0.0, bayesian_confidence=0.0,
entropy=1.0, signal_count=0,
)
def compute_bayesian_posterior(
signals: list[WeightedSignal],
) -> BayesianPosterior:
"""Accumulate weighted signals into a Bayesian posterior.
Computes:
- Log-likelihood: L_t = Σ(w_i · s_i)
- Bullish probability: P_bull = σ(L_t)
- Beta posterior: α = 1 + W_bull, β = 1 + W_bear
- Bayesian confidence: C = 1 - 4αβ/(α+β)²
- Shannon entropy: H = -p·log₂(p) - (1-p)·log₂(1-p)
"""
...
def compute_entropy(p_bull: float) -> float:
"""Shannon entropy H = -p·log₂(p) - (1-p)·log₂(1-p).
Returns value in [0, 1]. Maximum at p=0.5, zero at p=0 or p=1.
Handles edge cases p=0 and p=1 by returning 0.0.
"""
...
```
#### 2. Regime Detector (`services/aggregation/regime.py`)
```python
class MarketRegime(str, Enum):
TREND_FOLLOWING = "trend_following"
PANIC = "panic"
MEAN_REVERSION = "mean_reversion"
UNCERTAINTY = "uncertainty"
@dataclass(frozen=True)
class RegimeClassification:
"""Result of regime detection for a ticker."""
regime: MarketRegime
trend_indicator: float # R = sign(EMA_20 - EMA_100)
volatility_ratio: float # V_r = σ_20 / σ_100
bullish_threshold: float # Adjusted ±threshold for direction
bearish_threshold: float
contradiction_penalty_multiplier: float # 0.4 default, 0.6 for uncertainty
@dataclass(frozen=True)
class RegimeConfig:
ema_short_period: int = 20
ema_long_period: int = 100
vol_short_period: int = 20
vol_long_period: int = 100
panic_vol_ratio: float = 1.5
trend_vol_ratio: float = 1.2
mean_reversion_vol_ratio: float = 1.0
default_threshold: float = 0.15
panic_threshold: float = 0.10
mean_reversion_threshold: float = 0.20
uncertainty_contradiction_multiplier: float = 0.6
def classify_regime(
closing_prices: list[float],
returns: list[float],
config: RegimeConfig = RegimeConfig(),
) -> RegimeClassification:
"""Classify market regime from price and return history.
Requires at least 100 days of price history for EMA_100.
Falls back to UNCERTAINTY when data is insufficient.
"""
...
def compute_ema(values: list[float], period: int) -> float:
"""Compute exponential moving average over the last `period` values."""
...
```
#### 3. Source Accuracy Tracker (`services/aggregation/source_accuracy.py`)
```python
@dataclass
class SourceAccuracy:
"""Per-source historical prediction accuracy."""
source_id: str
accuracy_ratio: float # [0, 1] fraction of correct directional calls
sample_count: int # Number of signals with known outcomes
last_updated: datetime
@property
def accuracy_factor(self) -> float:
"""Multiplicative factor for credibility weight.
Returns 1.0 (neutral) when sample_count < 10.
Otherwise scales linearly from 0.5 (0% accuracy) to 1.5 (100% accuracy).
"""
if self.sample_count < 10:
return 1.0
return 0.5 + self.accuracy_ratio
async def fetch_source_accuracy(
pool: asyncpg.Pool,
source_ids: list[str],
) -> dict[str, SourceAccuracy]:
"""Fetch accuracy metrics for a batch of sources."""
...
async def update_source_accuracy(
pool: asyncpg.Pool,
source_id: str,
realized_outcomes: list[tuple[str, float]], # (predicted_direction, actual_7d_return)
) -> None:
"""Update accuracy metrics for a source based on realized price data."""
...
```
#### 4. Extended ScoringConfig
New fields added to the existing `ScoringConfig` dataclass in `scoring.py`:
```python
@dataclass(frozen=True)
class ScoringConfig:
# ... existing fields preserved ...
# Probabilistic scoring toggle (mirrors feature flag for local use)
probabilistic: bool = False
# Sigmoid gate parameters
sigmoid_steepness: float = 5.0 # k in σ(k·(x - midpoint))
sigmoid_midpoint: float = 0.5 # midpoint of sigmoid transition
# Information gain parameters
info_gain_lambda: float = 0.3 # scaling parameter λ
info_gain_max: float = 3.0 # maximum clamp for info gain factor
default_base_rate: float = 0.1 # fallback when event type rate unknown
# Adaptive decay parameters (β scaling factors)
adaptive_decay_impact_scale: float = 1.0 # max β_impact
adaptive_decay_surprise_scale: float = 1.0 # max β_surprise at r=3.0
adaptive_decay_market_scale: float = 0.5 # max β_market_reaction
# Regime multiplier parameters
regime_return_weight: float = 0.15 # coefficient for |z_r|
regime_volume_weight: float = 0.10 # coefficient for |z_v|
regime_multiplier_max: float = 2.5 # M_regime ceiling
```
#### 5. Extended WeightedSignal
The existing `WeightedSignal` dataclass gains optional fields:
```python
@dataclass
class WeightedSignal:
"""A document intelligence reference paired with its computed weight."""
document_id: str
weight: SignalWeight
sentiment_value: float
impact_score: float
# New optional fields for probabilistic mode
info_gain_factor: float = 1.0 # r = 1 + λ·(-log₂ P(event_type))
source_accuracy_factor: float = 1.0 # [0.5, 1.5] from historical accuracy
adaptive_half_life: float | None = None # τ_i when adaptive decay is active
```
#### 6. Extended SignalWeight
```python
@dataclass
class SignalWeight:
"""Breakdown of a document's aggregation weight."""
recency: float
credibility: float
novelty_bonus: float
confidence_gate: float
market_ctx_multiplier: float
combined: float
# New optional fields for probabilistic mode
sigmoid_gate: float | None = None # Smooth gate value [0, 1]
info_gain_factor: float = 1.0 # Surprise multiplier
source_accuracy_factor: float = 1.0 # Historical accuracy multiplier
regime_multiplier: float | None = None # M_regime replacing M_context
```
#### 7. Extended TrendSummary
New optional fields on the existing Pydantic model:
```python
class TrendSummary(BaseModel):
# ... all existing fields preserved ...
# New optional fields for probabilistic mode
p_bull: float | None = None # Bayesian bullish probability
alpha: float | None = None # Beta posterior α
beta_param: float | None = None # Beta posterior β (named to avoid shadowing)
bayesian_confidence: float | None = None # 1 - 4αβ/(α+β)²
entropy: float | None = None # Shannon entropy H
regime: str | None = None # Market regime classification
pipeline_mode: str = "heuristic" # "heuristic" or "probabilistic"
```
#### 8. Extended Recommendation
```python
class Recommendation(BaseModel):
# ... all existing fields preserved ...
# New optional fields for probabilistic mode
expected_value: float | None = None # EV = P_bull·R_up - P_bear·R_down
p_bull: float | None = None # Bayesian bullish probability used
pipeline_mode: str = "heuristic" # "heuristic" or "probabilistic"
```
---
## Data Models
### Database Storage Strategy
All new mathematical outputs are stored in existing JSONB columns. No new database migrations are required.
#### trend_windows table
The `market_context` JSONB column (currently stores volatility/volume data) is extended to include probabilistic outputs:
```json
{
"volatility": 1.23,
"volume_change_pct": 45.2,
"price_change_pct": -2.1,
"probabilistic": {
"p_bull": 0.72,
"alpha": 8.3,
"beta": 3.1,
"log_likelihood": 0.94,
"bayesian_confidence": 0.61,
"entropy": 0.42,
"regime": "trend_following",
"regime_volatility_ratio": 0.85,
"pipeline_mode": "probabilistic",
"contradiction_entropy": 0.31,
"macro_modifier": 1.15
}
}
```
#### recommendations table
The existing `invalidation_conditions` JSONB column stores recommendation-level data. The new EV and probabilistic fields are stored in a new key within the existing decision trace flow. Since recommendations don't have a dedicated metadata JSONB column, we add the probabilistic fields to the thesis text and store structured data in the `risk_checks` JSONB column of the `recommendation_evaluations` table:
```json
{
"ev": 0.0082,
"p_bull": 0.72,
"r_up": 0.034,
"r_down": 0.012,
"pipeline_mode": "probabilistic",
"ev_threshold": 0.005
}
```
#### risk_configs table
The `config` JSONB column gains the new feature flag:
```json
{
"macro_enabled": true,
"competitive_enabled": true,
"probabilistic_scoring_enabled": false
}
```
#### source_accuracy table (new — Requirement 4)
This is the one new database table required, stored via a migration:
```sql
CREATE TABLE IF NOT EXISTS source_accuracy (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
source_id VARCHAR(200) NOT NULL,
accuracy_ratio FLOAT NOT NULL DEFAULT 0.5,
sample_count INTEGER NOT NULL DEFAULT 0,
last_updated TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(source_id)
);
CREATE INDEX idx_source_accuracy_source ON source_accuracy(source_id);
```
Note: This is the only schema addition. All other new outputs use existing JSONB columns.
### Event Type Base Rates
Information gain computation requires empirical base rates for event types. These are stored as a configuration constant (not in the database) and can be tuned over time:
```python
EVENT_TYPE_BASE_RATES: dict[str, float] = {
"earnings": 0.25, # Quarterly, common
"product_launch": 0.10, # Moderately rare
"regulatory": 0.08, # Somewhat rare
"legal": 0.05, # Rare
"m_and_a": 0.03, # Very rare
"management_change": 0.06,
"partnership": 0.12,
"market_expansion": 0.09,
"restructuring": 0.04,
"dividend": 0.15,
}
DEFAULT_BASE_RATE = 0.1 # For unknown event types
```
### Configuration Hierarchy
```
risk_configs.config (DB, runtime)
└── probabilistic_scoring_enabled: bool
└── AggregationConfig.probabilistic: bool (in-memory)
└── ScoringConfig.probabilistic: bool (per-cycle)
├── scoring.py: sigmoid vs binary gate
├── scoring.py: adaptive vs fixed decay
├── scoring.py: info gain factor
├── scoring.py: regime multiplier vs market context
├── worker.py: Bayesian vs heuristic confidence
├── worker.py: entropy vs threshold direction
├── contradiction.py: entropy vs ratio
├── interpolation.py: multiplicative vs linear
├── signal_propagation.py: graph-distance vs flat
├── projection.py: EW momentum vs simple
└── eligibility.py: EV gate vs threshold-only
```
---
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
The following properties were derived from the acceptance criteria through systematic prework analysis. Each property is universally quantified and maps to specific requirements. Redundant properties were consolidated during reflection (e.g., requirements 17.117.7 duplicate properties already stated in requirements 115).
### Property 1: Sigmoid Gate Monotonicity
*For any* two extraction confidence values x₁, x₂ ∈ [0.0, 1.0] where x₁ ≤ x₂, the sigmoid gate σ(5·(x₁ - 0.5)) SHALL be less than or equal to σ(5·(x₂ - 0.5)). Higher confidence always produces equal or higher gate values.
**Validates: Requirements 2.6, 17.1**
### Property 2: Beta Posterior Evidence Accumulation
*For any* sequence of weighted signal sets where each successive set contains one additional signal, the sum α + β of the Beta posterior parameters SHALL increase monotonically. Evidence always accumulates — adding a signal never reduces the total evidence mass.
**Validates: Requirements 1.3, 17.2**
### Property 3: Bayesian Confidence Symmetry and Divergence
*For any* Beta posterior with parameters α, β ≥ 1.0, the Bayesian confidence C = 1 - 4αβ/(α+β)² SHALL equal 0.0 when α = β (maximum uncertainty) and SHALL increase monotonically as the ratio max(α/β, β/α) increases. Confidence reflects evidence concentration, not evidence volume.
**Validates: Requirements 1.4, 17.3**
### Property 4: Bayesian Posterior Round-Trip Consistency
*For any* set of weighted signals with uniform weights, computing the Beta posterior and extracting the mean P_bull = α/(α+β) SHALL produce a value within 0.05 of σ(L_t) where L_t is the log-likelihood accumulation. The two probabilistic representations are consistent.
**Validates: Requirements 1.7, 17.7**
### Property 5: Adaptive Decay Lower Bound
*For any* valid combination of impact_score ∈ [0, 1], information gain factor r ∈ [1.0, 3.0], and market context multiplier ∈ [1.0, 1.45], the adaptive half-life τ_i SHALL be greater than or equal to the base half-life τ_base. Adaptive decay is always slower or equal to fixed decay, never faster.
**Validates: Requirements 5.7, 17.4**
### Property 6: Information Gain Monotonicity
*For any* two event type base rates p₁, p₂ ∈ (0, 1] where p₁ < p₂, the information gain factor r(p₁) SHALL be greater than or equal to r(p₂). Rarer events always receive higher surprise weight.
**Validates: Requirements 3.5**
### Property 7: Multiplicative Macro Exposure Monotonicity
*For any* overlap configuration (O_geo, O_supply, O_commodity, O_sector) and any dimension k where O_k = 0, setting O_k to any positive value SHALL increase the total macro impact score. Multi-dimensional exposure always compounds — it never reduces impact.
**Validates: Requirements 10.7, 17.5**
### Property 8: Shannon Entropy Range and Maximum
*For any* bullish probability P_bull ∈ (0, 1), the Shannon entropy H = -P_bull·log₂(P_bull) - (1-P_bull)·log₂(1-P_bull) SHALL be in the range (0, 1], with the maximum value of 1.0 occurring at P_bull = 0.5.
**Validates: Requirements 9.7**
### Property 9: Contradiction Entropy Monotonicity
*For any* set of weighted signals containing both positive and negative sentiment signals, the contradiction entropy score SHALL increase monotonically as the weight distribution f_pos approaches 0.5 (equal split). More balanced disagreement always produces higher contradiction.
**Validates: Requirements 15.7**
### Property 10: Exponentially Weighted Momentum Direction
*For any* sequence of monotonically increasing signed trend strengths (each ΔS_{t-k} > 0), the exponentially weighted momentum M_t SHALL be positive. Consistently strengthening bullish trends always produce positive momentum.
**Validates: Requirements 13.6, 17.6**
### Property 11: Competitive Signal Distance Attenuation
*For any* source-target company pair with fixed source signal strength S_source and historical correlation ρ_historical, the transfer strength S_transfer SHALL decrease monotonically with increasing graph distance d_network. Closer competitors always receive stronger signal transfer.
**Validates: Requirements 12.7**
### Property 12: Expected Value Directional Consistency
*For any* Bayesian bullish probability P_bull > 0.5 and estimated returns where R_up > R_down, the expected value EV = P_bull · R_up - (1 - P_bull) · R_down SHALL be positive. When the model is bullish and upside exceeds downside, EV is always positive.
**Validates: Requirements 17.8**
### Property 13: Bayesian Confidence Monotonic with Agreeing Signals
*For any* set of weighted signals where all signals agree on direction (all positive or all negative), adding one more agreeing signal SHALL increase the Bayesian confidence C. More agreeing evidence always increases confidence.
**Validates: Requirements 8.6**
### Property 14: Numerical Stability Across All Formulas
*For any* valid input combination to any formula in the probabilistic pipeline (sigmoid gate, Beta posterior, Bayesian confidence, adaptive decay, regime multiplier, Shannon entropy, multiplicative exposure, EW momentum, expected value), the output SHALL be a finite float (not NaN, not infinity) within the documented range for that formula. This includes regime multiplier M_regime ∈ [1.0, 2.5], entropy H ∈ [0, 1], P_bull ∈ [0, 1], confidence ∈ [0, 1], and M_adj ∈ [-2.0, 2.0].
**Validates: Requirements 17.9, 6.4**
---
## Error Handling
### Numerical Edge Cases
| Scenario | Handling |
|----------|----------|
| P_bull = 0.0 or 1.0 (entropy undefined) | Return H = 0.0 (no uncertainty at extremes) |
| σ_20 = 0.0 (zero volatility for momentum scaling) | Use floor max(σ_20, 0.01) per Req 13.4 |
| σ_20 = 0.0 or σ_100 = 0.0 (volatility ratio) | Default to uncertainty regime |
| log₂(0) in entropy computation | Guard with `if p <= 0 or p >= 1: return 0.0` |
| log₂(0) in information gain (base_rate = 0) | Base rates must be > 0; use default 0.1 for unknown |
| Division by zero in z-score (σ = 0) | Use M_regime = 1.0 when σ = 0 |
| Empty signal list | Return uninformative prior (P_bull=0.5, α=1, β=1, C=0) |
| All neutral signals (no positive or negative) | Contradiction = 0.0, direction = neutral |
| Extremely large weights (overflow risk) | Python floats handle up to ~1.8e308; clamp combined weight if needed |
| NaN from upstream data | Validate inputs; skip signals with NaN weight or sentiment |
### Feature Flag Failure Modes
| Failure | Behavior |
|---------|----------|
| `risk_configs` table unreachable | Default to `probabilistic_scoring_enabled = false` (heuristic mode) |
| `config` JSONB missing the key | Default to `false` |
| Invalid value type for flag | Default to `false`, log warning |
| Flag changes mid-cycle | Flag is read once at cycle start; change takes effect next cycle |
### Source Accuracy Failures
| Failure | Behavior |
|---------|----------|
| `source_accuracy` table unreachable | Use neutral factor 1.0 for all sources |
| Accuracy update fails | Log error, continue with stale accuracy data |
| Corrupted accuracy data (ratio > 1.0 or < 0.0) | Clamp to [0.0, 1.0] |
### Regime Detection Failures
| Failure | Behavior |
|---------|----------|
| Market data unavailable | Default to uncertainty regime with default thresholds |
| Insufficient price history (< 100 days) | Default to uncertainty regime |
| Price data contains gaps | Use available data; EMA computation handles gaps gracefully |
---
## Testing Strategy
### Dual Testing Approach
The signal math upgrade requires both property-based tests (for mathematical correctness) and example-based unit tests (for specific behaviors and integration points). Property-based testing is highly appropriate here because the feature consists primarily of pure mathematical functions with clear input/output behavior, universal properties that hold across wide input spaces, and well-defined range invariants.
### Property-Based Testing
**Library:** Hypothesis (already in use per `.hypothesis/` directory and project conventions)
**Configuration:**
- Minimum 100 iterations per property: `@settings(max_examples=100)`
- File naming: `test_pbt_signal_math.py` (or split by module)
- Tag format: `# Feature: signal-math-upgrade, Property N: <title>`
**Property tests to implement (one test per correctness property):**
| Property | Test File | Key Generators |
|----------|-----------|----------------|
| 1: Sigmoid monotonicity | `test_pbt_signal_math.py` | `st.floats(0.0, 1.0)` pairs |
| 2: Evidence accumulation | `test_pbt_signal_math.py` | `st.lists(weighted_signal_strategy)` |
| 3: Confidence symmetry/divergence | `test_pbt_signal_math.py` | `st.floats(1.0, 100.0)` for α, β |
| 4: Posterior round-trip | `test_pbt_signal_math.py` | `st.lists(uniform_weight_signal_strategy)` |
| 5: Adaptive decay lower bound | `test_pbt_signal_math.py` | `st.floats` for impact, surprise, market |
| 6: Info gain monotonicity | `test_pbt_signal_math.py` | `st.floats(0.001, 1.0)` pairs |
| 7: Macro exposure monotonicity | `test_pbt_signal_math.py` | `st.floats(0.0, 1.0)` for overlaps |
| 8: Entropy range/maximum | `test_pbt_signal_math.py` | `st.floats(0.001, 0.999)` for P_bull |
| 9: Contradiction monotonicity | `test_pbt_signal_math.py` | Signal sets with varying weight splits |
| 10: EW momentum direction | `test_pbt_signal_math.py` | `st.lists(st.floats)` monotonic sequences |
| 11: Distance attenuation | `test_pbt_signal_math.py` | `st.integers(1, 3)` for distance |
| 12: EV directional consistency | `test_pbt_signal_math.py` | `st.floats(0.5, 1.0)` for P_bull |
| 13: Confidence with agreeing signals | `test_pbt_signal_math.py` | Growing lists of same-direction signals |
| 14: Numerical stability | `test_pbt_signal_math.py` | Broad `st.floats` for all formula inputs |
### Example-Based Unit Tests
**File:** `test_signal_math_unit.py`
| Test Area | Examples |
|-----------|----------|
| Sigmoid gate specific values | x=0.5→0.5, x=0.2→<0.05, x=0.8→>0.95 |
| Uninformative prior | Empty signals → P_bull=0.5, α=1, β=1, C=0 |
| Default base rate | Unknown event type → base_rate=0.1 |
| Info gain clamp | Very rare event → factor ≤ 3.0 |
| Source accuracy threshold | sample_count < 10 → factor=1.0 |
| Adaptive decay edge cases | All zeros → τ_base, all max → 6×τ_base |
| Regime classification | Specific (R, V_r) → expected regime |
| Regime thresholds | panic→0.10, mean_reversion→0.20, etc. |
| Entropy direction mapping | H>0.9→mixed, P_bull>0.65→bullish, etc. |
| Zero overlap → zero impact | All overlaps zero → S_macro=0 |
| Max overlap value | All overlaps 1.0 → ≈severity×0.724 |
| Macro fallback behaviors | Only macro → additive, only company → no modifier |
| Graph distance cutoff | d>3 → no propagation |
| Momentum fallback | <2 cycles → heuristic fallback |
| EV threshold behavior | EV>0.005→proceed, EV≤0.005→informational |
| Feature flag behaviors | flag=false→heuristic, flag=true→probabilistic |
| Heuristic equivalence | flag=false produces identical outputs to current system |
### Integration Tests
| Test Area | Scope |
|-----------|-------|
| Source accuracy persistence | Write/read from source_accuracy table |
| Regime persistence | Store/retrieve regime in JSONB |
| EV persistence | Store/retrieve EV in recommendation_evaluations |
| Feature flag reading | Read probabilistic_scoring_enabled from risk_configs |
| End-to-end pipeline | Full aggregation cycle with probabilistic=true |
### Test Organization
```
tests/
├── test_pbt_signal_math.py # All 14 property-based tests
├── test_signal_math_unit.py # Example-based unit tests
├── test_bayesian.py # Bayesian accumulator unit tests
├── test_regime.py # Regime detector unit tests
├── test_source_accuracy.py # Source accuracy tracker tests
└── test_signal_math_integration.py # Integration tests (DB required)
```
@@ -0,0 +1,293 @@
# Requirements Document — Signal Math Upgrade
## Introduction
The Stonks Oracle platform uses a three-layer signal aggregation engine (company-specific, macro, competitive) to produce market intelligence and drive paper-trading decisions. The current mathematical models are structurally too deterministic and too linear for a market system that is fundamentally probabilistic, regime-dependent, and nonlinear. The pipeline behaves as weighted sentiment aggregation with heuristics rather than a probabilistic forecasting engine.
This feature upgrades the signal processing mathematics across all pipeline stages — from signal scoring through trend assembly, macro impact, competitive signals, trend projection, and recommendation generation — to replace heuristic formulas with probabilistic, regime-aware, and adaptive alternatives. The goal is to transform prediction quality while preserving the existing `WeightedSignal` abstraction, three-layer architecture, and database schema compatibility.
## Glossary
- **Aggregation_Engine**: The core pipeline in `services/aggregation/worker.py` that merges signals from all three layers and computes `TrendSummary` objects across five time windows.
- **Signal_Scorer**: The scoring module in `services/aggregation/scoring.py` that transforms raw intelligence records into `WeightedSignal` objects with composite aggregation weights.
- **Trend_Assembler**: The component in `services/aggregation/worker.py` that derives trend direction, strength, confidence, and contradiction from merged weighted signals.
- **Macro_Scorer**: The macro impact scoring module in `services/aggregation/interpolation.py` that computes per-company impact from global events using overlap-based exposure profiles.
- **Competitive_Scorer**: The competitive signal modules in `services/aggregation/pattern_matcher.py` and `services/aggregation/signal_propagation.py` that mine historical patterns and propagate cross-company signals.
- **Projection_Engine**: The trend projection module in `services/aggregation/projection.py` that computes forward-looking trend estimates from momentum and macro decay.
- **Recommendation_Engine**: The recommendation pipeline in `services/recommendation/` that translates trend assessments into actionable buy/sell/hold/watch decisions with position sizing.
- **WeightedSignal**: The core data abstraction pairing a document reference with a composite aggregation weight, sentiment value, and impact score.
- **Beta_Distribution**: A probability distribution on [0, 1] parameterized by α and β, used to model the posterior probability of bullish vs bearish sentiment.
- **Regime_Detector**: A new component that classifies the current market regime (trend-following, panic, mean-reversion, uncertainty) from price and volume statistics.
- **Sigmoid_Function**: The logistic function σ(x) = 1/(1+e^(-x)) used to convert log-likelihood accumulations into probabilities.
- **Adaptive_Decay**: A recency decay mechanism where the half-life varies per signal based on event impact, surprise, and market reaction rather than using a fixed constant per window.
- **Information_Gain**: A measure of how surprising an event is relative to its base rate, computed as -log P(event_type), used to weight novel signals more heavily.
- **Entropy**: Shannon entropy H = -p·log(p) - (1-p)·log(1-p), used to detect mixed sentiment states where the probability distribution is spread rather than concentrated.
- **EMA**: Exponential Moving Average, a weighted moving average giving more weight to recent observations, used for trend and volatility regime detection.
---
## Requirements
### Requirement 1: Probabilistic Sentiment Accumulation via Bayesian Evidence
**User Story:** As a quantitative analyst, I want the signal scoring layer to accumulate sentiment evidence probabilistically using Bayesian methods, so that the system captures uncertainty structure instead of collapsing sentiment into binary ±1 labels.
#### Acceptance Criteria
1. WHEN a set of weighted signals is provided for a ticker and window, THE Signal_Scorer SHALL compute a log-likelihood accumulation L_t = Σ(w_i · s_i) where w_i is the combined signal weight and s_i is the sentiment value.
2. WHEN the log-likelihood L_t has been computed, THE Signal_Scorer SHALL convert the accumulation to a bullish probability using the Sigmoid_Function: P_bull = σ(L_t) = 1/(1+e^(-L_t)).
3. WHEN weighted signals are provided, THE Signal_Scorer SHALL maintain a Beta_Distribution posterior with parameters α_t = α_0 + W_bull and β_t = β_0 + W_bear, where W_bull is the sum of combined weights for positive signals and W_bear is the sum for negative signals, and α_0 = β_0 = 1.0 as uninformative priors.
4. THE Signal_Scorer SHALL compute Bayesian confidence from the Beta_Distribution posterior variance as C = 1 - 4αβ/(α+β)², where C ranges from 0.0 (maximum uncertainty at α=β) to approaching 1.0 (strong evidence concentration).
5. WHEN no signals exist for a ticker and window, THE Signal_Scorer SHALL return P_bull = 0.5, α = 1.0, β = 1.0, and C = 0.0, representing the uninformative prior state.
6. THE Signal_Scorer SHALL preserve the existing `WeightedSignal` dataclass interface, adding the Bayesian posterior fields (P_bull, α, β, Bayesian confidence) as additional output alongside the existing weighted sentiment average.
7. FOR ALL valid sets of weighted signals, computing the Beta posterior then extracting P_bull SHALL produce a value within 0.05 of σ(L_t) when signal weights are uniform (round-trip consistency between the two probabilistic representations).
---
### Requirement 2: Sigmoid Confidence Gate Replacing Binary Gate
**User Story:** As a quantitative analyst, I want the binary confidence gate replaced with a smooth sigmoid transition, so that marginally confident signals contribute proportionally rather than being completely discarded or fully included.
#### Acceptance Criteria
1. WHEN a document signal has extraction confidence x, THE Signal_Scorer SHALL compute a soft gate value p = σ(5·(x - 0.5)) = 1/(1+e^(-5·(x-0.5))) instead of the current binary 0/1 gate.
2. WHEN extraction confidence is 0.5, THE Signal_Scorer SHALL produce a gate value of 0.5 (the sigmoid midpoint).
3. WHEN extraction confidence is below 0.2, THE Signal_Scorer SHALL produce a gate value below 0.05, preserving near-zero weight for very low confidence signals.
4. WHEN extraction confidence is above 0.8, THE Signal_Scorer SHALL produce a gate value above 0.95, preserving near-full weight for high confidence signals.
5. THE Signal_Scorer SHALL use the sigmoid gate value as a multiplicative factor in the combined weight formula in place of the current binary G_conf.
6. FOR ALL extraction confidence values in [0.0, 1.0], THE Signal_Scorer SHALL produce gate values that are monotonically increasing (higher confidence always produces equal or higher gate values).
---
### Requirement 3: Information Gain Surprise Weighting
**User Story:** As a quantitative analyst, I want signals weighted by their information gain (surprise factor), so that rare and unexpected events receive proportionally higher influence than routine signals.
#### Acceptance Criteria
1. WHEN a signal has a known event type (e.g., earnings, product_launch, regulatory, legal, m_and_a), THE Signal_Scorer SHALL compute an information gain factor r = 1 + λ·(-log₂ P(event_type)), where P(event_type) is the empirical base rate of that event type and λ is a configurable scaling parameter with default 0.3.
2. WHEN the event type base rate is not available, THE Signal_Scorer SHALL use a default base rate of 0.1 (treating the event as moderately rare).
3. THE Signal_Scorer SHALL multiply the information gain factor r into the combined weight formula as an additional multiplicative component.
4. THE Signal_Scorer SHALL clamp the information gain factor to a maximum of 3.0 to prevent extremely rare events from dominating the aggregation.
5. FOR ALL event types with base rate in (0, 1], THE Signal_Scorer SHALL produce information gain factors that are monotonically decreasing with increasing base rate (rarer events always receive higher surprise weight).
---
### Requirement 4: Historical Source Accuracy Tracking
**User Story:** As a quantitative analyst, I want source credibility to incorporate historical prediction accuracy, so that sources with a track record of correct directional calls receive higher weight.
#### Acceptance Criteria
1. THE Signal_Scorer SHALL maintain a per-source accuracy metric computed as the fraction of past signals from that source where the predicted direction matched the subsequent 7-day price movement direction.
2. WHEN a source has at least 10 historical signals with known outcomes, THE Signal_Scorer SHALL incorporate the source accuracy as a multiplicative factor on the credibility weight, scaled linearly from 0.5 (0% accuracy) to 1.5 (100% accuracy).
3. WHEN a source has fewer than 10 historical signals, THE Signal_Scorer SHALL use a neutral accuracy factor of 1.0 (no adjustment).
4. THE Signal_Scorer SHALL update source accuracy metrics asynchronously after each aggregation cycle, using realized price data from the market data tables.
5. THE Signal_Scorer SHALL store source accuracy metrics in a database table with columns for source identifier, accuracy ratio, sample count, and last updated timestamp.
---
### Requirement 5: Adaptive Recency Decay with Event-Specific Half-Lives
**User Story:** As a quantitative analyst, I want recency decay half-lives to adapt based on event characteristics, so that high-impact events persist longer in the aggregation while routine signals decay faster.
#### Acceptance Criteria
1. WHEN computing recency decay for a signal, THE Signal_Scorer SHALL use an adaptive half-life τ_i = τ_base · (1 + β_impact) · (1 + β_surprise) · (1 + β_market_reaction), where τ_base is the current fixed half-life for the window.
2. THE Signal_Scorer SHALL compute β_impact from the signal's impact score, scaled linearly from 0.0 (impact_score = 0) to 1.0 (impact_score = 1.0).
3. THE Signal_Scorer SHALL compute β_surprise from the information gain factor (Requirement 3), scaled linearly from 0.0 (r = 1.0, no surprise) to 1.0 (r = 3.0, maximum surprise).
4. THE Signal_Scorer SHALL compute β_market_reaction from the market context multiplier, scaled linearly from 0.0 (multiplier = 1.0, no market reaction) to 0.5 (multiplier = 1.45, maximum market reaction).
5. WHEN all three β factors are at their maximum, THE Signal_Scorer SHALL produce an adaptive half-life of at most 6× the base half-life (τ_base · 2.0 · 2.0 · 1.5 = 6.0 · τ_base).
6. WHEN all three β factors are zero (routine, unsurprising signal in calm market), THE Signal_Scorer SHALL produce the same half-life as the current fixed system (τ_base).
7. FOR ALL combinations of impact, surprise, and market reaction values, THE Signal_Scorer SHALL produce adaptive half-lives that are greater than or equal to τ_base (adaptive decay is always slower or equal to the base decay, never faster).
---
### Requirement 6: Volatility-Adjusted Normalization (Regime-Aware Scoring)
**User Story:** As a quantitative analyst, I want signal weights normalized by current market volatility and volume conditions, so that the same signal magnitude is interpreted differently in calm vs volatile markets.
#### Acceptance Criteria
1. WHEN market data is available for a ticker, THE Signal_Scorer SHALL compute a return z-score z_r = (r_t - μ_20) / σ_20, where r_t is the current return, μ_20 is the 20-day mean return, and σ_20 is the 20-day return standard deviation.
2. WHEN market data is available for a ticker, THE Signal_Scorer SHALL compute a volume z-score z_v = (log(V_t) - μ_V) / σ_V, where V_t is the current volume, μ_V is the 20-day mean of log-volume, and σ_V is the 20-day standard deviation of log-volume.
3. THE Signal_Scorer SHALL compute a regime multiplier M_regime = 1 + 0.15·|z_r| + 0.10·|z_v|, which amplifies signal weights during abnormal market conditions.
4. THE Signal_Scorer SHALL clamp M_regime to the range [1.0, 2.5] to prevent extreme z-scores from producing runaway weight amplification.
5. WHEN market data is not available for a ticker, THE Signal_Scorer SHALL use M_regime = 1.0 (no regime adjustment).
6. THE Signal_Scorer SHALL replace the current market context multiplier (M_context) with M_regime in the combined weight formula.
---
### Requirement 7: Regime Detection and Classification
**User Story:** As a quantitative analyst, I want the system to detect and classify the current market regime for each ticker, so that scoring thresholds and behavior adapt to whether the market is trending, panicking, mean-reverting, or uncertain.
#### Acceptance Criteria
1. WHEN market data is available, THE Regime_Detector SHALL compute a trend indicator R = sign(EMA_20 - EMA_100), where EMA_20 and EMA_100 are exponential moving averages of closing prices over 20 and 100 days respectively.
2. WHEN market data is available, THE Regime_Detector SHALL compute a volatility ratio V_r = σ_20 / σ_100, where σ_20 and σ_100 are the 20-day and 100-day return standard deviations.
3. THE Regime_Detector SHALL classify the market regime into one of four categories based on R and V_r: trend-following (R ≠ 0 AND V_r < 1.2), panic (V_r > 1.5), mean-reversion (R = 0 AND V_r < 1.0), uncertainty (all other cases).
4. WHEN the regime is classified as panic, THE Aggregation_Engine SHALL reduce the bullish/bearish threshold from ±0.15 to ±0.10 (making the system more sensitive to directional signals during high-volatility periods).
5. WHEN the regime is classified as mean-reversion, THE Aggregation_Engine SHALL increase the bullish/bearish threshold from ±0.15 to ±0.20 (requiring stronger evidence for directional calls in range-bound markets).
6. WHEN the regime is classified as trend-following, THE Aggregation_Engine SHALL use the default thresholds of ±0.15.
7. WHEN the regime is classified as uncertainty, THE Aggregation_Engine SHALL use the default thresholds of ±0.15 and increase the contradiction penalty multiplier from 0.4 to 0.6.
8. THE Regime_Detector SHALL persist the current regime classification per ticker to the database for auditability and dashboard display.
9. WHEN market data is insufficient to compute EMA_100 (fewer than 100 days of price history), THE Regime_Detector SHALL default to the uncertainty regime.
---
### Requirement 8: Bayesian Posterior Confidence Replacing Heuristic Confidence
**User Story:** As a quantitative analyst, I want trend confidence derived from the Bayesian posterior distribution rather than the current heuristic weighted formula, so that confidence reflects actual evidence concentration rather than an ad-hoc combination of factors.
#### Acceptance Criteria
1. WHEN computing trend confidence, THE Trend_Assembler SHALL use the Bayesian confidence C = 1 - 4αβ/(α+β)² from the Beta_Distribution posterior (Requirement 1) as the primary confidence component with weight 0.5.
2. THE Trend_Assembler SHALL retain the source count factor (min(N_unique/15, 0.8)) as a secondary confidence component with weight 0.25, rewarding evidence breadth.
3. THE Trend_Assembler SHALL retain the contradiction penalty (contradiction_score × 0.4) as a confidence reduction.
4. THE Trend_Assembler SHALL compute the combined confidence as: confidence = 0.5 × C_bayesian + 0.25 × F_count + 0.25 × C_avg_credibility - P_contradiction, clamped to [0.0, 1.0].
5. THE Trend_Assembler SHALL preserve the existing confidence thresholds for recommendation eligibility (0.35 minimum, 0.50 paper, 0.70 live) without modification.
6. FOR ALL signal sets where all signals agree on direction, THE Trend_Assembler SHALL produce Bayesian confidence that increases monotonically with the number of agreeing signals.
---
### Requirement 9: Entropy-Based Mixed Signal Detection
**User Story:** As a quantitative analyst, I want mixed trend detection based on Shannon entropy rather than simple contradiction thresholds, so that the system can distinguish between genuine uncertainty (high entropy) and weak signal (low total weight).
#### Acceptance Criteria
1. WHEN the bullish probability P_bull has been computed from the Bayesian posterior, THE Trend_Assembler SHALL compute Shannon entropy H = -P_bull·log₂(P_bull) - (1-P_bull)·log₂(1-P_bull).
2. WHEN H > 0.9 (entropy close to maximum of 1.0, indicating near-equal probability of bullish and bearish), THE Trend_Assembler SHALL classify the trend direction as mixed, regardless of the weighted sentiment average.
3. WHEN H ≤ 0.9 AND P_bull > 0.65, THE Trend_Assembler SHALL classify the trend direction as bullish.
4. WHEN H ≤ 0.9 AND P_bull < 0.35, THE Trend_Assembler SHALL classify the trend direction as bearish.
5. WHEN H ≤ 0.9 AND 0.35 ≤ P_bull ≤ 0.65, THE Trend_Assembler SHALL classify the trend direction as neutral.
6. THE Trend_Assembler SHALL persist the entropy value H alongside the trend summary for auditability.
7. FOR ALL P_bull values in (0, 1), THE Trend_Assembler SHALL compute entropy values in (0, 1], with maximum entropy of 1.0 occurring at P_bull = 0.5.
---
### Requirement 10: Multiplicative Macro Exposure Scoring
**User Story:** As a quantitative analyst, I want macro impact computed using multiplicative exposure rather than linear weighted sums, so that a company exposed across multiple dimensions receives compounding impact rather than simple addition.
#### Acceptance Criteria
1. WHEN computing macro impact for a company, THE Macro_Scorer SHALL use the multiplicative exposure formula S_macro = severity · (1 - Π_k(1 - w_k · O_k)), where O_k are the overlap components (geographic, supply chain, commodity, sector) and w_k are their respective weights.
2. THE Macro_Scorer SHALL use the following overlap weights: w_geo = 0.35, w_supply = 0.25, w_commodity = 0.25, w_sector = 0.15 (matching the current linear weight distribution).
3. WHEN a company has zero overlap across all dimensions, THE Macro_Scorer SHALL produce S_macro = 0.0 (no impact).
4. WHEN a company has maximum overlap across all dimensions (all O_k = 1.0), THE Macro_Scorer SHALL produce S_macro = severity · (1 - (1-0.35)·(1-0.25)·(1-0.25)·(1-0.15)), which is approximately severity · 0.724.
5. THE Macro_Scorer SHALL preserve the existing severity weight mapping (critical=1.0, high=0.75, moderate=0.5, low=0.25).
6. THE Macro_Scorer SHALL preserve the existing resilience modifier (R_tier) applied after the multiplicative exposure computation.
7. FOR ALL overlap configurations, THE Macro_Scorer SHALL produce impact scores where adding a non-zero overlap in any dimension increases the total impact (monotonicity property).
---
### Requirement 11: Conditional Macro Signal Integration
**User Story:** As a quantitative analyst, I want macro signals treated as conditional modifiers on company signals rather than additive contributions, so that macro context amplifies or dampens existing company-level evidence rather than independently shifting the trend.
#### Acceptance Criteria
1. WHEN both company signals and macro signals exist for a ticker, THE Aggregation_Engine SHALL apply macro impact as a multiplicative modifier on the company signal strength: S_adjusted = S_company · (1 + M_macro · sign_alignment), where M_macro is the normalized macro impact and sign_alignment is +1 when macro and company signals agree in direction, -1 when they disagree.
2. THE Aggregation_Engine SHALL clamp the macro modifier (1 + M_macro · sign_alignment) to the range [0.5, 1.5] to prevent macro signals from inverting or excessively amplifying company signals.
3. WHEN only macro signals exist (no company signals), THE Aggregation_Engine SHALL fall back to the current additive behavior with the existing macro weight of 0.3, preserving the macro-only suppression safety mechanism.
4. WHEN only company signals exist (macro layer disabled or no macro events), THE Aggregation_Engine SHALL use company signals without modification (modifier = 1.0).
5. THE Aggregation_Engine SHALL log the macro modifier value applied to each ticker for auditability.
---
### Requirement 12: Graph-Distance Competitive Signal Attenuation
**User Story:** As a quantitative analyst, I want competitive signal transfer attenuated by network graph distance and historical correlation, so that signals propagate more strongly to closely related competitors and decay for distant relationships.
#### Acceptance Criteria
1. WHEN propagating a signal from a source company to a target company, THE Competitive_Scorer SHALL compute transfer strength as S_transfer = S_source · ρ_historical · e^(-d_network), where S_source is the source signal strength, ρ_historical is the historical price correlation between the two companies, and d_network is the graph distance in the competitor relationship network.
2. THE Competitive_Scorer SHALL compute graph distance d_network as the shortest path length in the competitor relationship graph, where direct competitors have distance 1, competitors-of-competitors have distance 2, and so on.
3. WHEN the graph distance exceeds 3, THE Competitive_Scorer SHALL not propagate the signal (e^(-3) ≈ 0.05, below meaningful contribution).
4. THE Competitive_Scorer SHALL compute ρ_historical as the 90-day rolling Pearson correlation of daily returns between the source and target companies.
5. WHEN historical correlation data is insufficient (fewer than 30 trading days of overlapping data), THE Competitive_Scorer SHALL use a default correlation of 0.3 for same-sector companies and 0.1 for cross-sector companies.
6. THE Competitive_Scorer SHALL preserve the existing relationship strength threshold (R_relationship ≥ 0.2) as a pre-filter before applying the graph-distance attenuation.
7. FOR ALL source-target pairs, THE Competitive_Scorer SHALL produce transfer strengths that decrease monotonically with increasing graph distance (closer competitors always receive stronger signal transfer).
---
### Requirement 13: Exponentially Weighted Momentum
**User Story:** As a quantitative analyst, I want trend momentum computed using exponentially weighted historical changes rather than a simple current-minus-previous difference, so that the momentum estimate is smoother and less sensitive to single-cycle noise.
#### Acceptance Criteria
1. WHEN computing trend momentum, THE Projection_Engine SHALL use an exponentially weighted sum M_t = Σ_{k=0}^{K-1} λ^k · ΔS_{t-k}, where ΔS_{t-k} is the signed strength change at lag k, λ = 0.7 is the decay factor, and K is the number of available historical cycles (up to 10).
2. THE Projection_Engine SHALL normalize the momentum by dividing by the geometric series sum Σ λ^k to produce a value in [-1, 1].
3. WHEN fewer than 2 historical cycles are available, THE Projection_Engine SHALL fall back to the current heuristic (momentum = direction_sign × strength × 0.5).
4. THE Projection_Engine SHALL compute volatility-scaled momentum M_adj = M_t / max(σ_20, 0.01), where σ_20 is the 20-day return standard deviation, to normalize momentum relative to the ticker's typical price movement.
5. THE Projection_Engine SHALL clamp M_adj to [-2.0, 2.0] to prevent division by very small σ_20 from producing extreme values.
6. FOR ALL sequences of monotonically increasing signed strengths, THE Projection_Engine SHALL produce positive momentum values (correctly detecting strengthening bullish trends).
---
### Requirement 14: Expected Value Recommendation Gate
**User Story:** As a quantitative analyst, I want recommendation eligibility based on expected value rather than simple confidence and strength thresholds, so that the system only recommends trades with positive risk-adjusted expected outcomes.
#### Acceptance Criteria
1. WHEN evaluating recommendation eligibility, THE Recommendation_Engine SHALL compute expected value EV = P_bull · R_up - P_bear · R_down, where P_bull is the Bayesian bullish probability, P_bear = 1 - P_bull, R_up is the estimated upside return, and R_down is the estimated downside return.
2. THE Recommendation_Engine SHALL estimate R_up and R_down from the trend strength and the ticker's 20-day historical volatility: R_up = strength · σ_20 · √(horizon_days) and R_down = (1 - strength) · σ_20 · √(horizon_days), where horizon_days corresponds to the trend window duration.
3. WHEN EV is positive and exceeds a configurable threshold (default 0.005, representing 0.5% expected return), THE Recommendation_Engine SHALL allow the recommendation to proceed through the existing eligibility gates.
4. WHEN EV is negative or below the threshold, THE Recommendation_Engine SHALL force the recommendation to informational mode regardless of confidence and strength.
5. THE Recommendation_Engine SHALL persist the computed EV alongside the recommendation for auditability.
6. THE Recommendation_Engine SHALL preserve all existing eligibility gates (confidence ≥ 0.35, strength ≥ 0.10, contradiction ≤ 0.60, evidence ≥ 2, direction ≠ neutral) as additional requirements beyond the EV gate.
---
### Requirement 15: Contradiction Handling via Weighted Disagreement Entropy
**User Story:** As a quantitative analyst, I want contradiction detection to use weighted disagreement entropy rather than a simple minority/majority ratio, so that the system better distinguishes between a few strong dissenting signals and many weak ones.
#### Acceptance Criteria
1. WHEN computing contradiction, THE Trend_Assembler SHALL compute weighted disagreement entropy using the effective weight distribution across positive and negative signal groups.
2. THE Trend_Assembler SHALL compute the positive weight fraction f_pos = W_positive / (W_positive + W_negative) and negative weight fraction f_neg = W_negative / (W_positive + W_negative), where W_positive and W_negative are the sums of effective weights (combined_weight × impact_score) for each sentiment group.
3. THE Trend_Assembler SHALL compute contradiction entropy as H_contradiction = -f_pos·log₂(f_pos) - f_neg·log₂(f_neg), normalized to [0, 1] (maximum at f_pos = f_neg = 0.5).
4. THE Trend_Assembler SHALL weight the contradiction entropy by the total evidence mass: contradiction_score = H_contradiction · min(1.0, (W_positive + W_negative) / W_threshold), where W_threshold is a configurable parameter (default 5.0) representing the evidence mass at which contradiction becomes fully significant.
5. WHEN only positive or only negative signals exist (no disagreement), THE Trend_Assembler SHALL produce a contradiction score of 0.0.
6. THE Trend_Assembler SHALL preserve the existing `ContradictionResult` interface, populating the overall score with the entropy-based value and retaining the `DisagreementDetail` objects for catalyst-level analysis.
7. FOR ALL signal sets with both positive and negative signals, THE Trend_Assembler SHALL produce contradiction scores that increase monotonically as the weight distribution approaches equal split (f_pos → 0.5).
---
### Requirement 16: Backward Compatibility and Migration
**User Story:** As a platform operator, I want the mathematical upgrades to be backward-compatible with the existing database schema and deployable incrementally, so that the upgrade does not require downtime or data migration.
#### Acceptance Criteria
1. THE Aggregation_Engine SHALL preserve the existing `WeightedSignal`, `SignalWeight`, `TrendSummary`, and `Recommendation` dataclass interfaces, adding new fields as optional attributes with default values.
2. THE Aggregation_Engine SHALL store new mathematical outputs (P_bull, α, β, entropy, regime, EV) in the existing JSONB metadata fields of `trend_windows` and `recommendations` tables rather than requiring new columns.
3. THE Aggregation_Engine SHALL support a feature flag `probabilistic_scoring_enabled` in `risk_configs` that toggles between the current heuristic pipeline and the new probabilistic pipeline, defaultable to `false` for safe rollout.
4. WHEN `probabilistic_scoring_enabled` is false, THE Aggregation_Engine SHALL produce identical outputs to the current system (no behavioral change).
5. WHEN `probabilistic_scoring_enabled` is true, THE Aggregation_Engine SHALL use the new Bayesian, regime-aware, and adaptive formulas for all pipeline stages.
6. IF the feature flag toggle fails to read from the database, THEN THE Aggregation_Engine SHALL default to the current heuristic pipeline (fail-safe behavior).
7. THE Aggregation_Engine SHALL log which pipeline mode (heuristic or probabilistic) is active at the start of each aggregation cycle.
---
### Requirement 17: Property-Based Testing for Mathematical Correctness
**User Story:** As a developer, I want comprehensive property-based tests validating the mathematical correctness of all new formulas, so that edge cases and numerical stability issues are caught before deployment.
#### Acceptance Criteria
1. THE test suite SHALL include property-based tests using Hypothesis for the sigmoid confidence gate verifying monotonicity (higher confidence input always produces higher or equal gate output) across all float inputs in [0.0, 1.0].
2. THE test suite SHALL include property-based tests for the Beta_Distribution posterior verifying that α + β increases monotonically with the number of signals processed (evidence always accumulates).
3. THE test suite SHALL include property-based tests for the Bayesian confidence formula verifying that confidence is 0.0 when α = β (maximum uncertainty) and approaches 1.0 as the ratio α/β or β/α increases.
4. THE test suite SHALL include property-based tests for the adaptive decay verifying that the adaptive half-life is always greater than or equal to the base half-life for all valid input combinations.
5. THE test suite SHALL include property-based tests for the multiplicative macro exposure verifying monotonicity (adding non-zero overlap in any dimension increases total impact).
6. THE test suite SHALL include property-based tests for the exponentially weighted momentum verifying that monotonically increasing strength sequences produce positive momentum.
7. THE test suite SHALL include a round-trip property test verifying that computing the Beta posterior from signals, extracting P_bull, then reconstructing approximate signal weights produces values consistent with the original inputs.
8. THE test suite SHALL include property-based tests for the expected value computation verifying that EV is positive when P_bull > 0.5 and R_up > R_down (basic directional consistency).
9. THE test suite SHALL include property-based tests for numerical stability verifying that no formula produces NaN, infinity, or values outside documented ranges for any valid input combination.
10. THE test suite SHALL use `@settings(max_examples=100)` and follow the project convention of `test_pbt_*` file naming.
+349
View File
@@ -0,0 +1,349 @@
# Implementation Plan: Signal Math Upgrade
## Overview
Upgrade the Stonks Oracle signal processing pipeline from deterministic heuristic formulas to a probabilistic, regime-aware, and adaptive mathematical framework. Implementation proceeds in layers: foundations (config, schemas, new modules), then each pipeline stage (scoring → trend assembly → macro → competitive → projection → recommendation), then integration wiring, and finally testing. All changes are gated behind the `probabilistic_scoring_enabled` feature flag.
## Tasks
- [ ] 1. Foundation: Configuration and schema extensions
- [x] 1.1 Extend `ScoringConfig` with probabilistic parameters in `services/aggregation/scoring.py`
- Add `probabilistic: bool = False` toggle field
- Add sigmoid gate parameters: `sigmoid_steepness`, `sigmoid_midpoint`
- Add information gain parameters: `info_gain_lambda`, `info_gain_max`, `default_base_rate`
- Add adaptive decay parameters: `adaptive_decay_impact_scale`, `adaptive_decay_surprise_scale`, `adaptive_decay_market_scale`
- Add regime multiplier parameters: `regime_return_weight`, `regime_volume_weight`, `regime_multiplier_max`
- All new fields must have defaults matching the design document values
- _Requirements: 2.5, 3.1, 5.1, 6.3, 16.1_
- [x] 1.2 Extend `SignalWeight` and `WeightedSignal` dataclasses in `services/aggregation/scoring.py`
- Add optional fields to `SignalWeight`: `sigmoid_gate`, `info_gain_factor`, `source_accuracy_factor`, `regime_multiplier`
- Add optional fields to `WeightedSignal`: `info_gain_factor`, `source_accuracy_factor`, `adaptive_half_life`
- All new fields must have defaults (None or 1.0) for backward compatibility
- _Requirements: 16.1, 2.5, 3.3, 4.2_
- [x] 1.3 Extend `TrendSummary` Pydantic model in `services/shared/schemas.py`
- Add optional fields: `p_bull`, `alpha`, `beta_param`, `bayesian_confidence`, `entropy`, `regime`, `pipeline_mode`
- `pipeline_mode` defaults to `"heuristic"`; all others default to `None`
- _Requirements: 16.1, 1.6, 9.6_
- [x] 1.4 Extend `Recommendation` model in `services/shared/schemas.py` (or `services/recommendation/eligibility.py`)
- Add optional fields: `expected_value`, `p_bull`, `pipeline_mode`
- `pipeline_mode` defaults to `"heuristic"`; all others default to `None`
- _Requirements: 16.1, 14.5_
- [x] 1.5 Add `probabilistic_scoring_enabled` feature flag support in `services/shared/config.py`
- Read `probabilistic_scoring_enabled` from `risk_configs.config` JSONB
- Default to `False` when key is missing, value is invalid, or DB is unreachable
- Propagate flag through `AggregationConfig` dataclass
- Log which pipeline mode is active at cycle start
- _Requirements: 16.3, 16.4, 16.5, 16.6, 16.7_
- [x] 1.6 Create database migration `infra/migrations/034_source_accuracy.sql`
- Create `source_accuracy` table with columns: `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`, `source_id VARCHAR(200) NOT NULL`, `accuracy_ratio FLOAT NOT NULL DEFAULT 0.5`, `sample_count INTEGER NOT NULL DEFAULT 0`, `last_updated TIMESTAMPTZ`, `created_at TIMESTAMPTZ`
- Add `UNIQUE(source_id)` constraint and `idx_source_accuracy_source` index
- _Requirements: 4.5_
- [x] 2. Checkpoint — Verify foundation compiles and existing tests pass
- Ensure all tests pass, ask the user if questions arise.
- [ ] 3. New module: Bayesian Accumulator (`services/aggregation/bayesian.py`)
- [x] 3.1 Implement `BayesianPosterior` dataclass and `compute_bayesian_posterior` function
- Create frozen dataclass with fields: `p_bull`, `alpha`, `beta`, `log_likelihood`, `bayesian_confidence`, `entropy`, `signal_count`
- Define `PRIOR` class-level constant for uninformative prior (p_bull=0.5, α=1.0, β=1.0, C=0.0, H=1.0)
- Implement log-likelihood accumulation: `L_t = Σ(w_i · s_i)` using `weight.combined * sentiment_value`
- Compute `P_bull = σ(L_t)` via sigmoid function
- Compute Beta posterior: `α = 1 + W_bull`, `β = 1 + W_bear` from positive/negative weight sums
- Compute Bayesian confidence: `C = 1 - 4αβ/(α+β)²`
- Compute Shannon entropy via `compute_entropy`
- Return `PRIOR` for empty signal lists
- Skip signals with NaN weight or sentiment
- _Requirements: 1.1, 1.2, 1.3, 1.4, 1.5, 1.6_
- [x] 3.2 Implement `compute_entropy` function
- Shannon entropy: `H = -p·log₂(p) - (1-p)·log₂(1-p)`
- Return 0.0 for p ≤ 0 or p ≥ 1 (edge cases)
- Return value in [0, 1] with maximum 1.0 at p=0.5
- _Requirements: 9.1, 9.7_
- [x] 3.3 Write property test for sigmoid gate monotonicity
- **Property 1: Sigmoid Gate Monotonicity**
- **Validates: Requirements 2.6, 17.1**
- [x] 3.4 Write property test for Beta posterior evidence accumulation
- **Property 2: Beta Posterior Evidence Accumulation**
- **Validates: Requirements 1.3, 17.2**
- [x] 3.5 Write property test for Bayesian confidence symmetry and divergence
- **Property 3: Bayesian Confidence Symmetry and Divergence**
- **Validates: Requirements 1.4, 17.3**
- [x] 3.6 Write property test for Bayesian posterior round-trip consistency
- **Property 4: Bayesian Posterior Round-Trip Consistency**
- **Validates: Requirements 1.7, 17.7**
- [x] 3.7 Write property test for Shannon entropy range and maximum
- **Property 8: Shannon Entropy Range and Maximum**
- **Validates: Requirements 9.7**
- [x] 3.8 Write property test for Bayesian confidence monotonic with agreeing signals
- **Property 13: Bayesian Confidence Monotonic with Agreeing Signals**
- **Validates: Requirements 8.6**
- [ ] 4. New module: Regime Detector (`services/aggregation/regime.py`)
- [x] 4.1 Implement `MarketRegime` enum, `RegimeClassification` and `RegimeConfig` dataclasses
- `MarketRegime`: `TREND_FOLLOWING`, `PANIC`, `MEAN_REVERSION`, `UNCERTAINTY`
- `RegimeClassification`: `regime`, `trend_indicator`, `volatility_ratio`, `bullish_threshold`, `bearish_threshold`, `contradiction_penalty_multiplier`
- `RegimeConfig`: all configurable parameters with defaults from design
- _Requirements: 7.3_
- [x] 4.2 Implement `compute_ema` and `classify_regime` functions
- `compute_ema`: exponential moving average over last N values
- `classify_regime`: compute trend indicator `R = sign(EMA_20 - EMA_100)` and volatility ratio `V_r = σ_20 / σ_100`
- Classification rules: trend-following (R≠0 AND V_r<1.2), panic (V_r>1.5), mean-reversion (R=0 AND V_r<1.0), uncertainty (all other)
- Adjust thresholds per regime: panic→±0.10, mean-reversion→±0.20, trend-following→±0.15, uncertainty→±0.15 with contradiction multiplier 0.6
- Default to uncertainty when data is insufficient (<100 days) or σ values are zero
- _Requirements: 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.9_
- [ ] 5. New module: Source Accuracy Tracker (`services/aggregation/source_accuracy.py`)
- [x] 5.1 Implement `SourceAccuracy` dataclass and database functions
- `SourceAccuracy` dataclass with `source_id`, `accuracy_ratio`, `sample_count`, `last_updated`
- `accuracy_factor` property: return 1.0 when sample_count < 10, else `0.5 + accuracy_ratio`
- `fetch_source_accuracy`: batch fetch from `source_accuracy` table via asyncpg
- `update_source_accuracy`: update accuracy metrics from realized price outcomes
- Handle DB unreachable: return neutral factor 1.0 for all sources
- Clamp corrupted accuracy_ratio to [0.0, 1.0]
- _Requirements: 4.1, 4.2, 4.3, 4.4, 4.5_
- [x] 6. Checkpoint — Verify new modules compile and unit tests pass
- Ensure all tests pass, ask the user if questions arise.
- [ ] 7. Signal Scorer upgrades (`services/aggregation/scoring.py`)
- [x] 7.1 Implement sigmoid confidence gate
- Add `sigmoid_gate(x, steepness, midpoint)` function: `σ(k·(x - midpoint))`
- When `probabilistic=True`, replace binary gate with sigmoid gate in `compute_signal_weight`
- When `probabilistic=False`, preserve existing binary gate behavior
- _Requirements: 2.1, 2.2, 2.3, 2.4, 2.5_
- [x] 7.2 Implement information gain surprise weighting
- Add `EVENT_TYPE_BASE_RATES` constant dict and `DEFAULT_BASE_RATE = 0.1`
- Add `compute_info_gain(event_type, lambda_param, max_gain, default_base_rate)` function: `r = 1 + λ·(-log₂ P(event_type))`, clamped to max 3.0
- Integrate as multiplicative factor in combined weight when `probabilistic=True`
- _Requirements: 3.1, 3.2, 3.3, 3.4, 3.5_
- [x] 7.3 Implement adaptive recency decay
- Add `compute_adaptive_half_life(base_half_life, impact_score, info_gain_factor, market_multiplier, config)` function
- Compute `β_impact`, `β_surprise`, `β_market_reaction` scaling factors per design
- `τ_i = τ_base · (1 + β_impact) · (1 + β_surprise) · (1 + β_market_reaction)`
- When `probabilistic=True`, use adaptive half-life in `recency_weight`; otherwise use fixed
- _Requirements: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7_
- [x] 7.4 Implement regime multiplier replacing market context multiplier
- Add `compute_regime_multiplier(returns, volumes, config)` function
- Compute z-scores for return and volume, then `M_regime = 1 + 0.15·|z_r| + 0.10·|z_v|`
- Clamp to [1.0, 2.5]; default to 1.0 when data unavailable or σ=0
- When `probabilistic=True`, use `M_regime` instead of `M_context` in combined weight
- _Requirements: 6.1, 6.2, 6.3, 6.4, 6.5_
- [x] 7.5 Integrate source accuracy factor into `compute_signal_weight`
- Accept optional `source_accuracy_factor` parameter
- When `probabilistic=True`, multiply into combined weight formula
- When `probabilistic=False`, ignore (factor = 1.0)
- _Requirements: 4.2, 4.3_
- [x] 7.6 Update `compute_signal_weight` to branch on `probabilistic` flag
- When `probabilistic=True`: use sigmoid gate × recency (adaptive) × credibility × (1 + novelty) × info_gain × source_accuracy × regime_multiplier
- When `probabilistic=False`: preserve exact current formula (binary gate × recency × credibility × (1 + novelty) × market_context)
- Populate all new optional fields on `SignalWeight` and `WeightedSignal`
- _Requirements: 16.4, 16.5_
- [x] 7.7 Write property test for information gain monotonicity
- **Property 6: Information Gain Monotonicity**
- **Validates: Requirements 3.5**
- [x] 7.8 Write property test for adaptive decay lower bound
- **Property 5: Adaptive Decay Lower Bound**
- **Validates: Requirements 5.7, 17.4**
- [ ] 8. Contradiction upgrade (`services/aggregation/contradiction.py`)
- [x] 8.1 Implement weighted disagreement entropy contradiction
- Compute `f_pos = W_positive / (W_positive + W_negative)` and `f_neg = 1 - f_pos`
- Compute `H_contradiction = -f_pos·log₂(f_pos) - f_neg·log₂(f_neg)`
- Weight by evidence mass: `contradiction_score = H_contradiction · min(1.0, (W_pos + W_neg) / W_threshold)`
- Return 0.0 when only one direction exists
- Preserve existing `ContradictionResult` interface
- When `probabilistic=False`, preserve existing minority/majority ratio behavior
- _Requirements: 15.1, 15.2, 15.3, 15.4, 15.5, 15.6, 15.7_
- [x] 8.2 Write property test for contradiction entropy monotonicity
- **Property 9: Contradiction Entropy Monotonicity**
- **Validates: Requirements 15.7**
- [ ] 9. Trend Assembly upgrades (`services/aggregation/worker.py`)
- [x] 9.1 Integrate Bayesian posterior into trend assembly
- When `probabilistic=True`, call `compute_bayesian_posterior` on merged signals
- Use Bayesian confidence formula for trend confidence: `0.5 × C_bayesian + 0.25 × F_count + 0.25 × C_avg_credibility - P_contradiction`
- Use entropy-based direction: H>0.9→mixed, P_bull>0.65→bullish, P_bull<0.35→bearish, else neutral
- Apply regime-adjusted thresholds from `RegimeClassification`
- Populate new `TrendSummary` fields: `p_bull`, `alpha`, `beta_param`, `bayesian_confidence`, `entropy`, `regime`, `pipeline_mode`
- Store probabilistic outputs in `market_context` JSONB under `"probabilistic"` key
- When `probabilistic=False`, preserve exact current heuristic behavior
- _Requirements: 1.1, 1.2, 8.1, 8.2, 8.3, 8.4, 8.5, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 7.8, 16.4, 16.5_
- [x] 9.2 Wire regime detection into the aggregation cycle
- Call `classify_regime` with closing prices and returns for each ticker
- Pass `RegimeClassification` to trend assembly for threshold adjustment
- Default to uncertainty regime when market data is unavailable
- Persist regime classification in JSONB for auditability
- _Requirements: 7.1, 7.2, 7.3, 7.8, 7.9_
- [ ] 10. Macro scoring upgrade (`services/aggregation/interpolation.py`)
- [x] 10.1 Implement multiplicative macro exposure formula
- When `probabilistic=True`, compute `S_macro = severity · (1 - Π_k(1 - w_k · O_k))` instead of linear weighted sum
- Preserve overlap weights: w_geo=0.35, w_supply=0.25, w_commodity=0.25, w_sector=0.15
- Preserve severity mapping and resilience modifier
- When `probabilistic=False`, preserve exact current linear formula
- _Requirements: 10.1, 10.2, 10.3, 10.4, 10.5, 10.6_
- [x] 10.2 Implement conditional macro signal integration
- When `probabilistic=True` and both company and macro signals exist, apply macro as multiplicative modifier: `S_adjusted = S_company · clamp(1 + M_macro · sign_alignment, 0.5, 1.5)`
- When only macro signals exist, fall back to additive behavior with weight 0.3
- When only company signals exist, use modifier = 1.0
- Log macro modifier value per ticker
- When `probabilistic=False`, preserve current additive merge behavior
- _Requirements: 11.1, 11.2, 11.3, 11.4, 11.5_
- [x] 10.3 Write property test for multiplicative macro exposure monotonicity
- **Property 7: Multiplicative Macro Exposure Monotonicity**
- **Validates: Requirements 10.7, 17.5**
- [ ] 11. Competitive signal upgrade (`services/aggregation/signal_propagation.py`)
- [x] 11.1 Implement graph-distance attenuation for competitive signals
- When `probabilistic=True`, compute `S_transfer = S_source · ρ_historical · e^(-d_network)` instead of flat transfer
- Compute graph distance as shortest path in competitor relationship graph (cap at 3)
- Use 90-day rolling Pearson correlation for `ρ_historical`; default to 0.3 (same-sector) or 0.1 (cross-sector) when insufficient data (<30 days)
- Preserve existing relationship strength threshold (R ≥ 0.2) as pre-filter
- When `probabilistic=False`, preserve exact current flat transfer behavior
- _Requirements: 12.1, 12.2, 12.3, 12.4, 12.5, 12.6, 12.7_
- [x] 11.2 Write property test for competitive signal distance attenuation
- **Property 11: Competitive Signal Distance Attenuation**
- **Validates: Requirements 12.7**
- [ ] 12. Projection upgrade (`services/aggregation/projection.py`)
- [x] 12.1 Implement exponentially weighted momentum
- When `probabilistic=True`, compute `M_t = Σ_{k=0}^{K-1} λ^k · ΔS_{t-k}` with λ=0.7, K up to 10
- Normalize by geometric series sum to produce value in [-1, 1]
- Fall back to current heuristic when fewer than 2 historical cycles available
- Compute volatility-scaled momentum: `M_adj = M_t / max(σ_20, 0.01)`, clamped to [-2.0, 2.0]
- When `probabilistic=False`, preserve exact current simple momentum behavior
- _Requirements: 13.1, 13.2, 13.3, 13.4, 13.5, 13.6_
- [x] 12.2 Write property test for exponentially weighted momentum direction
- **Property 10: Exponentially Weighted Momentum Direction**
- **Validates: Requirements 13.6, 17.6**
- [ ] 13. Recommendation upgrade (`services/recommendation/eligibility.py`)
- [x] 13.1 Implement expected value recommendation gate
- When `probabilistic=True`, compute `EV = P_bull · R_up - P_bear · R_down`
- Estimate `R_up = strength · σ_20 · √(horizon_days)` and `R_down = (1 - strength) · σ_20 · √(horizon_days)`
- When EV > threshold (default 0.005), allow recommendation through existing gates
- When EV ≤ threshold, force recommendation to informational mode
- Persist EV in `risk_checks` JSONB of `recommendation_evaluations`
- Populate `expected_value`, `p_bull`, `pipeline_mode` on Recommendation model
- Preserve all existing eligibility gates as additional requirements
- When `probabilistic=False`, skip EV gate entirely
- _Requirements: 14.1, 14.2, 14.3, 14.4, 14.5, 14.6_
- [x] 13.2 Write property test for expected value directional consistency
- **Property 12: Expected Value Directional Consistency**
- **Validates: Requirements 17.8**
- [x] 14. Checkpoint — Verify all pipeline stages compile and existing tests still pass
- Ensure all tests pass, ask the user if questions arise.
- [ ] 15. Integration wiring and feature flag plumbing
- [x] 15.1 Wire feature flag through the aggregation worker entry point
- Read `probabilistic_scoring_enabled` from `risk_configs` at cycle start in `services/aggregation/worker.py`
- Pass flag to `ScoringConfig`, trend assembly, contradiction, macro, competitive, and projection stages
- Log pipeline mode at cycle start
- Ensure flag is read once per cycle (mid-cycle changes take effect next cycle)
- _Requirements: 16.3, 16.6, 16.7_
- [x] 15.2 Wire source accuracy fetch into the scoring pipeline
- At cycle start, batch-fetch source accuracy for all source IDs in the current signal set
- Pass `source_accuracy_factor` to `compute_signal_weight` for each signal
- Handle DB errors gracefully (default to 1.0)
- _Requirements: 4.1, 4.2, 4.3_
- [x] 15.3 Wire regime detection into the aggregation cycle
- Fetch closing prices and returns for each ticker from market data
- Call `classify_regime` and pass result to trend assembly and scoring stages
- Handle missing market data (default to uncertainty regime)
- _Requirements: 7.1, 7.8, 7.9_
- [x] 15.4 Store probabilistic outputs in existing JSONB columns
- Store Bayesian fields in `trend_windows.market_context` JSONB under `"probabilistic"` key
- Store EV fields in `recommendation_evaluations.risk_checks` JSONB
- Store regime classification in trend window JSONB
- _Requirements: 16.2_
- [ ] 16. Numerical stability and edge case hardening
- [x] 16.1 Add input validation and edge case guards across all new functions
- Guard `log₂(0)` in entropy and information gain computations
- Floor `max(σ_20, 0.01)` for momentum volatility scaling
- Default to uncertainty regime when σ values are zero
- Return `M_regime = 1.0` when z-score σ = 0
- Skip signals with NaN weight or sentiment
- Clamp all outputs to documented ranges
- _Requirements: 17.9, 6.4_
- [x] 16.2 Write property test for numerical stability across all formulas
- **Property 14: Numerical Stability Across All Formulas**
- **Validates: Requirements 17.9, 6.4**
- [ ] 17. Unit tests for all new and modified modules
- [x] 17.1 Write unit tests for Bayesian accumulator (`tests/test_bayesian.py`)
- Test uninformative prior (empty signals → P_bull=0.5, α=1, β=1, C=0)
- Test specific sigmoid gate values (x=0.5→0.5, x=0.2→<0.05, x=0.8→>0.95)
- Test entropy direction mapping (H>0.9→mixed, P_bull>0.65→bullish, etc.)
- _Requirements: 1.1, 1.2, 1.3, 1.4, 1.5_
- [x] 17.2 Write unit tests for regime detector (`tests/test_regime.py`)
- Test specific (R, V_r) → expected regime classification
- Test threshold adjustments per regime (panic→0.10, mean_reversion→0.20)
- Test insufficient data fallback to uncertainty
- _Requirements: 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.9_
- [x] 17.3 Write unit tests for source accuracy tracker (`tests/test_source_accuracy.py`)
- Test accuracy_factor property: sample_count < 10 → 1.0, else 0.5 + ratio
- Test corrupted data clamping
- _Requirements: 4.1, 4.2, 4.3_
- [x] 17.4 Write unit tests for signal scoring upgrades (`tests/test_signal_math_unit.py`)
- Test info gain clamp (very rare event → factor ≤ 3.0)
- Test default base rate (unknown event type → 0.1)
- Test adaptive decay edge cases (all zeros → τ_base, all max → 6×τ_base)
- Test zero overlap → zero macro impact
- Test max overlap → ≈severity×0.724
- Test macro fallback behaviors (only macro → additive, only company → no modifier)
- Test graph distance cutoff (d>3 → no propagation)
- Test momentum fallback (<2 cycles → heuristic)
- Test EV threshold behavior (EV>0.005→proceed, EV≤0.005→informational)
- Test feature flag behaviors (flag=false→heuristic, flag=true→probabilistic)
- _Requirements: 3.1, 3.4, 5.5, 5.6, 10.3, 10.4, 11.3, 13.3, 14.3, 14.4, 16.4, 16.5_
- [x] 18. Final checkpoint — Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
## Notes
- Tasks marked with `*` are optional and can be skipped for faster MVP
- Each task references specific requirements for traceability
- Checkpoints ensure incremental validation after each major phase
- Property tests validate the 14 universal correctness properties from the design document
- Unit tests validate specific examples, edge cases, and integration points
- The design uses Python throughout — no language selection needed
- Migration number is 034 (existing migrations go up to 033)
- All new dataclass fields use optional defaults for backward compatibility
- Feature flag `probabilistic_scoring_enabled` gates every behavioral change