docs: update equations.md with probabilistic pipeline formulas

Add sections 1B, 2B, 3B, 4B, 5B, 7B covering all new probabilistic formulas: sigmoid gate, info gain, adaptive decay, regime multiplier, source accuracy, Bayesian posterior, entropy direction, weighted disagreement entropy, multiplicative macro exposure, conditional macro integration, graph-distance attenuation, EW momentum, and EV gate. Updated constants summary with all new parameters.
2026-04-29 15:12:47 +00:00
parent 7eecd71a0d
commit ac29e62033
1 changed files with 264 additions and 0 deletions
@@ -74,6 +74,93 @@ S_avg = Σ(W_combined_i × impact_i × sentiment_i) / Σ(W_combined_i × impact_

 ---

+## 1B. Probabilistic Signal Scoring (Feature-Flagged)
+
+**Source:** `services/aggregation/scoring.py`
+**Active when:** `probabilistic_scoring_enabled = true` in `risk_configs.config` JSONB
+
+When the probabilistic pipeline is enabled, the combined weight formula changes:
+
+### 1B.1 Combined Signal Weight (Probabilistic)
+
+```
+W_combined = G_sigmoid × W_recency(adaptive) × W_credibility × (1 + B_novelty) × R_info × F_accuracy × M_regime
+```
+
+| Component | Symbol | Formula | Range |
+|---|---|---|---|
+| Sigmoid gate | G_sigmoid | σ(k·(x − midpoint)) = 1/(1+e^(−5·(x−0.5))) | (0, 1) |
+| Adaptive recency | W_recency | 2^(−t_age / τ_adaptive) | [0.01, 1.0] |
+| Credibility | W_credibility | same as heuristic | [0.1, 1.0] |
+| Novelty bonus | B_novelty | same as heuristic | [0, 0.25] |
+| Information gain | R_info | 1 + λ·(−log₂ P(event_type)) | [1.0, 3.0] |
+| Source accuracy | F_accuracy | 0.5 + accuracy_ratio (if samples ≥ 10, else 1.0) | [0.5, 1.5] |
+| Regime multiplier | M_regime | 1 + 0.15·|z_r| + 0.10·|z_v| | [1.0, 2.5] |
+
+### 1B.2 Sigmoid Confidence Gate
+
+Replaces the binary 0/1 gate with a smooth transition:
+
+```
+G_sigmoid = σ(k·(x − m)) = 1 / (1 + e^(−k·(x−m)))
+```
+
+Default: k = 5.0, m = 0.5. At x=0.5 → 0.5; at x=0.2 → ~0.18; at x=0.8 → ~0.82.
+
+### 1B.3 Information Gain (Surprise Weighting)
+
+```
+R_info = min(1 + λ·(−log₂ P(event_type)), 3.0)
+```
+
+| Event Type | P(event_type) | R_info (λ=0.3) |
+|---|---|---|
+| earnings | 0.25 | 1.60 |
+| dividend | 0.15 | 1.84 |
+| product_launch | 0.10 | 2.00 |
+| regulatory | 0.08 | 2.07 |
+| management_change | 0.06 | 2.19 |
+| legal | 0.05 | 2.29 |
+| restructuring | 0.04 | 2.39 |
+| m_and_a | 0.03 | 2.56 |
+| unknown | 0.10 (default) | 2.00 |
+
+### 1B.4 Adaptive Recency Decay
+
+```
+τ_adaptive = τ_base × (1 + β_impact) × (1 + β_surprise) × (1 + β_market)
+```
+
+| Factor | Formula | Range |
+|---|---|---|
+| β_impact | impact_score × 1.0 | [0, 1.0] |
+| β_surprise | (R_info − 1) / 2 × 1.0 | [0, 1.0] |
+| β_market | (M_regime − 1) / 0.45 × 0.5 | [0, 0.5] |
+
+Maximum adaptive half-life: 6× base (when all factors at max).
+Minimum: τ_base (adaptive decay is never faster than fixed).
+
+### 1B.5 Regime Multiplier
+
+```
+z_r = (r_t − μ_20) / σ_20          (return z-score)
+z_v = (ln(V_t) − μ_V) / σ_V       (log-volume z-score)
+M_regime = clamp(1 + 0.15·|z_r| + 0.10·|z_v|, 1.0, 2.5)
+```
+
+Defaults to 1.0 when market data unavailable or σ = 0.
+
+### 1B.6 Source Accuracy Factor
+
+```
+F_accuracy = 0.5 + clamp(accuracy_ratio, 0, 1)    if sample_count ≥ 10
+F_accuracy = 1.0                                    if sample_count < 10
+```
+
+Stored in `source_accuracy` table, updated asynchronously from realized 7-day price outcomes.
+
+---
+
 ## 2. Trend Summary Assembly

 **Source:** `services/aggregation/worker.py`
@@ -125,6 +212,79 @@ confidence = clamp(0.3 × F_count + 0.3 × C_avg + 0.4 × A_agreement − P_cont

 ---

+## 2B. Probabilistic Trend Assembly (Feature-Flagged)
+
+**Source:** `services/aggregation/worker.py`, `services/aggregation/bayesian.py`
+**Active when:** `probabilistic_scoring_enabled = true`
+
+### 2B.1 Bayesian Posterior Accumulation
+
+```
+L_t = Σ(W_combined_i × sentiment_i)                    (log-likelihood)
+P_bull = σ(L_t) = 1 / (1 + e^(−L_t))                  (bullish probability)
+α = 1 + W_bull       (W_bull = Σ W_combined for positive signals)
+β = 1 + W_bear       (W_bear = Σ W_combined for negative signals)
+C_bayesian = 1 − 4αβ / (α + β)²                       (Bayesian confidence)
+H = −P_bull·log₂(P_bull) − (1−P_bull)·log₂(1−P_bull)  (Shannon entropy)
+```
+
+Uninformative prior (no signals): P_bull=0.5, α=1, β=1, C=0, H=1.0.
+
+### 2B.2 Entropy-Based Direction
+
+| Condition | Direction |
+|---|---|
+| H > 0.9 | Mixed |
+| P_bull > 0.65 | Bullish |
+| P_bull < 0.35 | Bearish |
+| otherwise | Neutral |
+
+### 2B.3 Bayesian Trend Confidence
+
+```
+confidence = clamp(0.5 × C_bayesian + 0.25 × F_count + 0.25 × C_avg_credibility − P_contradiction, 0, 1)
+```
+
+| Component | Formula |
+|---|---|
+| C_bayesian | 1 − 4αβ/(α+β)² from Beta posterior |
+| F_count | min(N_unique_sources / 15, 0.8) |
+| C_avg_credibility | mean credibility weight across active signals |
+| P_contradiction | contradiction_entropy × regime.contradiction_penalty_multiplier |
+
+### 2B.4 Weighted Disagreement Entropy (Contradiction)
+
+**Source:** `services/aggregation/contradiction.py`
+
+```
+f_pos = W_positive / (W_positive + W_negative)
+f_neg = 1 − f_pos
+H_contradiction = −f_pos·log₂(f_pos) − f_neg·log₂(f_neg)
+contradiction_score = H_contradiction × min(1.0, (W_pos + W_neg) / W_threshold)
+```
+
+W_threshold default = 5.0. Returns 0.0 when only one direction exists.
+
+### 2B.5 Regime Detection
+
+**Source:** `services/aggregation/regime.py`
+
+```
+R = sign(EMA_20 − EMA_100)          (trend indicator)
+V_r = σ_20 / σ_100                  (volatility ratio)
+```
+
+| Condition | Regime | Threshold | Contradiction Mult |
+|---|---|---|---|
+| V_r > 1.5 | Panic | ±0.10 | 0.4 |
+| R ≠ 0 AND V_r < 1.2 | Trend-following | ±0.15 | 0.4 |
+| R = 0 AND V_r < 1.0 | Mean-reversion | ±0.20 | 0.4 |
+| otherwise | Uncertainty | ±0.15 | 0.6 |
+
+Falls back to Uncertainty when data < 100 days or σ = 0.
+
+---
+
 ## 3. Macro Impact Scoring (Layer 2)

 **Source:** `services/aggregation/interpolation.py`
@@ -184,6 +344,30 @@ S_final = clamp(S_raw × R_tier, 0, 1)

 For domestic-only events, R_tier = 1.0 regardless of tier.

+### 3B. Multiplicative Macro Exposure (Probabilistic)
+
+**Active when:** `probabilistic_scoring_enabled = true`
+
+```
+S_raw = W_severity × (1 − Π_k(1 − w_k × O_k))
+     = W_severity × (1 − (1−0.35·O_geo)(1−0.25·O_supply)(1−0.25·O_commodity)(1−0.15·O_sector))
+```
+
+Zero overlap → 0.0. Max overlap (all 1.0) → severity × 0.689.
+
+### 3B.1 Conditional Macro Integration
+
+When both company and macro signals exist:
+```
+modifier = clamp(1 + M_macro × sign_alignment, 0.5, 1.5)
+S_adjusted = S_company × modifier
+```
+
+sign_alignment = +1 (agree), −1 (disagree), 0 (neutral/mixed).
+
+When only macro signals exist: additive fallback with weight 0.3.
+When only company signals exist: modifier = 1.0.
+
 ### 3.4 Macro Impact Confidence

 ```
@@ -256,6 +440,22 @@ S_competitive = clamp(S_pattern_avg × R_relationship × C_pattern × I_source,

 **Threshold gate:** Skipped if R_relationship < propagation_strength_threshold (default 0.2).

+### 4B. Graph-Distance Attenuation (Probabilistic)
+
+**Active when:** `probabilistic_scoring_enabled = true`
+
+```
+S_transfer = S_source × ρ_historical × e^(−d_network)
+```
+
+| Component | Description |
+|---|---|
+| S_source | Source signal strength |
+| ρ_historical | 90-day rolling Pearson correlation (default 0.3 same-sector, 0.1 cross-sector) |
+| d_network | Shortest path in competitor graph (capped at 3) |
+
+No propagation when d_network > 3 (e^(−3) ≈ 0.05).
+
 ### 4.3 Competitive Signal as WeightedSignal

 ```
@@ -321,6 +521,19 @@ C_projected = C_base × 0.8 + min(S_macro × 0.15, 0.1)

 **Divergence detection:** Flagged when projected direction ≠ current trend direction.

+### 5B. Exponentially Weighted Momentum (Probabilistic)
+
+**Source:** `services/aggregation/projection.py`
+**Active when:** `probabilistic_scoring_enabled = true`
+
+```
+M_t = Σ_{k=0}^{K-1} λ^k × ΔS_{t-k}     (λ = 0.7, K ≤ 10)
+M_normalized = M_t / Σ_{k=0}^{K-1} λ^k   (range: [−1, 1])
+M_adj = clamp(M_normalized / max(σ_20, 0.01), −2.0, 2.0)
+```
+
+Falls back to heuristic momentum when < 2 historical cycles available.
+
 ---

 ## 6. Data Quality Suppression
@@ -388,6 +601,27 @@ Q = 0.4 × Q_confidence + 0.3 × Q_freshness + 0.3 × Q_coverage
 | paper_eligible | confidence ≥ 0.50 |
 | informational | everything else (WATCH/HOLD always informational) |

+### 7B. Expected Value Gate (Probabilistic)
+
+**Active when:** `probabilistic_scoring_enabled = true`
+
+```
+R_up = strength × σ_20 × √(horizon_days)
+R_down = (1 − strength) × σ_20 × √(horizon_days)
+EV = P_bull × R_up − (1 − P_bull) × R_down
+```
+
+| Horizon window | horizon_days |
+|---|---|
+| intraday / 1d | 1 |
+| 7d | 7 |
+| 30d | 30 |
+| 90d | 90 |
+
+- EV > 0.005 (0.5% expected return): recommendation proceeds through existing gates
+- EV ≤ 0.005: forced to informational mode regardless of confidence/strength
+- All existing eligibility gates (§7.1) remain as additional requirements
+
 ### 7.4 Position Sizing

 ```
@@ -649,3 +883,33 @@ Sell entire lowest-confidence positions until count is within limit.
 | Tier upgrade win rate | > 55% | risk_tier_controller.py |
 | Tier upgrade max drawdown | < 5% | risk_tier_controller.py |
 | Tier upgrade min reserve | > 20% | risk_tier_controller.py |
+| **Probabilistic pipeline** | | |
+| Sigmoid steepness (k) | 5.0 | scoring.py |
+| Sigmoid midpoint (m) | 0.5 | scoring.py |
+| Info gain lambda (λ) | 0.3 | scoring.py |
+| Info gain max clamp | 3.0 | scoring.py |
+| Default base rate | 0.10 | scoring.py |
+| Adaptive decay impact scale | 1.0 | scoring.py |
+| Adaptive decay surprise scale | 1.0 | scoring.py |
+| Adaptive decay market scale | 0.5 | scoring.py |
+| Regime return weight | 0.15 | scoring.py |
+| Regime volume weight | 0.10 | scoring.py |
+| Regime multiplier max | 2.5 | scoring.py |
+| Source accuracy min samples | 10 | source_accuracy.py |
+| Contradiction W_threshold | 5.0 | contradiction.py |
+| EMA short period | 20 days | regime.py |
+| EMA long period | 100 days | regime.py |
+| Panic volatility ratio | > 1.5 | regime.py |
+| Trend-following vol ratio | < 1.2 | regime.py |
+| Mean-reversion vol ratio | < 1.0 | regime.py |
+| Panic threshold | ±0.10 | regime.py |
+| Mean-reversion threshold | ±0.20 | regime.py |
+| Uncertainty contradiction mult | 0.6 | regime.py |
+| EW momentum decay (λ) | 0.7 | projection.py |
+| EW momentum max lags (K) | 10 | projection.py |
+| Volatility floor (σ min) | 0.01 | projection.py |
+| Momentum clamp | ±2.0 | projection.py |
+| EV threshold | 0.005 (0.5%) | eligibility.py |
+| Graph distance max | 3 | signal_propagation.py |
+| Default correlation (same-sector) | 0.3 | signal_propagation.py |
+| Default correlation (cross-sector) | 0.1 | signal_propagation.py |