docs: update equations.md with probabilistic pipeline formulas
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled

Add sections 1B, 2B, 3B, 4B, 5B, 7B covering all new probabilistic
formulas: sigmoid gate, info gain, adaptive decay, regime multiplier,
source accuracy, Bayesian posterior, entropy direction, weighted
disagreement entropy, multiplicative macro exposure, conditional macro
integration, graph-distance attenuation, EW momentum, and EV gate.
Updated constants summary with all new parameters.
This commit is contained in:
Celes Renata
2026-04-29 15:12:47 +00:00
parent 7eecd71a0d
commit ac29e62033
+264
View File
@@ -74,6 +74,93 @@ S_avg = Σ(W_combined_i × impact_i × sentiment_i) / Σ(W_combined_i × impact_
---
## 1B. Probabilistic Signal Scoring (Feature-Flagged)
**Source:** `services/aggregation/scoring.py`
**Active when:** `probabilistic_scoring_enabled = true` in `risk_configs.config` JSONB
When the probabilistic pipeline is enabled, the combined weight formula changes:
### 1B.1 Combined Signal Weight (Probabilistic)
```
W_combined = G_sigmoid × W_recency(adaptive) × W_credibility × (1 + B_novelty) × R_info × F_accuracy × M_regime
```
| Component | Symbol | Formula | Range |
|---|---|---|---|
| Sigmoid gate | G_sigmoid | σ(k·(x midpoint)) = 1/(1+e^(5·(x0.5))) | (0, 1) |
| Adaptive recency | W_recency | 2^(t_age / τ_adaptive) | [0.01, 1.0] |
| Credibility | W_credibility | same as heuristic | [0.1, 1.0] |
| Novelty bonus | B_novelty | same as heuristic | [0, 0.25] |
| Information gain | R_info | 1 + λ·(log₂ P(event_type)) | [1.0, 3.0] |
| Source accuracy | F_accuracy | 0.5 + accuracy_ratio (if samples ≥ 10, else 1.0) | [0.5, 1.5] |
| Regime multiplier | M_regime | 1 + 0.15·|z_r| + 0.10·|z_v| | [1.0, 2.5] |
### 1B.2 Sigmoid Confidence Gate
Replaces the binary 0/1 gate with a smooth transition:
```
G_sigmoid = σ(k·(x m)) = 1 / (1 + e^(k·(xm)))
```
Default: k = 5.0, m = 0.5. At x=0.5 → 0.5; at x=0.2 → ~0.18; at x=0.8 → ~0.82.
### 1B.3 Information Gain (Surprise Weighting)
```
R_info = min(1 + λ·(log₂ P(event_type)), 3.0)
```
| Event Type | P(event_type) | R_info (λ=0.3) |
|---|---|---|
| earnings | 0.25 | 1.60 |
| dividend | 0.15 | 1.84 |
| product_launch | 0.10 | 2.00 |
| regulatory | 0.08 | 2.07 |
| management_change | 0.06 | 2.19 |
| legal | 0.05 | 2.29 |
| restructuring | 0.04 | 2.39 |
| m_and_a | 0.03 | 2.56 |
| unknown | 0.10 (default) | 2.00 |
### 1B.4 Adaptive Recency Decay
```
τ_adaptive = τ_base × (1 + β_impact) × (1 + β_surprise) × (1 + β_market)
```
| Factor | Formula | Range |
|---|---|---|
| β_impact | impact_score × 1.0 | [0, 1.0] |
| β_surprise | (R_info 1) / 2 × 1.0 | [0, 1.0] |
| β_market | (M_regime 1) / 0.45 × 0.5 | [0, 0.5] |
Maximum adaptive half-life: 6× base (when all factors at max).
Minimum: τ_base (adaptive decay is never faster than fixed).
### 1B.5 Regime Multiplier
```
z_r = (r_t μ_20) / σ_20 (return z-score)
z_v = (ln(V_t) μ_V) / σ_V (log-volume z-score)
M_regime = clamp(1 + 0.15·|z_r| + 0.10·|z_v|, 1.0, 2.5)
```
Defaults to 1.0 when market data unavailable or σ = 0.
### 1B.6 Source Accuracy Factor
```
F_accuracy = 0.5 + clamp(accuracy_ratio, 0, 1) if sample_count ≥ 10
F_accuracy = 1.0 if sample_count < 10
```
Stored in `source_accuracy` table, updated asynchronously from realized 7-day price outcomes.
---
## 2. Trend Summary Assembly
**Source:** `services/aggregation/worker.py`
@@ -125,6 +212,79 @@ confidence = clamp(0.3 × F_count + 0.3 × C_avg + 0.4 × A_agreement P_cont
---
## 2B. Probabilistic Trend Assembly (Feature-Flagged)
**Source:** `services/aggregation/worker.py`, `services/aggregation/bayesian.py`
**Active when:** `probabilistic_scoring_enabled = true`
### 2B.1 Bayesian Posterior Accumulation
```
L_t = Σ(W_combined_i × sentiment_i) (log-likelihood)
P_bull = σ(L_t) = 1 / (1 + e^(L_t)) (bullish probability)
α = 1 + W_bull (W_bull = Σ W_combined for positive signals)
β = 1 + W_bear (W_bear = Σ W_combined for negative signals)
C_bayesian = 1 4αβ / (α + β)² (Bayesian confidence)
H = P_bull·log₂(P_bull) (1P_bull)·log₂(1P_bull) (Shannon entropy)
```
Uninformative prior (no signals): P_bull=0.5, α=1, β=1, C=0, H=1.0.
### 2B.2 Entropy-Based Direction
| Condition | Direction |
|---|---|
| H > 0.9 | Mixed |
| P_bull > 0.65 | Bullish |
| P_bull < 0.35 | Bearish |
| otherwise | Neutral |
### 2B.3 Bayesian Trend Confidence
```
confidence = clamp(0.5 × C_bayesian + 0.25 × F_count + 0.25 × C_avg_credibility P_contradiction, 0, 1)
```
| Component | Formula |
|---|---|
| C_bayesian | 1 4αβ/(α+β)² from Beta posterior |
| F_count | min(N_unique_sources / 15, 0.8) |
| C_avg_credibility | mean credibility weight across active signals |
| P_contradiction | contradiction_entropy × regime.contradiction_penalty_multiplier |
### 2B.4 Weighted Disagreement Entropy (Contradiction)
**Source:** `services/aggregation/contradiction.py`
```
f_pos = W_positive / (W_positive + W_negative)
f_neg = 1 f_pos
H_contradiction = f_pos·log₂(f_pos) f_neg·log₂(f_neg)
contradiction_score = H_contradiction × min(1.0, (W_pos + W_neg) / W_threshold)
```
W_threshold default = 5.0. Returns 0.0 when only one direction exists.
### 2B.5 Regime Detection
**Source:** `services/aggregation/regime.py`
```
R = sign(EMA_20 EMA_100) (trend indicator)
V_r = σ_20 / σ_100 (volatility ratio)
```
| Condition | Regime | Threshold | Contradiction Mult |
|---|---|---|---|
| V_r > 1.5 | Panic | ±0.10 | 0.4 |
| R ≠ 0 AND V_r < 1.2 | Trend-following | ±0.15 | 0.4 |
| R = 0 AND V_r < 1.0 | Mean-reversion | ±0.20 | 0.4 |
| otherwise | Uncertainty | ±0.15 | 0.6 |
Falls back to Uncertainty when data < 100 days or σ = 0.
---
## 3. Macro Impact Scoring (Layer 2)
**Source:** `services/aggregation/interpolation.py`
@@ -184,6 +344,30 @@ S_final = clamp(S_raw × R_tier, 0, 1)
For domestic-only events, R_tier = 1.0 regardless of tier.
### 3B. Multiplicative Macro Exposure (Probabilistic)
**Active when:** `probabilistic_scoring_enabled = true`
```
S_raw = W_severity × (1 Π_k(1 w_k × O_k))
= W_severity × (1 (10.35·O_geo)(10.25·O_supply)(10.25·O_commodity)(10.15·O_sector))
```
Zero overlap → 0.0. Max overlap (all 1.0) → severity × 0.689.
### 3B.1 Conditional Macro Integration
When both company and macro signals exist:
```
modifier = clamp(1 + M_macro × sign_alignment, 0.5, 1.5)
S_adjusted = S_company × modifier
```
sign_alignment = +1 (agree), 1 (disagree), 0 (neutral/mixed).
When only macro signals exist: additive fallback with weight 0.3.
When only company signals exist: modifier = 1.0.
### 3.4 Macro Impact Confidence
```
@@ -256,6 +440,22 @@ S_competitive = clamp(S_pattern_avg × R_relationship × C_pattern × I_source,
**Threshold gate:** Skipped if R_relationship < propagation_strength_threshold (default 0.2).
### 4B. Graph-Distance Attenuation (Probabilistic)
**Active when:** `probabilistic_scoring_enabled = true`
```
S_transfer = S_source × ρ_historical × e^(d_network)
```
| Component | Description |
|---|---|
| S_source | Source signal strength |
| ρ_historical | 90-day rolling Pearson correlation (default 0.3 same-sector, 0.1 cross-sector) |
| d_network | Shortest path in competitor graph (capped at 3) |
No propagation when d_network > 3 (e^(3) ≈ 0.05).
### 4.3 Competitive Signal as WeightedSignal
```
@@ -321,6 +521,19 @@ C_projected = C_base × 0.8 + min(S_macro × 0.15, 0.1)
**Divergence detection:** Flagged when projected direction ≠ current trend direction.
### 5B. Exponentially Weighted Momentum (Probabilistic)
**Source:** `services/aggregation/projection.py`
**Active when:** `probabilistic_scoring_enabled = true`
```
M_t = Σ_{k=0}^{K-1} λ^k × ΔS_{t-k} (λ = 0.7, K ≤ 10)
M_normalized = M_t / Σ_{k=0}^{K-1} λ^k (range: [1, 1])
M_adj = clamp(M_normalized / max(σ_20, 0.01), 2.0, 2.0)
```
Falls back to heuristic momentum when < 2 historical cycles available.
---
## 6. Data Quality Suppression
@@ -388,6 +601,27 @@ Q = 0.4 × Q_confidence + 0.3 × Q_freshness + 0.3 × Q_coverage
| paper_eligible | confidence ≥ 0.50 |
| informational | everything else (WATCH/HOLD always informational) |
### 7B. Expected Value Gate (Probabilistic)
**Active when:** `probabilistic_scoring_enabled = true`
```
R_up = strength × σ_20 × √(horizon_days)
R_down = (1 strength) × σ_20 × √(horizon_days)
EV = P_bull × R_up (1 P_bull) × R_down
```
| Horizon window | horizon_days |
|---|---|
| intraday / 1d | 1 |
| 7d | 7 |
| 30d | 30 |
| 90d | 90 |
- EV > 0.005 (0.5% expected return): recommendation proceeds through existing gates
- EV ≤ 0.005: forced to informational mode regardless of confidence/strength
- All existing eligibility gates (§7.1) remain as additional requirements
### 7.4 Position Sizing
```
@@ -649,3 +883,33 @@ Sell entire lowest-confidence positions until count is within limit.
| Tier upgrade win rate | > 55% | risk_tier_controller.py |
| Tier upgrade max drawdown | < 5% | risk_tier_controller.py |
| Tier upgrade min reserve | > 20% | risk_tier_controller.py |
| **Probabilistic pipeline** | | |
| Sigmoid steepness (k) | 5.0 | scoring.py |
| Sigmoid midpoint (m) | 0.5 | scoring.py |
| Info gain lambda (λ) | 0.3 | scoring.py |
| Info gain max clamp | 3.0 | scoring.py |
| Default base rate | 0.10 | scoring.py |
| Adaptive decay impact scale | 1.0 | scoring.py |
| Adaptive decay surprise scale | 1.0 | scoring.py |
| Adaptive decay market scale | 0.5 | scoring.py |
| Regime return weight | 0.15 | scoring.py |
| Regime volume weight | 0.10 | scoring.py |
| Regime multiplier max | 2.5 | scoring.py |
| Source accuracy min samples | 10 | source_accuracy.py |
| Contradiction W_threshold | 5.0 | contradiction.py |
| EMA short period | 20 days | regime.py |
| EMA long period | 100 days | regime.py |
| Panic volatility ratio | > 1.5 | regime.py |
| Trend-following vol ratio | < 1.2 | regime.py |
| Mean-reversion vol ratio | < 1.0 | regime.py |
| Panic threshold | ±0.10 | regime.py |
| Mean-reversion threshold | ±0.20 | regime.py |
| Uncertainty contradiction mult | 0.6 | regime.py |
| EW momentum decay (λ) | 0.7 | projection.py |
| EW momentum max lags (K) | 10 | projection.py |
| Volatility floor (σ min) | 0.01 | projection.py |
| Momentum clamp | ±2.0 | projection.py |
| EV threshold | 0.005 (0.5%) | eligibility.py |
| Graph distance max | 3 | signal_propagation.py |
| Default correlation (same-sector) | 0.3 | signal_propagation.py |
| Default correlation (cross-sector) | 0.1 | signal_propagation.py |