# ADR-121: BFLD Identity Risk Scoring and Coherence Gate | Field | Value | |-------|-------| | **Status** | Proposed | | **Date** | 2026-05-24 | | **Deciders** | ruv | | **Parent** | [ADR-118](ADR-118-bfld-beamforming-feedback-layer-for-detection.md) | | **Relates to** | [ADR-024](ADR-024-contrastive-csi-embedding-model.md) (AETHER), [ADR-027](ADR-027-cross-environment-domain-generalization.md) (MERIDIAN), [ADR-029](ADR-029-ruvsense-multistatic-sensing-mode.md) (multistatic fusion), [ADR-086](ADR-086-edge-novelty-gate.md) (novelty gate precedent), [ADR-120](ADR-120-bfld-privacy-class-and-hash-rotation.md) (privacy class) | | **Companion research** | [`docs/research/soul/`](../research/soul/) — risk score doubles as Soul Signature enrollment-quality signal; §2.7 defines the Recalibrate exemption. | | **Tracking issue** | TBD | --- ## 1. Context BFLD's distinguishing primitive is the `identity_risk_score` — a scalar that says **"is this capture window currently capable of identifying a specific person?"**. The score has two consumers: 1. **The operator** — exposed as an HA diagnostic sensor (ADR-122). A spike from the long-term baseline indicates the RF environment has shifted toward a higher-leakage regime (new AP firmware, denser MIMO, attacker-grade sniffer in range). 2. **The privacy gate** (ADR-120) — when the score crosses a configurable threshold, the gate downgrades the active `privacy_class` automatically (e.g., 2 → 3) until the score recovers. The score must be: - **Bounded** in `[0, 1]` for HA gauge entities. - **Calibrated** against actual re-ID success rate, ideally on the KIT BFId dataset. - **Computable on-device** at ≥ 1 Hz on a Pi 5 core or an aarch64 cognitum-v0. - **Stable** — small environmental changes should not produce wild swings; the score is for slow-moving regime detection, not per-frame chatter. ADR-086 (edge novelty gate) establishes a precedent for an on-device gate primitive. BFLD's risk scoring borrows the gate-pattern but with identity leakage as the trigger condition. --- ## 2. Decision ### 2.1 Nine features (from BFLD spec §5) The features are computed over a sliding window of `W = 32` BFI frames (≈3 s at 10 Hz): | Feature | Definition | Source | |---------|------------|--------| | `mean_angle_delta` | mean( ‖ Φ_t − Φ_{t-1} ‖ over subcarriers ) | extractor | | `subcarrier_variance` | var( ‖ Φ ‖ over subcarrier axis ) | extractor | | `temporal_entropy` | Shannon entropy of angle-bin histogram over W | extractor | | `doppler_proxy` | FFT peak magnitude of mean-angle time series | features.rs | | `path_stability` | 1 − ‖ Φ_t − median(Φ_{t-W..t}) ‖ / scale | features.rs | | `cross_antenna_correlation` | mean Pearson correlation across n_tx × n_rx pairs | features.rs | | `burst_motion_score` | high-pass-filtered angular velocity, soft-thresholded | features.rs | | `stationarity_score` | 1 − rolling KL divergence over W/2 vs W | features.rs | | `identity_separability_score` | top-1 cosine to nearest AETHER cluster centroid | identity_risk.rs | The first eight are sensing features (also used by the presence/motion pipeline). Only the ninth depends on the AETHER embedding and therefore on `identity_class >= 1`. ### 2.2 Identity risk formula ```rust pub fn identity_risk_score( sep: f32, // identity_separability_score, [0, 1] stab: f32, // temporal_stability, [0, 1] = ema(path_stability, alpha=0.1) consist: f32,// cross_perspective_consistency, [0, 1] = multistatic.rs conf: f32, // sample_confidence, [0, 1] = f(SNR, n_subcarriers, n_rx) ) -> f32 { // Clamp inputs, then multiplicative combination — any factor near 0 dominates. let s = sep.clamp(0.0, 1.0); let t = stab.clamp(0.0, 1.0); let p = consist.clamp(0.0, 1.0); let c = conf.clamp(0.0, 1.0); (s * t * p * c).clamp(0.0, 1.0) } ``` Multiplicative combination is chosen so that **any** weak factor (e.g., very low SNR ⇒ low `conf`) collapses the score toward 0. This matches the privacy intent: when the system is uncertain, the score should be low and the operator should not be alarmed. ### 2.3 Calibration target The score is calibrated against re-ID success rate on a held-out test split of the KIT BFId dataset. A piecewise-linear isotonic regression maps raw scores into a calibrated `[0, 1]` band where `score ≥ 0.8` corresponds to `>80%` re-ID accuracy on a 5-second window in the calibration dataset. Calibration parameters live in `v2/crates/wifi-densepose-bfld/data/risk_calibration.toml` and are versioned independently of the code. A regression update is a content-only PR. ### 2.4 Coherence gate The coherence gate (per ADR-029 `coherence_gate.rs` pattern) consumes the risk score and emits one of four actions: ```rust pub enum GateAction { Accept, // score < 0.5, publish normally PredictOnly, // 0.5 <= score < 0.7, publish but flag confidence Reject, // 0.7 <= score < 0.9, drop the event Recalibrate, // score >= 0.9, drop AND rotate site_salt } ``` The `Recalibrate` action triggers a forced site-salt rotation — an aggressive response to a sustained high-risk regime. It costs the operator continuity of long-term aggregate analytics but is the right answer to an attacker-grade sniffer arriving in range. ### 2.5 Hysteresis To prevent oscillation around the gate thresholds, the gate uses ±0.05 hysteresis and a 5-second debounce. A score must cross the boundary by the hysteresis margin and persist for the debounce window before the gate action changes. ### 2.6 Soul Signature interaction — Recalibrate exemption and enrollment-quality gate Soul Signature (`docs/research/soul/`) intentionally exists in a high-separability regime — the whole point of its 60-second enrollment protocol is to push `identity_separability_score` toward 1.0. The default coherence gate (§2.4) would therefore fire `Recalibrate` constantly inside Soul Signature zones, rotating `site_salt` every few seconds and breaking enrollment. Two integrations resolve this: 1. **Recalibrate exemption.** When the gate is about to fire `Recalibrate`, it consults a `SoulMatchOracle` (provided by the Soul Signature crate when compiled with `--features soul-signature`). If the oracle reports that the current high-separability cluster matches an enrolled `person_id` above the Soul Signature acceptance threshold, the gate downgrades to `PredictOnly` instead. The high score is the *intended* outcome of a successful match, not an attack indicator. Without the `soul-signature` feature, the oracle is a no-op stub returning `MatchOutcome::NotEnrolled`, so the gate behaves exactly per §2.4. 2. **Enrollment-quality gate.** Soul Signature's enrollment protocol (`scanning-process.md` §3) requires that the sensing zone meet a minimum identity-leakage regime — too low, and the resulting signature is unreliable. The BFLD `identity_risk_score` is exactly the right signal. Soul Signature gates enrollment on `score >= ENROLL_MIN` (default `0.65`) sustained over the 60-second window. If the score drops below threshold mid-enrollment, the protocol aborts and the operator is prompted to re-attempt in better RF conditions. The exemption is asymmetric: it suppresses `Recalibrate` only for known-enrolled matches. Unknown high-separability clusters (a real attacker-grade sniffer, or an unenrolled person whose identity is unexpectedly leaky) still trigger `Recalibrate` as designed. ### 2.7 Compute budget | Stage | Target latency | Implementation | |-------|----------------|----------------| | Feature extraction (8 features) | < 3 ms per window | ndarray + nalgebra; vectorized over subcarriers | | Separability (cosine to centroids) | < 5 ms per window | RuVector RaBitQ index (ADR-085) over ≤ 1k centroids | | Risk score | < 0.1 ms | scalar multiplicative | | Gate decision + hysteresis | < 0.1 ms | scalar | Total p95 ≤ 10 ms per window on a Pi 5 core (8 ms target). Headroom on cognitum-v0 (Pi 5 + Hailo) is ample; ESP32-S3 hosts only the extraction stage (features computed; risk score is host-side per ADR-123). The `SoulMatchOracle` lookup (§2.6) adds < 1 ms when the `soul-signature` feature is enabled (RaBitQ index over enrolled centroids). --- ## 3. Consequences ### Positive - The risk score becomes a first-class diagnostic surface for operators and a structural input to the privacy gate — both consumers from a single computation. - Multiplicative combination is conservative under uncertainty; the system is biased toward "report low risk when unsure", which is the right default. - Calibration is a content-only update — no recompile needed when the calibration file changes. - The recalibration gate action gives the system a self-healing response to a sniffer arrival without operator intervention. ### Negative - Calibration requires the KIT BFId dataset; without it the score is uncalibrated and serves only as an internal trigger, not a publishable signal. - Multiplicative scoring can be dominated by `sample_confidence`, which is sensitive to channel conditions. A persistent low-SNR environment will keep the published score near 0 even when the underlying separability is high — an under-reporting failure mode that the documentation must call out. - The recalibrate action breaks historical hash continuity by design; an operator who wants long-term aggregates needs to know they will see a discontinuity on recalibrate events. ### Neutral - The nine features overlap with the existing CSI pipeline. BFLD computes them on BFI; the CSI pipeline computes them on CSI. Both can be fused via `cross_perspective_consistency`. --- ## 4. Alternatives Considered ### Alt 1: Additive scoring (`(s + t + p + c) / 4`) Rejected: a sample with high separability but very low confidence would still produce a moderate score, which over-reports risk in degraded RF conditions. ### Alt 2: Maximum scoring (`max(s, t, p, c)`) Rejected: over-reports risk because any single high factor pins the output, even if the others contradict it. ### Alt 3: Learned scoring (a small MLP) Rejected for this ADR: introduces an opaque model whose output cannot be audited from first principles. The multiplicative formula is simple, conservative, and directly explainable to operators. A learned model is a future option once enough calibration data is in hand. ### Alt 4: Per-feature thresholds instead of a continuous score Rejected: continuous score is needed for the HA gauge entity and for downstream calibration. Per-feature thresholds would force operators to interpret nine separate binaries. --- ## 5. Acceptance Criteria - [ ] **AC1**: All nine features are computed in `< 8 ms` p95 per window on a Pi 5 core. - [ ] **AC2**: `identity_risk_score` is monotonic non-decreasing in any single input when the other three are held constant. - [ ] **AC3**: Calibration regression on the KIT BFId test split: `score ≥ 0.8` corresponds to ≥ 80% re-ID accuracy ± 5%. - [ ] **AC4**: The coherence gate emits `Recalibrate` if score is ≥ 0.9 for ≥ 5 seconds. - [ ] **AC5**: Hysteresis prevents action oscillation across ± 0.05 of a threshold within a 5-second window. - [ ] **AC6**: At `privacy_class = 3`, the risk score is computed but not published to MQTT (kept local for the gate only). - [ ] **AC7**: A reproducible 1,000-frame synthetic fixture produces a deterministic score sequence (bit-identical across runs). --- ## 6. References - ADR-118 (umbrella) - ADR-024 (AETHER encoder for separability) - ADR-029 (`coherence_gate.rs` precedent) - ADR-086 (edge novelty gate pattern) - ADR-120 §2.4 (class transition consumed by gate) - KIT BFId dataset: https://publikationen.bibliothek.kit.edu/1000185756