Files
wifi-ruview/docs/adr/ADR-121-bfld-identity-risk-scoring.md
T
ruv 29233db6d5 docs(adr-118): BFLD — Beamforming Feedback Layer for Detection (6 ADRs + research bundle)
Introduce the Beamforming Feedback Layer for Detection: the RuView safety layer
that ingests WiFi BFI, measures identity-leakage risk, and structurally prevents
identity-correlated data from leaving the node by default.

ADRs (6):
- ADR-118: umbrella decision, crate scaffolding, 6-phase rollout (~10.5 wk)
- ADR-119: BfldFrame wire format, magic 0xBF1D_0001, deterministic serialization
- ADR-120: 4 privacy classes, BLAKE3 keyed-hash rotation, #[must_classify] default-deny
- ADR-121: 9-feature identity-risk scoring, coherence gate with hysteresis
- ADR-122: 6 HA entities, 3 Matter clusters, mosquitto ACL, cognitum-v0 federation
- ADR-123: Pi 5 / Nexmon production capture, AX210 dev path, ESP32-S3 self-only fallback

Research bundle (docs/research/BFLD/, 13,544 words):
- SOTA survey covering BFId (KIT, ACM CCS 2025) and LeakyBeam (NDSS 2025)
- Architectural soul: defensive sensing primitive, not surveillance lens
- Six-adversary threat model with attack trees and mitigations
- Privacy-gating mechanics with structural cross-site isolation proof
- Automation/integration surface (HA, Matter, MQTT, federation)
- Concrete implementation plan with reuse map
- Evaluation strategy with red-team protocol on KIT BFId dataset
- Draft ADR, GitHub issue, and public gist

Three structural invariants enforced by the type system, not policy:
  I1 — Raw BFI never exits the node
  I2 — Identity embedding is in-RAM-only (no Serialize impl)
  I3 — Cross-site identity correlation is cryptographically impossible
       (per-site BLAKE3 keyed-hash with daily epoch rotation)

References:
  https://publikationen.bibliothek.kit.edu/1000185756 (BFId)
  https://www.ndss-symposium.org/wp-content/uploads/2025-5-paper.pdf (LeakyBeam)

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-24 12:20:52 -04:00

9.3 KiB
Raw Blame History

ADR-121: BFLD Identity Risk Scoring and Coherence Gate

Field Value
Status Proposed
Date 2026-05-24
Deciders ruv
Parent ADR-118
Relates to ADR-024 (AETHER), ADR-027 (MERIDIAN), ADR-029 (multistatic fusion), ADR-086 (novelty gate precedent), ADR-120 (privacy class)
Tracking issue TBD

1. Context

BFLD's distinguishing primitive is the identity_risk_score — a scalar that says "is this capture window currently capable of identifying a specific person?". The score has two consumers:

  1. The operator — exposed as an HA diagnostic sensor (ADR-122). A spike from the long-term baseline indicates the RF environment has shifted toward a higher-leakage regime (new AP firmware, denser MIMO, attacker-grade sniffer in range).
  2. The privacy gate (ADR-120) — when the score crosses a configurable threshold, the gate downgrades the active privacy_class automatically (e.g., 2 → 3) until the score recovers.

The score must be:

  • Bounded in [0, 1] for HA gauge entities.
  • Calibrated against actual re-ID success rate, ideally on the KIT BFId dataset.
  • Computable on-device at ≥ 1 Hz on a Pi 5 core or an aarch64 cognitum-v0.
  • Stable — small environmental changes should not produce wild swings; the score is for slow-moving regime detection, not per-frame chatter.

ADR-086 (edge novelty gate) establishes a precedent for an on-device gate primitive. BFLD's risk scoring borrows the gate-pattern but with identity leakage as the trigger condition.


2. Decision

2.1 Nine features (from BFLD spec §5)

The features are computed over a sliding window of W = 32 BFI frames (≈3 s at 10 Hz):

Feature Definition Source
mean_angle_delta mean( ‖ Φ_t Φ_{t-1} ‖ over subcarriers ) extractor
subcarrier_variance var( ‖ Φ ‖ over subcarrier axis ) extractor
temporal_entropy Shannon entropy of angle-bin histogram over W extractor
doppler_proxy FFT peak magnitude of mean-angle time series features.rs
path_stability 1 ‖ Φ_t median(Φ_{t-W..t}) ‖ / scale features.rs
cross_antenna_correlation mean Pearson correlation across n_tx × n_rx pairs features.rs
burst_motion_score high-pass-filtered angular velocity, soft-thresholded features.rs
stationarity_score 1 rolling KL divergence over W/2 vs W features.rs
identity_separability_score top-1 cosine to nearest AETHER cluster centroid identity_risk.rs

The first eight are sensing features (also used by the presence/motion pipeline). Only the ninth depends on the AETHER embedding and therefore on identity_class >= 1.

2.2 Identity risk formula

pub fn identity_risk_score(
    sep: f32,    // identity_separability_score, [0, 1]
    stab: f32,   // temporal_stability, [0, 1] = ema(path_stability, alpha=0.1)
    consist: f32,// cross_perspective_consistency, [0, 1] = multistatic.rs
    conf: f32,   // sample_confidence, [0, 1] = f(SNR, n_subcarriers, n_rx)
) -> f32 {
    // Clamp inputs, then multiplicative combination — any factor near 0 dominates.
    let s = sep.clamp(0.0, 1.0);
    let t = stab.clamp(0.0, 1.0);
    let p = consist.clamp(0.0, 1.0);
    let c = conf.clamp(0.0, 1.0);
    (s * t * p * c).clamp(0.0, 1.0)
}

Multiplicative combination is chosen so that any weak factor (e.g., very low SNR ⇒ low conf) collapses the score toward 0. This matches the privacy intent: when the system is uncertain, the score should be low and the operator should not be alarmed.

2.3 Calibration target

The score is calibrated against re-ID success rate on a held-out test split of the KIT BFId dataset. A piecewise-linear isotonic regression maps raw scores into a calibrated [0, 1] band where score ≥ 0.8 corresponds to >80% re-ID accuracy on a 5-second window in the calibration dataset.

Calibration parameters live in v2/crates/wifi-densepose-bfld/data/risk_calibration.toml and are versioned independently of the code. A regression update is a content-only PR.

2.4 Coherence gate

The coherence gate (per ADR-029 coherence_gate.rs pattern) consumes the risk score and emits one of four actions:

pub enum GateAction {
    Accept,           // score < 0.5, publish normally
    PredictOnly,      // 0.5 <= score < 0.7, publish but flag confidence
    Reject,           // 0.7 <= score < 0.9, drop the event
    Recalibrate,      // score >= 0.9, drop AND rotate site_salt
}

The Recalibrate action triggers a forced site-salt rotation — an aggressive response to a sustained high-risk regime. It costs the operator continuity of long-term aggregate analytics but is the right answer to an attacker-grade sniffer arriving in range.

2.5 Hysteresis

To prevent oscillation around the gate thresholds, the gate uses ±0.05 hysteresis and a 5-second debounce. A score must cross the boundary by the hysteresis margin and persist for the debounce window before the gate action changes.

2.6 Compute budget

Stage Target latency Implementation
Feature extraction (8 features) < 3 ms per window ndarray + nalgebra; vectorized over subcarriers
Separability (cosine to centroids) < 5 ms per window RuVector RaBitQ index (ADR-085) over ≤ 1k centroids
Risk score < 0.1 ms scalar multiplicative
Gate decision + hysteresis < 0.1 ms scalar

Total p95 ≤ 10 ms per window on a Pi 5 core (8 ms target). Headroom on cognitum-v0 (Pi 5 + Hailo) is ample; ESP32-S3 hosts only the extraction stage (features computed; risk score is host-side per ADR-123).


3. Consequences

Positive

  • The risk score becomes a first-class diagnostic surface for operators and a structural input to the privacy gate — both consumers from a single computation.
  • Multiplicative combination is conservative under uncertainty; the system is biased toward "report low risk when unsure", which is the right default.
  • Calibration is a content-only update — no recompile needed when the calibration file changes.
  • The recalibration gate action gives the system a self-healing response to a sniffer arrival without operator intervention.

Negative

  • Calibration requires the KIT BFId dataset; without it the score is uncalibrated and serves only as an internal trigger, not a publishable signal.
  • Multiplicative scoring can be dominated by sample_confidence, which is sensitive to channel conditions. A persistent low-SNR environment will keep the published score near 0 even when the underlying separability is high — an under-reporting failure mode that the documentation must call out.
  • The recalibrate action breaks historical hash continuity by design; an operator who wants long-term aggregates needs to know they will see a discontinuity on recalibrate events.

Neutral

  • The nine features overlap with the existing CSI pipeline. BFLD computes them on BFI; the CSI pipeline computes them on CSI. Both can be fused via cross_perspective_consistency.

4. Alternatives Considered

Alt 1: Additive scoring ((s + t + p + c) / 4)

Rejected: a sample with high separability but very low confidence would still produce a moderate score, which over-reports risk in degraded RF conditions.

Alt 2: Maximum scoring (max(s, t, p, c))

Rejected: over-reports risk because any single high factor pins the output, even if the others contradict it.

Alt 3: Learned scoring (a small MLP)

Rejected for this ADR: introduces an opaque model whose output cannot be audited from first principles. The multiplicative formula is simple, conservative, and directly explainable to operators. A learned model is a future option once enough calibration data is in hand.

Alt 4: Per-feature thresholds instead of a continuous score

Rejected: continuous score is needed for the HA gauge entity and for downstream calibration. Per-feature thresholds would force operators to interpret nine separate binaries.


5. Acceptance Criteria

  • AC1: All nine features are computed in < 8 ms p95 per window on a Pi 5 core.
  • AC2: identity_risk_score is monotonic non-decreasing in any single input when the other three are held constant.
  • AC3: Calibration regression on the KIT BFId test split: score ≥ 0.8 corresponds to ≥ 80% re-ID accuracy ± 5%.
  • AC4: The coherence gate emits Recalibrate if score is ≥ 0.9 for ≥ 5 seconds.
  • AC5: Hysteresis prevents action oscillation across ± 0.05 of a threshold within a 5-second window.
  • AC6: At privacy_class = 3, the risk score is computed but not published to MQTT (kept local for the gate only).
  • AC7: A reproducible 1,000-frame synthetic fixture produces a deterministic score sequence (bit-identical across runs).

6. References

  • ADR-118 (umbrella)
  • ADR-024 (AETHER encoder for separability)
  • ADR-029 (coherence_gate.rs precedent)
  • ADR-086 (edge novelty gate pattern)
  • ADR-120 §2.4 (class transition consumed by gate)
  • KIT BFId dataset: https://publikationen.bibliothek.kit.edu/1000185756