docs: truth-up README + user-guide on Hugging Face model release (#637)

The previous wording in both README.md and docs/user-guide.md claimed no pretrained weights were released yet. That was wrong — the contrastive CSI encoder + presence-detection head + per-node LoRA adapters have been published as ruvnet/wifi-densepose-pretrained on Hugging Face for several weeks (124 downloads at time of writing), with 100% presence accuracy on the validation set and 164,183 emb/s on M4 Pro. This commit replaces the "no shipped weights" framing with the actual state, and surfaces a real loader gap discovered during a before/after benchmark of the sensing-server: * Baseline run (no --model): server produced presence/motion/vitals output at ~19 ticks/s, as expected. * After run (--model models/wifi-densepose-pretrained.rvf): the progressive RVF loader errored with "invalid magic at offset 0: expected 0x52564653, got 0x7974227B" (0x7974227B is the ASCII bytes {"ty… from the JSONL header). v2/.../rvf_container.rs only parses the binary RVF segment format; the HF artifact is JSONL RVF. When the load fails the pipeline degraded to null output (variance=0, presence=None) rather than falling back to heuristic mode. The docs now describe (a) what works today — Python / training-side consumption of model.safetensors — and (b) what is gated on a JSONL adapter or a binary-RVF republish — sensing-server --model loading. The 17-keypoint pose model remains separately pending (#509, ADR-079 phases P7–P9).
2026-06-02 00:58:56 +02:00 · 2026-05-19 13:03:54 -04:00
parent 8247d28d90
commit ad15f1b049
2 changed files with 99 additions and 9 deletions
@@ -32,7 +32,7 @@ Built on [RuVector](https://github.com/ruvnet/ruvector/) and [Cognitum Seed](htt

 The system learns each environment locally using spiking neural networks that adapt in under 30 seconds, with multi-frequency mesh scanning across 6 WiFi channels that uses your neighbors' routers as free radar illuminators. Every measurement is cryptographically attested via an Ed25519 witness chain.

-RuView **ships the full training pipeline for camera-free 17-keypoint pose estimation (WiFlow + AETHER + MERIDIAN heads)** — based on the original *DensePose From WiFi* research at Carnegie Mellon University. **What ships today is the inference and training infrastructure; pretrained pose weights are not yet released** (tracked in [#509](https://github.com/ruvnet/RuView/issues/509)). With no `.rvf` model loaded, the sensing server drives the on-screen skeleton from signal-based heuristics (amplitude variance, motion-band power), not learned keypoint inference. Camera-supervised fine-tune targets **35%+ PCK@20** ([ADR-079](docs/adr/ADR-079-camera-supervised-pose-finetune.md)) — pipeline implemented, P7–P9 (data collection + training + eval) are `Pending`.
+RuView **ships pretrained CSI weights on Hugging Face** at [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained) — a self-supervised contrastive CSI encoder (128-dim embeddings, 12.2M training steps, 60K frames) + a presence-detection head reporting 100% accuracy on the validation set + per-node LoRA adapters. Models are released as `.safetensors`, 4-bit/8-bit/2-bit quantized `.bin` (4 KB–16 KB), and a JSONL RVF container. The Python training and evaluation tooling consumes these today via `safetensors`. **Pending wiring**: the sensing-server's `--model` flag still expects binary RVF, so live-server consumption of the JSONL bundle is gated on a JSONL adapter (or a re-publish in binary RVF) — see [Pretrained model on Hugging Face](#-pretrained-model-on-hugging-face) below for the workaround. **Not yet released**: a 17-keypoint pose-estimation model — training pipeline is implemented (WiFlow + AETHER + MERIDIAN heads) but camera-supervised fine-tune phases P7–P9 of [ADR-079](docs/adr/ADR-079-camera-supervised-pose-finetune.md) are `Pending`, tracked in [#509](https://github.com/ruvnet/RuView/issues/509). The live sensing server therefore drives the on-screen output from signal-based DSP heuristics today.

 ### Built for low-power edge applications

@@ -51,19 +51,22 @@ RuView **ships the full training pipeline for camera-free 17-keypoint pose estim
 > |------|--------|-----|-------|
 > | 🫁 **Breathing rate** | ✅ Works today | Bandpass 0.1-0.5 Hz → zero-crossing BPM, circular variance on wrapped phase ([#593](https://github.com/ruvnet/RuView/issues/593)) | 6-30 BPM |
 > | 💓 **Heart rate** | ✅ Works today | Bandpass 0.8-2.0 Hz → zero-crossing BPM | 40-120 BPM (needs good SNR) |
-> | 👤 **Presence indicator** | ⚠️ Heuristic, not learned | Phase variance vs adaptive threshold (60 s ambient calibration). False-positives under strong RF interference. | < 1 ms latency |
+> | 👤 **Presence detection** | ✅ Heuristic in server · 🤗 Trained head on HF (loader wiring pending) | Live server uses phase-variance vs adaptive threshold (60 s ambient calibration). A trained `presence-head.json` reporting 100% validation accuracy is published in [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained) but the sensing-server's `--model` loader only accepts binary RVF today — JSONL adapter pending. | <1 ms heuristic |
+> | 🧬 **CSI embeddings** | 🤗 Trained encoder on HF | 128-dim contrastive encoder, **164,183 emb/s** on M4 Pro. Usable today from Python / training via `model.safetensors`; sensing-server consumption pending the same JSONL loader gap as above. | 8 KB q4 fits ESP32 SRAM |
 > | 🚶 **Motion / activity** | ✅ Works today | Motion-band power + phase acceleration | Real-time |
 > | 🤸 **Fall detection** | ✅ Works today | Phase acceleration > threshold + 3-frame debounce + 5 s cooldown ([#263](https://github.com/ruvnet/RuView/issues/263)) | < 200 ms |
-> | 🧮 **Multi-person slot count** | ⚠️ Heuristic, not learned | Subcarrier diversity divided by 2 (capped). **Not** a learned counter — see [firmware README](firmware/esp32-csi-node/README.md#tier-2--full-pipeline-stable) "Tier 2 caveats". Adaptive normalisation fix in [#491](https://github.com/ruvnet/RuView/pull/491). | Real-time |
-> | 🦴 **17-keypoint pose estimation** | 🔬 Pipeline only, no shipped weights | Training infrastructure complete (WiFlow + AETHER + MERIDIAN heads); pretrained `.rvf` not yet released. Fallback heuristic in the meantime. Tracked in [#509](https://github.com/ruvnet/RuView/issues/509). | Pending data collection |
+> | 🧮 **Multi-person slot count** | ⚠️ Heuristic, not learned | Subcarrier diversity divided by 2 (capped). **Not** a learned counter — see [firmware README](firmware/esp32-csi-node/README.md#tier-2--full-pipeline-stable) "Tier 2 caveats". Adaptive normalisation in [#491](https://github.com/ruvnet/RuView/pull/491). | Real-time |
+> | 🦴 **17-keypoint pose estimation** | 🔬 Pipeline only, no shipped weights | Training infrastructure complete (WiFlow + AETHER + MERIDIAN heads); the published HF model is presence + embeddings, not keypoints. Tracked in [#509](https://github.com/ruvnet/RuView/issues/509). | Pending data collection |
 > | 🧱 **Through-wall sensing** | ✅ Works today | Fresnel zone geometry + multipath modeling | Up to ~5m signal-dependent |
 > | 🧠 **Edge intelligence** | ✅ Works today | Optional Cognitum Seed for persistent vector store + kNN + witness chain | $140 total BOM |
-> | 🎯 **Camera-free pre-training** | ✅ Pipeline works | MM-Fi + Wi-Pose datasets through `wifi-densepose-train`. Released weights pending [#509](https://github.com/ruvnet/RuView/issues/509). | 84 s/epoch on M4 Pro |
+> | 🎯 **Camera-free pre-training** | ✅ Shipped weights on HF | Self-supervised contrastive encoder, 12.2M training steps on 60K frames. See [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained). | 84 s/epoch retrain on M4 Pro |
 > | 📷 **Camera-supervised fine-tune** | 🔬 Pipeline only | MediaPipe + ESP32 CSI paired training, [ADR-079](docs/adr/ADR-079-camera-supervised-pose-finetune.md). Target **35%+ PCK@20**. P7–P9 (data + train + eval) `Pending`. | ~19 min/epoch on laptop |
 > | 📡 **Multi-frequency mesh** | ✅ Works today | Channel hopping across 6 bands, TDM slot scheduling (ADR-029) | 3x sensing bandwidth |
 > | 🌐 **3D point cloud fusion** | 🔬 Reference impl | Camera depth (MiDaS) + WiFi CSI + mmWave radar → unified spatial model. Requires camera. | 22 ms pipeline · 19K+ points/frame |
 >
-> Legend: ✅ shipped + tested on hardware · ⚠️ ships and runs, but is a heuristic/threshold (not a learned classifier) — accuracy depends on calibration · 🔬 implementation + tests in repo, weights/data/eval pending
+> Legend: ✅ shipped + tested on hardware (some have learned weights on [HF](https://huggingface.co/ruvnet/wifi-densepose-pretrained), others are deterministic DSP) · ⚠️ ships and runs, but is a heuristic/threshold (not a learned classifier) — accuracy depends on calibration · 🔬 implementation + tests in repo, weights/data/eval pending
+>
+> 🤗 **Pretrained weights**: download from [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained) — see [Loading the pretrained model](#loading-the-pretrained-model) below for one-command setup.

 ```bash
 # Option 1: Docker (simulated data, no hardware needed)
@@ -120,6 +123,31 @@ node scripts/mincut-person-counter.js --port 5006  # Correct person counting
 > **Live ESP32 pipeline**: Connect an ESP32-S3 node → run the [sensing server](#sensing-server) → open the [pose fusion demo](https://ruvnet.github.io/RuView/pose-fusion.html) for real-time dual-modal pose estimation (webcam + WiFi CSI). See [ADR-059](docs/adr/ADR-059-live-esp32-csi-pipeline.md).


+## 🤗 Pretrained model on Hugging Face
+
+Pretrained CSI weights live at [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained) — 12.2M training steps on 60K frames / 610K contrastive triplets, **100% presence accuracy** on the validation set, 4-bit quantized variant fits in 8 KB. The release includes a contrastive **CSI encoder** producing 128-dim embeddings (164,183 emb/s on M4 Pro) and a **presence-detection head**. Per-node LoRA adapters are included for environment-specific fine-tuning.
+
+```bash
+# Download the model bundle
+pip install huggingface_hub
+huggingface-cli download ruvnet/wifi-densepose-pretrained --local-dir models/wifi-densepose-pretrained
+```
+
+**What works today vs. what's pending wiring:**
+
+| Consumer | Format used | Status |
+|----------|-------------|--------|
+| Python training / evaluation / embedding extraction | `model.safetensors` | ✅ Works — load with `safetensors.torch.load_file` |
+| Inspect / re-export the bundle | `model.rvf.jsonl` (line-by-line JSON) | ✅ Works — plain JSONL |
+| Sensing-server `--model <PATH>` flag | binary RVF (`RVFS` magic) | ⚠️ Loader does not yet accept the JSONL container |
+
+**Known gap:** the HF model ships in JSONL RVF format, but `v2/crates/wifi-densepose-sensing-server/src/rvf_container.rs` only parses the binary RVF segment format. Pointing `--model` at `model.rvf.jsonl` currently errors with `invalid magic at offset 0: expected 0x52564653, got 0x7974227B` and the live pipeline degrades to null output rather than falling back to heuristic mode — so for the live sensing-server, run **without** `--model` until a JSONL adapter lands (or the model is re-published as binary RVF). Use the weights from Python / training in the meantime.
+
+**Quantization choices** (all in the HF repo): `model-q2.bin` (4 KB) · `model-q4.bin` ⭐ recommended (8 KB) · `model-q8.bin` (16 KB) · `model.safetensors` full (48 KB)
+
+The separate **17-keypoint pose-estimation model** is not in this release — pipeline is implemented but keypoint weights are still pending. Tracked in [#509](https://github.com/ruvnet/RuView/issues/509); see [ADR-079](docs/adr/ADR-079-camera-supervised-pose-finetune.md) phases P7–P9.
+
+
 ## 🔬 How It Works

 WiFi routers flood every room with radio waves. When a person moves — or even breathes — those waves scatter differently. WiFi DensePose reads that scattering pattern and reconstructs what happened:
@@ -29,13 +29,14 @@ WiFi DensePose turns commodity WiFi signals into real-time human pose estimation
 8. [Vital Sign Detection](#vital-sign-detection)
 9. [CLI Reference](#cli-reference)
 10. [Observatory Visualization](#observatory-visualization)
-11. [Adaptive Classifier](#adaptive-classifier)
+11. [Loading the Pretrained Model from Hugging Face](#loading-the-pretrained-model-from-hugging-face)
+12. [Adaptive Classifier](#adaptive-classifier)
    - [Recording Training Data](#recording-training-data)
    - [Training the Model](#training-the-model)
    - [Using the Trained Model](#using-the-trained-model)
-12. [Training a Model](#training-a-model)
+13. [Training a Model](#training-a-model)
    - [CRV Signal-Line Protocol](#crv-signal-line-protocol)
-13. [RVF Model Containers](#rvf-model-containers)
+14. [RVF Model Containers](#rvf-model-containers)
 14. [Hardware Setup](#hardware-setup)
    - [ESP32-S3 Mesh](#esp32-s3-mesh)
    - [Intel 5300 / Atheros NIC](#intel-5300--atheros-nic)
@@ -793,6 +794,67 @@ The Observatory is an immersive Three.js visualization that renders WiFi sensing

 ---

+## Loading the Pretrained Model from Hugging Face
+
+A pretrained CSI encoder + presence-detection head is published on Hugging Face at [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained). It was trained on 60,630 frames / 610,615 contrastive triplets (12.2M steps, final loss 0.065) and reports 100% presence accuracy and ~164k embeddings/sec on an Apple M4 Pro.
+
+What it ships (and what it does not):
+
+| Capability | Status |
+|------------|--------|
+| Presence detection (occupied / empty) | ✅ Trained head — 100% accuracy on validation |
+| 128-dim CSI embeddings (re-ID, similarity, downstream training) | ✅ Trained encoder |
+| Single-person breathing / heart-rate | ⚠️ Server still uses heuristic DSP — model does not replace this yet |
+| 17-keypoint full-body pose | 🔬 No keypoint weights shipped yet — pose pipeline runs but without a learned head |
+
+### Download
+
+```bash
+pip install huggingface_hub
+huggingface-cli download ruvnet/wifi-densepose-pretrained \
+    --local-dir models/wifi-densepose-pretrained
+```
+
+The download yields a small set of files (the `.rvf.jsonl` is the canonical container the sensing server reads):
+
+```
+models/wifi-densepose-pretrained/
+  model.rvf.jsonl       # RVF container (encoder + presence head + lora)
+  model.safetensors     # 48 KB — same encoder weights, safetensors format
+  model-q4.bin          # 8 KB — recommended quantization for edge
+  presence-head.json    # presence classifier head
+  config.json           # sona-lora rank=8 alpha=16, target encoder + task_heads
+```
+
+### Using the weights
+
+The HF artifact is in **JSONL RVF** format (one JSON object per line: `metadata`, `encoder`, `lora`). What you can do with it today:
+
+| Consumer | Format it reads | Status |
+|----------|-----------------|--------|
+| Python / PyTorch training pipeline | `model.safetensors` | ✅ Works — load with `safetensors.torch.load_file` |
+| RVF JSONL inspection / re-export | `model.rvf.jsonl` | ✅ Works — plain JSONL, parse line-by-line |
+| Sensing-server `--model <PATH>` flag | binary RVF (`RVFS` magic) | ⚠️ Does **not** accept the JSONL file yet — see gap below |
+
+**Known gap (tracked):** `v2/crates/wifi-densepose-sensing-server/src/rvf_container.rs` only parses the binary RVF segment format (magic `0x52564653`). Pointing `--model` at `model.rvf.jsonl` causes the progressive loader to error with `invalid magic at offset 0: expected 0x52564653, got 0x7974227B` (`0x7974227B` is the ASCII bytes `{"ty…` from the JSONL header), and the live pipeline degrades to null output rather than falling back to heuristic mode. Until a JSONL adapter lands (or the model is re-published as binary RVF), run the sensing-server **without** `--model` and consume the HF weights from Python or the training pipeline.
+
+```bash
+# Works today — Python side (training, evaluation, embedding extraction):
+python -c "
+from safetensors.torch import load_file
+state = load_file('models/wifi-densepose-pretrained/model.safetensors')
+print({k: tuple(v.shape) for k, v in state.items()})
+"
+
+# Sensing server — run heuristic for now:
+cargo run -p wifi-densepose-sensing-server --release -- \
+    --source esp32 --udp-port 5005 --http-port 3000
+```
+
+See [RVF Model Containers](#rvf-model-containers) for the binary format the loader expects, and [Training a Model](#training-a-model) for using the encoder as a starting point for environment-specific fine-tuning.
+
+---
+
 ## Adaptive Classifier

 The adaptive classifier (ADR-048) learns your environment's specific WiFi signal patterns from labeled recordings. It replaces static threshold-based classification with a trained logistic regression model that uses 15 features (7 server-computed + 8 subcarrier-derived statistics).