Context
Found during ADR-152 §2.2 measurement (b) (2026-06-10/11), when a fresh 40-minute paired collection initially aligned to zero windows and the trained-model forensics exposed silent data corruption. These bugs also retroactively explain pathologies in earlier sessions (#645, #509). Full forensic record: benchmarks/wiflow-std/RESULTS.md on branch feat/adr-152-wiflow-std-benchmark.
Bug 1 — scripts/record-csi-udp.py stamps local time with a Z (UTC) suffix
parse_csi_packet() builds timestamp via time.strftime('%Y-%m-%dT%H:%M:%S.') + ... + 'Z' — local wall time labeled as UTC. The camera collector writes true-epoch ts_ns. The aligner parses the CSI ISO string as UTC, so camera and CSI disagree by the UTC offset (−4 h under EDT) and alignment produces 0 pairs. Workaround used: --clock-offset-ms=-14400000. Fix: write datetime.now(timezone.utc).isoformat() or just use the already-present ts_ns in the aligner (preferred — see Bug 4 note).
Bug 2 — scripts/align-ground-truth.js dilutes window confidence with non-detection frames
loadGroundTruth() keeps records with keypoints: [] (empty array is truthy) at confidence 0; window avgConf then averages detections and empties. At a normal ~27% MediaPipe detection rate, every window's avgConf lands ~0.22 < the 0.5 threshold → all windows rejected even when detections themselves average 0.80 confidence. Fix: skip empty-keypoint records at load (treat as gaps); confidence statistics should be over detections only. --min-camera-frames still guards sparse windows.
Bug 3 — heterogeneous csi_shape with silent zero-padding
extractCsiMatrix() stamps the window's subcarrier count from window[0].subcarriers and zero-pads/truncates the other 19 frames to match. Tonight's session: 1,347×[70,20], 284×[134,20], 243×[26,20], 130×[12,20], 42×[20,20] — ~20% of frames inside even native-70 windows were silently zero-padded. Mixed-subcarrier frames come from the ESP32 emitting different packet formats (HT20/HT40/fragments). Fix: either filter frames to the session's modal subcarrier count before windowing, or record the per-frame subcarrier count and reject mixed windows; never silently pad.
Bug 4 — transposed shape label in extractCsiMatrix
The matrix is filled frame-major (matrix[f * nSc + s]) but declared shape: [nSc, nFrames] (~line 351). Consumers that trust the label transpose the data. Found because the measurement-(b) trainer had to correct it on load. Fix the label or the fill order, and add a round-trip test.
Acceptance
- A fresh paired session aligns with zero clock-offset flags needed
- Window kept-rate ≈
csi_frames/20 × detection_coverage (no silent confidence collapse)
- No zero-padded frames in output windows;
csi_shape homogeneous per file
- Shape label matches memory layout (tested)
- Re-run alignment on tonight's raw files (
data/recordings/csi-1781143789.csi.jsonl + data/ground-truth/keypoints_20260610_221000.jsonl) reproduces ≥2,046 pairs without workarounds
Related
#645 (paired-data quantity/quality tracking), #509 (external reproducibility), ADR-152 §2.2, the 92.9% retraction (CHANGELOG + PR #535).
🤖 Generated with claude-flow
Context
Found during ADR-152 §2.2 measurement (b) (2026-06-10/11), when a fresh 40-minute paired collection initially aligned to zero windows and the trained-model forensics exposed silent data corruption. These bugs also retroactively explain pathologies in earlier sessions (#645, #509). Full forensic record:
benchmarks/wiflow-std/RESULTS.mdon branchfeat/adr-152-wiflow-std-benchmark.Bug 1 —
scripts/record-csi-udp.pystamps local time with a Z (UTC) suffixparse_csi_packet()buildstimestampviatime.strftime('%Y-%m-%dT%H:%M:%S.') + ... + 'Z'— local wall time labeled as UTC. The camera collector writes true-epochts_ns. The aligner parses the CSI ISO string as UTC, so camera and CSI disagree by the UTC offset (−4 h under EDT) and alignment produces 0 pairs. Workaround used:--clock-offset-ms=-14400000. Fix: writedatetime.now(timezone.utc).isoformat()or just use the already-presentts_nsin the aligner (preferred — see Bug 4 note).Bug 2 —
scripts/align-ground-truth.jsdilutes window confidence with non-detection framesloadGroundTruth()keeps records withkeypoints: [](empty array is truthy) at confidence 0; windowavgConfthen averages detections and empties. At a normal ~27% MediaPipe detection rate, every window's avgConf lands ~0.22 < the 0.5 threshold → all windows rejected even when detections themselves average 0.80 confidence. Fix: skip empty-keypoint records at load (treat as gaps); confidence statistics should be over detections only.--min-camera-framesstill guards sparse windows.Bug 3 — heterogeneous
csi_shapewith silent zero-paddingextractCsiMatrix()stamps the window's subcarrier count fromwindow[0].subcarriersand zero-pads/truncates the other 19 frames to match. Tonight's session: 1,347×[70,20], 284×[134,20], 243×[26,20], 130×[12,20], 42×[20,20] — ~20% of frames inside even native-70 windows were silently zero-padded. Mixed-subcarrier frames come from the ESP32 emitting different packet formats (HT20/HT40/fragments). Fix: either filter frames to the session's modal subcarrier count before windowing, or record the per-frame subcarrier count and reject mixed windows; never silently pad.Bug 4 — transposed shape label in
extractCsiMatrixThe matrix is filled frame-major (
matrix[f * nSc + s]) but declaredshape: [nSc, nFrames](~line 351). Consumers that trust the label transpose the data. Found because the measurement-(b) trainer had to correct it on load. Fix the label or the fill order, and add a round-trip test.Acceptance
csi_frames/20 × detection_coverage(no silent confidence collapse)csi_shapehomogeneous per filedata/recordings/csi-1781143789.csi.jsonl+data/ground-truth/keypoints_20260610_221000.jsonl) reproduces ≥2,046 pairs without workaroundsRelated
#645 (paired-data quantity/quality tracking), #509 (external reproducibility), ADR-152 §2.2, the 92.9% retraction (CHANGELOG + PR #535).
🤖 Generated with claude-flow