Web
Analytics

Human Activitiy Detection

Human Skeleton Estimation using Single WiFi Device

Overlay of video image, camera-generated skeleton (green), and radio-generated skeleton (red)



Problem & Motivation

Most existing WiFi-based sensing methods rely on channel state information (CSI) collected across separate transmitter (Tx) and receiver (Rx) devices. These setups suffer from hardware imperfections (carrier frequency offset, sampling time offset, carrier phase offset) and environmental sensitivity, especially in the phase of CSI signals. That makes them unreliable, particularly when moving to new, unseen environments. We develop SiWiS for fine-grained human detection (pose estimation, mask segmentation) in indoor settings, without needing cameras, preserving privacy, and still having robustness and generalizability.

Core Ideas

SiWiS is a combined hardware + neural network system built on a single WiFi device. The hardware add-on is a sensing module (patch antennas, RF switch, amplifier, mixer, ADC) that attaches to a commercial WiFi router. It works by self-mixing the OFDM (802.11) signal (getting a local copy of the transmission) with the reflections from the environment. This self-mixing design co-locates transmitter and receiver of signals, which helps avoid CFO, STO, and CPO errors that plague multi-device CSI systems. Signal processing extracts features from the mixed signals; in particular, the phase of certain processed signals shows a linear relationship with object distance movement, enabling phase-coherent sensing.

Deep Learning & Task Architecture

On the software side, SiWiS has a dual-branch deep neural network (DNN) for two tasks in parallel: (i) mask segmentation of the human body, (ii) human pose estimation (keypoints). During training, SiWiS uses synchronized video frames + vision-based models to generate ground truth (masks, keypoints), allowing cross-modal supervision. For inference, only the WiFi-based sensing signal is required. The network includes a signal encoder (convolutions + self-attention over time) and decoders that upsample via cross-attention to produce spatial heat maps for segmentation and pose.

Results & Contributions

The prototype demonstrates that SiWiS substantially outperforms traditional CSI-based WiFi sensing methods in accuracy for mask segmentation and pose estimation. Importantly, zero-shot experiments (transfer to new, unseen environments) show good generalization, thanks largely to the phase coherence and the hardware design that reduces sensitivity to environment and hardware variation. The authors also show that SiWiS can resist interference from other WiFi or non-WiFi sources by detecting whether the excitation (signal source) is its own device or external. Overall, the paper contributes (a) a hardware add-on enabling OFDM self-mixing on a single device for sensing, (b) a DNN architecture for fine-grained human detection, and (c) validation that this setup improves generalizability over prior CSI-based methods.

Publication

  • SiWiS: Fine-grained human detection using single Wi-Fi device [PDF]
    K. Song*, Q. Wang*, S. Zhang*, and H. Zeng,
    ACM MobiCom, 2024.