A frontend engineer explores hand gesture detection on the web using Google MediaPipe. Learn how to capture hand landmarks, create feature vectors, and build a stable gesture recognition system.

Dec 26, 2025

I’ve been curious about hand behavior detection, especially for website applications, since I mainly work as a frontend engineer.
To explore this, I started experimenting with the library to provide the purposes with Google media-pipe, to understand how it works, how the callbacks operate, and how to detect gestures.
Previously, I tried a simple approach:
Camera -> landmarks -> indexTip (x,y,z) -> compare (my record gesture) -> sound
however, this method turned out to be unstable, because it was sensitive to:
After learning the library more deeply, I realized media-pipe actually provides 21 landmarks: 20 for the fingers (thumb, index, middle, ring, pinky) and 1 for the wrist:
With this, a more robust approach is:
Camera → landmarks → FEATURE VECTOR (using all 21 points) → gesture classification → sound
Result:
This method is much more stable and accurate because it captures the full hand posture instead of relying on a single point.
🔗 Link: https://hand-humorism.alimnfl.com/
#frontend #machinelearning #this-sound-just-for-entertain :)