KalidoKit: algorithms to achieve, Facemesh, Blazepose, Handpose, Holistic

post thumb
Data Science
by Admin/ on 22 Nov 2021

KalidoKit: algorithms to achieve, Facemesh, Blazepose, Handpose, Holistic

KalidoKit is the integration of a variety of algorithms to achieve, Facemesh, Blazepose, Handpose, Holistic. Let’s see the effect.

post thumb

The virtual image is driven by the movements of real human limbs, faces and hands.

The mainstream application direction of this technology is virtual anchor.

It is possible to drive avatars to dance.

post thumb

It can also capture the whole body movements, facial expressions, gestures, etc., like the motion picture at the beginning.

In addition to this type of driving virtual image type, you can also use your imagination to make some interesting small applications.

post thumb


This project is based on Tensorflow.js implementation.

Project address: https://github.com/yeemachine/kalidokit

The key point information captured can be used to drive 2D and 3D avatars, combined with some avatar driving engines, to achieve the effect shown at the beginning of the article.

It is possible to drive both Live2D images and 3D VRM images.

The technical points involved here can’t be finished in one article, so today we mainly talk about the basic key point detection technologies: face key point detection, human pose estimation, and gesture pose estimation.

Face keypoint detection

Face keypoint detection, there are sparse and dense.

Like the basic one, 68 keypoints are detected.

post thumb

Generally speaking, for the detection of closed eyes, head posture, open and closed mouth, a simple 68 keypoints is enough.

Of course, there are also more dense keypoints detection.

post thumb

For some skin beauty applications, a dense keypoint detection algorithm is needed, with thousands of keypoints.

But the idea of the algorithm is the same, to return the location coordinates of these keypoints, usually used with face detection algorithms.

For those who want to learn face keypoint detection algorithms, we recommend two introductory projects.

  1. https://github.com/1adrianb/face-alignment
  2. https://github.com/ChanChiChoi/awesome-Face_Recognition

One is a basic introductory project, and the other integrates the mainstream algorithms for face keypoints.

Human Pose Estimation

Human pose estimation is also a very basic problem in computer vision.

From the point of view of the name, it can be understood as the estimation of the position of the “human body” pose (key points, such as head, left hand, right foot, etc.).

Generally, there are 4 types of tasks.

  1. Single-Person Skeleton Estimation (SPSE)
  2. Multi-person Pose Estimation
  3. Video Pose Tracking
  4. 3D Skeleton Estimation

Simply put, it is the detection of human skeleton joint points to locate the human pose.

post thumb

Human pose estimation has a wide range of applications, for example, pose detection and action prediction of pedestrians in street scenes in the autonomous driving industry; pedestrian re-identification problems in the security field, specific action monitoring in special scenes; movie special effects in the film industry, etc.

For those who want to learn, you can read this compiled paper at:


Gestural posture estimation

Hand joints are more flexible, agile and self-obscuring, so it is a little more complicated.

post thumb

But the principle is similar to human posture estimation.

post thumb

In addition to this regular gesture recognition, it can also be used to do some special effects.

post thumb

In fact, many of these human effects, the positioning of the position, are achieved with the help of these key points.

As above, to learn, you can see this integrated material at:



comments powered by Disqus