Computer Vision News - March 2022

4 Computer Vision Tool INTRODUCTION TO MEDIAPIPE (1/2) by Marica Muffoletto @maricaS8 Hi everyone, and welcome to a new full issue of Computer Vision News magazine and another review on a great computer vision tool which employs machine learning. MediaPipe is an open-source framework (available here ) for machine learning solutions. Currently still in the alpha stage, over the last 2 years since its first release MediaPipe built a huge number of demos and projects that demonstrate its use. MediaPipe is free, written in C++ and can be deployed to any platform, from web assembly to Android to MacOS. MediaPipe ML pipelines use a wide range of algorithms, from classical computer vision software to state-of-the-art deep learning networks such as MobileNetV3, introduced by Google itself in 2017. MediaPipe provides flexibility, through a modular architecture, and speed, which is warranted by the use of GPU acceleration and multi-threading. According to its users, it is in fact much faster than its competitors, making it an absolute world-class in the field. What does it offer? MediaPipe can be used for building multimodal (e.g. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) appliedML pipelines. It offers solutions for several computer vision problems including: 1. Face Detection  solution that comes with 6 landmarks and multi-face support 2. Face Mesh  algorithm that estimates 468 3D face landmarks in real-time even on mobile devices. It employs machine learning (ML) to infer the 3D surface geometry, requiring only a single camera input without the need for a dedicated depth sensor 3. Iris  accurate iris estimation, able to track landmarks involving the iris, pupil and the eye contours using a single RGB camera, in real-time, without the need for specialized hardware 4. Hands  high-fidelity hand and finger tracking solution. It employs machine learning to infer 21 3D landmarks of a hand from just a single frame 5. Pose  it infers 33 3D landmarks and background segmentation mask on the whole body from RGB video frames

RkJQdWJsaXNoZXIy NTc3NzU=