Return to Teaching

AIR – Analysis, Interpretation and Recognition of 2D (touch) and 3D Gestures for New Man-Machine Interactions


With the development of touch screen and motion capture technology, new human-computer interaction gains in popularity in the recent years:  human-machine interactions are evolving. Several methods of artificial intelligence have been designed to take advantage of the new interaction potential offered by 2D and 3D action gestures. These gestural controls allow the user to execute many actions simply by doing 2D or 3D Gestures. Recognition of human actions (2D and 3D action gestures) has recently become an active research topic in Artificial Intelligence, Computer Vision, Pattern Recognition and Man-Machine Interaction.In this course, we address this emerging scientific topic: Analysis, Interpretation and Recognition of 2D (touch) and 3D Gestures for new Man-Machine Interactions. Technically, an action is a sequence generated by a human subject during the performance of a task. Action recognition deals with the process of labelling such motion sequence with respect to the depicted motions. The course will expose the specificity of the motion capture and modelisation as well as the recognition process of these two kind of actions (2D and 3D action gestures) but also the potential convergence of the scientific approaches used for each of them. We want also to address in this course some notion of user-centered design, user needs, acceptability and user testing to illustrate the importance of considering the user when we develop such new human-computer interaction.


Geste 2D, Geste 3D, classification, Reconnaissance, Analyse, Interaction Homme-Machine, Computer Vision, Pattern Recognition, Man-Machine Interaction




  • Signal acquisition, Pre-processing and Normalization

    • Motion capture (MoCap) systems to extract 3D joint positions by using markers and high precision camera array.
    • Microsoft Kinect or Leap Motion sensor: Shotton algorithm largely eases the task of extracting 3D joint positions.
    • Pen-based and Multi-Touch Capture on touch screen: smartphone, tablet PC and tangible surface which support simultaneous participation of multiple users
    • Morphology normalisation pre-processing
    • Joint trajectory modelling
  • Feature Extraction

    • 2D and 3D feature extraction
    • Sub-stroke representation
    • Temporal, shape and motion relation between Sub-stroke
  • Artificial Intelligence for 2D and 3D Action recognition

    • Eager and lazy Recognition
    • Skeleton-based human action recognition
    • Several Recognition and Machine Learning Approaches:
      1. Graph modelling, matching and embedding algorithm
      2. Dynamic Time Warping (DTW)
      3. Hidden Markov Model (HMM)
      4. Support Vector Machine (SVM)
      5. Neural Network (NN)
      6. Reject Option…
  • 2D and 3D Segmentation and action detection

    • Direct manipulation and indirect commands
    • Early detection of an action, in an unsegmented stream
    • Temporal segmentation methods
    • Sliding Window approach
  • Human-centered design (ISO 9241-210) and test protocol

    • The goal of the user-centered design process is to obtain a product that is functional, operational and satisfies the user applying humans factors, ergonomics, and knowledge and technics of usability.
    • Test protocols
    • Data analysis
  • Example and demo

Acquired skills

Comprehensive vision of a processing chain from signal acquisition, pre-processing, classification, interpretation and user feedback.
Link between pattern recognition issues and human-machine interaction.
Link between 2D and 3D gesture recognition approaches.


Eric Anquetil (responsable), Richard Kulpa, Nathalie Girard