Log in / Register
Home arrow Computer Science arrow Augmented and Virtual Reality
< Prev   CONTENTS   Next >

Real-Time Single Camera Hand Gesture Recognition System for Remote Deaf-Blind


Abstract. This paper presents a fast approach for marker-less FullDOF hand tracking, leveraging only depth information from a single depth camera. This system can be useful in many applications, ranging from tele-presence to remote control of robotic actuators or interaction with 3D virtual environment. We applied the proposed technology to enable remote transmission of signs from Tactile Sing Languages (i.e., Sign Languages with Tactile feedbacks), allowing non-invasive remote communication not only among deaf-blind users, but also with deaf, blind and hearing with proficiency in Sign Languages. We show that our approach paves the way to a fluid and natural remote communication for deaf-blind people, up to now impossible. This system is a first prototype for the PARLOMA project, which aims at designing a remote communication system for deaf-blind people.

Keywords: Real-time Markerless Hand Tracking Hand Gesture Recognition Tactile Sign-Language Communication Haptic Interface

1 Introduction

The problem of tracking human hands joints, recognizing a wide set of signs, from single marker-less visual observations is of both theoretical interest and practical importance. In the last years promising results in terms of performances and robustness have been achieved due also to rapid advances in modern sensing technologies.

Many approaches have been presented in the hand gesture recognition area [16, 20, 27]; they differ in the used algorithm, in the type of camera, in the theoretical justifications, etc. Due to recent lowering in prices, new data sources have become available for mass consumers, such as time-of-flight [7] and structured light cameras [5], which ease the task of hand gesture recognition. Indeed, a robust solution has yet to be found, as existing approaches very often require an intensive tuning phase, the usage of coloured or sensitized gloves, or a working framework which embed more than one imaging sensor. Currently, traditional vision-based hand gesture recognition methods do not achieve satisfactory performances for real-life applications [15]. On the other hand, the development of RGB-D cameras, able to generate depth images with few noise also in very low illumination conditions, has recently accelerated the process of investigating for innovative solutions. In addition, the advent of modern programming frameworks for GPUs enable real-time processing even for complex approaches (i.e., that do not rely on too simplistic assumptions), that otherwise would be much slower if executed in CPUs.

Hand gestures are a natural part of human interaction, both with machines and other humans. As they represent a simple and intuitive way of conveying information and commands (e.g., zoom in or out, drag. . . ), hand gesture recognition is of great importance for Human Machine Interaction (HMI) as well. Human interaction is widely based on hand gestures, above all when subjects with severe disabilities are involved and speech is absent, as in Sign Language (SL) based interaction. In both these fields (HMI and SL based interaction) it is necessary to provide support for real-time unaided gestures recognition, as markers or gloves are cumbersome and represent a hindrance to natural interaction. It is preferable to develop a system which relies on a single camera and does not require any calibration or tuning phase. Extensive initialization would represent a barrier for users which are not comfortable with technology, in particular when severe disabilities such as deaf-blindness are targeted.

Indeed, deaf-blind people can use neither vocal mean nor standard SLs, in the latter case because they are not able to perceive the meaning expressed by the signer. For this reason their communication is based on a different mechanism: the receiver's hands are placed on the ones of the signer in order to follow the signs made. Since the communication is still based on SL, but with tactile feedback, this variant is called tactile SL (tSL). Therefore, while it is possible for two normal speakers or two deaf signers to communicate in presence or remotely (either through phone calls or video-calling systems), as of now, there is no way for two deaf-blind persons to communicate with each other if they are not in the same place, given the basic need to touch each other's hands. Moreover, one-tomany communication is not possible, and the same signs must be repeated in front of each listener if the same message should reach many different persons [19].

In this context, the PARLOMA project[1] aims at designing a system able to

capture messages produced in SL and reproduce them remotely in tSL, in order to overcome the spatial limitation posed by tSL communication. Indeed, the project poses the bases for the experimentation of a “telephone for deaf-blind people”.

In this paper, we present a sophisticated approach to make the remote communication between deaf-blind people feasible. The proposed solution is based on a reliable marker-less hand gestures recognition method, which is targeted to recognize static signs from Italian SL (LIS) and is able to work up to 30 fps,

that is the maximum operating frequency of the Kinect sensor[2].

To show its effectiveness, in addition to a quantitative and qualitative analysis, we also present an experimental apparatus in which signs from a subset of LIS hand spelling alphabet are recognized and sent remotely over the Internet, so that a compatible robotic actuator can reproduce them and any listener with proficiency in LIS can understand the meaning of what is signed. This work is a first step toward complete remote deaf-blind communication, in which more complex and also dynamic signs will be recognized.

This paper is organized as follows: Section 2 lists already existing related works, Section 3 discusses theoretical background and practical implementation of our solution, Section 4 presents results from our experiments and summarizes the pipeline of the remote communication system we developed and finally Section 5 presents some conclusion.

  • [1]
  • [2]
Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
Business & Finance
Computer Science
Language & Literature
Political science