Abstract:
In the recent past, domestic service robots have come under close scrutiny among researchers. When collaborating with humans, robots should be able to clearly understand the instructions conveyed by the human users. Voice interfaces are frequently used as a mean of interaction interface between users and robots, as it requires minimum amount of work overhead from the users. However, the information conveyed through the voice instructions are often ambiguous and cumbersome due to the inclusion of imprecise information. The voice instructions are often accompanied with gestures especially when referring objects, locations, directions etc. in the environment. However, the information conveyed solely through gestures is also imprecise. Therefore, it is more e ective to consider a multimodal interface rather than a unimodal interface in order to understand the user instructions. Moreover, the information conveyed through the gestures can be used to improve the understanding of the user instructions related to object placements. This research proposes a method to enhance the interpretation of user instructions related to the object placements by interpreting the information conveyed through voice and gestures. The main objective of this system is to enhance the correlation between the user expectation and the placement of the object by interpreting uncertain information included in user commands. Furthermore, several human studies have been carried out in order to understand the factors that may in uence and their level of in uence on the object placement. The proposed system is capable of adapting and understanding, according to the spatial arrangement of the workspace of the robot as well as the position and the orientation of the human user. Fuzzy logic system is proposed in order to evaluate the information conveyed through these two modalities while considering the arrangement, size and shape of the workspace. Experiments have been carried out in order to evaluate the performance of the proposed system. The experimental results validate the performance gain of the proposed multimodal system over the unimodal systems.