A great challenge in everyday manipulation are the perceptual capabilities required for their successful accomplishment.
In our perception framework, RoboSherlock, we investigate techniques for combining knowledge processing and reasoning with robot perception mechanisms, while maintaining a close integration with planning libraries.
Recognizing class/instance labels and 6DOF position of objects is necessary for a robot perception system, but it is not sufficient for achieving human-like manipulation of objects. Executing a task often means detecting functional parts of objects, deducing what object is missing from a scene, finding objects contained in other objects and so on. These tasks are usually highly knowledge intensive and go beyond the capabilities of the perception systems currently used in robotics. RoboSherlock acts as a middle-ware for perception algorithms, offering solutions for reasoning about perception tasks that the robot can execute and seamless integration for existing robot perception systems. It builds on the concept of unstructured information management (UIM), which was successfully applied in natural language processing for open-question answering.
Given an input image or a sequence of images, RoboSherlock first generates object hypotheses. These hypotheses, can be for example supporting planes, point clusters or image regions. Subsequently these hypotheses are further investigated by expert perception algorithms, that return annotations such as shape, colour, location, etc. Using these pieces of information RoboSherlock decides on which object is the one needed for a certain manipulation task, and identifies functional parts of these objects (e.g. handle of a cup, opening of a bottle etc. ).
More detail about RoboSherlock it’s data representation results it produces can be found on the project website.