|
- Abstract
The identification or recognition of material of real object is a fundamental aspect of noncontact visual perception. Material recognition capability in open eyes helps a person to identify the correct object among various similar types of candidate objects, as for example a person can easily distinguish between paper made cup and ceramic cup in open eyes. By inspiring from the human natural capability material recognition for robot vision application gained our attention. If we want to develop a service robot which will work in home environment by interacting with human user we need to build it's vision system that should be closer to human vision system. In real world a person utilizes several attributes to specify a particular object certainly. Consequently to establish proper interaction between user and robot, a robot should have ability to understand and extract those attributes from an object like shape, size, color etc. Here we are particularly interested about object's material, because it also often uses to specify target object by human. It could be possible to get material information of an object or alternatively the surface's micropartical size information from the optical signal re?ected by object's surface. To enhance the discriminating feature among various type of surface we use larger wavelength incident light in our proposed method.
Initially we proposed a method to recognize object's material only by using a time-of-flight range sensor as shown in fig. We measure reflection intensity patterns with respect to surface orientation for various material objects. We then classify these patterns by Random Forest classifier to identify the type of material of the reflected surface. Although it showed promising results, the method needed long processing time and was able to handle only with homogeneous surface objects such as single color objects. In our current research, we have solved these problems and present a modified version of the method that can work for multicolor objects in low processing time. We combine a conventional USB camera with the range sensor and detect the same color regions for reflection pattern estimation. We have devised a filtering method to reduce the range sensor noise. This allows us to use a small number of data points for estimation and increases the processing speed. Experiments using 50 real household objects with 5 material classes show that we can use the method for our interactive robot vision system under development.
Our previous method needs only a time-of-flight range sensor to identify the material of target object. Although it produced promising results, the method needed long processing time and was able to handle only with homogeneous surface objects such as single color objects. We have solved such type of several limitations and present a modified version of our previous method that can work for multicolor objects in short processing time. The visual representation of an object's surface depends on several factors: the illumination condition, the geometric structure of the surface and the surface reflectance properties, often characterized by the bidirectional reflectance distribution function (BRDF). We consider this BRDF to recognize object material. In our study we modify the Torrance-Sparrow model to represent surface reflectance components. We choose a large number of small patches on the object surface and obtain their directions and intensities as shown in fig. Then, we fit the data to our modified Torrance-Sparrow model to obtain a reflection pattern curve. The data are normalized by dividing the intensity value at 0 degree. We construct a feature vector of 81 dimensions by taking the point data from 0 to 80 degrees for a degree interval on the normalized curve. Different materials show different patterns. We obtain many patterns from training data and construct a Random Forest classifier.
The collected range data from 3D range sensor are very noisy. Thus, in our previous method the orientation angle data computed from the range data contain large error. To solve this problem, we were used data for a large number of points and applied some complicated smoothing filter to each point data. Thus the method needed long processing time. In addition, we assumed that all patches have the same properties. Thus, the method can work only for single color objects. The SwissRanger camera uses infrared light to illuminate the scene and has a visible light elimination filter in front of its CCD array. However, we empirically know that reflection intensity values are smaller for darker objects. We examined this issue experimentally and found that the normalized reflection patterns are similar for the same material whatever their color. This allows our previous method to recognize object material regardless of its color if it is a single color and uniform brightness object. However, if it has different color parts, the method cannot work.
To solve this problem, we have given up our policy of using only the SwissRanger. We have found the necessity of color camera with the range sensor. We have set the camera so that we can obtain gray-scale values of corresponding points in the range sensor data. From experiment we observed that the reflection intensity of a point is approximately proportional to the gray level of the point in gray-scale image. Thus, we have devised two methods. One is the normalization method. We normalize IR intensity values by dividing them by the gray levels of corresponding points.
The second is the equal gray-level method. We apply gray-level segmentation to the camera image and choose the largest region (or a set of regions with similar gray scale level whose combined area is largest). We use the data only in the region(s) for recognition. The normalized method needs the precise positional calibration between the range sensor and the gray-scale camera. In addition, gray levels of the camera cannot be so stable. The equal gray-level method is simple and practical as long as the method can find enough data points. As we describe the next section, we have modified our method to work for a small number of data points. Therefore, we have adopted the same gray-scale method. The whole process is illustrated in fig.
The data of SwissRanger 3D camera are noisy. There are two types of noise, saturation noise and random noise. The random noise occurs for all types of surfaces. Although it is hard to eliminate it completely, we input 20 range images for a scene, filtering them to reduce the random noise.
If the surface of an object is glossy or shiny like ceramic or metal, the CCD array of SR4000 may become saturated and give falsified output, causing the saturation noise. In this case the most significant bit (MSB) of the intensity value is flagged and the pixel intensity becomes the highest value. Meanwhile, the 3D values (x, y, and z) of that saturated pixel become zero. The device manufacturer has given some solutions to avoid this type of unwanted situation by reducing the integration time of the sensor, increasing the distance between the sensor and the object, and changing the orientation of the object surface. However we cannot take any of the above solutions for our purpose because those types of solutions directly hamper our material identification method. Therefore, we have developed a filtering method to remove the saturation noise from the 3D depth map. In our filtering approach, if the system finds any saturated pixel on the surface of an object, it keeps the pixel's intensity value unchanged as the flagged saturation value. Then it fits a quadratic surface to the neighboring 5x5 points' range data around the saturated pixel and estimates the range data of the saturated point from the fitted surface.
After removing saturation noise, we have 20 saturation noise free range images. Then we compute the median at each pixel location for these 20 images to obtain the range image with reduced random noise. We construct the reflection pattern from this image. In our previous method, we generate surface patches for all pixels. However, since the noise is much reduced, we do not need to use a large number of pixel data to estimate the reflection pattern. Here we experimentally determine to use 20 pixels.
To select these 20 pixels, first we cluster all pixels into 20 groups according to the pixel intensity. Then from each group we select the highest intensity pixel. We generate a surface patch from the pixel and its 5x5 neighbors to estimate the reflection pattern (in fig.).
To perform experiments we arranged 50 household objects of various sizes, shapes and colors. They included multicolor objects. These objects were divided into 5 material groups (plastic, paper, wood, fabric and ceramic) 10 objects for each group. In our reflection pattern classification experiment, among 50 experimental objects we took 3 objects from each class to train the classifier. When obtaining the reflection patterns in the training stage, we put each object about 40 cm in front of the range camera. We performed recognition experiment 5 times for each rest of the objects for test. Each time we randomly changed the orientation of the target object with respect to the viewing direction. The recognition rate of the method is 71.5% and the total time required by the system to recognize each object is 2 seconds. Our previous method was not able to recognize ceramic because ceramic objects yield a large amount of saturation noise. Now our modified method can identify ceramic and work for multicolor objects. In such harder conditions, the modified system can show equivalent recognition results. Although the figure itself may not be so high, the recognition rate of the method is reasonable because surface roughness of objects actually varies much even for the same material objects. This level of recognition can be useful in the interactive object recognition framework. In addition, the processing time has been much reduced. Since we input 20 images by the sensor to reduce noise, we need at least about a second. We need to improve the sensor to further reduce the recognition time.
We combine a conventional USB camera with the range sensor and detect the same color regions for reflection pattern estimation. We have devised a filtering method to reduce the range sensor noise. This allows us to use a small number of data points for estimation and increases the processing speed. Experimental results are promising. We are now working on an interactive robot vision system in which the robot uses this material recognition method to respond users' utterances referring to material. Although we will work to further shorten the processing time, the current level of processing time may not be a major problem in this application, since the robot can execute this process background before actual interaction about material.
Our proposed method can also be applicable in many other field of machine vision for surface investigation, as for example identification of person age by analyzing his or her 2D and 3D facial images taken by visible and NIR light source in any environmental condition.
- Publications
|