Technical Paper on Thresholding Digital Image Processing for Vision Systems

Robotic systems are gaining popularity among the industries today, owing to the better precision, speed and quality achieved by them. The most important element of these systems is their vision system. Several techniques are being used to assist the vision systems in improving their performance. This paper presents one such technique thresholding to digitize a given image for being processed by the vision computer.Thresholding is the most popular technique with the industrial robotics application owing to the low cost and ease of establishment and also since in most of the industrial applications the lighting at the scene of interest can be controlled. This paper consists of the background information needed to understand the concept of image analysis using vision systems and deals with the image processing techniques and thresholding.

Keywords:

 CCD camera, vision, thresholding, gray level, image processing.

Introduction:

Machine vision has now been an active area of research for more than 35 years and it is gradually being introduced for real world application. Most of the applications which were
developed and built in the seventies and the eighties were based on dedicated methods combined with use of specific application knowledge. I.e., in a typical application special sensory equipment such as laser range cameras was used often in combination with well-controlled lighting (i.e., artificial or structured lighting). For description of geometric information or motion it was often assumed that the environment was constrained to a limited number of well-defined objects which were modeled a priori and most of their characteristics were utilized in the image processing and analysis.Little insight into the general problem of image based scene description and interpretation was gained from these applications, as the applications to a large extent were based directly on image-derived features. The approaches generally lacked robustness and often became ill posed for even a slight variation in the conditions of the original application.Very little in the way of general “high level” algorithms came out of this. The reconstruction approach set forward by Marr in his now famous book “Vision” thus received little attention in terms of use in industry.Recent research has, however, indicated that some of the robustness and the ill-posed problem may be eliminated if the algorithms are applied in the context of a controllable  sensor system. The explicit control of the sensory system to improve robustness and eliminate ill-posed conditions is often termed “active vision” .In addition it has been suggested that the general machine vision problem of providing a full 3-D symbolic description of the environment without any prior knowledge is much too hard to be solved at this time and that robust solutions may be found provided that task specific knowledge is utilized in the design and processing for a specific application. Such an approach to machine vision is termed “purposive vision”. The aim of purposive vision is not to default to the strategy for construction of application, which was adopted in the seventies and the eighties, but rather to complement “general” machine vision techniques with domain specific information to facilitate control of the entire system so as to provide the needed robustness. Control is thus a significant issue in purposive vision. A significant application area for machine vision is in robotics. Much of the work in robotics has been based on use of sensory modalities such as ultrasonic sonars, as it has been difficult to obtain sufficiently good depth data using “shape from X” techniques.The introduction of a priori information may, however, change this situation. For use in well-known scenarios, it is possible to construct a model of the environment and subsequently compare sensor readings with predictions obtained from the model  environment. Progress in areas such as CAD modeling has implied that it today is possible to integrate CAD systems into the control of robots. The introduction of such models implies at the same time that it is possible to exploit machine vision methods, as the needed a priori information may be extracted from the CAD model .To ensure that the systems constructed may be used not only for one specific application but also rather for a variety of applications the trend is towards use of layered control. In layered control the hardware of the robot is interfaced to the rest of the system through device level software.This software transforms robot-specific commands and feedback into a standard representation which may be shared by several different platforms. It thus becomes simple to change robots without a need for a complete redesign of all the software. Above the device level is a set of control layers which handle path planning, control and the associated perception. In robot control, at least for mobile robots, there is a trend towards use of a set of different layers. Each layer is responsible for a specific task for the robot. For example, one layer may be responsible for “survival” and be responsible to make sure that the robot does not bump into objects in the environment and that it moves if it is on a collision course with another object in the environment. Another layer might be responsible for construction of a map of the environment to facilitate navigation or localization of target objects. The use of different layers for different tasks is different from the approach which traditionally has been used in robotics, where control is integrated in a “perceive-plan-control” cycle. The two approaches to control of a robot are illustrated in figure 1.



Introduction to Vision Systems:

The typical vision system consists of the camera and digitizing hardware, a digital computer, and hardware and software necessary to interface them. This interface hardware and software is often referred to as a preprocessor. The operation of the vision system consists of three functions:
1. Sensing and digitizing image data.
2. Image processing and analysis.
3. Application.
The sensing and digitizing functions involve the input of vision data by means of a camera focused on the scene of interest. Special lighting techniques are frequently used to obtain an image of sufficient contrast for latter processing. The image viewed by the camera is typically digitized and stored in computer memory. The digital image is as a frame of data vision, and is frequently captured by a hardware device called frame grabber. These devices are capable of digitizing images at the rate of 30 frames per second. The frames consist of a matrix of data representing projections of the scene sensed by the camera. The elements of the matrix are called pixels. The number of pixels is determined by a sapling process performed on each image frame.

A single pixel is the projection of a small portion of the scene which reduces that portion to a single value.
The value is a measure of the light intensity for that element of the scene. Each pixel intensity is converted to a digital value.
The digitized image matrix for each frame is stored and then subjected to image processing and analysis function for data reduction and interpretation of the image.
Typically an image frame will be thresholded to produce a binary image, and then various feature measurements will further reduce the data representation of the image.

Image Processing versus Image Analysis:

Image processing relates to the preparation of an image for latter analysis and use. Images captured by a camera or a similar technique (e.g. by a scanner) are not necessarily in a form that can be used by image analysis routines. Some may need improvement to reduce noise, others may need to be simplified, and still others may need to be enhanced, altered, segmented, filtered, etc. Image processing is the collection of routines and techniques that improve, simplify, enhance, or otherwise alter an image. Image analysis is the collection of processes in which a captured image that is prepared by image processing is analyzed in order to extract information about the image and to identify objects or facts about the object or its environment.

Two-And-Three-Dimensional Images:

Although all real scenes are three dimensional, images can either be two or three dimensional. 2-dimensional images are used when the depth of the scene or its features need not be determined. As an example, consider defining the surrounding contour or the silhouette of an object. In that case, it will not be necessary to determine the depth of any point on the object. Another example is the use of vision system or inspection of an integrated circuit board. Here, too there is no need to know the depth relationship between different parts, and since all parts are fixed to a flat plane, no information about the surface is necessary. Thus, a 2-dimensional image analysis and inspection will suffice. Three- dimensional image processing deals with operations that require motion direction, depth measurement, remote sensing, relative positioning and navigation. All three dimensional vision systems share the problem of coping with many-to-one mappings of scenes to images to extract information from these scenes image processing techniques are combined with Artificial intelligence techniques. In this paper we shall consider a vision system for 2-dimensional image processing only.

Acquisition of Images:

There are two types of vision cameras: analog and digital. Analog cameras are not very common anymore, but are still around; they used to be standard at television stations. Digital cameras are much more common and are mostly similar to each other. A video camera is a digital camera with an added videotape recording section. Otherwise the mechanism of image acquisition is the same as in the other cameras that do not record an image. Whether the captured image is analog or digital, in vision systems the image is eventually digitized. In a digital form, all data are binary and are stored in a computer file or memory chip


Vidicon Camera:

A vidicon camera is an analog camera that transforms an image into an analog electrical signal. The signal, a variable voltage (or current) vs. time, can be stored, digitized,broadcast, or reconstructed into an image. With the use of a lens the scene is projected onto a screen made up of two layers: a transparent metallic film and a photoconductive mosaic that is sensitive to light. The mosaic reacts to the varying intensity of light by varying its resistance. As a result, the image is projected on to it; the magnitude of the resistance at each location varies with the intensity of light. An electron gun generates and sends a continuous cathode beam through two pairs (deflectors) that are perpendicular to each other. Depending on the charge of each pair of capacitors, the electron beam is deflected up or down, and left or right and is projected on to the photoconductive mosaic. At each instance, as the beam of electron hits the mosaic, the charge is conducted to the metallic film and can be measured at the output port. The voltage measured at the output is V=IR, where I is the current (of the beam of electrons), and R is the resistance of the mosaic at the point of interest.


Digital Camera:

A digital camera is based on solid-state technology. As with other cameras, a set of lenses is used to project the area of interest onto the image area of the camera. The main part of the camera is a solid-state silicon wafer image area that has hundreds of thousands of extremely small photosensitive areas called Photosites printed on it. Each small area of the wafer is a pixel. As the image is projected on to the image area, at each pixel location of the wafer a charge is developed that is proportional to the intensity of light at that location. Thus, a digital camera is also called a Charge Coupled Device or CCD camera, and a Charge Integrated Device or CID camera. The collection of charges, if read sequentially, would be a representation of the image pixels. The wafer may have as many as 520,000 pixels in an area with dimensions of a fraction of an inch (3 /16 × 1/4). Obviously, it is impossible to have direct wire connections to all of these pixels to measure the charge in each one. To read such an enormous number of pixels, 30 times a second the charges are moved to optically isolated shift registers next to each photosite, are moved down to an output line, and then are read. The result is that every 30th of a second the charges in all pixel location are read sequentially and stored or recorded. The output is discrete representation of the image – a voltage sampled in time – as shown in figure (a) and (b) is the CCD element of a VHS camera. Similar to CCD cameras for visible lights, long wavelength infrared cameras yield a television-like image of the infrared emissions of the scene.

Digital Images:

The sampled images from the aforementioned process are first digitized through an analog-to-digital converter (ADC) and then either stored in the computer storage unit in an image format such as TIFF, JPG, BMP, etc., or displayed on a monitor. Since it is digitized, the stored information is a collections of 0’s and 1’s that represent the intensity of light at each pixel; a digitized image is nothing more than a computer file that contains these 0’s and 1’s, sequentially stored to represent the intensity of light at each pixel. The files can be accessed and read be a program, can be duplicated and manipulated, or can be rewritten in a different form. Vision routines generally access this information, perform some function on the data, and either display the result or store the manipulated result in a new file. An image that has different gray levels at each pixel location is called gray image. The gray values are digitized by a digitizer, yielding strings of 0’s and 1’s that are  sequentially displayed or stored. A color image is obtained by superimposing threeimages of red, green, and blue hues, each with a varying intensity and each equivalent toa gray image( but in a colored state). Thus, when the image is digitized, it will similarlyhave strings of 0’s and 1’s for each hue. A binary image such that each pixel is either fully light or fully dark - a 0 or a 1. To achieve a binary image in most cases a gray image is converted by using the histogram of the image and a cutoff called a threshold. A histogram determines the distribution of the different gray levels. One can pick a value that best determines a cutoff level with least distortion and use that value as a threshold to assign 0’s (or “off”) to all pixels whose gray levels are below the threshold value and to assign 1’s(or “on”) to all pixels whose gray values are above the threshold. Changing the threshold will change the binary image. The advantage of a binary image is that it requires far less memory and can be processed much faster than gray or colored images.

Image Processing Techniques:

Image techniques are used to enhance, improve, or otherwise alter an image and to prepare it for image analysis. Usually, during image processing information is not extracted from the image. The intension it that is to remove faults, trivial information, or information that may be important, but not useful, and to improve the image. As an example, suppose that an image was obtained while the object was moving, and as a result the image is not clear. It would be desirable to see if the blurring in the image could be reduced or removed before the information about the object (such as its nature, shape, location, orientation, etc.) can be determined. Again, consider an image that is corrupted by direct lighting that reflected back, or and image that is noisy because of low light. In all these cases, it is desirable to improve the image and prepare it before image analysis routines are used.

The various techniques employed in image processing and analysis are:
1. Image data reduction
2. Segmentation
3. Feature extraction
4. Object recognition

This paper primarily deals with the process of segmentation thus further discussion about other techniques is avoided.

Segmentation:

Segmentation is the generic name for the number of different techniques that divide the image into segments of its constituents. In segmentation, the objective is to group areas of an image having similar characteristics or features into distinct entities representing parts of the image. One of the most important techniques which this papers deals with is thresholding.

Thresholding:

Thresholding is a binary conversion technique in which each pixel is converted into a binary value either black or white. This is accomplished by utilizing a frequency histogram of the image and establishing what intensity (gray level) is to be the border between black and white. To improve the ability to differentiate, special lighting techniques must often be employed.It should be pointed out that the above method of using a histogram is only one of a large number of ways to threshold an image. Such a method is said to use a global threshold for an entire image. When it is not possible to find a single threshold or an entire image, an approach is to partition the total image into smaller rectangular areas and determine the threshold or each window being analyzed. Images of a weld pool in real time were taken and digitized using thresholding technique. The images were thresholded at various threshold values and also at the optimum value to show the importance of choosing an appropriate threshold. Two such sample images are shown here.These images clearly show the importance of the threshold being chosen. The optimum threshold is determined from the histogram.

Conclusion:

Thresholding is the most widely used technique for segmentation in industrial vision applications. The reasons are that it is fast and easily implemented and that the lighting is usually controllable in an industrial setting. In this paper as an example  a weld pool image is digitized using thresholding technique. The paper also successfully demonstrates the effect of choosing various thresholds. This technique can also be applied to scenes in which multiple objects occupy the view port.

References:


  • Minimization and quantification of Arc interference in Robotic Welding– C.V.Sriram, Prof. C.L.V.Prasad and Prof. M.M.M. Sarkar, proceedings of MEAK 2K2- National Conference on CAD/CAM.
  • K.S.Fu, R.C.Gonzalez, C.S.G.Lee, Robotics-control, sensing, vision and intelligence, McGraw-Hill Book Company.
  • Mikell P.Groover, Mitchell Weiss, Roger N. Nagel, Nicholas G. Odrey , Industrial Robotics technology, programming and applications, McGraw-Hill Book Company.
  • Robert J. Schilling, Fundamentals of Robotics Analysis and control, Prentice-Hall of India Private Limited.
  • Saeed B. Nikku , Introduction to Robotics Analysis, Systems, Applications, Pearson Education Asia.
  • P.A.Janakiraman, Robotics and Image Processing an Introduction, Tata McGraw Hill Publishing Company.
Technical Paper on Thresholding Digital Image Processing for Vision Systems
4/ 5
Oleh