Robotic systems are gaining popularity
among the industries today, owing to the better precision, speed and quality
achieved by them. The most important element of these systems is their vision
system. Several techniques are being used to assist the vision systems in
improving their performance. This paper presents one such technique thresholding to
digitize a given image for being processed by the vision computer.Thresholding
is the most popular technique with the industrial robotics application owing to
the low cost and ease of establishment and also since in most of the industrial
applications the lighting at the scene of interest can be controlled. This
paper consists of the background information needed to understand the concept
of image analysis using vision systems and deals with the image processing
techniques and thresholding.
Keywords:
CCD camera, vision,
thresholding, gray level, image processing.
Introduction:
Machine vision has now been an active area
of research for more than 35 years and it is gradually being introduced for
real world application. Most of the applications which were
developed and built in the seventies and
the eighties were based on dedicated methods combined with use of specific
application knowledge. I.e., in a typical application special sensory equipment
such as laser range cameras was used often in combination with well-controlled
lighting (i.e., artificial or structured lighting). For description of
geometric information or motion it was often assumed that the environment was
constrained to a limited number of well-defined objects which were modeled a
priori and most of their characteristics were utilized in the image processing and
analysis.Little insight into the general problem of image based scene
description and interpretation was gained from these applications, as the
applications to a large extent were based directly on image-derived features.
The approaches generally lacked robustness and often became ill posed for even
a slight variation in the conditions of the original application.Very little in
the way of general “high level” algorithms came out of this. The reconstruction
approach set forward by Marr in his now famous book “Vision” thus received
little attention in terms of use in industry.Recent research has, however,
indicated that some of the robustness and the ill-posed problem may be
eliminated if the algorithms are applied in the context of a controllable sensor system. The explicit control of the
sensory system to improve robustness and eliminate ill-posed conditions is
often termed “active vision” .In addition it has been suggested that the
general machine vision problem of providing a full 3-D symbolic description of
the environment without any prior knowledge is much too hard to be solved at
this time and that robust solutions may be found provided that task specific knowledge
is utilized in the design and processing for a specific application. Such an approach
to machine vision is termed “purposive vision”. The aim of purposive vision is not
to default to the strategy for construction of application, which was adopted
in the seventies and the eighties, but rather to complement “general” machine
vision techniques with domain specific information to facilitate control of the
entire system so as to provide the needed robustness. Control is thus a
significant issue in purposive vision. A significant application area for
machine vision is in robotics. Much of the work in robotics has been based on
use of sensory modalities such as ultrasonic sonars, as it has been difficult
to obtain sufficiently good depth data using “shape from X” techniques.The
introduction of a priori information may, however, change this situation. For
use in well-known scenarios, it is possible to construct a model of the
environment and subsequently compare sensor readings with predictions obtained
from the model environment. Progress in
areas such as CAD modeling has implied that it today is possible to integrate
CAD systems into the control of robots. The introduction of such models implies
at the same time that it is possible to exploit machine vision methods, as the
needed a priori information may be extracted from the CAD model .To ensure that
the systems constructed may be used not only for one specific application but
also rather for a variety of applications the trend is towards use of layered
control. In layered control the hardware of the robot is interfaced to the rest
of the system through device level software.This software transforms
robot-specific commands and feedback into a standard representation which may
be shared by several different platforms. It thus becomes simple to change
robots without a need for a complete redesign of all the software. Above the
device level is a set of control layers which handle path planning, control and
the associated perception. In robot control, at least for mobile robots, there
is a trend towards use of a set of different layers. Each layer is responsible
for a specific task for the robot. For example, one layer may be responsible
for “survival” and be responsible to make sure that the robot does not bump
into objects in the environment and that it moves if it is on a collision
course with another object in the environment. Another layer might be responsible
for construction of a map of the environment to facilitate navigation or localization
of target objects. The use of different layers for different tasks is different
from the approach which traditionally has been used in robotics, where control
is integrated in a “perceive-plan-control” cycle. The two approaches to control
of a robot are illustrated in figure 1.
Introduction to Vision Systems:
The typical vision system consists of the
camera and digitizing hardware, a digital computer, and hardware and software
necessary to interface them. This interface hardware and software is often
referred to as a preprocessor. The operation of the vision system consists of
three functions:
1. Sensing and digitizing image data.
2. Image processing and analysis.
3. Application.
The sensing and digitizing functions
involve the input of vision data by means of a camera focused on the scene of
interest. Special lighting techniques are frequently used to obtain an image of
sufficient contrast for latter processing. The image viewed by the camera is
typically digitized and stored in computer memory. The digital image is as a frame
of data vision, and is frequently captured by a hardware device called frame grabber.
These devices are capable of digitizing images at the rate of 30 frames per second.
The frames consist of a matrix of data representing projections of the scene sensed
by the camera. The elements of the matrix are called pixels. The number of
pixels is determined by a sapling process performed on each image frame.
A single pixel is the projection of a small
portion of the scene which reduces that portion to a single value.
The value is a measure of the light
intensity for that element of the scene. Each pixel intensity is converted to a
digital value.
The digitized image matrix for each frame
is stored and then subjected to image processing and analysis function for data
reduction and interpretation of the image.
Typically an image frame will be
thresholded to produce a binary image, and then various feature measurements
will further reduce the data representation of the image.
Image Processing versus Image Analysis:
Image processing relates to the preparation
of an image for latter analysis and use. Images captured by a camera or a
similar technique (e.g. by a scanner) are not necessarily in a form that can be
used by image analysis routines. Some may need improvement to reduce noise,
others may need to be simplified, and still others may need to be enhanced, altered,
segmented, filtered, etc. Image processing is
the collection of routines and techniques that improve, simplify, enhance, or
otherwise alter an image. Image analysis is
the collection of processes in which a captured image that is prepared by image
processing is analyzed in order to extract information about the image and to
identify objects or facts about the object or its environment.
Two-And-Three-Dimensional Images:
Although all real scenes are three
dimensional, images can either be two or three dimensional. 2-dimensional
images are used when the depth of the scene or its features need not be
determined. As an example, consider defining the surrounding contour or the silhouette
of an object. In that case, it will not be necessary to determine the depth of
any point on the object. Another example is the use of vision system or
inspection of an integrated circuit board. Here, too there is no need to know
the depth relationship between different parts, and since all parts are fixed
to a flat plane, no information about the surface is necessary. Thus, a
2-dimensional image analysis and inspection will suffice. Three- dimensional
image processing deals with operations that require motion direction, depth
measurement, remote sensing, relative positioning and navigation. All three dimensional
vision systems share the problem of coping with many-to-one mappings of scenes
to images to extract information from these scenes image processing techniques are
combined with Artificial intelligence techniques.
In this paper we shall consider a vision system for 2-dimensional image
processing only.
Acquisition of Images:
There are two types of vision cameras:
analog and digital. Analog cameras are not very common anymore, but are still
around; they used to be standard at television stations. Digital cameras are
much more common and are mostly similar to each other. A video camera is a
digital camera with an added videotape recording section. Otherwise the mechanism
of image acquisition is the same as in the other cameras that do not record an image.
Whether the captured image is analog or digital, in vision systems the image is
eventually digitized. In a digital form, all data are binary and are stored in
a computer file or memory chip
Vidicon Camera:
A vidicon camera is an analog camera that
transforms an image into an analog electrical signal. The signal, a variable
voltage (or current) vs. time, can be stored, digitized,broadcast, or
reconstructed into an image. With the use of a lens the scene is projected onto
a screen made up of two layers: a transparent metallic film and a
photoconductive mosaic that is sensitive to light. The mosaic reacts to the
varying intensity of light by varying its resistance. As a result, the image is
projected on to it; the magnitude of the resistance at each location varies
with the intensity of light. An electron gun generates and sends a continuous
cathode beam through two pairs (deflectors) that are perpendicular to each
other. Depending on the charge of each pair of capacitors, the electron beam is
deflected up or down, and left or right and is projected on to the photoconductive
mosaic. At each instance, as the beam of electron hits the mosaic, the charge
is conducted to the metallic film and can be measured at the output port. The voltage
measured at the output is V=IR,
where I is
the current (of the beam of electrons), and R
is the resistance of the mosaic at the
point of interest.
Digital Camera:
A digital camera is based on solid-state
technology. As with other cameras, a set of lenses is used to project the area
of interest onto the image area of the camera. The main part of the camera is a
solid-state silicon wafer image area that has hundreds of thousands of extremely
small photosensitive areas called Photosites
printed on it. Each small area of the wafer
is a pixel. As the image is projected on to the image area, at each pixel
location of the wafer a charge is developed that is proportional to the
intensity of light at that location. Thus, a digital camera is also called a
Charge Coupled Device or CCD camera, and a Charge Integrated Device or CID
camera. The collection of charges, if read sequentially, would be a
representation of the image pixels. The wafer may have as many as 520,000
pixels in an area with dimensions of a fraction of an inch (3 /16
× 1/4).
Obviously, it is impossible to have direct wire connections to all of these
pixels to measure the charge in each one. To read such an enormous number of pixels,
30 times a second the charges are moved to optically isolated shift registers
next to each photosite, are moved down to an output line, and then are read.
The result is that every 30th of
a second the charges in all pixel location are read sequentially and stored or recorded.
The output is discrete representation of the image – a voltage sampled in time –
as shown in figure (a) and (b) is the CCD element of a VHS camera. Similar to
CCD cameras for visible lights, long wavelength infrared cameras yield a television-like
image of the infrared emissions of the scene.
Digital Images:
The sampled images from the aforementioned
process are first digitized through an analog-to-digital converter (ADC) and
then either stored in the computer storage unit in an image format such as
TIFF, JPG, BMP, etc., or displayed on a monitor. Since it is digitized, the
stored information is a collections of 0’s and 1’s that represent the intensity
of light at each pixel; a digitized image is nothing more than a computer file
that contains these 0’s and 1’s, sequentially stored to represent the intensity
of light at each pixel. The files can be accessed and read be a program, can be
duplicated and manipulated, or can be rewritten in a different form. Vision
routines generally access this information, perform some function on the data,
and either display the result or store the manipulated result in a new file. An
image that has different gray levels at each pixel location is called gray image. The
gray values are digitized by a digitizer, yielding strings of 0’s and 1’s that
are sequentially displayed or stored. A
color image is obtained by superimposing threeimages of red, green, and blue
hues, each with a varying intensity and each equivalent toa gray image( but in
a colored state). Thus, when the image is digitized, it will similarlyhave
strings of 0’s and 1’s for each hue. A binary image such that each pixel is
either fully light or fully dark - a 0 or a 1. To achieve a binary image in
most cases a gray image is converted by using the histogram of the image and a
cutoff called a threshold. A
histogram determines the distribution of the different gray levels. One can
pick a value that best determines a cutoff level with least distortion and use
that value as a threshold to assign 0’s (or “off”) to all pixels whose gray
levels are below the threshold value and to assign 1’s(or “on”) to all pixels
whose gray values are above the threshold. Changing the threshold will change
the binary image. The advantage of a binary image is that it requires far less
memory and can be processed much faster than gray or colored images.
Image Processing Techniques:
Image techniques are used to enhance,
improve, or otherwise alter an image and to prepare it for image analysis.
Usually, during image processing information is not extracted from the image.
The intension it that is to remove faults, trivial information, or information
that may be important, but not useful, and to improve the image. As an example,
suppose that an image was obtained while the object was moving, and as a result
the image is not clear. It would be desirable to see if the blurring in the
image could be reduced or removed before the information about the object (such
as its nature, shape, location, orientation, etc.) can be determined. Again,
consider an image that is corrupted by direct lighting that reflected back, or
and image that is noisy because of low light. In all these cases, it is
desirable to improve the image and prepare it before image analysis routines
are used.
The various techniques employed in image
processing and analysis are:
1. Image data reduction
2. Segmentation
3. Feature extraction
4. Object recognition
This paper primarily deals with the process
of segmentation thus further discussion about other techniques is avoided.
Segmentation:
Segmentation is the generic name for the
number of different techniques that divide the image into segments of its
constituents. In segmentation, the objective is to group areas of an image
having similar characteristics or features into distinct entities representing
parts of the image. One of the most important techniques which this papers deals
with is thresholding.
Thresholding:
Thresholding is a binary conversion
technique in which each pixel is converted into a binary value either black or
white. This is accomplished by utilizing a frequency histogram of the image and
establishing what intensity (gray level) is to be the border between black and
white. To improve the ability to differentiate, special lighting techniques
must often be employed.It should be pointed out that the above method of using
a histogram is only one of a large number of ways to threshold an image. Such a
method is said to use a global threshold for an entire image. When it is not
possible to find a single threshold or an entire image, an approach is to
partition the total image into smaller rectangular areas and determine the
threshold or each window being analyzed. Images of a weld pool in real time
were taken and digitized using thresholding technique. The images were
thresholded at various threshold values and also at the optimum value to show
the importance of choosing an appropriate threshold. Two such sample images are
shown here.These images clearly show the importance of the threshold being
chosen. The optimum threshold is determined from the histogram.
Conclusion:
Thresholding is the most widely used
technique for segmentation in industrial vision applications. The reasons are
that it is fast and easily implemented and that the lighting is usually
controllable in an industrial setting. In this paper as an example a weld pool image is digitized using
thresholding technique. The paper also successfully demonstrates the effect of
choosing various thresholds. This technique can also be applied to scenes in which
multiple objects occupy the view port.
References:
- Minimization and quantification of Arc interference in Robotic Welding– C.V.Sriram, Prof. C.L.V.Prasad and Prof. M.M.M. Sarkar, proceedings of MEAK 2K2- National Conference on CAD/CAM.
- K.S.Fu, R.C.Gonzalez, C.S.G.Lee, Robotics-control, sensing, vision and intelligence, McGraw-Hill Book Company.
- Mikell P.Groover, Mitchell Weiss, Roger N. Nagel, Nicholas G. Odrey , Industrial Robotics technology, programming and applications, McGraw-Hill Book Company.
- Robert J. Schilling, Fundamentals of Robotics Analysis and control, Prentice-Hall of India Private Limited.
- Saeed B. Nikku , Introduction to Robotics Analysis, Systems, Applications, Pearson Education Asia.
- P.A.Janakiraman, Robotics and Image Processing an Introduction, Tata McGraw Hill Publishing Company.
Technical Paper on Thresholding Digital Image Processing for Vision Systems
4/
5
Oleh
siva