|
Traditional biology has centered on the study of a single
gene, that is, the attempt to identify all the factors that
regulate the activity level of a particular gene. This approach
has been painstakingly slow, but very rewarding in terms of
obtaining specific information about the gene being studied.
With the announcement that the entire sequence of the human
genome has been identified, the rush is on to utilize the
vast amount of information now available.
There are an estimated 80,000 to 100,000 unique genes encoded
within the human genome and in order to determine the various
functions of all of these genes, scientists are actively developing
methods to analyze many of them in parallel. This tack makes
numerous new and interesting experiments possible, like identifying
all the genes that are changed in response to a biologically
active molecule such as a hormone or a drug.
While much of the early work in this area has relied on data-collection
methods that employ serial scanning, new protocols based on
digital imaging are now proving highly successful. Recent
advances in electronics technology and new methods of high-volume
manufacturing are making it possible for builders of microarray
scanners to affordably integrate advanced CCD imagers as components
in their systems. On a smaller scale, it is even possible
for individual researchers to use currently available camera
systems for microarray imaging experiments in their own laboratories.
Measurement
of Gene Activity
In order to measure a gene's activity, scientists collect
the messenger RNA (mRNA), which carries information from the
gene in the nucleus to the cytoplasm, where it is usually
translated into a protein product. When the RNA from a cell
population or tissue is collected, this preparation is typically
converted into copy DNA (cDNA) and then amplified with the
polymerase chain reaction (PCR). A sample of the amplified
cDNA product can be labeled with fluorescent tags to allow
that population to be identified.
Typically, the researcher uses two populations of cells:
one representing the control and one representing the experimental
treatment. As an example, a cultured cell line's response
to insulin can be measured by preparing two cell populations:
one treated with insulin and one mock treated. The RNA from
the insulin-treated cells can be labeled with Fluor 1 (green
color) and the RNA from the control cells can be labeled with
Fluor 2 (red color). These probes can then be used to interrogate
a microarray of immobilized DNA targets on a glass surface,
where each (x,y) coordinate represents a known DNA sequence.
When the green and red probes are hybridized to the array,
the composite color is a measure of the gene activity ratio.
A green color would indicate a gene that is on with insulin
treatment and off when insulin is absent. A red color would
indicate a gene that is on when insulin is absent and off
with insulin treatment. A yellow color would indicate a gene
that does not change significantly with insulin treatment.
Basic
Analytical Approaches
Once the arrays have been hybridized and washed under the
appropriate stringency conditions, the user needs to read
the arrays at two different wavelengths of interest. The wavelengths
used are usually in the red region of the spectrum to reduce
auto-fluorescence from the system. The most commonly used
labels for microarray experiments are Cy3 and Cy5, depicted
as green and red, respectively. Two basic approaches can be
utilized to measure these signals: serial scanning or imaging.
In the serial scanning approach, a confocal fluorescent scanning
device collects data serially using a common optical pathway
for delivering the excitation beam and for collecting the
emission beam. The fluorescent signal is measured by a photomultiplier
tube and digitized to generate the output intensity at each
spot on the microarray. After the first fluorescent channel
has been measured, the optics are switched and the second
channel is then measured.
In the imaging approach, a wide-field illumination scheme
is utilized to achieve parallel excitation on most or all
of the microarray. Similarly, the imaging of the whole array
then occurs in parallel with an exposure time chosen to obtain
the best signal-to-noise in the data. A camera digitizes the
data and all of the digital data is delivered to the computer
in an image format. For the second wavelength, the optics
will need to be changed and another exposure taken.
Scanning
Vs. Imaging
So which is the better method for analyzing this kind of data?
Right now, the majority of the microarray scanners on the
market employ the serial scanning approach. This is due to
the simplicity of design of a system that uses a laser for
illumination coupled with a PMT for a detector. In terms of
performance, however, there are distinct advantages to moving
to an imaging-based approach.
For example, in selecting laser lines for illumination in
serial scanners, the fluorescent probes become confined to
those whose excitation spectrum is sufficiently overlapped
with the laser line to be useful. In the imaging approach,
broadband emitters like xenon (Xe) or mercury (Hg) can be
used to generate an almost continuous usable spectrum of illumination
light.
Furthermore, by using a CCD-type detector to measure the
fluorescence signals, an imaging device can achieve quantum
efficiencies on the order of 90% versus the 15-20%
QE obtainable with a PMT-type device. This difference translates
into a sensitivity advantage of up to sixfold! The CCDs can
also be run with very low-noise analog electronics to achieve
a readout noise lower than 4 e rms, whereas the PMTs can have
noise terms that are significantly higher.
Another key advantage of the imaging-based approach is the
highly parallel nature of the data collection. If the user
needs to collect 5,000 data points in one experiment and 20,000
in another, the scanning approach will require four times
longer for the second experiment than the first. In an optimized
imaging experiment, both measurements will be under identical
exposure conditions and so the time will be equal. As the
number of elements in an array increases, the advantage of
imaging over serial scanning becomes more pronounced.
Conversely, one of the arguments in favor of serial scanners
over imaging devices is the former's ability to achieve a
larger number of effective pixels. This ability is useful
for the currently accepted standard of measuring each 100-micron
DNA spot with a 10 x 10-pixel array, a practice that yields
70-80 (on average) individual valid measurements within the
circular spot. The rationale for obtaining this highly oversampled
data is that it enables the interpreting software to derive
detailed statistics on the variation from spot to spot as
well as to use this information to qualify the quality of
the data in the spot. While this method does indeed give the
software a good idea of the uniformity of the distribution
of the signal across the spot, it really does not address
the core issue of whether or not the data is useful or quantitative.
In all biology, the standard manner to ensure the usefulness
or validity of the data is the replicate method. Every biochemical
assay is done in at least triplicate. Every plate-based assay
is done in at least duplicate. The variation across replicates
or triplicates is defined by the coefficient of variation;
this number places a boundary on the usefulness of the data.
The same principle should be applied in the microarray field,
that is, all spots should be represented at least twice on
each microarray and these spots should not be in the same
region of the array. This approach eliminates the need for
the high level of oversampling employed with serial scanning.
Under these conditions, the finite number of pixels on the
CCD is no longer a limiting factor.
This article was written by Mark Christenson, Ph.D., Senior
Scientist, and Jeff Grant, Senior Technical Writer, at Roper
Scientific, Inc. The authors can be reached at mchristenson@roperscientific.com
and jgrant@roperscientific.
com. Learn more about Roper Scientific at www.roperscientific.com.
|