PerspectivesAre you interested in submitting a Perspective Article? Be sure to read The Science Advisory Board's Editorial Guides for Perspective Articles. Click here. A Picture is Worth a Million Numbers: Extracting Information from Biological Images with CellProfiler by Anne E. Carpenter, Ph.D. and Mark Bray, Ph.D. -Broad Institute, Cambridge, MA Biologists are increasingly interested in extracting quantitative information from images, especially as automated microscopes are becoming widespread. Traditionally, researchers gained tremendous insight from images through qualitative visual inspection or small-scale manual image quantification, using interactive programs like Adobe Photoshop. Today, however, the image acquisition rate attained by automated microscopy can easily outstrip a researcher’s ability to inspect them. Properly configured, automated image analysis software can extract the biological information contained within images, producing quantitative measurements of features from every cell in a population while simultaneously speeding analysis, reducing subjectivity and increasing statistical power. We have previously written a review article and tutorial on quantitative biological image analysis with helpful guidance, practical tips and an extensive list of resources in the field[1]. To enable rapid development and execution of automated image analysis routines for biological images, we have developed CellProfiler for biologists (http://www.cellprofiler.org). CellProfiler is a free, open-source software package, designed to enable scientists without prior programming experience to automatically quantify relevant features of samples in large numbers of images[2-3]. We created CellProfiler to meet the need for flexible, powerful, user-friendly, and inexpensive software for image analysis, especially for large-scale, image-based experiments. It incorporates several cutting-edge techniques from the image processing and computer vision literature, packaged in an easy-to-use program. CellProfiler allows individual modules designed for a particular task to be chained together to build complex analysis pipelines (Figure 1), and is distributed under an open-source license so that programmers can design new modules and features, if necessary. Figure 1: Schematic of image analysis modularity in CellProfiler. Many biological phenotypes can be measured using existing modules in CellProfiler, including the number, size, and shape of biological objects, such as cells, cellular compartments, colonies, and organisms (Figure 2). The intensity and texture of biological markers in multiple image channels can also be measured. To enable exploration and analysis of the dizzying array of measurements that can be extracted from images, we also developed the open-source software CellProfiler Analyst. Figure 2: Examples of typical image-based assays. For each panel, the left image shows the raw image from fluorescence or bright field microscopy; the right image shows the identified objects to be used for downstream quantification. Additional example images and assays may be found at CellProfiler Images. For complex or subtle phenotypes, such as morphological abnormalities or unusual cell types, it is often not clear which features are suited for classifying the cells or other biological objects in the images. In such cases, machine learning is a powerful tool that can nevertheless be used by beginners. First, CellProfiler is used to generate a large and standard set of measurements from each cell or other biological object. Then, the biologist trains the Classifier function of CellProfiler Analyst to recognize objects with the biological phenotype of interest[4]. With this function, the researcher performs “supervised learning”: the tool presents the user examples of cells, which the user visually sorts into bins based on the phenotype displayed (Figure 3). The computer then iteratively learns the differences between the phenotypes with feedback from the biologist. When sufficiently accurate, the computer scores all of the images in the experiment. By providing a simple, user-friendly interface, the Classifier function enables biologists to get up and running within minutes, usually training the computer to recognize complicated phenotypes in less than a day. Figure 3: The Classifier machine learning tool in CellProfiler Analyst. Individual cells are “fetched” from images in the experiment according to settings at the top of the window. Fetched cells appear in the middle “unclassified” bin for the biologist to sort into two or more phenotypic bins at the bottom of the window. Cells in these phenotypic bins serve as the “training set” that iteratively improves the machine learning algorithm. These automated software tools are especially useful for very large experiments of hundreds to millions of images. Large-scale microscopy experiments using the high-throughput acquisition and analysis of cellular images, known as high-content screens (HCS), are rapidly growing within the research community as a means of answering fundamental questions about basic biological processes and human disease[5-8]. Image-based assays provide a combination of quality and quantity of information from a single sample that is unmatched by other modalities. Over the past couple of decades, automated microscopy has opened the door to systematic image collection and analysis of cellular function that was simply not feasible before. This richness of information stems from several sources: Our research group continues to develop image processing algorithms and data mining techniques for challenging image-based experiments, especially high-throughput imaging screens. With the advent of automated microscopy and the successful demonstration of sophisticated image analysis methods, the number of quantitative microscopy experiments is only expected to grow. Because nearly every biology laboratory uses microscopy in some fashion, the modern biologist will be well served by becoming familiar with the software and strategies involved in quantitative image analysis. -To learn more about the CellProfiler project, visit: CellProfiler. -To learn more about the Carpenter lab, visit: Carpenter lab. **We are grateful to our collaborators for the use of their images.** References: 1. Ljosa, V. and A.E. Carpenter, Introduction to the quantitative analysis of two-dimensional fluorescence microscopy images for cell-based screening. PLoS Comput Biol, 2009. 5(12): p. e1000603. 2. Carpenter, A.E., et al., CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol, 2006. 7(10): p. R100. 3. Lamprecht, M.R., D.M. Sabatini, and A.E. Carpenter, CellProfiler: free, versatile software for automated biological image analysis. Biotechniques, 2007. 42(1): p. 71-5. 4. Jones, T.R., et al., Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. Proc Natl Acad Sci U S A, 2009. 106(6): p. 1826-1831. 5. Carpenter, A.E. and D.M. Sabatini, Systematic genome-wide screens of gene function. Nat Rev Genet, 2004. 5(1): p. 11-22. 6. Echeverri, C.J. and N. Perrimon, High-throughput RNAi screening in cultured cells: a user's guide Nat Rev Genetics, 2006. 7(5): p. 373-84. 7. Abraham, V.C., D.L. Taylor, and J.R. Haskins, High content screening applied to large-scale cell biology. Trends Biotechnol, 2004. 22(1): p. 15-22. 8. Perlman, Z.E., et al., Multidimensional drug profiling by automated microscopy. Science, 2004. 306(5699): p. 1194-8. ### << Previous Next >> [ View All Perspectives ] |
|