From Surf Wiki (app.surf) — the open knowledge base
MA plot
Diagram in computational biology
Diagram in computational biology
Within computational biology, an MA plot is an application of a Bland–Altman plot for visual representation of genomic data. The plot visualizes the differences between measurements taken in two samples, by transforming the data onto M (log ratio) and A (mean average) scales, then plotting these values. Though originally applied in the context of two channel DNA microarray gene expression data, MA plots are also used to visualise high-throughput sequencing analysis.
Explanation
Microarray data is often normalized within arrays to control for systematic biases in dye coupling and hybridization efficiencies, as well as other technical biases in the DNA probes and the print tip used to spot the array. By minimizing these systematic variations, true biological differences can be found. To determine whether normalization is needed, one can plot Cy5 (R) intensities against Cy3 (G) intensities and see whether the slope of the line is around 1. An improved method, which is basically a scaled, 45 degree rotation of the R vs. G plot is an MA-plot. The MA-plot is a plot of the distribution of the red/green intensity ratio ('M'), that is plotted against the average intensity ('A'). M and A are defined by the following equations.
:M=\log_2(R/G)=\log_2(R)-\log_2(G)
:A=\frac12 \log_2(RG) = \frac12 (\log_2(R) + \log_2(G))
M is, therefore, the binary logarithm of the intensity ratio (or difference between log intensities) and A is the average log intensity for a dot in the plot. MA plots are then used to visualize intensity-dependent ratio of raw microarray data (microarrays typically show a bias here, with higher A resulting in higher |M|, i.e. the brighter the spot the more likely an observed difference between sample and control). The MA plot puts the variable M on the y-axis and A on the x-axis and gives a quick overview of the distribution of the data.
In many microarray gene expression experiments, an underlying assumption is that most of the genes would not see any change in their expression; therefore, the majority of the points on the y-axis (M) would be located at 0, since log(1) is 0. If this is not the case, then a normalization method such as LOESS should be applied to the data before statistical analysis. (On the diagram below see the red line running below the zero mark before normalization, it should be straight. Since it is not straight, the data should be normalized. After being normalized, the red line is straight on the zero line and shows as pink/black.)
Packages
Several Bioconductor packages, for the R software, provide the facility for creating MA plots. These include affy (ma.plot, mva.pairs), limma (plotMA), marray (maPlot), and edgeR(maPlot)
Similar "RA" plots can be generated using the raPlot function in the caroline CRAN R package.
An interactive MA plot to filter genes by M, A and p-values, search by names or with a lasso, and save selected genes, is available as an R-Shiny code Enhanced-MA-Plot.
Example in the R programming language
library(affy)
if (require(affydata))
{
data(Dilution)
}
y <- (exprs(Dilution)[, c("20B", "10A")])
x11()
ma.plot( rowMeans(log2(y)), log2(y[, 1])-log2(y[, 2]), cex=1 )
title("Dilutions Dataset (array 20B v 10A)")
library(preprocessCore)
#do a quantile normalization
x <- normalize.quantiles(y)
x11()
ma.plot( rowMeans(log2(x)), log2(x[, 1])-log2(x[, 2]), cex=1 )
title("Post Norm: Dilutions Dataset (array 20B v 10A)")
References
References
- (11 November 2009). "edgeR: a Bioconductor package for differential expression analysis of digital gene expression data". Bioinformatics.
- (5 December 2014). "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2". Genome Biology.
- [[Jean Yang. YH Yang]], [[Sandrine Dudoit. S Dudoit]], P Luu, DM Lin, V Peng, J Ngai, [[Terry Speed. TP Speed]]. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. ''Nucleic Acids Research'' vol. 30 (4) pp. e15.
- [[Sandrine Dudoit. Dudoit, S]], [[Jean Yang. Yang, YH]], Callow, MJ, [[Terry Speed. Speed, TP]]. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. ''Stat. Sin.'' 12:1 111–139
This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.
Ask Mako anything about MA plot — get instant answers, deeper analysis, and related topics.
Research with MakoFree with your Surf account
Create a free account to save articles, ask Mako questions, and organize your research.
Sign up freeThis content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.
Report