Bioconductor Code: limma

Browse code

8 Aug 2014: limma 3.21.12

- The definition of the M and A axes for an MA-plot of single channel
data is changed slightly. Previously the A-axis was the average of
all arrays in the dataset -- this has been definition since MA-plots
were introduced for single channel data in April 2003. Now an
artificial array is formed by averaging all arrays other than the
one to be plotted. Then a mean-difference plot is formed from the
specified array and the artificial array. This change ensures the
specified and artificial arrays are computed from independent data,
and ensures the MA-plot will reduce to a correct mean-difference
plot when there are just two arrays in the dataset.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/limma@93246 bc3139a8-67e5-0310-9ffc-ced21a209358

Gordon Smyth authored on 08/08/2014 05:19:48
Showing 12 changed files

DESCRIPTION index f1fef2229..8e7bc1e20 100755
NAMESPACE index 357432b15..df348c461 100644
R/diffSplice.R index c2981a594..db7623a9c 100644
R/goana.R index 1231b43ca..e062de5d0 100644
R/plots-ma.R index 954456c7b..7417152ae 100755
R/read-ilmn.R index 8530abaea..568594bf7 100644
inst/doc/changelog.txt index 636c4a6eb..7b3b00742 100755
man/genas.Rd index c1efe8198..ea5d08b21 100644
man/goana.Rd index 86cd4ee3a..a026351bc 100644
man/plotSplice.Rd index 7be592e8c..b36dd6747 100644
man/plotma.Rd index f8924d2ae..ed4548ff4 100755
man/topSplice.Rd index a286349b8..609a7fabf 100644

DESCRIPTION

History View file @ c05cb5963

@@ -1,13 +1,13 @@
                      Package: limma
                     -Version: 3.21.10
                     -Date: 2014/06/27
                     +Version: 3.21.12
                     +Date: 2014/08/08
                      Title: Linear Models for Microarray Data
                      Description: Data analysis, linear models and differential expression for microarray data.
                      Author: Gordon Smyth [cre,aut], Matthew Ritchie [ctb], Jeremy Silver [ctb], James Wettenhall [ctb], Natalie Thorne [ctb], Davis McCarthy [ctb], Di Wu [ctb], Yifang Hu [ctb], Wei Shi [ctb], Belinda Phipson [ctb], Alicia Oshlack [ctb], Carolyn de Graaf [ctb], Mette Langaas [ctb], Egil Ferkingstad [ctb], Marcus Davy [ctb], Francois Pepin [ctb], Dongseok Choi [ctb]
                      Maintainer: Gordon Smyth <[email protected]>
                      License: GPL (>=2)
                      Depends: R (>= 2.3.0), methods
                     -Suggests: statmod (>= 1.2.2), splines, locfit, MASS, ellipse, affy, vsn, AnnotationDbi, org.Hs.eg.db, GO.db, illuminaio
                     +Suggests: statmod (>= 1.2.2), splines, locfit, MASS, ellipse, affy, vsn, AnnotationDbi, org.Hs.eg.db, GO.db, illuminaio, BiasedUrn
                      LazyLoad: yes
                      URL: http://bioinf.wehi.edu.au/limma
                      biocViews: ExonArray, GeneExpression, Transcription, AlternativeSplicing, DifferentialExpression, DifferentialSplicing, GeneSetEnrichment, DataImport, Genetics, Bayesian, Clustering, Regression, TimeCourse, Microarray, microRNAArray, mRNAMicroarray, OneChannel, ProprietaryPlatforms, TwoChannel, RNASeq, BatchEffect, MultipleComparison, Normalization, Preprocessing, QualityControl

NAMESPACE

History View file @ c05cb5963

@@ -50,6 +50,7 @@ S3method("dimnames<-",RGList)
                      S3method("dimnames<-",EList)
                      S3method("dimnames<-",EListRaw)
                      S3method(fitted,MArrayLM)
                     +S3method(goana,MArrayLM)
                      S3method(length,MAList)
                      S3method(length,MArrayLM)
                      S3method(length,RGList)

R/diffSplice.R

History View file @ c05cb5963

@@ -2,7 +2,7 @@ diffSplice <- function(fit,geneid,exonid=NULL,verbose=TRUE)
                      #	Test for splicing variants between conditions
                      #	using linear model fit of exon data.
                      #	Charity Law and Gordon Smyth
                     -#	Created 13 Dec 2013.  Last modified 18 Feb 2014.
                     +#	Created 13 Dec 2013.  Last modified 7 Aug 2014.
+                     {
                      	exon.genes <- fit$genes
                      	if(is.null(exon.genes)) exon.genes <- data.frame(ExonID=1:nrow(fit))
@@ -106,78 +106,98 @@ diffSplice <- function(fit,geneid,exonid=NULL,verbose=TRUE)
                      	out$gene.F.p.value <- gene.F.p.value
                      #	Which columns of exon.genes contain gene level annotation?
                     -	exon.lastexon <- cumsum(gene.nexons)
                     -	exon.firstexon <- exon.lastexon-gene.nexons+1
                     +	gene.lastexon <- cumsum(gene.nexons)
                     +	gene.firstexon <- gene.lastexon-gene.nexons+1
                      	no <- logical(nrow(exon.genes))
                     -	isdup <- vapply(exon.genes,duplicated,no)[-exon.firstexon,,drop=FALSE]
                     +	isdup <- vapply(exon.genes,duplicated,no)[-gene.firstexon,,drop=FALSE]
                      	isgenelevel <- apply(isdup,2,all)
                     -	out$gene.genes <- exon.genes[exon.lastexon,isgenelevel, drop=FALSE]
                     +	out$gene.genes <- exon.genes[gene.lastexon,isgenelevel, drop=FALSE]
                      	out$gene.genes$NExons <- gene.nexons
                     +	out$gene.firstexon <- gene.firstexon
                     +	out$gene.lastexon <- gene.lastexon
+                    +
                     +#	Simes adjustment of exon level p-values
                     +	simes <- function(p,n) {
                     +		p <- p[-which.max(p)]
                     +		min(sort(p)*(n-1)/(1:(n-1)))
                     +	}
                     +	out$gene.simes.p.value <- out$gene.F.p.value
                     +	for (i in 1:ngenes) for (j in 1:ncol(fit)) {
                     +		out$gene.simes.p.value[i,j] <- simes(exon.p.value[gene.firstexon[i]:gene.lastexon[i],j],gene.nexons[i])
                     +	}
                      	out
+                     }
                     -topSplice <- function(fit, coef=ncol(fit), level="exon", number=10, FDR=1)
                     -#	Collate voomex results in data.frame, ordered from most significant at top
                     +topSplice <- function(fit, coef=ncol(fit), level="hybrid", number=10, FDR=1)
                     +#	Collate diffSplice results into data.frame, ordered from most significant at top
                      #	Gordon Smyth
                     -#	Created 18 Dec 2013.  Last modified 17 March 2014.
                     +#	Created 18 Dec 2013.  Last modified 7 Aug 2014.
+                     {
                      	coef <- coef[1]
                     -	exon <- match.arg(level,c("exon","gene"))
                     -	if(level=="exon") {
                     +	level <- match.arg(level,c("hybrid","exon","gene"))
                     +	switch(level,
                     +	"exon" = {
                      		number <- min(number,nrow(fit$coefficients))
                      		P <- fit$p.value[,coef]
                      		BH <- p.adjust(P, method="BH")
                      		if(FDR<1) number <- min(number,sum(BH<FDR))
                      		o <- order(P)[1:number]
                      		data.frame(fit$genes[o,,drop=FALSE],logFC=fit$coefficients[o,coef],t=fit$t[o,coef],P.Value=P[o],FDR=BH[o])
                     -	} else {
                     +	},
                     +	gene = {
                      		number <- min(number,nrow(fit$gene.F))
                      		P <- fit$gene.F.p.value[,coef]
                      		BH <- p.adjust(P, method="BH")
                      		if(FDR<1) number <- min(number,sum(BH<FDR))
                      		o <- order(P)[1:number]
                      		data.frame(fit$gene.genes[o,,drop=FALSE],F=fit$gene.F[o,coef],P.Value=P[o],FDR=BH[o])
                     +	},
                     +	hybrid = {
                     +		number <- min(number,nrow(fit$gene.F))
                     +		P <- fit$gene.simes.p.value[,coef]
                     +		BH <- p.adjust(P, method="BH")
                     +		if(FDR<1) number <- min(number,sum(BH<FDR))
                     +		o <- order(P)[1:number]
                     +		data.frame(fit$gene.genes[o,,drop=FALSE],P.Value=P[o],FDR=BH[o])
+                     	}
                     +	)
+                     }
                     -plotSplice <- function(fit, coef=ncol(fit), geneid=NULL, rank=1L, FDR = 0.05)
                     -#	Plot exons of most differentially spliced gene
                     +plotSplice <- function(fit, coef=ncol(fit), geneid=NULL, genecolname=NULL, rank=1L, FDR = 0.05)
                     +#	Plot exons of chosen gene
                     +#	fit is output from diffSplice
                      #	Gordon Smyth and Yifang Hu
                      #	Created 3 Jan 2014.  Last modified 19 March 2014.
+                     {
                     -	# Gene labelling including gene symbol
                     -	genecolname <- fit$genecolname
                     -	genelab <- grep(paste0(genecolname,"|Symbol|symbol"), colnames(fit$gene.genes), value = T)
+                    -
                     +	if(is.null(genecolname))
                     +		genecolname <- fit$genecolname
                     +	else
                     +		genecolname <- as.character(genecolname)
+                    +
                      	if(is.null(geneid)) {
                     +#		Find gene from specified rank
                      		if(rank==1L)
                      			i <- which.min(fit$gene.F.p.value[,coef])
                      		else
                      			i <- order(fit$gene.F.p.value[,coef])[rank]
+                    -
                     -		geneid <- paste(fit$gene.genes[i,genelab], collapse = ".")
+                    -
                     +		geneid <- paste(fit$gene.genes[i,genecolname], collapse = ".")
                      	} else {
                     -		i <- which(fit$gene.genes[,fit$genecolname]==geneid)
+                    -
                     -		geneid <- paste(fit$gene.genes[i,genelab], collapse = ".")
+                    -
                     +#		Find gene from specified name
                     +		geneid <- as.character(geneid)
                     +		i <- which(fit$gene.genes[,genecolname]==geneid)[1]
                      		if(!length(i)) stop(paste("geneid",geneid,"not found"))
+                     	}
                     -	exon.lastexon <- cumsum(fit$gene.genes$NExons[1:i])
                     -	j <- (exon.lastexon[i-1]+1):exon.lastexon[i]
                     -	exoncolname <- fit$exoncolname
                     +#	Row numbers containing exons
                     +	j <- fit$gene.firstexon[i]:fit$gene.lastexon[i]
                     -	if(is.null(exoncolname)){
                     +	exoncolname <- fit$exoncolname
                     +	if(is.null(exoncolname)) {
                      		plot(fit$coefficients[j,coef], xlab = "Exon", ylab = "logFC (this exon vs rest)", main = geneid, type = "b")
                     -	}
+                    -
                     -	# Plot exons and mark exon ids on the axis
                     -	if(!is.null(exoncolname)) {
                     +	} else {
                      		exon.id <- fit$genes[j, exoncolname]
                      		xlab <- paste("Exon", exoncolname, sep = " ")
@@ -186,25 +206,24 @@ plotSplice <- function(fit, coef=ncol(fit), geneid=NULL, rank=1L, FDR = 0.05)
                      		axis(1, at = 1:length(j), labels = exon.id, las = 2, cex.axis = 0.6)
                      		mtext(xlab, side = 1, padj = 5.2)
                     -		# Mark the topSpliced exons
                     +#		Mark the topSpliced exons
                      		top <- topSplice(fit, coef = coef, number = Inf, level = "exon", FDR = FDR)
                      		m <- which(top[,genecolname] %in% fit$gene.genes[i,genecolname])
+                    -
                      		if(length(m) > 0){
+                    -
                     -			if(length(m) == 1) cex <- 1.5 else{
+                    -
                     +			if(length(m) == 1)
                     +				cex <- 1.5
                     +			else {
                      				abs.fdr <- abs(log10(top$FDR[m]))
                      				from <- range(abs.fdr)
                      				to <- c(1,2)
                      				cex <- (abs.fdr - from[1])/diff(from) * diff(to) + to[1]
+                    -
+                     			}
                      			mark <- match(top[m, exoncolname], exon.id)
                      			points((1:length(j))[mark], fit$coefficients[j[mark], coef], col = "red", pch = 16, cex = cex)
+                     		}
+                    +
+                     	}
                      	abline(h=0,lty=2)
+                    -
                     -}
                     \ No newline at end of file
                     +	invisible()
                     +}

R/goana.R

History View file @ c05cb5963

@@ -1,30 +1,81 @@
                     -goana <- function(fit, coef = ncol(fit), geneid = "GeneID", FDR = 0.05, species = "Hs"){
                     +goana <- function(de,...) UseMethod("goana")
+                    +
                     +goana.MArrayLM <- function(de, coef = ncol(de), geneid = rownames(de), FDR = 0.05, species = "Hs", trend = FALSE, ...)
                      #  Gene ontology analysis of DE genes from linear model fit
                      #  Gordon Smyth and Yifang Hu
                     -#  Created 20 June 2014. Last modified 24 June 2014.
+                    -
                     -	# Check input
                     -	if(!is(fit, "MArrayLM")) stop("fit must be an MArrayLM object.")
                     -	if(is.null(fit$p.value)) stop("p value not found in fit object (from eBayes).")
                     -	if(is.null(fit$coefficients)) stop("coefficient not found in fit object.")
+                    -
                     -	if(length(geneid) == nrow(fit)){
                     +#  Created 20 June 2014.  Last modified 23 July 2014.
                     +{
                     +	# Check fit
                     +	if(is.null(de$p.value)) stop("p value not found in fit object (from eBayes).")
                     +	if(is.null(de$coefficients)) stop("coefficient not found in fit object.")
                     +	if(length(coef) != 1) stop("coef length needs to be 1.")
                     +	ngenes <- nrow(de)
+                    +
                     +	# Check geneid
                     +	# Can be either a vector of IDs or a column name
                     +	geneid <- as.character(geneid)
                     +	if(length(geneid) == ngenes) {
                     +		universe <- geneid
                     +	} else
                     +		if(length(geneid) == 1L) {
                     +			universe <- de$genes[[geneid]]
                     +			if(is.null(universe)) stop(paste("Column",geneid,"not found in de$genes"))
                     +		} else
                     +			stop("geneid has incorrect length")
+                    +
                     +	# Check trend
                     +	# Can be logical, or a numeric vector of covariate values, or the name of the column containing the covariate values
                     +	if(is.logical(trend)) {
                     +		if(trend) {
                     +			covariate <- de$Amean
                     +			if(is.null(covariate)) stop("Amean not found in fit")
                     +		}
                     +	} else
                     +		if(is.numeric(trend)) {
                     +			if(length(trend) != ngenes) stop("If trend is numeric, then length must equal nrow(de)")
                     +			covariate <- trend
                     +			trend <- TRUE
                     +		} else {
                     +			if(is.character(trend)) {
                     +				if(length(trend) != 1L) stop("If trend is character, then length must be 1")
                     +				covariate <- de$genes[[trend]]
                     +				if(is.null(covariate)) stop(paste("Column",trend,"not found in de$genes"))
                     +				trend <- TRUE
                     +			} else
                     +				stop("trend is neither logical, numeric nor character")
                     +		}
+                    +
                     +	# Check FDR
                     +	if(!is.numeric(FDR) | length(FDR) != 1) stop("FDR must be numeric and of length 1.")
                     +	if(FDR < 0 | FDR > 1) stop("FDR should be between 0 and 1.")
                     -		fit$genes$GeneID <- geneid
                     -		EG.col <- "GeneID"
                     +	# Get up and down DE genes
                     +	fdr.coef <- p.adjust(de$p.value[,coef], method = "BH")
                     +	EG.DE.UP <- universe[fdr.coef < FDR & de$coef[,coef] > 0]
                     +	EG.DE.DN <- universe[fdr.coef < FDR & de$coef[,coef] < 0]
                     +	de.gene <- list(Up=EG.DE.UP, Down=EG.DE.DN)
+                    +
                     +	# Fit monotonic cubic spline for DE genes vs. gene.weights
                     +	if(trend) {
                     +			PW <- isDE <- rep(0,nrow(de))
                     +			isDE[fdr.coef < FDR] <- 1
                     +			o <- order(covariate)
                     +			PW[o] <- tricubeMovingAverage(isDE[o],span=0.5,full.length=TRUE)
+                     	}
                     +	if(!trend) PW <- NULL
                     -	else if(length(geneid) == 1) EG.col <- as.character(geneid)
+                    -
                     -	if(is.null(fit$genes)) stop("no annotation (genes) found.")
+                    -
                     -	EG.All <- as.character(fit$genes[[EG.col]])
+                    -
                     -	# Get up and down DE genes
                     -	fdr.coef <- p.adjust(fit$p.value[,coef], method = "BH")
                     +	NextMethod(de = de.gene, universe = universe, species = species, weights = PW, ...)
                     +}
                     -	EG.DE.UP <- EG.All[fdr.coef < FDR & fit$coef[,coef] > 0]
                     -	EG.DE.DN <- EG.All[fdr.coef < FDR & fit$coef[,coef] < 0]
                     +goana.default <- function(de, universe = NULL, species = "Hs", weights = NULL, ...)
                     +#  Gene ontology analysis of DE genes
                     +#  Gordon Smyth and Yifang Hu
                     +#  Created 20 June 2014.  Last modified 23 July 2014.
                     +{
                     +	# check de
                     +	if(!is.list(de)) de <- list(DE1 = de)
                     +	if(is.null(names(de))) names(de) <- paste0("DE", 1:length(de))
                     +	de <- lapply(de, as.character)
                      	# Select species
                      	species <- match.arg(species, c("Hs", "Mm", "Rn", "Dm"))
@@ -42,46 +93,102 @@ goana <- function(fit, coef = ncol(fit), geneid = "GeneID", FDR = 0.05, species
                      	d <- duplicated(EG.GO[,c("gene_id", "go_id", "Ontology")])
                      	EG.GO <- EG.GO[!d, ]
                     +	# Check universe
                     +	if(is.null(universe)) universe <- EG.GO$gene_id
                     +	universe <- as.character(universe)
+                    +
                     +	# Check weights
                     +	if(!is.null(weights)){
                     +		if(length(weights)!=length(universe)) stop("length(weights) must equal length(universe).")
                     +	}
+                    +
                      	# Reduce to universe
                     -	EG.GO <- EG.GO[EG.GO$gene_id %in% EG.All, ]
                     +	EG.GO <- EG.GO[EG.GO$gene_id %in% universe, ]
                     +	if(!length(EG.GO$gene_id)) stop("Universe is empty.")
+                    +
                      	Total <- length(unique(EG.GO$gene_id))
                      	# Overlap with DE genes
                     -	isDE.UP <- (EG.GO$gene_id %in% EG.DE.UP)
                     -	isDE.DN <- (EG.GO$gene_id %in% EG.DE.DN)
                     -	TotalDE.UP <- length(unique(EG.GO$gene_id[isDE.UP]))
                     -	TotalDE.DN <- length(unique(EG.GO$gene_id[isDE.DN]))
                     +	isDE <- lapply(de, function(x) EG.GO$gene_id %in% x)
                     +	TotalDE <- lapply(isDE, function(x) length(unique(EG.GO$gene_id[x])))
                     +	nDE <- length(isDE)
+                    +
                     +	if(length(weights)) {
                     +		# Probability weight for each gene
                     +		m <- match(EG.GO$gene_id, universe)
                     +		PW2 <- list(weights[m])
                     +		X <- do.call(cbind, c(N=1, isDE, PW=PW2))
                     +	} else
                     +		X <- do.call(cbind, c(N=1, isDE))
                     -	X <- cbind(N=1, Up=isDE.UP, Down=isDE.DN)
                      	group <- paste(EG.GO$go_id, EG.GO$Ontology, sep=".")
                      	S <- rowsum(X, group=group, reorder=FALSE)
                     -	# Fisher's exact test
                     -	p.UP <- phyper(q=S[,"Up"]-0.5,m=TotalDE.UP,n=Total-TotalDE.UP,k=S[,"N"],lower.tail=FALSE)
                     -	p.DN <- phyper(q=S[,"Down"]-0.5,m=TotalDE.DN,n=Total-TotalDE.DN,k=S[,"N"],lower.tail=FALSE)
                     +	P <- matrix(0, nrow = nrow(S), ncol = nDE)
+                    +
                     +	if(length(weights)) {
+                    +
                     +		# Calculate weight
                     +		require("BiasedUrn", character.only = TRUE)
                     +		PW.ALL <- sum(weights[universe %in% EG.GO$gene_id])
                     +		AVE.PW <- S[,"PW"]/S[,"N"]
                     +		W <- AVE.PW*(Total-S[,"N"])/(PW.ALL-S[,"N"]*AVE.PW)
+                    +
                     +		# Wallenius' noncentral hypergeometric test
                     +		for(j in 1:nDE){
+                    +
                     +			for(i in 1:nrow(S)){
+                    +
                     +				P[i,j] <- pWNCHypergeo(S[i,1+j], S[i,"N"], Total-S[i,"N"], TotalDE[[j]], W[i],lower.tail=FALSE)+
                     +					dWNCHypergeo(S[i,1+j], S[i,"N"], Total-S[i,"N"], TotalDE[[j]], W[i])
                     +			}
+                    +
                     +		}
+                    +
                     +		S <- S[,-ncol(S)]
+                    +
                     +	} else {
+                    +
                     +		# Fisher's exact test
                     +		for(j in 1:nDE){
+                    +
                     +			P[,j] <- phyper(q=S[,1+j]-0.5,m=TotalDE[[j]],n=Total-TotalDE[[j]], k=S[,"N"],lower.tail=FALSE)
                     +		}
+                    +
                     +	}
                      	# Assemble output
                      	g <- strsplit2(rownames(S),split="\\.")
                      	TERM <- select(GO.db,keys=g[,1],columns="TERM")
                     -	Results <- data.frame(Term=TERM[[2]],Ont=g[,2],S,P.Up=p.UP,P.Down=p.DN,stringsAsFactors=FALSE)
                     +	Results <- data.frame(Term = TERM[[2]], Ont = g[,2], S, P, stringsAsFactors=FALSE)
                      	rownames(Results) <- g[,1]
                     +	# Name P value for the DE genes
                     +	iTON <- c(1:3)
                     +	iDE <- 3+c(1:nDE)
                     +	PDE<- paste0("P.", colnames(Results)[iDE])
                     +	colnames(Results)[-c(iTON,iDE)] <- PDE
+                    +
                      	Results
+                     }
                      topGO <- function(results, ontology = c("BP", "CC", "MF"), sort = "up", number = 20L){
                      #  Extract sorted goana gene ontology test results
                      #  Gordon Smyth and Yifang Hu
                     -#  Created 20 June 2014. Last modified 24 June 2014.
+                    -
                     -	# Check input
                     -	if(any(! c("Term", "N", "Up", "Down", "Ont", "P.Up", "P.Down") %in% colnames(results))) stop("Results column names don't match GOTest results.")
+                    -
                     +#  Created 20 June 2014. Last modified 21 July 2014.
+                    +
                     +	# Check results
                     +	if(!is.data.frame(results)) stop("Expect a dataframe with goana results.")
+                    +
                     +	# Check ontology
                      	ontology <- match.arg(ontology, c("BP", "CC", "MF"), several.ok = TRUE)
+                    -
                     -	sort <- match.arg(tolower(sort), c("up", "down"))
                     -	sort <- switch(sort, up = "P.Up", down = "P.Down")
+                    -
+                    +
                     +	# Check sort and sort by P value
                     +	if(length(sort) != 1) stop("sort length needs to be 1.")
                     +	sort <- tolower(colnames(results)) %in% tolower(paste0("P.", sort))
                     +	if(sum(sort)!=1) stop("sort not found.")
+                    +
                     +	# Check number
                      	if(!is.numeric(number)) stop("Need to input number.")
                      	if(number < 1L) return(data.frame())
@@ -89,10 +196,9 @@ topGO <- function(results, ontology = c("BP", "CC", "MF"), sort = "up", number =
                      	sel <- results$Ont %in% ontology
                      	results <- results[sel,]
                     -	# Sort by Up or Down p value
                     +	# Sort by p value
                      	o <- order(results[, sort], rownames(results))
                      	results <- results[o,]
                      	if(number >= nrow(results)) results else results[1:number,]
                     -}
+                    -
                     +}
                     \ No newline at end of file

R/plots-ma.R

History View file @ c05cb5963

@@ -3,7 +3,7 @@
                      plotMA <- function(MA,...) UseMethod("plotMA")
                     -plotMA.RGList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
                     +plotMA.RGList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, zero.weights=FALSE, ...)
                      #	MA-plot with color coding for controls
                      #	Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
                      #	Last modified 23 April 2013.
@@ -12,7 +12,7 @@ plotMA.RGList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[arr
                      	plotMA(MA,array=1,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend,zero.weights=zero.weights,...)
+                     }
                     -plotMA.MAList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
                     +plotMA.MAList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, zero.weights=FALSE, ...)
                      #	MA-plot with color coding for controls
                      #	Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
                      #	Last modified 23 April 2013.
@@ -20,7 +20,7 @@ plotMA.MAList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[arr
                      	x <- as.matrix(MA$A)[,array]
                      	y <- as.matrix(MA$M)[,array]
                      	if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,array]
                     -	if(missing(status)) status <- MA$genes$Status
                     +	if(is.null(status)) status <- MA$genes$Status
                      	if(!is.null(w) && !zero.weights) {
                      		i <- is.na(w) | (w <= 0)
                      		y[i] <- NA
@@ -28,7 +28,7 @@ plotMA.MAList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[arr
                      	.plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+                     }
                     -plotMA.MArrayLM <- function(MA, coef=ncol(MA), xlab="AveExpr", ylab="logFC", main=colnames(MA)[coef], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
                     +plotMA.MArrayLM <- function(MA, coef=ncol(MA), xlab="AveExpr", ylab="logFC", main=colnames(MA)[coef], xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, zero.weights=FALSE, ...)
                      #	MA-plot with color coding for controls
                      #	Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
                      #	Last modified 21 March 2014.
@@ -37,7 +37,7 @@ plotMA.MArrayLM <- function(MA, coef=ncol(MA), xlab="AveExpr", ylab="logFC", mai
                      	x <- MA$Amean
                      	y <- as.matrix(MA$coef)[,coef]
                      	if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,coef]
                     -	if(missing(status)) status <- MA$genes$Status
                     +	if(is.null(status)) status <- MA$genes$Status
                      	if(!is.null(w) && !zero.weights) {
                      		i <- is.na(w) | (w <= 0)
                      		y[i] <- NA
@@ -45,7 +45,7 @@ plotMA.MArrayLM <- function(MA, coef=ncol(MA), xlab="AveExpr", ylab="logFC", mai
                      	.plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+                     }
                     -plotMA.EList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
                     +plotMA.EList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, zero.weights=FALSE, ...)
                      #	MA-plot with color coding for controls
                      #	Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
                      #	Last modified 23 April 2013.
@@ -59,7 +59,7 @@ plotMA.EList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[arra
                      		x <- rowMeans(MA$E,na.rm=TRUE)
                      	y <- MA$E[,array]-x
                      	if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,array]
                     -	if(missing(status)) status <- MA$genes$Status
                     +	if(is.null(status)) status <- MA$genes$Status
                      	if(!is.null(w) && !zero.weights) {
                      		i <- is.na(w) | (w <= 0)
@@ -68,34 +68,29 @@ plotMA.EList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[arra
                      	.plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+                     }
                     -plotMA.default <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
                     +plotMA.default <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, ...)
                      #	MA-plot with color coding for controls
                      #	Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
                     -#	Last modified 23 April 2013.
                     +#	Last modified 8 August 2014.
+                     {
                      #	Data is assumed to be single-channel
                      	MA <- as.matrix(MA)
                      	narrays <- ncol(MA)
                     -	if(narrays < 2) stop("Need at least two arrays")
                     -	if(narrays > 5)
                     -		x <- apply(MA,1,median,na.rm=TRUE)
                     -	else
                     -		x <- rowMeans(MA,na.rm=TRUE)
                     -	y <- MA[,array]-x
                     -	w <- NULL
                     -	if(missing(status)) status <- NULL
                     +	if(narrays<2) stop("Need at least two columns")
                     +	array <- as.integer(array[1L])
                     +	Ave <- rowMeans(MA[,-array,drop=FALSE],na.rm=TRUE)
                     +	y <- MA[,array]-Ave
                     +	x <- (MA[,array]+Ave)/2
                     -	if(!is.null(w) && !zero.weights) {
                     -		i <- is.na(w) | (w <= 0)
                     -		y[i] <- NA
                     -	}
                      	.plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+                     }
                     -.plotMAxy <- function(x, y, xlab="A", ylab="M", main=NULL, xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, ...)
                     +# Call this plotWithHighlights and document?
+                    +
                     +.plotMAxy <- function(x, y, xlab="A", ylab="M", main=NULL, xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, pch0=16, col0="black", cex0=0.3, ...)
                      #	MA-plot with color coding for controls
                      #	Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
                     -#	Last modified 23 April 2013.
                     +#	Last modified 13 April 2014.
+                     {
                      #	Check legend
                      	legend.position <- "topleft"
@@ -114,65 +109,69 @@ plotMA.default <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[ar
                      #	If no status information, just plot points normally
                      	if(is.null(status) || all(is.na(status))) {
                     -		if(missing(pch)) pch <- 16
                     -		if(missing(cex)) cex <- 0.3
                     -		points(x,y,pch=pch[[1]],cex=cex[1])
                     +		points(x,y,pch=pch0,cex=cex0)
                      		return(invisible())
+                     	}
                     -#	From here, status is not NULL and not all missing
                     +#	From here, status is not NULL and not all NA
                      #	Check values
                     -	if(missing(values)) {
                     -		if(is.null(attr(status,"values")))
                     -			values <- names(sort(table(status),decreasing=TRUE))
                     -		else
                     -			values <- attr(status,"values")
                     +#	Default is to set the most frequent status value as background, and to highlight all other status values in order of frequency
                     +	if(is.null(values)) values <- attr(status,"values")
                     +	if(is.null(values)) {
                     +		status.values <- names(sort(table(status),decreasing=TRUE))
                     +		status <- as.character(status)
                     +		values <- status.values[-1]
+                     	}
                      	nvalues <- length(values)
                     +	if(nvalues==0L) {
                     +		points(x,y,pch=pch0,cex=cex0)
                     +		return(invisible())
                     +	}
+                    +
                     +#	From here, values has positive length
                      #	Plot non-highlighted points
                     -	sel <- !(status %in% values)
                     -	nonhi <- any(sel)
                     -	if(nonhi) points(x[sel],y[sel],pch=16,cex=0.3)
                     +	bg <- !(status %in% values)
                     +	nonhi <- any(bg)
                     +	if(nonhi) points(x[bg],y[bg],pch=pch0,cex=cex0)
                     -	if(missing(pch)) {
                     -		if(is.null(attr(status,"pch")))
                     -			pch <- rep(16,nvalues)
                     -		else
                     -			pch <- attr(status,"pch")
                     -	}
                     +#	Check parameters for plotting highlighted points
                     -	if(missing(cex)) {
                     -		if(is.null(attr(status,"cex"))) {
                     -			cex <- rep(1,nvalues)
                     -			if(!nonhi) cex[1] <- 0.3
                     -		} else
                     -			cex <- attr(status,"cex")
                     -	}
                     +	if(is.null(pch)) pch <- attr(status,"pch")
                     +	if(is.null(pch)) pch <- pch0
                     +	pch <- rep(pch,length=nvalues)
                     -	if(missing(col)) {
                     -		if(is.null(attr(status,"col"))) {
                     -			col <- nonhi + 1:nvalues
                     -		} else
                     -			col <- attr(status,"col")
                     -	}
                     +	if(is.null(cex)) cex <- attr(status,"cex")
                     +	if(is.null(cex)) cex <- 1
                     +	cex <- rep(cex,length=nvalues)
                     -	pch <- rep(pch,length=nvalues)
                     +	if(is.null(col)) col <- attr(status,"col")
                     +	if(is.null(col)) col <- nonhi + 1L:nvalues
                      	col <- rep(col,length=nvalues)
                     -	cex <- rep(cex,length=nvalues)
                     -#	Plot highlighted classes of points
                     +#	Plot highlighted points
                      	for (i in 1:nvalues) {
                      		sel <- status==values[i]
                      		points(x[sel],y[sel],pch=pch[[i]],cex=cex[i],col=col[i])
+                     	}
                      	if(legend) {
                     +		if(nonhi) {
                     +#			Include background value in legend
                     +			bg.value <- unique(status[bg])
                     +			if(length(bg.value) > 1) bg.value <- "Other"
                     +			values <- c(bg.value,values)
                     +			pch <- c(pch0,pch)
                     +			col <- c(col0,col)
                     +			cex <- c(cex0,cex)
                     +		}
                     +		h <- cex>0.5
                     +		cex[h] <- 0.5+0.8*(cex[h]-0.5)
                      		if(is.list(pch))
                     -			legend(legend.position,legend=values,fill=col,col=col,cex=0.9)
                     +			legend(legend.position,legend=values,fill=col,col=col,cex=0.9,pt.cex=cex)
                      		else
                     -			legend(legend.position,legend=values,pch=pch,,col=col,cex=0.9)
                     +			legend(legend.position,legend=values,pch=pch,,col=col,cex=0.9,pt.cex=cex)
+                     	}
                      	invisible()
+                     }

R/read-ilmn.R

History View file @ c05cb5963

@@ -3,7 +3,7 @@
                      read.ilmn <- function(files=NULL, ctrlfiles=NULL, path=NULL, ctrlpath=NULL, probeid="Probe", annotation=c("TargetID", "SYMBOL"), expr="AVG_Signal", other.columns="Detection",sep="\t", quote="\"", verbose=TRUE, ...)
                      #	Read one or more files of Illumina BeadStudio output
                      #	Wei Shi and Gordon Smyth.
                     -#	Created 15 July 2009. Last modified 27 November 2013.
                     +#	Created 15 July 2009. Last modified 21 July 2014.
+                     {
                      	if(!is.null(files)){
                      		f <- unique(files)
@@ -34,12 +34,22 @@ read.ilmn <- function(files=NULL, ctrlfiles=NULL, path=NULL, ctrlpath=NULL, prob
                      			else
                      				elist.ctrl <- cbind(elist.ctrl, elist.ctrl1)
+                     		}
                     -		elist.ctrl$genes$Status <- elist.ctrl$genes[,ncol(elist.ctrl$genes)]
+                    +
                     +		if(is.null(elist.ctrl$genes)) elist.ctrl$genes <- data.frame(Status = rep("negative", nrow(elist.ctrl)))
                     +		else {
                     +			ctrl.status.negative <- apply(elist.ctrl$genes, 2, function(x) sum(tolower(x) %in% "negative"))
                     +			STATUS.col <- which.max(ctrl.status.negative)
                     +			if(ctrl.status.negative[STATUS.col] > 0) elist.ctrl$genes$Status <- elist.ctrl$genes[[STATUS.col]]
                     +			else elist.ctrl$genes$Status <- "negative"
                     +		}
+                     	}
                      	if(!is.null(files))
                      		if(!is.null(ctrlfiles)){
                     -			colnames(elist.ctrl$genes) <- colnames(elist$genes)
                     +			REG.col <- setdiff(colnames(elist$genes), colnames(elist.ctrl$genes))
                     +			if(length(REG.col)) for(i in REG.col) elist.ctrl$genes[[i]] <- NA
                     +			CTRL.col <- setdiff(colnames(elist.ctrl$genes), colnames(elist$genes))
                     +			if(length(CTRL.col)) for(i in CTRL.col) elist$genes[[i]] <- NA
                      			return(rbind(elist, elist.ctrl))
+                     		}
                      		else
@@ -64,7 +74,7 @@ read.ilmn.targets <- function(targets, ...)
                      .read.oneilmnfile <- function(fname, probeid, annotation, expr, other.columns, sep, quote, verbose, ...)
                      #	Read a single file of Illumina BeadStudio output
                      #	Wei Shi and Gordon Smyth
                     -#	Created 15 July 2009. Last modified 16 June 2014.
                     +#	Created 15 July 2009. Last modified 21 July 2014.
+                     {
                      	h <- readGenericHeader(fname,columns=expr,sep=sep)
                      	skip <- h$NHeaderRecords
@@ -101,7 +111,7 @@ read.ilmn.targets <- function(targets, ...)
                      #	Add probe annotation
                      	if(length(anncol)) {
                      		elist$genes <- x[,anncol,drop=FALSE]
                     -		if(!any(duplicated(pids))) row.names(elist$genes) <- pids
                     +		if(length(pids) & !any(duplicated(pids))) row.names(elist$genes) <- pids
+                     	}
                      #	elist$targets <- data.frame(SampleNames=snames, stringsAsFactors=FALSE)

inst/doc/changelog.txt

History View file @ c05cb5963

@@ -1,3 +1,28 @@
                     + 8 Aug 2014: limma 3.21.12
+                    +
                     +- The definition of the M and A axes for an MA-plot of single channel
                     +  data is changed slightly.  Previously the A-axis was the average of
                     +  all arrays in the dataset -- this has been definition since MA-plots
                     +  were introduced for single channel data in April 2003.  Now an
                     +  artificial array is formed by averaging all arrays other than the
                     +  one to be plotted.  Then a mean-difference plot is formed from the
                     +  specified array and the artificial array.  This change ensures the
                     +  specified and artificial arrays are computed from independent data,
                     +  and ensures the MA-plot will reduce to a correct mean-difference
                     +  plot when there are just two arrays in the dataset.
+                    +
                     + 7 Aug 2014: limma 3.21.11
+                    +
                     +- diffSplice() now includes Simes adjusted p-values
+                    +
                     +- topSplice() has a new level="hybrid" argument for ranking genes by
                     +  Simes adjusted p-values.
+                    +
                     +- goana() is now an S3 generic function.
+                    +
                     +- goana() can now optionally adjust for gene length or abundance
                     +  bias.
+                    +
 June 2014: limma 3.21.10
                      - remove col argument from plotMDS(), as it handled by ... as are
@@ -3448,6 +3473,10 @@ Apr 25 2003: limma 0.9.7
                      The smawehi package was renamed to limma, with the title "Linear Models
                      for Microarray Data" and became part of the Bioconductor project.
                     +Apr 7, 2003
+                    +
                     +MA plots added for both two-color and single channel data.
+                    +
 November 2002: smawehi 0.1
                      smawehi package made publicly available for the first time, through

man/genas.Rd

History View file @ c05cb5963

@@ -30,10 +30,10 @@ The method is explained briefly in Majewski et al (2010) and in full detail in P
                      The \code{subset} argument specifies whether and how the fit object should be subsetted.
                      Ideally, only genes that are truly differentially expressed for one or both of the contrasts should be used estimate the biological correlation.
                      The default is \code{"all"}, which uses all genes in the fit object to estimate the biological correlation.
                     -The option \code{"Fpval"} chooses genes based on how many F-test p-values are estimated to be truly significant using the method \code{propNotDE}.
                     +The option \code{"Fpval"} chooses genes based on how many F-test p-values are estimated to be truly significant using the function \code{propTrueNull}.
                      This should capture genes that display any evidence of differential expression in either of the two contrasts.
                      The options \code{"p.union"} and \code{"p.int"} are based on the moderated t p-values from both contrasts.
                     -From the \code{propNotDE} method an estimate of the number of p-values truly significant in either of the two contrasts can be obtained.
                     +From the \code{propTrueNull} function an estimate of the number of p-values truly significant in either of the two contrasts can be obtained.
                      "p.union" takes the union of these genes and \code{"p.int"} takes the intersection of these genes.
                      The other options, \code{"logFC"} and \code{"predFC"} subsets on genes that attain a logFC or predFC at least as large as the 90th percentile of the log fold changes or predictive log fold changes on the absolute scale.

man/goana.Rd

History View file @ c05cb5963

@@ -1,48 +1,97 @@
                      \name{goana}
                      \alias{goana}
                     +\alias{goana.default}
                     +\alias{goana.MArrayLM}
                      \title{Gene Ontology Analysis of Differentially Expressed Genes}
                      \description{
                     -Hypergeometric tests on up and down differentially expressed genes for over-representation of GO Terms.
                     +Fisher's exact test or Wallenius' noncentral hypergeometric test on differentially expressed genes for over-representation of gene ontology (GO) terms.
+                     }
                      \usage{
                     -goana(fit, coef = ncol(fit), geneid = "GeneID", FDR = 0.05, species = "Hs")
                     +\method{goana}{default}(de, universe = NULL, species = "Hs", weights = NULL, \dots)
                     +\method{goana}{MArrayLM}(de, coef = ncol(de), geneid = rownames(de), FDR = 0.05,
                     +      species = "Hs", trend = FALSE, \dots)
+                     }
                      \arguments{
                     -  \item{fit}{should be an object of class \code{MArrayLM} as produced by \code{lmFit} and \code{eBayes}.}
                     +  \item{de}{Differentially expressed genes.  Can be a \code{MArrayLM} fit object, or a vector or list of Entrez Gene IDs of differentially expressed genes.}
                     +  \item{universe}{vector specifying the Entrez Gene identifiers of the universe. If \code{NULL}, all Entrez Gene IDs with any gene ontology annotation will be the universe.}
                     +  \item{species}{character string specifying the species. Possible values are \code{"Hs"}, \code{"Mm"}, \code{"Rn"} or \code{"Dm"}.}
                     +  \item{weights}{probability weighting of the universe.}
                        \item{coef}{column number or column name specifying which coefficient or contrast of the linear model is of interest.}
                     -  \item{geneid}{Entrez gene identifiers. Either a vector of length nrow(fit) or the name of the column of fit$genes containing the Entrez Gene IDs.}
                     +  \item{geneid}{Entrez Gene identifiers. Either a vector of length \code{nrow(de)} or the name of the column of \code{de$genes} containing the Entrez Gene IDs.}
                        \item{FDR}{numeric. False discovery rate of differentially expressed genes less than this cutoff.}
                     -  \item{species}{character string specifying the species. Possible values are \code{"Hs"}, \code{"Mm"}, \code{"Rn"} or \code{"Dm"}.}
                     +  \item{trend}{adjust analysis for gene length or abundance?
                     +  Can be logical, or a numeric vector of covariate values, or the name of the column of \code{de$genes} containing the covariate values.
                     +  If \code{TRUE}, then \code{de$Amean} is used as the covariate.}
                     +  \item{\dots}{further arguments.}
+                     }
                      \details{
                     -This function is used in conjunction with \code{\link{lmFit}}, \code{\link{eBayes}} and \code{\link{topGO}}.
                     +Performs a Gene Ontology enrichment analysis for one for more gene lists using the appropriate Bioconductor organism package.
                     +The gene lists must be supplied as Entrez Gene IDs.
+                    +
                     +If applied to a linear model fit, then the differentially expressed genes are extracted automatically from the fit object, and analyses are done separately for the up and down significant genes.
+                    +
                     +If \code{trend=FALSE}, the function computes one-sided hypergeometric tests.
                     +If \code{trend=TRUE} or a covariate is supplied, then a trend is fitted to the differential expression results and the method of Young et al (2010) is used to adjust for this trend.
                     +The adjusted test uses Wallenius' noncentral hypergeometric distribution.
+                     }
                      \value{
                     -  A dataframe with rows for GO IDs and the following columns:
                     +  A dataframe with rows for GO IDs and the following possible columns:
                        \item{Term}{GO terms.}
                        \item{Ont}{GO term ontology from BP, CC and MF.}
                        \item{N}{Number of genes in the GO term.}
                        \item{Up}{Number of up regulated differentially expressed genes for testing.}
                        \item{Down}{Number of down regulated differentially expressed genes for testing.}
                     -  \item{P.Up}{P value from hypergeometric test of up regulated differentially expressed genes for over representation of GO terms.}
                     -  \item{P.Down}{P value from hypergeometric test of down regulated differentially expressed genes for over representation of GO terms.}
                     +  \item{DE1}{Number of differentially expressed genes for testing.}
                     +  \item{P.Up}{P value from hypergeometric test or Wallenius' noncentral hypergeometric test of up regulated differentially expressed genes for over representation of GO terms.}
                     +  \item{P.Down}{P value from hypergeometric test or Wallenius' noncentral hypergeometric test of down regulated differentially expressed genes for over representation of GO terms.}
                     +  \item{P.DE1}{P value from hypergeometric test or Wallenius' noncentral hypergeometric test of differentially expressed genes for over representation of GO terms.}
                     +}
+                    +
                     +\references{
                     +  Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A. (2010).
                     +  Gene ontology analysis for RNA-seq: accounting for selection bias.
                     +  \emph{Genome Biology} 11, R14.
                     +  \url{http://genomebiology.com/2010/11/2/R14}
+                     }
                      \seealso{
                      \code{\link{topGO}}
                     -The gostats package also does GO analyses with some extra options.
                     +The goseq package implements a similar GO analysis.
                     +The goseq version will work with a variety of gene identifiers, not only Entrez Gene as here, and includes a database of gene length information for various species.
+                    +
                     +The gostats package also does GO analyses with some different options.
+                     }
                      \author{Gordon Smyth and Yifang Hu}
                      \examples{
                      \dontrun{
                     -fit <- lmFit(y,design)
                     +fit <- lmFit(y, design)
                      fit <- eBayes(fit)
                     -go.results <- goana(fit)
                     -topGO(go.results, sort = "up")
                     -topGO(go.results, sort = "down")
+                    +
                     +# goana without adjusting for gene length bias
                     +go.fisher <- goana(fit)
                     +topGO(go.fisher, sort = "up")
                     +topGO(go.fisher, sort = "down")
+                    +
                     +# goana adjusting for gene length bias
                     +go.len <- goana(fit, geneid = "GeneID", trend = "Length")
                     +topGO(go.len, sort = "up")
                     +topGO(go.len, sort = "down")
+                    +
                     +# goana adjusting for gene abundance
                     +go.abund <- goana(fit, geneid = "GeneID", trend = TRUE)
                     +topGO(go.abund, sort = "up")
                     +topGO(go.abund, sort = "down")
+                    +
                     +# goana.default
                     +go.de <- goana(list(DE1 = EG.DE1, DE2 = EG.DE2, DE3 = EG.DE3))
                     +topGO(go.de, sort = "DE1")
                     +topGO(go.de, sort = "DE2")
                     +topGO(go.de, ontology = "BP", sort = "DE3")
                     +topGO(go.de, ontology = "CC", sort = "DE3")
                     +topGO(go.de, ontology = "MF", sort = "DE3")
+                     }

man/plotSplice.Rd

History View file @ c05cb5963

@@ -1,22 +1,24 @@
                     -\title{Plot exons on differentially spliced gene}
                     +\title{Differential splicing plot}
                      \name{plotSplice}
                      \alias{plotSplice}
                      \description{
                     -Plot exons of differentially spliced gene.
                     +Plot relative log-fold changes by exons for the specified gene.
+                     }
                      \usage{
                     -plotSplice(fit, coef=ncol(fit), geneid=NULL, rank=1L, FDR = 0.05)
                     +plotSplice(fit, coef=ncol(fit), geneid=NULL, genecolname=NULL, rank=1L, FDR = 0.05)
+                     }
                      \arguments{
                        \item{fit}{\code{MArrayLM} fit object produced by \code{diffSplice}.}
                        \item{coef}{the coefficient (column) of fit for which differentially splicing is assessed.}
                        \item{geneid}{character string, ID of the gene to plot.}
                     +  \item{genecolname}{column name of \code{fit$genes} containing gene IDs. Defaults to \code{fit$genecolname}.}
                        \item{rank}{integer, if \code{geneid=NULL} then this ranked gene will be plotted.}
                        \item{FDR}{numeric, mark exons with false discovery rate less than this cutoff.}
+                     }
                      \details{
                     -Plots interaction log2-fold-change by exon for the specified gene.
                     +Plot relative log2-fold-changes by exon for the specified gene.
                     +The relative logFC is the difference between the exon's logFC and the overall logFC for the gene, as computed by \code{diffSplice}.
+                     }
                      \value{A plot is created on the current graphics device.}
@@ -27,3 +29,4 @@ Plots interaction log2-fold-change by exon for the specified gene.
                      An overview of diagnostic functions available in LIMMA is given in \link{09.Diagnostics}.
+                     }
                      \examples{# See diffSplice}
                     +\keyword{rna-seq}

man/plotma.Rd

History View file @ c05cb5963

@@ -10,16 +10,15 @@
                      Creates an MA-plot with color coding for control spots.
+                     }
                      \usage{
                     -\method{plotMA}{default}(MA, array = 1, xlab = "A", ylab = "M", main = colnames(MA)[array],
                     -       xlim = NULL, ylim = NULL, status, values, pch, col, cex, legend = TRUE,
                     -       zero.weights = FALSE, ...)
                     -\method{plotMA}{MArrayLM}(MA, coef = ncol(MA), xlab = "AveExpr", ylab = "logFC",
                     -       main = colnames(MA)[coef], xlim = NULL, ylim = NULL, status, values, pch, col,
                     -       cex, legend = TRUE, zero.weights = FALSE, ...)
                     +\method{plotMA}{default}(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL,
                     +       ylim=NULL, status=NULL, values=NULL, pch=NULL, col=NULL, cex=NULL, legend=TRUE, \dots)
                     +\method{plotMA}{MArrayLM}(MA, coef=ncol(MA), xlab="AveExpr", ylab="logFC",
                     +       main=colnames(MA)[coef], xlim=NULL, ylim=NULL, status=NULL, values=NULL, pch=NULL,
                     +       col=NULL, cex=NULL, legend=TRUE, zero.weights=FALSE, \dots)
+                     }
                      \arguments{
                     -  \item{MA}{an \code{RGList}, \code{MAList}, \code{EList} or \code{MArrayLM} object.
                     -  Alternatively a \code{matrix} or \code{ExpressionSet} object.}
                     +  \item{MA}{an \code{RGList}, \code{MAList}, \code{EList}, \code{ExpressionSet} or \code{MArrayLM} object.
                     +  Alternatively a numeric \code{matrix}.}
                        \item{array}{integer giving the array to be plotted.}
                        \item{coef}{integer giving the linear model coefficient to be plotted.}
                        \item{xlab}{character string giving label for x-axis}
@@ -28,7 +27,7 @@ Creates an MA-plot with color coding for control spots.
                        \item{xlim}{numeric vector of length 2 giving limits for x-axis, defaults to min and max of the data}
                        \item{ylim}{numeric vector of length 2 giving limits for y-axis, defaults to min and max of the data}
                        \item{status}{character vector giving the control status of each spot on the array, of same length as the number of rows of \code{MA$M}.
                     -  If omitted, all points are plotted in the default color, symbol and size.}
                     +  If omitted, all points are plotted in the default color (\code{"black"}), symbol (\code{pch=16}) and size (\code{cex=0.3}).}
                        \item{values}{character vector giving values of \code{status} to be highlighted on the plot. Defaults to unique values of \code{status}.
                        Ignored if there is no \code{status} vector.}
                        \item{pch}{vector or list of plotting characters. Default is integer code 16 which gives a solid circle.
@@ -36,25 +35,27 @@ Creates an MA-plot with color coding for control spots.
                        \item{col}{numeric or character vector of colors, of the same length as \code{values}. Defaults to \code{1:length(values)}.
                        Ignored if there is no \code{status} vector.}
                        \item{cex}{numeric vector of plot symbol expansions, of the the same length as \code{values}.
                     -  Defaults to 0.3 for the most common status value and 1 for the others.
                     +  Defaults is 1.
                        Ignored if there is no \code{status} vector.}
                        \item{legend}{logical, should a legend of plotting symbols and colors be included. Can also be a character string giving position to place legend. Ignored if there is no \code{status} vector.}
                        \item{zero.weights}{logical, should spots with zero or negative weights be plotted?}
                     -  \item{...}{any other arguments are passed to \code{plot}}
                     +  \item{\dots}{any other arguments are passed to \code{plot}}
+                     }
                      \details{
                      An MA-plot is a plot of log-intensity ratios (M-values) versus log-intensity averages (A-values).
                     -If \code{MA} is an \code{RGList} or \code{MAList} then this function produces an ordinary within-array MA-plot.
                     -If \code{MA} is an \code{MArrayLM} object, then the plot is an fitted model MA-plot in which the estimated coefficient is on the y-axis and the average A-value is on the x-axis.
                     +For two color data objects, a within-array MA-plot is produced with the M and A values computed from the two channels for the specified array.
                     +This is the same as a mean-difference plot (\code{\link{mdplot}}) with the red and green log2-intensities of the array providing the two columns.
+                    +
                     +For single channel data objects, then a between-array MA-plot is produced.
                     +An articifial array is produced by averaging all the arrays other than the array specified.
                     +A mean-difference plot is then producing from the specified array and the artificial array.
                     +Note that this procedure reduces to an ordinary mean-difference plot when there are just two arrays total.
                     -If \code{MA} is a \code{matrix} or \code{ExpressionSet} object, then this function produces a between-array MA-plot.
                     -In this case the A-values in the plot are the average log-intensities across the arrays and the M-values are the deviations of the log-intensities for the specified array from the average.
                     -If there are more than five arays, then the average is computed robustly using medians.
                     -With five or fewer arrays, it is computed by means.
                     +If \code{MA} is an \code{MArrayLM} object, then the plot is an fitted model MA-plot in which the estimated coefficient is on the y-axis and the average A-value is on the x-axis.
                      The \code{status} vector is intended to specify the control status of each spot, for example "gene", "ratio control", "house keeping gene", "buffer" and so on.
                     -The vector is usually computed using the function \code{\link{controlStatus}} and a spot-types file.
                     +The vector is often computed using the function \code{\link{controlStatus}} and a spot-types file.
                      However the function may be used to highlight any subset of spots.
                      The \code{status} can be included as the component \code{MA$genes$Status} instead of being passed as an argument to \code{plotMA}.
@@ -77,8 +78,7 @@ status[4:6] <- "M=3"
                      MA$M[4:6] <- 3
                      status[7:9] <- "M=-3"
                      MA$M[7:9] <- -3
                     -plotMA(MA,main="MA-Plot with Simulated Data",
                     -       status=status,values=c("M=0","M=3","M=-3"),col=c("blue","red","green"))
                     +plotMA(MA,main="MA-Plot with Simulated Data",status=status,values=c("M=0","M=3","M=-3"),col=c("blue","red","green"))
                      #  Same as above
                      attr(status,"values") <- c("M=0","M=3","M=-3")

man/topSplice.Rd

History View file @ c05cb5963

@@ -5,18 +5,27 @@
                      Top table ranking the most differentially spliced genes or exons.
+                     }
                      \usage{
                     -topSplice(fit, coef=ncol(fit), level="exon", number=10, FDR=1)
                     +topSplice(fit, coef=ncol(fit), level="hybrid", number=10, FDR=1)
+                     }
                      \arguments{
                        \item{fit}{\code{MArrayLM} fit object produced by \code{diffSplice}.}
                        \item{coef}{the coefficient (column) of fit for which differentially splicing is assessed.}
                     -  \item{level}{character string, should the table be by \code{"exon"} or by \code{"gene"}.}
                     +  \item{level}{character string, possible values are \code{"gene"}, \code{"exon} or \code{"hybrid}.
                     +    \code{"gene"} gives F-tests for each gene.
                     +    \code{"exon"} gives t-tests for each exon.
                     +    \code{"hybrid"} gives gene level results with p-values derived from the exon-level tests.}
                        \item{number}{integer, maximum number of rows to output.}
                        \item{FDR}{numeric, only show exons or genes with false discovery rate less than this cutoff.}
+                     }
                      \details{
                     -Ranks genes by the Plots interaction log2-fold-change by exon for the specified gene.
                     +Ranks genes or exons by evidence for differential splicing.
                     +The gene-level test is an F-test for any differences in exon usage between experimental conditions.
                     +The exon-level tests are t-tests for differences between each exon and all other exons for the same gene.
+                    +
                     +The hybrid testing method processes the exon-level p-values to give an overall call of differential splicing for each gene.
                     +It returns the minimum Simes-adjusted p-values for each gene.
                     +It is likely to be more powerful than the genewise F-tests if only a minority of exons for a gene are differentially spliced.
+                     }
                      \value{A data.frame with any annotation columns found in \code{fit} plus the following columns