Title General Sample Size and Power Analysis for Microarray and Next-Generation Sequencing Data
Maintainer Maarten van Iterson <mviterson@gmail.com>
Description General Sample size and power analysis for microarray and
Depends R (>= 2.12), methods, qvalue, lattice, limma
Suggests BiocStyle, genefilter, edgeR, DESeq
Collate 'zzz.R' 'numericalintegration.R' 'trimmingbinning.R'
'DistributionClass.R' 'PilotDataClass.R' 'SampleSizeClass.R''bitriangular.R' 'deconvolution.R' 'conjugategradient.R''Ferreira.R' 'tikhonov.R' 'powerandsamplesize.R'
dbitri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nutrigenomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
pbitri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
qbitri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
rbitri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
show-methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simdat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Density function for a bi-triangular random variable.
Density function for a bi-triangular random variable.
dbitri(x, a = log2(1.2), b = log2(4), m = log2(2))
location of point . Default a = log2(1.2).
location of point . Default b = log2(4).
location of the midpoint of the triangle. Default m = log2(2).
For more details see M. Langaas et al. JRSS B 2005.
Test statistics derived from a deepSAGE experiment
Vector of test statistics obtained by performing a likelihood ratio test using edgeR
’t Hoen, P.A.C. Ariyurek, Y. Thygesen, H.H. Vreugdenhil, E. Vossen, R.H.A.M. de Menezes, R.X. Boer, J.M. van Ommen, G.B. and den Dunnen, J.T., Deep Sequencing-based Expression analysisshows Major Advances in Robustness, Resolution and Inter-lab Portability over Five MicroarrayPlatforms, Nucleic Acids Research, 2008.
Test statistics from a Nutrigenomics gene expression profiling experi-ment
There are five sets of test statistics each represents a different compound and exposure time. Teststatistics were obtained by using an empirical Bayes linear model.
A data frame with 16539 test statistics for five experiments.
First row indicates the effective sample size of the experiment. Column names refer to thecompound and exposure time (see details).
In this experiment the outcome of specific PPAR-alpha activation on murine small intestinal geneexpression was examined using Affymetrix GeneChip Mouse 430 2.0 arrays. PPAR-alpha wasactivated by several PPAR-alpha-agonists that differed in activating potency. In this paper the dataof three agonists were used, namely Wy14,643, fenofibrate and trilinolenin (C18:3). The first twocompounds belong to the fibrate class of drugs that are widely prescribed to treat dyslipidemia,whereas trilinolenin is an agonist frequently found in the human diet. For intestinal PPAR-alpha,Wy14,643 is the most potent agonist followed by C18:3 and fenofibrate. Since time of exposurealso affects the effect size, intestines were collected 6 hrs (all three agonists) or 5 days (Wy14,643and fenofibrate only) after exposure.
van Iterson, M. ’t Hoen, P.A.C. Pedotti, P. Hooiveld, G.J.E.J. den Dunnen, J.T. van Ommen, G.J.B. Boer, J.M. Menezes, R.X., Relative power and sample size analysis on gene expression profilingdata, BMC Genomics, (2009).
Distribution function for a bi-triangular random variable.
Distribution function for a bi-triangular random variable.
pbitri(q, a = log2(1.2), b = log2(4), m = log2(2))
location of point, . Default a = log2(1.2).
location of point, . Default b = log2(4).
location of the midpoint of the triangle. Default m = log2(2).
For more details see M. Langaas et al. JRSS B 2005.
User friendly interface to class "PilotData"
User friendly interface to class "PilotData"
pilotData(statistics = NULL, samplesize = NULL,
distribution = c("norm", "t", "f", "chisq"), .)
total sample size of the pilot-data or effective sample size in two-group case (seeDetails for more information).
type of the null/alternative distribution, one of ’norm’, ’t’, ’f’ or ’chisq’
additional arguments for the distribution like degrees of freedom
In the two-group case the effective sample size is defined as the square-root of the inverse of 1/n1 +1/n2.
pd <- pilotData(statistics=rnorm(100), samplesize=10, distribution="norm")pdplot(pd)
Methods for Function plot in Package SSPA
Plot function for objects of class PilotData and SampleSize
signature(x = "PilotData") Diagonstic plots of the PilotData.
signature(x = "SampleSize") Plot the estimated density of effect sizes.
Predict power for given vector of sample sizes
Predict power for given vector of sample sizes
predictpower(object, samplesizes, alpha = 0.1,
Quantile function for a bi-triangular random variable.
Quantile function for a bi-triangular random variable.
qbitri(p, a = log2(1.2), b = log2(4), m = log2(2))
location of point, . Default a = log2(1.2).
location of point, . Default b = log2(4).
location of the midpoint of the triangle. Default m = log2(2).
For more details see M. Langaas et al. JRSS B 2005.
Random generation of bitriangular distributed values.
Random generation of bitriangular distributed values.
rbitri(n, a = log2(1.2), b = log2(4), m = log2(2))
location of point, . Default a = log2(1.2).
location of point, . Default b = log2(4).
location of the midpoint of the triangle. Default m = log2(2).
For more details see M. Langaas et al. JRSS B 2005.
hist(rbitri(100), freq=FALSE)curve(dbitri, add=TRUE)
User friendly interface to class ’SampleSize’
User friendly interface to class "SampleSize"
method = c("deconv", "congrad", "tikhonov", "ferreira"),control = list(from = -6, to = 6, resolution = 2^9))
estimation method one of ’deconv’, ’congrad’, ’tikhonov’ or ’ferreira’. See ’De-tails’.
A list of control parameters. See ’Details’.
The default method is ’deconv’ which is an kernel deconvolution density estimator implementatedusing The ’nncg’ is a nonnegative conjugate gradient algorithm based on R’s implementationsee ’tikonov’ implements ridge-regression with optimal penalty selection using the L-curveapproach. Higher order penalties are possible as well using a transformation to standard form (seeHansen).
The ’control’ argument is a list that can supply any of the following components. Per method logicalchecks are performed.
– pi0Method:the pi0 estimation method one of ’Langaas’, ’Storey’, ’Ferreira’, ’Userde-
– pi0:if method = ’ferreira’ grid pi0-value need to be suppled e.g. seq(0.1, 0.99, 0.01)
– adjust:Default TRUE, adjust pi0 esitmate if density of effect size is somewhere negative.
– a:Adjust pi0 better approach suggested by Efron. Symmetric range around zero of size
– bandwith:Default NULL uses 1/sqrt(log(length(statistics)))
– kernel:Either ’fan’, ’wand’, ’sinc’ kernels can be used.
– from:Density of effect sizes should be estimated from = -6
– resolution:Density of effect sizes should be estimated on 2^9 points.
– verbose:Default FALSE if TRUE additional information is printed to the console.
– integration:’midpoint’, ’trapezoidal’, ’simpson’
– scale:’pdfstat’, ’cdfstat’, ’cdfpval’
– verbose:Default FALSE if TRUE additional information is printed to the console.
– integration:’midpoint’, ’trapezoidal’, ’simpson’
– scale:’pdfstat’, ’cdfstat’, ’cdfpval’
– method:’lcurve’, ’gcv’, ’aic’– log:TRUE– penalty:0– lambda:10^seq(-10, 10, length=100)– verbose:Default FALSE if TRUE additional information is printed to the console.
van Iterson, M., P. ’t Hoen, P. Pedotti, G. Hooiveld, J. den Dunnen, G. van Ommen, J. Boer, and R. de Menezes (2009): ’Relative power and sample size analysis on gene expression profiling data,’BMC Genomics, 10, 439–449.
Ferreira, J. and A. Zwinderman (2006a): ’Approximate Power and Sample Size Calculations withthe Benjamini-Hochberg Method,’ The International Journal of Biostatistics, 2, 1.
Ferreira, J. and A. Zwinderman (2006b): ’Approximate Sample Size Calculations with MicroarrayData: An Illustration,’ Statistical Applications in Genetics and Molecular Biology, 5, 1.
Hansen, P. (2010): Discrete Inverse Problems: Insight and Algorithms, SIAM: Fun- damentals ofalgorithms series.
Langaas, M., B. Lindqvist, and E. Ferkingstad (2005): ’Estimating the proportion of true nullhypotheses, with application to DNA microarray data,’ Journal of the Royal Statistical SocietySeries B, 67, 555–572.
Storey, J. (2003): ’The positive false discovery rate: A bayesian interpretation and the q-value,’Annals of Statistics, 31, 2013–2035.
m <- 5000 ##number of genesJ <- 10 ##sample size per grouppi0 <- 0.8 ##proportion of non-differentially expressed genesm0 <- as.integer(m*pi0)mu <- rbitri(m - m0, a = log2(1.2), b = log2(4), m = log2(2)) #effect size distributiondata <- simdat(mu, m=m, pi0=pi0, J=J, noise=NULL)library(genefilter)stat <- rowttests(data, factor(rep(c(0, 1), each=J)), tstatOnly=TRUE)$statisticpd <- pilotData(statistics=stat, samplesize=sqrt(J/2), distribution=norm)ss <- sampleSize(pd, method=deconv)plot(ss)
General show method for Classes PilotData and SampleSize
Methods for function show in package SSPA
signature(object = "PilotData") Show the content of a PilotData-object in a userfriendly
signature(object = "SampleSize") Show the content of a SampleSize-object in a userfriendly
Generate simulated microarray data using the bitriangular distribu-tion.
simdat(mu, m, pi0, J, nullX = function(x) rnorm(x, 0, 1),
nullY = function(x) rnorm(x, 0, 1), noise = 0.01)
vector of effect sizes drawn from the bitriangular distribution.
number of features (genes, tags, .).
proportion of nondifferentially expressed features.
the distribution of nondifferentially expressed features.
the distribution of nondifferentially expressed features.
standard deviation of the additive noise.
Matrix of size m x (2J), containing the simulated values.
##generate two-group microarray datam <- 5000 ##number of genesJ <- 10 ##sample size per grouppi0 <- 0.8 ##proportion of non-differentially expressed genesm0 <- as.integer(m*pi0)mu <- rbitri(m - m0, a = log2(1.2), b = log2(4), m = log2(2)) #effect size distributiondata <- simdat(mu, m=m, pi0=pi0, J=J, noise=0.01)
pbitri, pilotData, plot,ANY-method (plot-methods), plot,PilotData-method (plot-methods), plot,SampleSize-method (plot-methods), plot-methods, predictpower,
sampleSize, show,ANY-method (show-methods), show,PilotData-method (show-methods), show,SampleSize-method (show-methods),
Long-term stability of the anti-influenza AChristoph Scholtissek, Robert G. Webster * Department of Virology and Molecular Biology , St . Jude Children ’ s Research Hospital , 332 N . Lauderdale , P . O . Box 318, Memphis ,Received 14 June 1997; accepted 5 September 1997 Abstract Amantadine and rimantadine hydrochloride were tested for stability after storage at different tempe
Health Clinic Panui Kia Ora Tatou K o k i r i M a r a e H a u o r a a n d S o c i a l S e r v i c e s By now we should all be feeling refreshed and moving into the me- dium/fast paced lane of life. We are almost into the first quarter of this year already. Some of us have chosen to take a new path and pace in life than that which exists at Kokiri Hauora and Social Ser- Feb