Browse code

Updated according to second review

Divyagash authored on 09/02/2018 22:42:04
Showing 10 changed files

... ...
@@ -1,7 +1,7 @@
1 1
 Package: scmeth
2 2
 Type: Package
3 3
 Title: Functions to conduct quality control analysis in methylation data
4
-Version: 0.99.17
4
+Version: 0.99.18
5 5
 Author: Divy Kangeyan <[email protected]>
6 6
 Maintainer: Divy Kangeyan <[email protected]>
7 7
 Depends: R (>= 3.5.0)
... ...
@@ -22,6 +22,7 @@ coverage <- function(bs,subSample=1e6,offset=50000) {
22 22
 
23 23
     if (nCpGs<(subSample+offset)){
24 24
         bs <- bs
25
+        subSample <- nCpGs
25 26
     }else{
26 27
         bs <- bs[offset:(subSample+offset)]
27 28
     }
... ...
@@ -2,7 +2,7 @@
2 2
 #'of CpGs observed in certain base pair long region.
3 3
 #'@param bs bsseq object
4 4
 #'@param organism scientific name of the organism of interest,
5
-#'e.g. Mus musculus or Homo sapiens
5
+#'e.g. Mmusculus or Hsapiens
6 6
 #'@param windowLength Length of the window to calculate the density
7 7
 #'Default value for window length is 1000 basepairs.
8 8
 #'@return Data frame with sample name and coverage in repeat masker regions
... ...
@@ -34,6 +34,7 @@ cpgDiscretization <- function(bs,subSample=1e6,offset=50000,coverageVec=NULL){
34 34
 
35 35
     if (nCpGs<(subSample+offset)){
36 36
         bs <- bs
37
+        subSample <- nCpGs
37 38
     }else{
38 39
         bs <- bs[offset:(subSample+offset)]
39 40
     }
... ...
@@ -30,6 +30,7 @@ downsample <- function(bs,dsRates = c(0.01,0.02,0.05, seq(0.1,0.9,0.1)),subSampl
30 30
 
31 31
     if (nCpGs<(subSample+offset)){
32 32
         bs <- bs
33
+        subSample <- nCpGs
33 34
     }else{
34 35
         bs <- bs[offset:(subSample+offset)]
35 36
     }
... ...
@@ -1,7 +1,7 @@
1 1
 #'Provides Coverage metrics in the repeat masker region
2 2
 #'@param bs bsseq object
3 3
 #'@param organism scientific name of the organism of interest,
4
-#'e.g. Mus musculus or Homo sapiens
4
+#'e.g. Mmusculus or Hsapiens
5 5
 #'@param genome reference alignment, i.e. mm10 or hg38
6 6
 #'@return Data frame with sample name and coverage in repeat masker regions
7 7
 #'@examples
... ...
@@ -11,7 +11,7 @@ cpgDensity(bs, organism, windowLength = 1000)
11 11
 \item{bs}{bsseq object}
12 12
 
13 13
 \item{organism}{scientific name of the organism of interest,
14
-e.g. Mus musculus or Homo sapiens}
14
+e.g. Mmusculus or Hsapiens}
15 15
 
16 16
 \item{windowLength}{Length of the window to calculate the density
17 17
 Default value for window length is 1000 basepairs.}
... ...
@@ -10,7 +10,7 @@ repMask(bs, organism, genome)
10 10
 \item{bs}{bsseq object}
11 11
 
12 12
 \item{organism}{scientific name of the organism of interest,
13
-e.g. Mus musculus or Homo sapiens}
13
+e.g. Mmusculus or Hsapiens}
14 14
 
15 15
 \item{genome}{reference alignment, i.e. mm10 or hg38}
16 16
 }
17 17
new file mode 100644
... ...
@@ -0,0 +1,20 @@
1
+% Generated by roxygen2: do not edit by hand
2
+% Please edit documentation in R/scmeth.R
3
+\docType{package}
4
+\name{scmeth}
5
+\alias{scmeth}
6
+\alias{scmeth-package}
7
+\title{scmeth: a package to conduct quality control analysis for methylation data.
8
+Most functions can be applied to both bulk and single-cell methylation
9
+while other functions are specific to single-cell methylation data.
10
+scmeth is especially customized to use the output from the FireCloud
11
+implementation of methylation pipeline to produce comprehensive
12
+quality control report}
13
+\description{
14
+scmeth: a package to conduct quality control analysis for methylation data.
15
+Most functions can be applied to both bulk and single-cell methylation
16
+while other functions are specific to single-cell methylation data.
17
+scmeth is especially customized to use the output from the FireCloud
18
+implementation of methylation pipeline to produce comprehensive
19
+quality control report
20
+}
... ...
@@ -31,23 +31,22 @@ Contents
31 31
 Though a small chemical change in the genome, DNA methylation has significant
32 32
 impact in several diseases, developmental processes and other biological 
33 33
 changes. Hence methylation data should be analyzed carefully to gain 
34
-biological insights. **scmeth** package offers a few functions to asseess
34
+biological insights. **scmeth** package offers a few functions to assess
35 35
 the quality of the methylation data. 
36 36
 </p>
37 37
 
38 38
 <p style="text-align: justify;">
39 39
 This bioconductor package contains functions to perform quality control and 
40 40
 preprocessing analysis for single-cell methylation data. *scmeth* is 
41
-especially customized to use the output from the fireCloud implementation of 
42
-methylation pipeline. For now only human and mouse genomes are supported in 
43
-this package but in the future we will expand to other organisms. In addition 
44
-to individual functions, **report** function in the package provides all 
45
-inclusive report using most of the functions. If users prefer
46
-they can just use the **report** function to gain summary of their data.
41
+especially customized to use the output from the FireCloud implementation of 
42
+methylation pipeline. In addition to individual functions, **report** function
43
+in the package provides all inclusive report using most of the functions. If
44
+users prefer they can just use the **report** function to gain summary of 
45
+their data.
47 46
 </p>
48 47
 
49
-2. Installation
48
+2. Installation and package loading
49
+------------------------------------
50 50
 **scmeth** is available in bioconductor and can be downloaded using the 
51 51
 following commands
52 52
 ```{r, eval=FALSE}
... ...
@@ -55,14 +54,19 @@ source("http://bioconductor.org/biocLite.R")
55 54
 biocLite("scmeth")
56 55
 ```
57 56
 
57
+Load the package
58
+```{r, warning=FALSE, message=FALSE}
59
+library(scmeth)
60
+```
61
+
58 62
 3. Input files
59 63
 ---------------------
60 64
 <p style="text-align: justify;">
61 65
 
62 66
 
63
-Main input for most of the function is a *bsseq* object. In the fireCloud 
67
+Main input for most of the function is a *bsseq* object. In the FireCloud 
64 68
 implementation it is stored as hdf5 file which can be read via 
65
-*loadHDF5SummarizedExperiment* function in *SummarizedExperiment* package.
69
+*loadHDF5SummarizedExperiment* function in *HDF5Array* package.
66 70
 Code chunk below shows how it can be loaded.
67 71
 </p>
68 72
 
... ...
@@ -99,13 +103,12 @@ when subsetting from the beginning of the data.
99 103
 </p>
100 104
 
101 105
 ```{r, eval=FALSE}
102
-library(scmeth)
103
-scmeth::report(bsObject, '~/Documents',Hsapiens,"hg38")
106
+report(bsObject, '~/Documents',Hsapiens,"hg38")
104 107
 ```
105 108
 
106 109
 
107 110
 <p style="text-align: justify;">
108
-Command above generayed an html report named *qcReport.html*. It will be stored
111
+Command above generated an html report named *qcReport.html*. It will be stored
109 112
 in the indicated directory. 
110 113
 </p>
111 114
 
... ...
@@ -129,8 +132,6 @@ sample. **coverage** function can be used to get this information.
129 132
 </p>
130 133
 Loading the data
131 134
 ```{r,  warning=FALSE, message=FALSE, comment=FALSE}
132
-#
133
-library(scmeth)
134 135
 directory <- system.file("extdata","bismark_data",package='scmeth')
135 136
 bsObject <- HDF5Array::loadHDF5SummarizedExperiment(directory)
136 137
 ```
... ...
@@ -145,7 +146,7 @@ succeeded. **readmetrics** function outputs a visualization showing number
145 146
 of reads seen in each samples and of those reads what proportion of 
146 147
 them were mapped to the reference genome. 
147 148
 ```{r,fig.width=6,fig.height=3}
148
-scmeth::readmetrics(bsObject)
149
+readmetrics(bsObject)
149 150
 ```
150 151
 
151 152
 ### repmask
... ...
@@ -163,7 +164,7 @@ the genome build information.
163 164
 ```{r, warning=FALSE,message=FALSE}
164 165
 library(BSgenome.Mmusculus.UCSC.mm10)
165 166
 load(system.file("extdata",'bsObject.rda',package='scmeth'))
166
-scmeth::repMask(bs,Mmusculus,"mm10")
167
+repMask(bs,Mmusculus,"mm10")
167 168
 ```
168 169
 
169 170
 ### Coverage by Chromosome
... ...
@@ -177,7 +178,7 @@ only the CpGs covered in chromosome 1 is shown.)
177 178
 </p>
178 179
 
179 180
 ```{r, warning=FALSE}
180
-scmeth::chromosomeCoverage(bsObject)
181
+chromosomeCoverage(bsObject)
181 182
 ```
182 183
 
183 184
 ### featureCoverage
... ...
@@ -195,7 +196,7 @@ that region.
195 196
 ```{r, warning=FALSE,message=FALSE}
196 197
 library(annotatr)
197 198
 featureList <- c('genes_exons','genes_introns')
198
-DT::datatable(scmeth::featureCoverage(bsObject,features=featureList,"hg38"))
199
+DT::datatable(featureCoverage(bsObject,features=featureList,"hg38"))
199 200
 ```
200 201
 </p>
201 202
 
... ...
@@ -217,7 +218,7 @@ obtained uniformly across the regions.
217 218
 
218 219
 ```{r,warning=FALSE,message=FALSE}
219 220
 library(BSgenome.Hsapiens.NCBI.GRCh38)
220
-DT::datatable(scmeth::cpgDensity(bsObject,Hsapiens,windowLength=1000))
221
+DT::datatable(cpgDensity(bsObject,Hsapiens,windowLength=1000))
221 222
 ```
222 223
 
223 224
 ### downsample
... ...
@@ -236,7 +237,7 @@ report renders this information into a plot. Downsampling rate ranges from
236 237
 </p>
237 238
 
238 239
 ```{r,warning=FALSE}
239
-DT::datatable(scmeth::downsample(bsObject))
240
+DT::datatable(downsample(bsObject))
240 241
 ```
241 242
 
242 243
 
... ...
@@ -248,13 +249,13 @@ However there could be fluctuations in the beginning or the end of the read due
248 249
 to the quality of the bases. Single cell sequencing samples also can show 
249 250
 jagged trend in the methylation bias plot due to low read count. Methylation
250 251
 bias can be assessed via **mbiasPlot** function. This function takes the mbias
251
-file generated from firecloud pipeline and generates the methylation bias plot.
252
+file generated from FireCloud pipeline and generates the methylation bias plot.
252 253
 </p>
253 254
 
254 255
 ```{r,warning=FALSE,message=FALSE,fig.width=6,fig.height=6}
255 256
 
256 257
 methylationBiasFile <- '2017-04-21_HG23KBCXY_2_AGGCAGAA_TATCTC_pe.M-bias.txt'
257
-scmeth::mbiasplot(mbiasFiles=system.file("extdata",methylationBiasFile,
258
+mbiasplot(mbiasFiles=system.file("extdata",methylationBiasFile,
258 259
                                          package='scmeth'))
259 260
 ```
260 261
 </p>
... ...
@@ -272,7 +273,7 @@ are large number of intermediate methylation this indicates there might be
272 273
 some error in sequencing. 
273 274
 </p>
274 275
 ```{r,warning=FALSE,message=FALSE,fig.width=6,fig.height=3}
275
-scmeth::methylationDist(bsObject)
276
+methylationDist(bsObject)
276 277
 ```
277 278
 
278 279
 
... ...
@@ -288,7 +289,7 @@ rate below 95% indicates some problem with sample preparation.
288 289
 sample.
289 290
 </p>
290 291
 ```{r,warning=FALSE,message=FALSE,fig.width=4,fig.height=6}
291
-scmeth::bsConversionPlot(bsObject)
292
+bsConversionPlot(bsObject)
292 293
 ```
293 294
 
294 295
 ```{r,warning=FALSE,message=FALSE}