Bioconductor Code: SplicingGraphs

Browse code

- compare() was renamed pcompare() in S4Vectors --> change code accordingly - use UTF-8 encoding in DESCRIPTION and Rd files so I can have my accents back

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@119306 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 12/07/2016 09:05:23
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 2ef41b9

...	...	@@ -147,7 +147,7 @@ reportReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
147	147	}
148	148
149	149	\author{
150		- H. Pages
	150	+ H. Pagès
151	151	}
152	152
153	153	\seealso{

Browse code

fix some broken links

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@98050 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 06/01/2015 07:13:31
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 95b4fa0

@@ -113,7 +113,7 @@ reportReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
+                                     }
                                      \value{
                                     -  A \link[IRanges]{DataFrame} object with one row per:
                                     +  A \link[S4Vectors]{DataFrame} object with one row per:
                                        \itemize{
                                          \item unique splicing graph edge, if \code{by="sgedge"};
                                          \item unique \emph{reduced} splicing graph edge, if \code{by="rsgedge"};

Browse code

arghh, forgot to escape % when using %in% in \examples section

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76172 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 01/05/2013 17:55:45
Showing 1 changed files

man/countReads-methods.Rd

History View file @ a97aaba

@@ -211,7 +211,7 @@ ambiguous_reads
                                      ## Reads that are ambiguous at the "rsgedge" level must also be
                                      ## ambiguous at the "sgedge" level:
                                     -stopifnot(all(ambiguous_reads$rsgedge %in% ambiguous_reads$sgedge))
                                     +stopifnot(all(ambiguous_reads$rsgedge \%in\% ambiguous_reads$sgedge))
                                      ## However, there is no reason why reads that are ambiguous at the
                                      ## "tx" level should also be ambiguous at the "sgedge" or "rsgedge"

Browse code

Put \pkg{} around package names in the man pages.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76137 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 30/04/2013 01:20:58
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 7bc3409

@@ -151,7 +151,7 @@ reportReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
 }
 
 \seealso{
-  This man page is part of the SplicingGraphs package.
+  This man page is part of the \pkg{SplicingGraphs} package.
   Please see \code{?`\link{SplicingGraphs-package}`} for an overview of the
   package and for an index of its man pages.
 }

Browse code

\tabular produces an ugly table. Using \preformatted instead.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76133 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 29/04/2013 22:17:15
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 88f0c8a

@@ -73,11 +73,12 @@ reportReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                          resolution is defined by combining the relationships between consecutive
                                          levels. All possible parent-child relationships are summarized in the
                                          following table:
                                     -    \tabular{rlll}{
                                     -                    \tab to: sgedge   \tab to: rsgedge  \tab to: tx      \cr
                                     -      from: rsgedge \tab one-to-many  \tab              \tab             \cr
                                     -      from: tx      \tab many-to-many \tab many-to-many \tab             \cr
                                     -      from: gene    \tab one-to-many  \tab one-to-many  \tab one-to-many \cr
                                     +    \preformatted{
                                     +                    | to: sgedge   | to: rsgedge  | to: tx
                                     +      --------------+--------------+--------------+------------
                                     +      from: rsgedge | one-to-many  |              |
                                     +      from: tx      | many-to-many | many-to-many |
                                     +      from: gene    | one-to-many  | one-to-many  | one-to-many
+                                         }
+                                       }

Browse code

- Rename getReads() -> reportReads(). - Complete man page for countReads()/reportReads().

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76132 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 29/04/2013 22:07:38
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 189b61d

@@ -11,18 +11,18 @@
+                                     }
                                      \description{
                                     -  \code{getReads} returns the reads assigned to a SplicingGraphs object,
                                     -  summarized either by splicing graph edge, \emph{reduced} splicing graph
                                     -  edge, transcript, or gene.
+                                    -
                                        \code{countReads} counts the reads assigned to a SplicingGraphs object.
                                        The counting can be done by splicing graph edge, \emph{reduced} splicing
                                        graph edge, transcript, or gene.
+                                    +
                                     +  \code{reportReads} is similar to \code{countReads} but returns right before
                                     +  the final counting step, that is, the returned DataFrame contains the reads
                                     +  instead of their counts.
+                                     }
                                      \usage{
                                     -getReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                      countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                     +reportReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
+                                     }
                                      \arguments{
@@ -30,14 +30,85 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                          A \link{SplicingGraphs} object.
+                                       }
                                        \item{by}{
                                     -    Summarize/count by splicing graph edge (\code{by="sgedge"}), by
                                     -    \emph{reduced} splicing graph edge (\code{by="rsgedge"}), by transcript
                                     -    (\code{by="tx"}), or by gene (\code{by="gene"}).
                                     +    Can be \code{"sgedge"}, \code{"rsgedge"}, \code{"tx"}, or \code{"gene"}.
                                     +    Specifies the \emph{level of resolution} that summarization should be
                                     +    performed at. See Details section below.
+                                       }
+                                     }
                                      \details{
                                     -  TODO
                                     +  \subsection{Levels of resolution}{
                                     +    \code{countReads} and \code{reportReads} allow summarization of the reads
                                     +    at different levels of resolution. The level of resolution is determined
                                     +    by the type of feature that one chooses via the \code{by} argument.
                                     +    The supported resolutions are (from highest to lowest resolution):
                                     +    \enumerate{
                                     +      \item \code{by="sgedge"} for summarization at the splicing graph edge
                                     +            level (i.e. at the exons/intron level);
                                     +      \item \code{by="rsgedge"} for summarization at the \emph{reduced}
                                     +            splicing graph edge level;
                                     +      \item \code{by="tx"} for summarization at the transcript level;
                                     +      \item \code{by="gene"} for summarization at the gene level.
                                     +    }
                                     +  }
+                                    +
                                     +  \subsection{Relationship between levels of resolution}{
                                     +    There is a parent-child relationship between the features
                                     +    corresponding to a given level of resolution (the parent features)
                                     +    and those corresponding to a higher level of resolution (the child
                                     +    features).
+                                    +
                                     +    For example, in the case of the 2 first levels of resolution listed
                                     +    above, the parent-child relationship is the following: the parent
                                     +    features are the \emph{reduced} splicing graph edges, the child features
                                     +    are the splicing graph edges, and each parent feature is obtained by
                                     +    merging one or more child features together.
                                     +    Similarly, transcripts can be seen as parent features of \emph{reduced}
                                     +    splicing graph edges, and genes as parent features of transcripts.
                                     +    Note that, the rsgedge/sgedge and gene/tx relationships are one-to-many,
                                     +    but the tx/rsgedge relationship is many-to-many because a given edge can
                                     +    belong to more than one transcript.
+                                    +
                                     +    Finally the parent-child relationships between 2 arbitrary levels of
                                     +    resolution is defined by combining the relationships between consecutive
                                     +    levels. All possible parent-child relationships are summarized in the
                                     +    following table:
                                     +    \tabular{rlll}{
                                     +                    \tab to: sgedge   \tab to: rsgedge  \tab to: tx      \cr
                                     +      from: rsgedge \tab one-to-many  \tab              \tab             \cr
                                     +      from: tx      \tab many-to-many \tab many-to-many \tab             \cr
                                     +      from: gene    \tab one-to-many  \tab one-to-many  \tab one-to-many \cr
                                     +    }
                                     +  }
+                                    +
                                     +  \subsection{Multiple hits and ambiguous reads}{
                                     +    An important distinction needs to be made between a read that hits a
                                     +    given feature multiple times and a read that hits more than one feature.
+                                    +
                                     +    If the former, the read is counted/reported only once for that feature.
                                     +    For example, when summarizing at the transcript level, a read is
                                     +    counted/reported only once for a given transcript, even if that read
                                     +    hits more than one splicing graph edge (or \emph{reduced} splicing graph
                                     +    edge) associated with that transcript.
+                                    +
                                     +    If the latter, the read is said to be \emph{ambiguous}. An ambiguous read
                                     +    is currently counted/reported for each feature where it has a hit.
                                     +    This is a temporary situation: in the near future the user will be offered
                                     +    options to handle ambiguous reads in different ways.
                                     +  }
+                                    +
                                     +  \subsection{Ambiguous reads and levels of resolution}{
                                     +    A read might be ambiguous at one level of resolution but not at the other.
                                     +    Also the number of ambiguous reads is typically affected by the level
                                     +    of resolution. However, even though higher resolution generally means
                                     +    more ambiguous reads, this is only true when the switch from one level
                                     +    of resolution to the other implies a parent-child relationship between
                                     +    features that is one-to-many.
                                     +    So, based on the above table, this is always true, except when
                                     +    switching from using \code{by="tx"} to using \code{by="sgedge"} or
                                     +    \code{by="rsgedge"}. In those cases, the switch can produce more
                                     +    ambiguities but it can also produce less.
                                     +  }
+                                     }
                                      \value{
@@ -49,9 +120,9 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                          \item gene if \code{by="gene"}.
+                                       }
                                     -  And with one column per sample (containing the reads for that sample for
                                     -  \code{getReads}, and the counts for that sample for \code{countReads}),
                                     -  plus the following two additional leading columns:
                                     +  And with one column per sample (containing the counts for that sample for
                                     +  \code{countReads}, and the reads for that sample for \code{reportReads}),
                                     +  plus the two following left columns:
                                        \itemize{
                                          \item if \code{by="sgedge"}: \code{"sgedge_id"}, containing the
                                                \emph{global splicing graph edge ids}, and \code{"ex_or_in"},
@@ -63,6 +134,15 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                          \item if \code{by="tx"}: \code{"tx_id"} and \code{"gene_id"};
                                          \item if \code{by="gene"}: \code{"gene_id"} and \code{"tx_id"}.
+                                       }
+                                    +
                                     +  For \code{countReads}, each column of counts is of type integer and
                                     +  is named after the corresponding sample.
                                     +  For \code{reportReads}, each column of reads is a CharacterList object
                                     +  and its name is the name of the corresponding sample with the
                                     +  \code{".hits"} suffix added to it.
                                     +  In both cases, the name of the sample is the name that was passed to
                                     +  \code{assignReads} when the reads of a given sample were initially
                                     +  assigned. See \code{?\link{assignReads}} for more information.
+                                     }
                                      \author{
@@ -85,29 +165,59 @@ example(assignReads)
                                      ## ---------------------------------------------------------------------
                                      ## 2. Summarize the reads by splicing graph edge
                                      ## ---------------------------------------------------------------------
                                     -getReads(sg)
                                      countReads(sg)
                                     +reportReads(sg)
                                      ## ---------------------------------------------------------------------
                                      ## 3. Summarize the reads by reduced splicing graph edge
                                      ## ---------------------------------------------------------------------
                                     -getReads(sg, by="rsgedge")
                                      countReads(sg, by="rsgedge")
                                     +reportReads(sg, by="rsgedge")
                                      ## ---------------------------------------------------------------------
                                      ## 4. Summarize the reads by transcript
                                      ## ---------------------------------------------------------------------
                                     -getReads(sg, by="tx")
                                      countReads(sg, by="tx")
                                     +reportReads(sg, by="tx")
                                      ## ---------------------------------------------------------------------
                                     -## 4. Summarize the reads by gene
                                     +## 5. Summarize the reads by gene
                                      ## ---------------------------------------------------------------------
                                     -getReads(sg, by="gene")
                                      countReads(sg, by="gene")
                                     +reportReads(sg, by="gene")
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 6. A close look at ambiguous reads
                                     +## ---------------------------------------------------------------------
                                     +resolutions <- c("sgedge", "rsgedge", "tx", "gene")
+                                    +
                                     +reported_reads <- lapply(resolutions,
                                     +    function(by) {
                                     +        reported_reads <- reportReads(sg, by=by)
                                     +        unlist(reported_reads$TOYREADS.hits)
                                     +    })
+                                    +
                                     +## The set of reported reads is the same at all levels of resolution:
                                     +unique_reported_reads <- lapply(reported_reads, unique)
                                     +stopifnot(identical(unique_reported_reads,
                                     +                    rep(unique_reported_reads[1], 4)))
+                                    +
                                     +## Extract ambigous reads for each level of resolution:
                                     +ambiguous_reads <- lapply(reported_reads,
                                     +                          function(x) unique(x[duplicated(x)]))
                                     +names(ambiguous_reads) <- resolutions
                                     +ambiguous_reads
+                                    +
                                     +## Reads that are ambiguous at the "rsgedge" level must also be
                                     +## ambiguous at the "sgedge" level:
                                     +stopifnot(all(ambiguous_reads$rsgedge %in% ambiguous_reads$sgedge))
+                                    +
                                     +## However, there is no reason why reads that are ambiguous at the
                                     +## "tx" level should also be ambiguous at the "sgedge" or "rsgedge"
                                     +## level!
                                      ## ---------------------------------------------------------------------
                                     -## 5. Remove the reads from 'sg'.
                                     +## 7. Remove the reads from 'sg'.
                                      ## ---------------------------------------------------------------------
                                      sg <- removeReads(sg)
                                      countReads(sg)

Browse code

Add getReads(). Similar to countReads() but returns right before the final counting step, that is, the returned DataFrame contains the reads instead of their counts.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76127 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 29/04/2013 18:02:30
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 05410e1

@@ -7,15 +7,21 @@
                                      \title{
                                     -  Summarize the reads assigned to the edges of a SplicingGraphs object
                                     +  Summarize the reads assigned to a SplicingGraphs object
+                                     }
                                      \description{
                                     -  \code{countReads} returns a summarized count of the reads assigned to
                                     -  the edges of a SplicingGraphs object.
                                     +  \code{getReads} returns the reads assigned to a SplicingGraphs object,
                                     +  summarized either by splicing graph edge, \emph{reduced} splicing graph
                                     +  edge, transcript, or gene.
+                                    +
                                     +  \code{countReads} counts the reads assigned to a SplicingGraphs object.
                                     +  The counting can be done by splicing graph edge, \emph{reduced} splicing
                                     +  graph edge, transcript, or gene.
+                                     }
                                      \usage{
                                     +getReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                      countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
+                                     }
@@ -24,9 +30,9 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                          A \link{SplicingGraphs} object.
+                                       }
                                        \item{by}{
                                     -    Summarize by splicing graph edge (\code{by="sgedge"}), by \emph{reduced}
                                     -    splicing graph edge (\code{by="rsgedge"}), by transcript (\code{by="tx"}),
                                     -    or by gene (\code{by="gene"}).
                                     +    Summarize/count by splicing graph edge (\code{by="sgedge"}), by
                                     +    \emph{reduced} splicing graph edge (\code{by="rsgedge"}), by transcript
                                     +    (\code{by="tx"}), or by gene (\code{by="gene"}).
+                                       }
+                                     }
@@ -43,8 +49,9 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                          \item gene if \code{by="gene"}.
+                                       }
                                     -  And with one column per sample (containing the counts for that sample),
                                     -  plus the two following additional leading columns:
                                     +  And with one column per sample (containing the reads for that sample for
                                     +  \code{getReads}, and the counts for that sample for \code{countReads}),
                                     +  plus the following two additional leading columns:
                                        \itemize{
                                          \item if \code{by="sgedge"}: \code{"sgedge_id"}, containing the
                                                \emph{global splicing graph edge ids}, and \code{"ex_or_in"},
@@ -76,16 +83,32 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                      example(assignReads)
                                      ## ---------------------------------------------------------------------
                                     -## 2. Summarize the reads assigned to 'sg'
                                     +## 2. Summarize the reads by splicing graph edge
                                     +## ---------------------------------------------------------------------
                                     +getReads(sg)
                                     +countReads(sg)
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 3. Summarize the reads by reduced splicing graph edge
                                     +## ---------------------------------------------------------------------
                                     +getReads(sg, by="rsgedge")
                                     +countReads(sg, by="rsgedge")
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 4. Summarize the reads by transcript
                                     +## ---------------------------------------------------------------------
                                     +getReads(sg, by="tx")
                                     +countReads(sg, by="tx")
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 4. Summarize the reads by gene
                                      ## ---------------------------------------------------------------------
                                     -countReads(sg)  # nb of reads per splicing graph edge
                                     -countReads(sg, by="rsgedge")  # ... per reduced splicing graph edge
                                     -countReads(sg, by="tx")  # ... per transcript
                                     -countReads(sg, by="gene")  # ... per gene
                                     +getReads(sg, by="gene")
                                     +countReads(sg, by="gene")
                                      ## ---------------------------------------------------------------------
                                     -## 3. Remove the reads from 'sg'.
                                     +## 5. Remove the reads from 'sg'.
                                      ## ---------------------------------------------------------------------
                                     -removeReads(sg)
                                     +sg <- removeReads(sg)
                                      countReads(sg)
+                                     }

Browse code

countReads() now supports 'by="gene"' for counting hits by gene (in addition to 'by="sgedge"', 'by="rsgedge"', and 'by="tx"').

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76090 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 26/04/2013 23:16:07
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 55bd937

@@ -39,11 +39,12 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                        \itemize{
                                          \item unique splicing graph edge, if \code{by="sgedge"};
                                          \item unique \emph{reduced} splicing graph edge, if \code{by="rsgedge"};
                                     -    \item transcript if \code{by="tx"}.
                                     +    \item transcript if \code{by="tx"};
                                     +    \item gene if \code{by="gene"}.
+                                       }
                                     -  And with one column per sample (containing the counts for each sample),
                                     -  and the two following additional leading columns:
                                     +  And with one column per sample (containing the counts for that sample),
                                     +  plus the two following additional leading columns:
                                        \itemize{
                                          \item if \code{by="sgedge"}: \code{"sgedge_id"}, containing the
                                                \emph{global splicing graph edge ids}, and \code{"ex_or_in"},
@@ -52,7 +53,8 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                                \emph{global reduced splicing graph edge ids}, and
                                                \code{"ex_or_in"}, containing the type of edge (exon, intron,
                                                or mixed);
                                     -    \item if \code{by="tx"}: \code{"tx_id"} and \code{"gene_id"}.
                                     +    \item if \code{by="tx"}: \code{"tx_id"} and \code{"gene_id"};
                                     +    \item if \code{by="gene"}: \code{"gene_id"} and \code{"tx_id"}.
+                                       }
+                                     }
@@ -74,22 +76,16 @@ countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
                                      example(assignReads)
                                      ## ---------------------------------------------------------------------
                                     -## 2. Count the number of reads per splicing graph edge
                                     +## 2. Summarize the reads assigned to 'sg'
                                      ## ---------------------------------------------------------------------
                                     -countReads(sg)
+                                    -
                                     -## ---------------------------------------------------------------------
                                     -## 3. Count the number of reads per reduced splicing graph edge
                                     -## ---------------------------------------------------------------------
                                     -countReads(sg, by="rsgedge")
                                     +countReads(sg)  # nb of reads per splicing graph edge
                                     +countReads(sg, by="rsgedge")  # ... per reduced splicing graph edge
                                     +countReads(sg, by="tx")  # ... per transcript
                                     +countReads(sg, by="gene")  # ... per gene
                                      ## ---------------------------------------------------------------------
                                     -## 4. Count the number of reads per transcript
                                     -## ---------------------------------------------------------------------
                                     -countReads(sg, by="tx")
+                                    -
                                     -## ---------------------------------------------------------------------
                                     -## 5. Remove the reads from 'sg'.
                                     +## 3. Remove the reads from 'sg'.
                                      ## ---------------------------------------------------------------------
                                      removeReads(sg)
                                     +countReads(sg)
+                                     }

Browse code

Split countReads-methods.R into 2 units: assignReads.R and countReads-methods.R

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76089 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 26/04/2013 22:47:35
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 75d6518

@@ -2,54 +2,31 @@
                                      \alias{countReads-methods}
                                     -\alias{assignReads}
                                      \alias{countReads}
                                      \alias{countReads,SplicingGraphs-method}
                                     -\alias{removeReads}
                                      \title{
                                     -  Assign reads to the edges of a SplicingGraphs object and summarize them
                                     +  Summarize the reads assigned to the edges of a SplicingGraphs object
+                                     }
                                      \description{
                                     -  \code{assignReads} assigns reads to the exonic and intronic edges of a
                                     -  \link{SplicingGraphs} object.
+                                    -
                                     -  \code{countReads} returns a summarized count of the assigned reads.
+                                    -
                                     -  \code{removeReads} removes all the reads assigned to a
                                     -  \link{SplicingGraphs} object.
                                     +  \code{countReads} returns a summarized count of the reads assigned to
                                     +  the edges of a SplicingGraphs object.
+                                     }
                                      \usage{
                                     -assignReads(sg, reads, sample.name=NA)
+                                    -
                                     -countReads(x, by=c("sgedge", "rsgedge", "tx"))
+                                    -
                                     -removeReads(sg)
                                     +countReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
+                                     }
                                      \arguments{
                                     -  \item{sg, x}{
                                     +  \item{x}{
                                          A \link{SplicingGraphs} object.
+                                       }
                                     -  \item{reads}{
                                     -    A \link[GenomicRanges]{GAlignments},
                                     -    \link[GenomicRanges]{GAlignmentPairs}, or
                                     -    \link[GenomicRanges]{GRangesList} object, containing the
                                     -    reads to assign to the exons and introns in \code{sg}.
                                     -    It must have unique names on it, typically the QNAME ("query name")
                                     -    field coming from the BAM file. More on this in the 'About the read
                                     -    names' section below.
                                     -  }
                                     -  \item{sample.name}{
                                     -    A single string containing the name of the sample where the reads
                                     -    are coming from.
                                     -  }
                                        \item{by}{
                                     -    Summarize by splicing graph edge (\code{"sgedge"}), by \emph{reduced}
                                     -    splicing graph edge (\code{"rsgedge"}), or by transcript (\code{"tx"}).
                                     +    Summarize by splicing graph edge (\code{by="sgedge"}), by \emph{reduced}
                                     +    splicing graph edge (\code{by="rsgedge"}), by transcript (\code{by="tx"}),
                                     +    or by gene (\code{by="gene"}).
+                                       }
+                                     }
@@ -57,65 +34,16 @@ removeReads(sg)
                                        TODO
+                                     }
                                     -\section{About read names}{
                                     -  The read names are typically imported from the BAM file by calling
                                     -  \code{\link[GenomicRanges]{readGAlignments}} (or
                                     -  \code{\link[GenomicRanges]{readGAlignmentPairs}}) with
                                     -  \code{use.names=TRUE}. This extracts the "query names" from the
                                     -  file (stored in the QNAME field), and makes them the names of the
                                     -  returned object.
+                                    -
                                     -  The \code{reads} object must have unique names on it. The presence of
                                     -  duplicated names generally indicates one (or both) of the following
                                     -  situations:
+                                    -
                                     -  \itemize{
                                     -    \item (a) \code{reads} contains paired-end reads that have not been
                                     -              paired;
                                     -    \item (b) some of the reads are \emph{secondary alignments}.
                                     -  }
+                                    -
                                     -  If (a): you can find out whether reads in a BAM file are single- or
                                     -  paired-end with the \code{\link[Rsamtools]{quickCountBam}} utility
                                     -  from the Rsamtools package. If they're paired-end, load them with
                                     -  \code{\link[GenomicRanges]{readGAlignmentPairs}}
                                     -  instead of \code{\link[GenomicRanges]{readGAlignments}}, and that
                                     -  will pair them.
+                                    -
                                     -  If (b): you can filter out secondary alignments by passing
                                     -  \code{'isNotPrimaryRead=FALSE'} to \code{\link[Rsamtools]{scanBamFlag}}
                                     -  when preparing the \link[Rsamtools]{ScanBamParam} object used to load
                                     -  the reads. For example:
                                     -  \preformatted{
                                     -    library(Rsamtools)
                                     -    flag0 <- scanBamFlag(isNotPrimaryRead=FALSE,
                                     -                         isNotPassingQualityControls=FALSE,
                                     -                         isDuplicate=FALSE)
                                     -    param0 <- ScanBamParam(flag=flag0)
                                     -    reads <- readGAlignments("path/to/BAM/file", use.names=TRUE,
                                     -                                  param=param0)
                                     -  }
                                     -  This will filter out records that have flag 0x100 (secondary alignment)
                                     -  set to 1. See \code{?\link[Rsamtools]{scanBamFlag}} in the Rsamtools
                                     -  package for more information.
                                     -  See the SAM Specs on the SAMtools project page at
                                     -  \url{http://samtools.sourceforge.net/} for a description of the
                                     -  SAM/BAM flags.
                                     -}
+                                    -
                                      \value{
                                     -  For \code{assignReads}: the supplied \link{SplicingGraphs} object with
                                     -  the reads assigned to it.
+                                    -
                                     -  For \code{countReads}: a \link[IRanges]{DataFrame} object.
                                     -  It has one row per:
                                     +  A \link[IRanges]{DataFrame} object with one row per:
                                        \itemize{
                                          \item unique splicing graph edge, if \code{by="sgedge"};
                                          \item unique \emph{reduced} splicing graph edge, if \code{by="rsgedge"};
                                          \item transcript if \code{by="tx"}.
+                                       }
                                     -  The returned \link[IRanges]{DataFrame} object has one column of counts per
                                     -  sample, and the two following additional leading columns:
+                                    +
                                     +  And with one column per sample (containing the counts for each sample),
                                     +  and the two following additional leading columns:
                                        \itemize{
                                          \item if \code{by="sgedge"}: \code{"sgedge_id"}, containing the
                                                \emph{global splicing graph edge ids}, and \code{"ex_or_in"},
@@ -136,83 +64,32 @@ removeReads(sg)
                                        This man page is part of the SplicingGraphs package.
                                        Please see \code{?`\link{SplicingGraphs-package}`} for an overview of the
                                        package and for an index of its man pages.
+                                    -
                                     -  Other topics related to this man page and documented in other packages:
                                     -  \itemize{
                                     -    \item The \link[GenomicRanges]{GRangesList},
                                     -          \link[GenomicRanges]{GAlignments}, and
                                     -          \link[GenomicRanges]{GAlignmentPairs} classes
                                     -          in the GenomicRanges package.
+                                    -
                                     -    \item The \code{\link[Rsamtools]{quickCountBam}} and
                                     -          \code{\link[Rsamtools]{ScanBamParam}} functions in the
                                     -          Rsamtools package.
                                     -  }
+                                     }
                                      \examples{
                                      ## ---------------------------------------------------------------------
                                     -## 1. Make SplicingGraphs object 'sg' from toy gene model (see
                                     -##    '?SplicingGraphs')
                                     -## ---------------------------------------------------------------------
                                     -example(SplicingGraphs)
                                     -sg
+                                    -
                                     -## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids.
                                     -names(sg)
+                                    -
                                     +## 1. Make SplicingGraphs object 'sg' from toy gene model and assign toy
                                     +##    reads to it (see '?assignReads')
                                      ## ---------------------------------------------------------------------
                                     -## 2. Load toy reads
                                     -## ---------------------------------------------------------------------
                                     -## Load toy reads (single-end) from a BAM file. We filter out secondary
                                     -## alignments, reads not passing quality controls, and PCR or optical
                                     -## duplicates (see ?scanBamFlag in the Rsamtools package for more
                                     -## information):
                                     -flag0 <- scanBamFlag(isNotPrimaryRead=FALSE,
                                     -                     isNotPassingQualityControls=FALSE,
                                     -                     isDuplicate=FALSE)
                                     -param0 <- ScanBamParam(flag=flag0)
                                     -gal <- readGAlignments(toy_reads_bam(), use.names=TRUE, param=param0)
                                     -gal
+                                    -
                                     -## ---------------------------------------------------------------------
                                     -## 3. Assign the reads to the exons and introns in 'sg'
                                     -## ---------------------------------------------------------------------
                                     -## The same read can be assigned to more than 1 exon or intron (e.g. a
                                     -## junction read with 1 gap can be assigned to 2 exons and 1 intron).
                                     -sg <- assignReads(sg, gal, sample.name="TOYREADS")
+                                    -
                                     -## See the assignments to the splicing graph edges.
                                     -edge_by_tx <- sgedgesByTranscript(sg, with.hits.mcols=TRUE)
                                     -edge_data <- mcols(unlist(edge_by_tx))
                                     -colnames(edge_data)
                                     -head(edge_data)
                                     -edge_data[ , c("sgedge_id", "TOYREADS.hits")]
+                                    -
                                     -edge_by_gene <- sgedgesByGene(sg, with.hits.mcols=TRUE)
                                     -mcols(unlist(edge_by_gene))
+                                    -
                                     -## See the assignments to the reduced splicing graph edges.
                                     -redge_by_gene <- rsgedgesByGene(sg, with.hits.mcols=TRUE)
                                     -mcols(unlist(redge_by_gene))
                                     +example(assignReads)
                                      ## ---------------------------------------------------------------------
                                     -## 4. Count the number of reads per splicing graph edge
                                     +## 2. Count the number of reads per splicing graph edge
                                      ## ---------------------------------------------------------------------
                                      countReads(sg)
                                      ## ---------------------------------------------------------------------
                                     -## 5. Count the number of reads per reduced splicing graph edge
                                     +## 3. Count the number of reads per reduced splicing graph edge
                                      ## ---------------------------------------------------------------------
                                      countReads(sg, by="rsgedge")
                                      ## ---------------------------------------------------------------------
                                     -## 6. Count the number of reads per transcript
                                     +## 4. Count the number of reads per transcript
                                      ## ---------------------------------------------------------------------
                                      countReads(sg, by="tx")
                                      ## ---------------------------------------------------------------------
                                     -## 7. Remove the reads from 'sg'.
                                     +## 5. Remove the reads from 'sg'.
                                      ## ---------------------------------------------------------------------
                                      removeReads(sg)
+                                     }

Browse code

Starting unit tests + other minor tweaks.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/SplicingGraphs@76087 bc3139a8-67e5-0310-9ffc-ced21a209358

Herve Pages authored on 26/04/2013 22:09:30
Showing 1 changed files

man/countReads-methods.Rd

History View file @ 94444bd

                                     new file mode 100644
@@ -0,0 +1,218 @@
                                     +\name{countReads-methods}
+                                    +
                                     +\alias{countReads-methods}
+                                    +
                                     +\alias{assignReads}
                                     +\alias{countReads}
                                     +\alias{countReads,SplicingGraphs-method}
                                     +\alias{removeReads}
+                                    +
+                                    +
                                     +\title{
                                     +  Assign reads to the edges of a SplicingGraphs object and summarize them
                                     +}
+                                    +
                                     +\description{
                                     +  \code{assignReads} assigns reads to the exonic and intronic edges of a
                                     +  \link{SplicingGraphs} object.
+                                    +
                                     +  \code{countReads} returns a summarized count of the assigned reads.
+                                    +
                                     +  \code{removeReads} removes all the reads assigned to a
                                     +  \link{SplicingGraphs} object.
                                     +}
+                                    +
                                     +\usage{
                                     +assignReads(sg, reads, sample.name=NA)
+                                    +
                                     +countReads(x, by=c("sgedge", "rsgedge", "tx"))
+                                    +
                                     +removeReads(sg)
                                     +}
+                                    +
                                     +\arguments{
                                     +  \item{sg, x}{
                                     +    A \link{SplicingGraphs} object.
                                     +  }
                                     +  \item{reads}{
                                     +    A \link[GenomicRanges]{GAlignments},
                                     +    \link[GenomicRanges]{GAlignmentPairs}, or
                                     +    \link[GenomicRanges]{GRangesList} object, containing the
                                     +    reads to assign to the exons and introns in \code{sg}.
                                     +    It must have unique names on it, typically the QNAME ("query name")
                                     +    field coming from the BAM file. More on this in the 'About the read
                                     +    names' section below.
                                     +  }
                                     +  \item{sample.name}{
                                     +    A single string containing the name of the sample where the reads
                                     +    are coming from.
                                     +  }
                                     +  \item{by}{
                                     +    Summarize by splicing graph edge (\code{"sgedge"}), by \emph{reduced}
                                     +    splicing graph edge (\code{"rsgedge"}), or by transcript (\code{"tx"}).
                                     +  }
                                     +}
+                                    +
                                     +\details{
                                     +  TODO
                                     +}
+                                    +
                                     +\section{About read names}{
                                     +  The read names are typically imported from the BAM file by calling
                                     +  \code{\link[GenomicRanges]{readGAlignments}} (or
                                     +  \code{\link[GenomicRanges]{readGAlignmentPairs}}) with
                                     +  \code{use.names=TRUE}. This extracts the "query names" from the
                                     +  file (stored in the QNAME field), and makes them the names of the
                                     +  returned object.
+                                    +
                                     +  The \code{reads} object must have unique names on it. The presence of
                                     +  duplicated names generally indicates one (or both) of the following
                                     +  situations:
+                                    +
                                     +  \itemize{
                                     +    \item (a) \code{reads} contains paired-end reads that have not been
                                     +              paired;
                                     +    \item (b) some of the reads are \emph{secondary alignments}.
                                     +  }
+                                    +
                                     +  If (a): you can find out whether reads in a BAM file are single- or
                                     +  paired-end with the \code{\link[Rsamtools]{quickCountBam}} utility
                                     +  from the Rsamtools package. If they're paired-end, load them with
                                     +  \code{\link[GenomicRanges]{readGAlignmentPairs}}
                                     +  instead of \code{\link[GenomicRanges]{readGAlignments}}, and that
                                     +  will pair them.
+                                    +
                                     +  If (b): you can filter out secondary alignments by passing
                                     +  \code{'isNotPrimaryRead=FALSE'} to \code{\link[Rsamtools]{scanBamFlag}}
                                     +  when preparing the \link[Rsamtools]{ScanBamParam} object used to load
                                     +  the reads. For example:
                                     +  \preformatted{
                                     +    library(Rsamtools)
                                     +    flag0 <- scanBamFlag(isNotPrimaryRead=FALSE,
                                     +                         isNotPassingQualityControls=FALSE,
                                     +                         isDuplicate=FALSE)
                                     +    param0 <- ScanBamParam(flag=flag0)
                                     +    reads <- readGAlignments("path/to/BAM/file", use.names=TRUE,
                                     +                                  param=param0)
                                     +  }
                                     +  This will filter out records that have flag 0x100 (secondary alignment)
                                     +  set to 1. See \code{?\link[Rsamtools]{scanBamFlag}} in the Rsamtools
                                     +  package for more information.
                                     +  See the SAM Specs on the SAMtools project page at
                                     +  \url{http://samtools.sourceforge.net/} for a description of the
                                     +  SAM/BAM flags.
                                     +}
+                                    +
                                     +\value{
                                     +  For \code{assignReads}: the supplied \link{SplicingGraphs} object with
                                     +  the reads assigned to it.
+                                    +
                                     +  For \code{countReads}: a \link[IRanges]{DataFrame} object.
                                     +  It has one row per:
                                     +  \itemize{
                                     +    \item unique splicing graph edge, if \code{by="sgedge"};
                                     +    \item unique \emph{reduced} splicing graph edge, if \code{by="rsgedge"};
                                     +    \item transcript if \code{by="tx"}.
                                     +  }
                                     +  The returned \link[IRanges]{DataFrame} object has one column of counts per
                                     +  sample, and the two following additional leading columns:
                                     +  \itemize{
                                     +    \item if \code{by="sgedge"}: \code{"sgedge_id"}, containing the
                                     +          \emph{global splicing graph edge ids}, and \code{"ex_or_in"},
                                     +          containing the type of edge (exon or intron);
                                     +    \item if \code{by="rsgedge"}: \code{"rsgedge_id"}, containing the
                                     +          \emph{global reduced splicing graph edge ids}, and
                                     +          \code{"ex_or_in"}, containing the type of edge (exon, intron,
                                     +          or mixed);
                                     +    \item if \code{by="tx"}: \code{"tx_id"} and \code{"gene_id"}.
                                     +  }
                                     +}
+                                    +
                                     +\author{
                                     +  H. Pages
                                     +}
+                                    +
                                     +\seealso{
                                     +  This man page is part of the SplicingGraphs package.
                                     +  Please see \code{?`\link{SplicingGraphs-package}`} for an overview of the
                                     +  package and for an index of its man pages.
+                                    +
                                     +  Other topics related to this man page and documented in other packages:
                                     +  \itemize{
                                     +    \item The \link[GenomicRanges]{GRangesList},
                                     +          \link[GenomicRanges]{GAlignments}, and
                                     +          \link[GenomicRanges]{GAlignmentPairs} classes
                                     +          in the GenomicRanges package.
+                                    +
                                     +    \item The \code{\link[Rsamtools]{quickCountBam}} and
                                     +          \code{\link[Rsamtools]{ScanBamParam}} functions in the
                                     +          Rsamtools package.
                                     +  }
                                     +}
+                                    +
                                     +\examples{
                                     +## ---------------------------------------------------------------------
                                     +## 1. Make SplicingGraphs object 'sg' from toy gene model (see
                                     +##    '?SplicingGraphs')
                                     +## ---------------------------------------------------------------------
                                     +example(SplicingGraphs)
                                     +sg
+                                    +
                                     +## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids.
                                     +names(sg)
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 2. Load toy reads
                                     +## ---------------------------------------------------------------------
                                     +## Load toy reads (single-end) from a BAM file. We filter out secondary
                                     +## alignments, reads not passing quality controls, and PCR or optical
                                     +## duplicates (see ?scanBamFlag in the Rsamtools package for more
                                     +## information):
                                     +flag0 <- scanBamFlag(isNotPrimaryRead=FALSE,
                                     +                     isNotPassingQualityControls=FALSE,
                                     +                     isDuplicate=FALSE)
                                     +param0 <- ScanBamParam(flag=flag0)
                                     +gal <- readGAlignments(toy_reads_bam(), use.names=TRUE, param=param0)
                                     +gal
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 3. Assign the reads to the exons and introns in 'sg'
                                     +## ---------------------------------------------------------------------
                                     +## The same read can be assigned to more than 1 exon or intron (e.g. a
                                     +## junction read with 1 gap can be assigned to 2 exons and 1 intron).
                                     +sg <- assignReads(sg, gal, sample.name="TOYREADS")
+                                    +
                                     +## See the assignments to the splicing graph edges.
                                     +edge_by_tx <- sgedgesByTranscript(sg, with.hits.mcols=TRUE)
                                     +edge_data <- mcols(unlist(edge_by_tx))
                                     +colnames(edge_data)
                                     +head(edge_data)
                                     +edge_data[ , c("sgedge_id", "TOYREADS.hits")]
+                                    +
                                     +edge_by_gene <- sgedgesByGene(sg, with.hits.mcols=TRUE)
                                     +mcols(unlist(edge_by_gene))
+                                    +
                                     +## See the assignments to the reduced splicing graph edges.
                                     +redge_by_gene <- rsgedgesByGene(sg, with.hits.mcols=TRUE)
                                     +mcols(unlist(redge_by_gene))
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 4. Count the number of reads per splicing graph edge
                                     +## ---------------------------------------------------------------------
                                     +countReads(sg)
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 5. Count the number of reads per reduced splicing graph edge
                                     +## ---------------------------------------------------------------------
                                     +countReads(sg, by="rsgedge")
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 6. Count the number of reads per transcript
                                     +## ---------------------------------------------------------------------
                                     +countReads(sg, by="tx")
+                                    +
                                     +## ---------------------------------------------------------------------
                                     +## 7. Remove the reads from 'sg'.
                                     +## ---------------------------------------------------------------------
                                     +removeReads(sg)
                                     +}

...	...	@@ -151,7 +151,7 @@ reportReads(x, by=c("sgedge", "rsgedge", "tx", "gene"))
151	151	}
152	152
153	153	\seealso{
154		- This man page is part of the SplicingGraphs package.
	154	+ This man page is part of the \pkg{SplicingGraphs} package.
155	155	Please see \code{?`\link{SplicingGraphs-package}`} for an overview of the
156	156	package and for an index of its man pages.
157	157	}