Browse code

Improving documentation (with help from ChatGPT-4)

MarekGierlinski authored on 25/04/2023 08:22:49
Showing 5 changed files

... ...
@@ -72,10 +72,12 @@
72 72
 
73 73
  - Pre-release Bioconductor version.
74 74
 
75
-## Version 0.99.1 (2023-04-24)
75
+## Version 0.99.1 (2023-04-25)
76 76
 
77
- - BioPlanet database vanished from internet and there is no sign of it coming back. Removing all BioPlanet-relaed code and replacing BioPlanet with GO in the vignette and examples (this, alas, makes it longer to check).
78
- - OK, it is back, but I keep GO examples and vinettes.
77
+ - BioPlanet database vanished from internet and there is no sign of it coming back. Removing all BioPlanet-related code and replacing BioPlanet with GO in the vignette and examples (this, alas, makes it longer to check).
78
+ - OK, it is back, but I keep GO examples and vignettes.
79
+ - Minor improvements to documentation.
80
+ 
79 81
  
80 82
  
81 83
  
... ...
@@ -1,38 +1,39 @@
1
-#' Prepare term data for enrichment analysis
1
+#' Prepare Term Data for Enrichment Analysis
2 2
 #'
3
-#' Prepare term data downloaded with \code{fetch_*} functions for fast
4
-#' enrichment analysis.
3
+#' Process term data downloaded with the \code{fetch_*} functions, preparing it
4
+#' for fast enrichment analysis using \code{functional_enrichment}.
5 5
 #'
6 6
 #' @details
7 7
 #'
8
-#' Takes two tibbles with functional term information (\code{terms}) and
9
-#' feature mapping (\code{mapping}) and converts them into an object required by
10
-#' \code{functional_enrichment} for fast analysis. Terms and mapping can be
11
-#' created with database access functions in this package, for example
12
-#' \code{fetch_reactome} or \code{fetch_go_from_go}.
8
+#' This function takes two tibbles containing functional term information
9
+#' (\code{terms}) and feature mapping (\code{mapping}), and converts them into
10
+#' an object required by \code{functional_enrichment} for efficient analysis.
11
+#' Terms and mapping can be generated with the database access functions
12
+#' included in this package, such as \code{fetch_reactome} or
13
+#' \code{fetch_go_from_go}.
13 14
 #'
14 15
 #' @param terms A tibble with at least two columns: \code{term_id} and
15
-#'   \code{term_name}. Contains information about functional term
16
-#'   names/descriptions.
17
-#' @param mapping A tibble with at least two columns, containing mapping between
18
-#'   functional terms and features. One column needs to be called \code{term_id}
19
-#'   and the other column has a name specified by \code{feature_name} argument.
20
-#'   For example, if \code{mapping} contains columns \code{term_id},
21
-#'   \code{accession_number} and  \code{gene_symbol} then setting
22
-#'   \code{feature_name = "gene_symbol"} indicates that gene symbols will be
23
-#'   used for enrichment.
24
-#' @param all_features A vector with all feature ids used as background for
16
+#'   \code{term_name}. This tibble contains information about functional term
17
+#'   names and descriptions.
18
+#' @param mapping A tibble with at least two columns, containing the mapping
19
+#'   between functional terms and features. One column must be named
20
+#'   \code{term_id}, while the other column should have a name specified by the
21
+#'   \code{feature_name} argument. For example, if \code{mapping} contains
22
+#'   columns \code{term_id}, \code{accession_number}, and \code{gene_symbol},
23
+#'   setting \code{feature_name = "gene_symbol"} indicates that gene symbols
24
+#'   will be used for enrichment analysis.
25
+#' @param all_features A vector with all feature IDs used as the background for
25 26
 #'   enrichment. If not specified, all features found in \code{mapping} will be
26 27
 #'   used, resulting in a larger object size.
27
-#' @param feature_name The name of the column in \code{mapping} tibble to be
28
-#'   used as feature.For example, if \code{mapping} contains columns \code{term_id},
29
-#'   \code{accession_number} and  \code{gene_symbol} then setting
30
-#'   \code{feature_name = "gene_symbol"} indicates that gene symbols will be
31
-#'   used for enrichment.
28
+#' @param feature_name The name of the column in the \code{mapping} tibble to be
29
+#'   used as the feature identifier. For example, if \code{mapping} contains
30
+#'   columns \code{term_id}, \code{accession_number}, and \code{gene_symbol},
31
+#'   setting \code{feature_name = "gene_symbol"} indicates that gene symbols
32
+#'   will be used for enrichment analysis.
32 33
 #'
33
-#'
34
-#' @return An object class \code{fenr_terms} required by
34
+#' @return An object of class \code{fenr_terms} required by
35 35
 #'   \code{functional_enrichment}.
36
+#' @importFrom assertthat assert_that
36 37
 #' @export
37 38
 #' @examples
38 39
 #' data(exmpl_all)
... ...
@@ -43,35 +44,32 @@ prepare_for_enrichment <- function(terms, mapping, all_features = NULL, feature_
43 44
   feature_id <- term_id <- NULL
44 45
 
45 46
   # Argument checks
46
-  if (!is.data.frame(terms) && !tibble::is_tibble(terms)) {
47
-    stop("'terms' must be a data frame or tibble.")
48
-  }
47
+  assert_that(is.data.frame(terms) || tibble::is_tibble(terms),
48
+              msg = "'terms' must be a data frame or tibble.")
49 49
 
50
-  if (!is.data.frame(mapping) && !tibble::is_tibble(mapping)) {
51
-    stop("'mapping' must be a data frame or tibble.")
52
-  }
50
+  assert_that(is.data.frame(mapping) || tibble::is_tibble(mapping),
51
+              msg = "'mapping' must be a data frame or tibble.")
53 52
 
54
-  if (!is.null(all_features) && !is.vector(all_features)) {
55
-    stop("'all_features' must be a vector or NULL.")
56
-  }
53
+  assert_that(is.null(all_features) || is.vector(all_features),
54
+              msg = "'all_features' must be a vector or NULL.")
57 55
 
58
-  if (!is.character(feature_name) || length(feature_name) != 1) {
59
-    stop("'feature_name' must be a single string.")
60
-  }
56
+  assert_that(is.character(feature_name) && length(feature_name) == 1,
57
+              msg = "'feature_name' must be a single string.")
61 58
 
62 59
   # Check terms
63
-  if (!all(c("term_id", "term_name") %in% colnames(terms)))
64
-    stop("Column names in 'terms' should be 'term_id' and 'term_name'.")
65
-  if(anyDuplicated(terms$term_id) > 0)
66
-    stop("Duplicated term_id detected in 'terms'.")
60
+  assert_that(all(c("term_id", "term_name") %in% colnames(terms)),
61
+              msg = "Column names in 'terms' should be 'term_id' and 'term_name'.")
62
+
63
+  assert_that(anyDuplicated(terms$term_id) == 0,
64
+              msg = "Duplicated term_id detected in 'terms'.")
67 65
 
68 66
   # Check mapping
69
-  if (!("term_id" %in% colnames(mapping)))
70
-    stop("'mapping' should contain a column named 'term_id'.")
67
+  assert_that("term_id" %in% colnames(mapping),
68
+              msg = "'mapping' should contain a column named 'term_id'.")
71 69
 
72 70
   # Check for feature name
73
-  if (!(feature_name %in% colnames(mapping)))
74
-    stop(feature_name, " column not found in mapping table. Check feature_name argument.")
71
+  assert_that(feature_name %in% colnames(mapping),
72
+              msg = paste0(feature_name, " column not found in mapping table. Check 'feature_name' argument."))
75 73
 
76 74
   # Replace empty all_features with everything from mapping
77 75
   map_features <- mapping[[feature_name]] |>
... ...
@@ -134,37 +132,39 @@ prepare_for_enrichment <- function(terms, mapping, all_features = NULL, feature_
134 132
 }
135 133
 
136 134
 
137
-#' Fast functional enrichment
138
-#'
139
-#' Fast functional enrichment based on hypergeometric distribution. Can be used
140
-#' in interactive applications.
135
+
136
+#' Fast Functional Enrichment
141 137
 #'
142
-#' @details
138
+#' Perform fast functional enrichment analysis based on the hypergeometric
139
+#' distribution. Designed for use in interactive applications.
143 140
 #'
144
-#' Functional enrichment in a selection (e.g. differentially expressed genes) of
145
-#' features, using hypergeometric probability (that is, Fisher's exact test). A
146
-#' feature can be a gene, protein, etc. \code{term_data} is an object with
147
-#' functional term information and feature-term mapping
141
+#' @details This function carries out functional enrichment analysis on a
142
+#'   selection of features (e.g., differentially expressed genes) using the
143
+#'   hypergeometric probability distribution (Fisher's exact test). Features can
144
+#'   be genes, proteins, etc. The \code{term_data} object contains functional
145
+#'   term information and feature-term mapping.
148 146
 #'
149
-#' @param feat_all A character vector with all feature identifiers. This is the
150
-#'   background for enrichment.
147
+#' @param feat_all A character vector with all feature identifiers, serving as
148
+#'   the background for enrichment.
151 149
 #' @param feat_sel A character vector with feature identifiers in the selection.
152
-#' @param term_data An object class \code{fenr_terms}, created by
150
+#' @param term_data An object of class \code{fenr_terms}, created by
153 151
 #'   \code{prepare_for_enrichment}.
154
-#' @param feat2name An optional named list to convert feature ids into feature
152
+#' @param feat2name An optional named list to convert feature IDs into feature
155 153
 #'   names.
156 154
 #'
157
-#' @return A tibble with enrichment results. For each term the following
158
-#'   quantities are reported: \itemize{ \item{\code{N_with} - number of features
159
-#'   with this term among all features} \item{\code{n_with_sel} - number
160
-#'   of features with this term in the selection} \item{\code{n_expect} -
161
-#'   expected number of features with this term in the selection, under the null
162
-#'   hypothesis that terms are mapped to features randomly}
163
-#'   \item{\code{enrichment} - ratio of n_with_sel / n_expect}
164
-#'   \item{\code{odds_ratio} - odds ratio for enrichment; is infinite, when all
165
-#'   features with the given term are in the selection} \item{\code{p_value} -
166
-#'   p-value from a single hypergeometric test} \item{\code{p_adjust} - p-value
167
-#'   adjusted for multiple tests using Benjamini-Hochberg approach}}.
155
+#' @return A tibble with enrichment results, providing the following information
156
+#'   for each term:
157
+#'   \itemize{
158
+#'     \item{\code{N_with} - number of features with this term among all features}
159
+#'     \item{\code{n_with_sel} - number of features with this term in the selection}
160
+#'     \item{\code{n_expect} - expected number of features with this term in the selection,
161
+#'       under the null hypothesis that terms are mapped to features randomly}
162
+#'     \item{\code{enrichment} - ratio of n_with_sel / n_expect}
163
+#'     \item{\code{odds_ratio} - odds ratio for enrichment; is infinite when all
164
+#'       features with the given term are in the selection}
165
+#'     \item{\code{p_value} - p-value from a single hypergeometric test}
166
+#'     \item{\code{p_adjust} - p-value adjusted for multiple tests using the Benjamini-Hochberg approach}
167
+#'   }.
168 168
 #'
169 169
 #' @importFrom assertthat assert_that
170 170
 #' @importFrom methods is
... ...
@@ -2,44 +2,47 @@
2 2
 % Please edit documentation in R/enrichment.R
3 3
 \name{functional_enrichment}
4 4
 \alias{functional_enrichment}
5
-\title{Fast functional enrichment}
5
+\title{Fast Functional Enrichment}
6 6
 \usage{
7 7
 functional_enrichment(feat_all, feat_sel, term_data, feat2name = NULL)
8 8
 }
9 9
 \arguments{
10
-\item{feat_all}{A character vector with all feature identifiers. This is the
11
-background for enrichment.}
10
+\item{feat_all}{A character vector with all feature identifiers, serving as
11
+the background for enrichment.}
12 12
 
13 13
 \item{feat_sel}{A character vector with feature identifiers in the selection.}
14 14
 
15
-\item{term_data}{An object class \code{fenr_terms}, created by
15
+\item{term_data}{An object of class \code{fenr_terms}, created by
16 16
 \code{prepare_for_enrichment}.}
17 17
 
18
-\item{feat2name}{An optional named list to convert feature ids into feature
18
+\item{feat2name}{An optional named list to convert feature IDs into feature
19 19
 names.}
20 20
 }
21 21
 \value{
22
-A tibble with enrichment results. For each term the following
23
-  quantities are reported: \itemize{ \item{\code{N_with} - number of features
24
-  with this term among all features} \item{\code{n_with_sel} - number
25
-  of features with this term in the selection} \item{\code{n_expect} -
26
-  expected number of features with this term in the selection, under the null
27
-  hypothesis that terms are mapped to features randomly}
28
-  \item{\code{enrichment} - ratio of n_with_sel / n_expect}
29
-  \item{\code{odds_ratio} - odds ratio for enrichment; is infinite, when all
30
-  features with the given term are in the selection} \item{\code{p_value} -
31
-  p-value from a single hypergeometric test} \item{\code{p_adjust} - p-value
32
-  adjusted for multiple tests using Benjamini-Hochberg approach}}.
22
+A tibble with enrichment results, providing the following information
23
+  for each term:
24
+  \itemize{
25
+    \item{\code{N_with} - number of features with this term among all features}
26
+    \item{\code{n_with_sel} - number of features with this term in the selection}
27
+    \item{\code{n_expect} - expected number of features with this term in the selection,
28
+      under the null hypothesis that terms are mapped to features randomly}
29
+    \item{\code{enrichment} - ratio of n_with_sel / n_expect}
30
+    \item{\code{odds_ratio} - odds ratio for enrichment; is infinite when all
31
+      features with the given term are in the selection}
32
+    \item{\code{p_value} - p-value from a single hypergeometric test}
33
+    \item{\code{p_adjust} - p-value adjusted for multiple tests using the Benjamini-Hochberg approach}
34
+  }.
33 35
 }
34 36
 \description{
35
-Fast functional enrichment based on hypergeometric distribution. Can be used
36
-in interactive applications.
37
+Perform fast functional enrichment analysis based on the hypergeometric
38
+distribution. Designed for use in interactive applications.
37 39
 }
38 40
 \details{
39
-Functional enrichment in a selection (e.g. differentially expressed genes) of
40
-features, using hypergeometric probability (that is, Fisher's exact test). A
41
-feature can be a gene, protein, etc. \code{term_data} is an object with
42
-functional term information and feature-term mapping
41
+This function carries out functional enrichment analysis on a
42
+  selection of features (e.g., differentially expressed genes) using the
43
+  hypergeometric probability distribution (Fisher's exact test). Features can
44
+  be genes, proteins, etc. The \code{term_data} object contains functional
45
+  term information and feature-term mapping.
43 46
 }
44 47
 \examples{
45 48
 data(exmpl_all, exmpl_sel)
... ...
@@ -2,7 +2,7 @@
2 2
 % Please edit documentation in R/enrichment.R
3 3
 \name{prepare_for_enrichment}
4 4
 \alias{prepare_for_enrichment}
5
-\title{Prepare term data for enrichment analysis}
5
+\title{Prepare Term Data for Enrichment Analysis}
6 6
 \usage{
7 7
 prepare_for_enrichment(
8 8
   terms,
... ...
@@ -13,41 +13,42 @@ prepare_for_enrichment(
13 13
 }
14 14
 \arguments{
15 15
 \item{terms}{A tibble with at least two columns: \code{term_id} and
16
-\code{term_name}. Contains information about functional term
17
-names/descriptions.}
16
+\code{term_name}. This tibble contains information about functional term
17
+names and descriptions.}
18 18
 
19
-\item{mapping}{A tibble with at least two columns, containing mapping between
20
-functional terms and features. One column needs to be called \code{term_id}
21
-and the other column has a name specified by \code{feature_name} argument.
22
-For example, if \code{mapping} contains columns \code{term_id},
23
-\code{accession_number} and  \code{gene_symbol} then setting
24
-\code{feature_name = "gene_symbol"} indicates that gene symbols will be
25
-used for enrichment.}
19
+\item{mapping}{A tibble with at least two columns, containing the mapping
20
+between functional terms and features. One column must be named
21
+\code{term_id}, while the other column should have a name specified by the
22
+\code{feature_name} argument. For example, if \code{mapping} contains
23
+columns \code{term_id}, \code{accession_number}, and \code{gene_symbol},
24
+setting \code{feature_name = "gene_symbol"} indicates that gene symbols
25
+will be used for enrichment analysis.}
26 26
 
27
-\item{all_features}{A vector with all feature ids used as background for
27
+\item{all_features}{A vector with all feature IDs used as the background for
28 28
 enrichment. If not specified, all features found in \code{mapping} will be
29 29
 used, resulting in a larger object size.}
30 30
 
31
-\item{feature_name}{The name of the column in \code{mapping} tibble to be
32
-used as feature.For example, if \code{mapping} contains columns \code{term_id},
33
-\code{accession_number} and  \code{gene_symbol} then setting
34
-\code{feature_name = "gene_symbol"} indicates that gene symbols will be
35
-used for enrichment.}
31
+\item{feature_name}{The name of the column in the \code{mapping} tibble to be
32
+used as the feature identifier. For example, if \code{mapping} contains
33
+columns \code{term_id}, \code{accession_number}, and \code{gene_symbol},
34
+setting \code{feature_name = "gene_symbol"} indicates that gene symbols
35
+will be used for enrichment analysis.}
36 36
 }
37 37
 \value{
38
-An object class \code{fenr_terms} required by
38
+An object of class \code{fenr_terms} required by
39 39
   \code{functional_enrichment}.
40 40
 }
41 41
 \description{
42
-Prepare term data downloaded with \code{fetch_*} functions for fast
43
-enrichment analysis.
42
+Process term data downloaded with the \code{fetch_*} functions, preparing it
43
+for fast enrichment analysis using \code{functional_enrichment}.
44 44
 }
45 45
 \details{
46
-Takes two tibbles with functional term information (\code{terms}) and
47
-feature mapping (\code{mapping}) and converts them into an object required by
48
-\code{functional_enrichment} for fast analysis. Terms and mapping can be
49
-created with database access functions in this package, for example
50
-\code{fetch_reactome} or \code{fetch_go_from_go}.
46
+This function takes two tibbles containing functional term information
47
+(\code{terms}) and feature mapping (\code{mapping}), and converts them into
48
+an object required by \code{functional_enrichment} for efficient analysis.
49
+Terms and mapping can be generated with the database access functions
50
+included in this package, such as \code{fetch_reactome} or
51
+\code{fetch_go_from_go}.
51 52
 }
52 53
 \examples{
53 54
 data(exmpl_all)
... ...
@@ -7,7 +7,7 @@ output:
7 7
     toc_float: true
8 8
     css: style.css
9 9
 abstract: |
10
-  `fenr` performs functional enrichment analysis quickly, typically in a fraction of a second, making it ideal for interactive applications, e.g. Shiny apps. To achieve this, `fenr` downloads functional data (e.g. GO terms of KEGG pathways) in advance, storing them in a format designed for fast analysis of any arbitrary selection of features (genes or proteins).
10
+  The `fenr` R package enables rapid functional enrichment analysis, typically completing in a fraction of a second, which makes it well-suited for interactive applications, such as Shiny apps. To accomplish this, fenr pre-downloads functional data (e.g., GO terms or KEGG pathways) and stores them in a format optimized for swift analysis of any arbitrary selection of features, including genes or proteins.
11 11
 vignette: >
12 12
   %\VignetteIndexEntry{Fast functional enrichment}
13 13
   %\VignetteEngine{knitr::rmarkdown}
... ...
@@ -24,26 +24,25 @@ knitr::opts_chunk$set(
24 24
 
25 25
 # Purpose
26 26
 
27
-Functional enrichment determines whether some biological functions or pathways are enriched in a selection of features (genes, proteins etc.). The selection often comes from differential expression analysis, while functions and pathways are obtained from databases as *GO*, *Reactome* or *KEGG*. At its simplest, enrichment analysis tells us if a given function is enriched in the selection based on Fisher's test. The null hypothesis is that the proportion of features annotated with that function is the same among selected and non-selected features. 
27
+Functional enrichment analysis determines if specific biological functions or pathways are overrepresented in a set of features (e.g., genes, proteins). These sets often originate from differential expression analysis, while the functions and pathways are derived from databases such as *GO*, *Reactome*, or *KEGG*. In its simplest form, enrichment analysis employs Fisher's test to evaluate if a given function is enriched in the selection. The null hypothesis asserts that the proportion of features annotated with that function is the same between selected and non-selected features.
28 28
 
29
-Performing functional enrichment involves downloading large data sets from the aforementioned databases before the actual analysis is done. Downloading data takes time, while Fisher's test can be performed quickly. The purpose of this package is to separate the two and allow for fast enrichment analysis for a given database on various selections of features. It is designed with interactive applications, like Shiny, in mind. A small Shiny app is included in the package to demonstrate usage of `fenr`.
29
+Functional enrichment analysis requires downloading large datasets from the aforementioned databases before conducting the actual analysis. While downloading data is time-consuming, Fisher's test can be performed rapidly. This package aims to separate these two steps, enabling fast enrichment analysis for various feature selections using a given database. It is specifically designed for interactive applications like Shiny. A small Shiny app, included in the package, demonstrates the usage of `fenr`.
30 30
 
31 31
 ## Caveats
32 32
 
33
-Functional enrichment is not the final answer about biology. Quite often is does not give any answer about biology. In particular, when arbitrary groups of genes are selected, enrichment tells us only about simplified statistical overrepresentation of a functional term in the selection. Statistics does not equal biology. This package is meant to be only a tool to explore data and search for clues. Any further statements about biology need independent validation.
34
-
33
+Functional enrichment analysis should not be considered the ultimate answer in understanding biological systems. In many instances, it may not provide clear insights into biology. Specifically, when arbitrary groups of genes are selected, enrichment analysis only reveals the statistical overrepresentation of a functional term within the selection, which may not directly correspond to biological relevance. This package serves as a tool for data exploration; any conclusions drawn about biology require independent validation and further investigation.
35 34
 
36 35
 # Installation
37 36
 
38 37
 `fenr` can be installed from GitHub (you need to install `remotes` package first).
39 38
 
40
-```
39
+```{r install, eval=FALSE}
41 40
 remotes::install_github("bartongroup/fenr", build_vignettes = TRUE)
42 41
 ```
43 42
 
44 43
 # Example
45 44
 
46
-Package `fenr` and example data are loaded with
45
+Package `fenr` and example data are loaded with the following commands:
47 46
 
48 47
 ```{r load_fenr}
49 48
 library(fenr)
... ...
@@ -52,13 +51,13 @@ data(exmpl_all, exmpl_sel)
52 51
 
53 52
 ## Data preparation
54 53
 
55
-The first step is to download functional term data. `fenr` supports downloads from *Gene Ontology*, *Reactome*, *KEGG* and *WikiPathways*. Other ontologies can be used as long as they are converted into a suitable format (see function `prepare_for_enrichment` for details). The following command downloads functional terms and gene mapping from Gene Ontology (GO):
54
+The initial step involves downloading functional term data. `fenr` supports data downloads from *Gene Ontology*, *Reactome*, *KEGG*, *BioPlanet*, and *WikiPathways*. Custom ontologies can also be used, provided they are converted into an appropriate format (refer to the `prepare_for_enrichment` function for more information). The command below downloads functional terms and gene mapping from Gene Ontology (GO):
56 55
 
57 56
 ```{r fetch_go}
58 57
 go <- fetch_go(species = "sgd")
59 58
 ```
60 59
 
61
-This is a list with two tibbles. The first tibble contains term information:
60
+This command returns a list with two tibbles. The first tibble contains term information:
62 61
 
63 62
 ```{r go_terms}
64 63
 go$terms
... ...
@@ -70,23 +69,22 @@ The second tibble contains gene-term mapping:
70 69
 go$mapping
71 70
 ```
72 71
 
73
-Next, these user-friendly data need to be converted into machine-friendly object suitable for fast functional enrichment with the following function:
72
+To make these user-friendly data more suitable for rapid functional enrichment analysis, they need to be converted into a machine-friendly object using the following function:
74 73
 
75 74
 ```{r prepare_for_enrichment}
76 75
 go_terms <- prepare_for_enrichment(go$terms, go$mapping, exmpl_all, feature_name = "gene_symbol")
77 76
 ```
78 77
 
79
-`exmpl_all` is an example of gene background - a vector with gene symbols related to all detections in an imaginary RNA-seq experiment. As different datasets use different features (gene id, gene symbol, protein id), the column name containing features in `go$mapping` needs to be specified with `feature_name = "gene_symbol"`. The result, `go_terms`, is a data structure containing all the mappings in a quickly accessible form. From this point on, `go_terms` can be used to do multiple functional enrichments on various gene selections.
78
+`exmpl_all` is an example of gene background - a vector with gene symbols related to all detections in an imaginary RNA-seq experiment. Since different datasets use different features (gene id, gene symbol, protein id), the column name containing features in `go$mapping` needs to be specified using `feature_name = "gene_symbol"`. The resulting object, `go_terms`, is a data structure containing all the mappings in a quickly accessible form. From this point on, `go_terms` can be employed to perform multiple functional enrichment analyses on various gene selections.
80 79
 
81 80
 ## Functional enrichment
82 81
 
83
-There are two gene sets attached to the package. `exmpl_all` contains all background gene symbols and `exmpl_sel` contains genes of interest. Functional enrichment in the selection can be found using one fast function call:
82
+The package includes two pre-defined gene sets. `exmpl_all` contains all background gene symbols, while `exmpl_sel` comprises the genes of interest. To perform functional enrichment analysis on the selected genes, you can use the following single, efficient function call:
84 83
 
85 84
 ```{r enrichment}
86 85
 enr <- functional_enrichment(exmpl_all, exmpl_sel, go_terms)
87 86
 ```
88 87
 
89
-
90 88
 ## The output
91 89
 
92 90
 The result of `functional_enrichment` is a tibble with enrichment results.
... ...
@@ -98,19 +96,18 @@ enr |>
98 96
 
99 97
 The columns are as follows
100 98
 
101
- - `N_with` - number of features (genes) with this term in the background of all genes,
102
- - `n_with_sel` - number of features with this term in the selection,
103
- - `n_expect` - expected number of features with this term under the null hypothesis (terms are randomly distributed),
104
- - `enrichment` - ratio of observed to expected,
105
- - `odds_ratio` - effect size, odds ratio from the contingency table,
106
- - `ids` - identifiers of features with term in the selection,
107
- - `p_value` - raw p-value from hypergeometric distribution,
108
- - `p_adjust` - p-value adjusted for multiple tests using Benjamini-Hochberg approach.
109
- 
110
- 
111
-# Interactive example
99
+ - `N_with`: The number of features (genes) associated with this term in the background of all genes.
100
+ - `n_with_sel`: The number of features associated with this term in the selection.
101
+ - `n_expect`: The expected number of features associated with this term under the null hypothesis (terms are randomly distributed).
102
+ - `enrichment`: The ratio of observed to expected.
103
+ - `odds_ratio`: The effect size, represented by the odds ratio from the contingency table.
104
+ - `ids`: The identifiers of features with the term in the selection.
105
+ - `p_value`: The raw p-value from the hypergeometric distribution.
106
+ - `p_adjust`: The p-value adjusted for multiple tests using the Benjamini-Hochberg approach.
107
+
108
+# Interactive Example
112 109
 
113
-A small Shiny app is included in the package to illustrate usage of `fenr` in intractive environment. All slow data loading and preparation is done before the app is started.
110
+A small Shiny app is included in the package to demonstrate the usage of `fenr` in an interactive environment. All time-consuming data loading and preparation tasks are performed before the app is launched.
114 111
 
115 112
 ```{r interactive_prepare, eval=FALSE}
116 113
 data(yeast_de)
... ...
@@ -119,9 +116,9 @@ term_data <- fetch_terms_for_example(yeast_de)
119 116
  
120 117
 `yeast_de` is the result of differential expression (using `edgeR`) on a subset of 6+6 replicates from [Gierlinski et al. (2015)](https://academic.oup.com/bioinformatics/article/31/22/3625/240923).
121 118
 
122
-The function `fetch_terms_for_example` uses `fetch_*` functions from `fenr` to download and process data from *GO*, *Reactome* and *KEGG*. One can see how this is done, step by step, by reading the function code from [GitHub](https://github.com/bartongroup/fenr/blob/main/R/iteractive_example.R). The object `term_data` is a named list of `fenr_terms` objects, one for each ontology.
119
+The function `fetch_terms_for_example` uses `fetch_*` functions from `fenr` to download and process data from *GO*, *Reactome* and *KEGG*.  You can view the step-by-step process by examining the function code on [GitHub](https://github.com/bartongroup/fenr/blob/main/R/iteractive_example.R). The object `term_data` is a named list of `fenr_terms` objects, one for each ontology.
123 120
 
124
-Once the slow part is over, the Shiny app can be started with
121
+After completing the slow tasks, you can start the Shiny app by running:
125 122
 
126 123
 ```{r shiny_app, eval=FALSE}
127 124
 enrichment_interactive(yeast_de, term_data)