Bioconductor Code: SpaNorm

Browse code

spell check in vignette and documentation

Dharmesh Bhuva authored on 13/10/2024 09:52:03
Showing 4 changed files

R/AllClasses.R index c16e20d..973cd95 100644
R/mainSpaNorm.R index e085f62..c31164a 100644
README.md index e21d637..0824314 100644
vignettes/SpaNorm.Rmd index e3f33bd..047ee6e 100644

History View file @ f2bc1ec

@@ -11,7 +11,7 @@
                      #' @slot W a matrix, specifying the covariate matrix of the linear model.
                      #' @slot alpha a matrix, specifying the coefficients of the linear model.
                      #' @slot gmean a numeric, specifying the mean estimate for each gene in the linear model.
                     -#' @slot psi a numeric, specifying the over-dispersion parameter for each geneif a negative binomial model was used (or a vector of NAs if another gene model is used).
                     +#' @slot psi a numeric, specifying the over-dispersion parameter for each gene if a negative binomial model was used (or a vector of NAs if another gene model is used).
                      #' @slot isbio a logical, specifying the columns of the covariate matrix that represent biology.
                      #' @slot loglik a numeric, specifying the log-likelihood of the model at each external iteration.
                      #'

R/mainSpaNorm.R

History View file @ f2bc1ec

@@ -1,4 +1,4 @@
                     -#' Spatially-dependent normalisation for spatial transcriptomics datas
                     +#' Spatially-dependent normalisation for spatial transcriptomics data
                      #'
                      #' Performs normalisation of spatial transcriptomics data using spatially-dependent spot- and gene- specific size factors.
                      #'
@@ -15,8 +15,8 @@
                      #' @param maxit.psi a numeric, specifying the maximum number of IRLS iterations to estimate the dispersion parameter (default is 25).
                      #' @param maxn.psi a numeric, specifying the maximum number of cells/spots to sample for dispersion estimation (default is 500).
                      #' @param tol a numeric, specifying the tolerance for convergence (default is 1e-4).
                     -#' @param overwrite a logical, specifying wether to force recomputation and overwrite an existing fit (default FALSE). Note that if df.tps, batch, lambda.a, or gene.model are changed, the model is recomputed and overwritten.
                     -#' @param verbose a logical, specifying wether to show update messages (default TRUE).
                     +#' @param overwrite a logical, specifying whether to force recomputation and overwrite an existing fit (default FALSE). Note that if df.tps, batch, lambda.a, or gene.model are changed, the model is recomputed and overwritten.
                     +#' @param verbose a logical, specifying whether to show update messages (default TRUE).
                      #' @param ... other parameters fitting parameters.
                      #'
                      #' @details SpaNorm works by first fitting a spatial regression model for library size to the data. Normalised data can then be computed using various adjustment approaches. When a negative binomial gene-model is used, the data can be adjusted using the following approaches: 'logpac', 'pearson', 'medbio', and 'meanbio'.

README.md

History View file @ f2bc1ec

@@ -10,7 +10,7 @@ SpaNorm is a spatially aware library size normalisation method that removes libr
                      SpaNorm uses a unique approach to spatially constraint modelling approach to model gene expression (e.g., counts) and remove library size effects, while retaining biology. It achieves this through three key innovations:
 . Computing spatially smooth functions (using thin plate splines) to represent the gene- and location-/cell-/spot- specific size factors.
                     -1. Optmial decomposition of spatial variation into spatially smooth library size associated (technical) and library size independent (biology) variation using generalized linear models (GLMs).
                     +1. Optimal decomposition of spatial variation into spatially smooth library size associated (technical) and library size independent (biology) variation using generalized linear models (GLMs).
 . Adjustment of data using percentile adjusted counts (PAC) (Salim et al., 2022), as well as other adjustment approaches (e.g., Pearson).
                      ## Installation

vignettes/SpaNorm.Rmd

History View file @ f2bc1ec

@@ -120,7 +120,7 @@ HumanDLPFC = SpaNorm(HumanDLPFC)
                      HumanDLPFC
                      ```
                     -The above output (which can be switched off by setting `verbose = FALSE`), shows the two steps of normalisation. In the model fitting step, `r round(0.25 * ncol(HumanDLPFC))` cells/spots are used to fit the negative binomial (NB) model. Subsequent output shows that this fit is performed by alternating between estimation of the dispersion parameter and estimation of the NB parameters by fixing the dispersion. The output also shows that each intermmediate fit converges, and so does the final fit. The accuracy of the fit can be controlled by modifying the tolerance parameter `tol` (default `1e-4`).
                     +The above output (which can be switched off by setting `verbose = FALSE`), shows the two steps of normalisation. In the model fitting step, `r round(0.25 * ncol(HumanDLPFC))` cells/spots are used to fit the negative binomial (NB) model. Subsequent output shows that this fit is performed by alternating between estimation of the dispersion parameter and estimation of the NB parameters by fixing the dispersion. The output also shows that each intermediate fit converges, and so does the final fit. The accuracy of the fit can be controlled by modifying the tolerance parameter `tol` (default `1e-4`).
                      Next, data is adjusted using the fit model. The following approaches are implemented for count data:
@@ -144,9 +144,9 @@ p_logpac = plotSpatial(
                      p_region + p_logpac
                      ```
                     -# Computing alternative adjusments using a precomputed SpaNorm fit
                     +# Computing alternative adjustments using a precomputed SpaNorm fit
                     -As no appropriate slot exists for storing model parameters, we currently save them in the metadata slot with the name "SpaNorm". This also means that subsetting features (i.e., genes) or observatins (i.e., cells/spots/loci) does not subset the model. In such an instance, the SpaNorm function will realise that the model no longer matches the data and restimates when called. If instead the model is valid for the data, the existing fit is extracted and reused.
                     +As no appropriate slot exists for storing model parameters, we currently save them in the metadata slot with the name "SpaNorm". This also means that subsetting features (i.e., genes) or observations (i.e., cells/spots/loci) does not subset the model. In such an instance, the SpaNorm function will realise that the model no longer matches the data and re-estimates when called. If instead the model is valid for the data, the existing fit is extracted and reused.
                      The fit can be manually retrieved as below for users wishing to reuse the model outside the SpaNorm framework. Otherwise, calling `SpaNorm()` on an object containing the fit will automatically use it.
@@ -226,7 +226,7 @@ p_logpac_2 + p_logpac_6
                      # Enhancing signal
                     -As the counts for the MOBP gene are very low, we see artefacts in the adjusted counts. As we have a model for the genes, we can increase the signal by adjusting all means by a constant factor. Applying a scale factor of 4 shows how the adjusted data are more continuous, with significant enrichment in the white matter.
                     +As the counts for the MOBP gene are very low, we see artifacts in the adjusted counts. As we have a model for the genes, we can increase the signal by adjusting all means by a constant factor. Applying a scale factor of 4 shows how the adjusted data are more continuous, with significant enrichment in the white matter.
                      ```{r fig.width=7.5, fig.height=4.25}
                      # scale.factor = 1 (default)