... | ... |
@@ -14,10 +14,13 @@ author: |
14 | 14 |
package: "`r BiocStyle::pkg_ver('spicyR')`" |
15 | 15 |
vignette: > |
16 | 16 |
%\VignetteIndexEntry{"Inroduction to lisaClust"} |
17 |
- %\VignetteEngine{knitr::rmarkdown} |
|
18 | 17 |
%\VignetteEncoding{UTF-8} |
18 |
+ %\VignetteEngine{knitr::rmarkdown} |
|
19 | 19 |
output: |
20 | 20 |
BiocStyle::html_document |
21 |
+editor_options: |
|
22 |
+ markdown: |
|
23 |
+ wrap: 72 |
|
21 | 24 |
--- |
22 | 25 |
|
23 | 26 |
```{r, include = FALSE} |
... | ... |
@@ -35,7 +38,6 @@ if (!require("BiocManager")) |
35 | 38 |
BiocManager::install("lisaClust") |
36 | 39 |
``` |
37 | 40 |
|
38 |
- |
|
39 | 41 |
```{r message=FALSE, warning=FALSE} |
40 | 42 |
# load required packages |
41 | 43 |
library(lisaClust) |
... | ... |
@@ -43,30 +45,28 @@ library(spicyR) |
43 | 45 |
library(ggplot2) |
44 | 46 |
library(SingleCellExperiment) |
45 | 47 |
``` |
46 |
- |
|
47 |
- |
|
48 | 48 |
|
49 | 49 |
# Overview |
50 |
- Clustering local indicators of spatial association (LISA) functions is a |
|
51 |
- methodology for identifying consistent spatial organisation of multiple |
|
52 |
- cell-types in an unsupervised way. This can be used to enable the |
|
53 |
- characterization of interactions between multiple cell-types simultaneously and |
|
54 |
- can complement traditional pairwise analysis. In our implementation our LISA |
|
55 |
- curves are a localised summary of an L-function from a Poisson point process |
|
56 |
- model. Our framework `lisaClust` can be used to provide a high-level summary |
|
57 |
- of cell-type colocalization in high-parameter spatial cytometry data, |
|
58 |
- facilitating the identification of distinct tissue compartments or |
|
59 |
- identification of complex cellular microenvironments. |
|
60 |
- |
|
61 | 50 |
|
51 |
+Clustering local indicators of spatial association (LISA) functions is a |
|
52 |
+methodology for identifying consistent spatial organisation of multiple |
|
53 |
+cell-types in an unsupervised way. This can be used to enable the |
|
54 |
+characterization of interactions between multiple cell-types |
|
55 |
+simultaneously and can complement traditional pairwise analysis. In our |
|
56 |
+implementation our LISA curves are a localised summary of an L-function |
|
57 |
+from a Poisson point process model. Our framework `lisaClust` can be |
|
58 |
+used to provide a high-level summary of cell-type colocalization in |
|
59 |
+high-parameter spatial cytometry data, facilitating the identification |
|
60 |
+of distinct tissue compartments or identification of complex cellular |
|
61 |
+microenvironments. |
|
62 | 62 |
|
63 | 63 |
# Quick start |
64 | 64 |
|
65 | 65 |
## Generate toy data |
66 | 66 |
|
67 |
-TO illustrate our `lisaClust` framework, here we consider a very simple toy |
|
68 |
-example where two cell-types are completely separated spatially. We simulate |
|
69 |
-data for two different images. |
|
67 |
+TO illustrate our `lisaClust` framework, here we consider a very simple |
|
68 |
+toy example where two cell-types are completely separated spatially. We |
|
69 |
+simulate data for two different images. |
|
70 | 70 |
|
71 | 71 |
```{r eval=T} |
72 | 72 |
set.seed(51773) |
... | ... |
@@ -79,42 +79,55 @@ imageID <- rep(c('s1', 's2'),c(800,800)) |
79 | 79 |
|
80 | 80 |
cells <- data.frame(x, y, cellType, imageID) |
81 | 81 |
|
82 |
-ggplot(cells, aes(x,y, colour = cellType)) + geom_point() + facet_wrap(~imageID) |
|
82 |
+ggplot(cells, aes(x,y, colour = cellType)) + geom_point() + facet_wrap(~imageID) + theme_minimal() |
|
83 | 83 |
|
84 | 84 |
|
85 | 85 |
``` |
86 | 86 |
|
87 |
-## Create SegmentedCellExperiment object |
|
87 |
+## Create Single Cell Experiment object |
|
88 | 88 |
|
89 |
-First we store our data in a `SegmentedCells` object. |
|
89 |
+First we store our data in a `SingleCellExperiment` object. |
|
90 | 90 |
|
91 | 91 |
```{r} |
92 | 92 |
|
93 |
-cellExp <- SegmentedCells(cells, cellTypeString = 'cellType') |
|
94 |
- |
|
95 |
- |
|
93 |
+SCE <- SingleCellExperiment(colData = cells) |
|
94 |
+SCE |
|
96 | 95 |
``` |
96 |
+ |
|
97 | 97 |
## Running lisaCLust |
98 | 98 |
|
99 |
-We can then use a convience function `lisaClust` to simultaneously calculate local indicators of spatial association (LISA) functions |
|
100 |
-using the `lisa` function and perform k-means clustering. |
|
99 |
+We can then use the convenience function `lisaClust` to simultaneously |
|
100 |
+calculate local indicators of spatial association (LISA) functions using |
|
101 |
+the `lisa` function and perform k-means clustering. The number of |
|
102 |
+clusters can be specified with the `k =` parameter. In the example |
|
103 |
+below, we've chosen `k = 2`, resulting in a total of 2 clusters. |
|
104 |
+ |
|
105 |
+These clusters are stored in `colData` of the `SingleCellExperiment` |
|
106 |
+object, as a new column with the column name `regions`. |
|
101 | 107 |
|
102 | 108 |
```{r} |
103 |
-cellExp <- lisaClust(cellExp, k = 2) |
|
109 |
+SCE <- lisaClust(SCE, k = 2) |
|
110 |
+colData(SCE) |> head() |
|
104 | 111 |
``` |
105 | 112 |
|
106 |
- |
|
107 | 113 |
## Plot identified regions |
108 | 114 |
|
109 |
-The `hatchingPlot` function can be used to construct a `ggplot` object where the |
|
110 |
-regions are marked by different hatching patterns. This allows us to plot both |
|
111 |
-regions and cell-types on the same visualization. |
|
115 |
+`lisaClust` also provides the convenient `hatchingPlot` function to |
|
116 |
+visualise the different regions that have been demarcated by the |
|
117 |
+clustering. `hatchingPlot` outputs a `ggplot` object where the regions |
|
118 |
+are marked by different hatching patterns. In a real biological dataset, |
|
119 |
+this allows us to plot both regions and cell-types on the same |
|
120 |
+visualization. |
|
112 | 121 |
|
122 |
+In the example below, we can visualise our stimulated data where our 2 |
|
123 |
+cell types have been separated neatly into 2 distinct regions based on |
|
124 |
+which cell type each region is dominated by. `region_2` is dominated by |
|
125 |
+the red cell type `c1`, and `region_1` is dominated by the blue cell |
|
126 |
+type `c2`. |
|
113 | 127 |
|
114 | 128 |
```{r} |
115 |
-hatchingPlot(cellExp, useImages = c('s1','s2')) |
|
129 |
+hatchingPlot(SCE, useImages = c('s1','s2')) |
|
116 | 130 |
``` |
117 |
- |
|
118 | 131 |
## Using other clustering methods. |
119 | 132 |
|
120 | 133 |
While the `lisaClust` function is convenient, we have not implemented an exhaustive |
... | ... |
@@ -129,178 +142,122 @@ localised summary of an L-function from a Poisson point process model. The radii |
129 | 142 |
that will be calculated over can be set with `Rs`. |
130 | 143 |
|
131 | 144 |
```{r} |
132 |
- |
|
133 |
-lisaCurves <- lisa(cellExp, Rs = c(20, 50, 100)) |
|
134 |
- |
|
145 |
+lisaCurves <- lisa(SCE, Rs = c(20, 50, 100)) |
|
135 | 146 |
``` |
136 | 147 |
|
137 | 148 |
### Perform some clustering |
138 | 149 |
|
139 | 150 |
The LISA curves can then be used to cluster the cells. Here we use k-means |
140 | 151 |
clustering, other clustering methods like SOM could be used. We can store these |
141 |
-cell clusters or cell "regions" in our `SegmentedCells` object using the |
|
152 |
+cell clusters or cell "regions" in our `SingleCellExperiment` object using the |
|
142 | 153 |
`cellAnnotation() <-` function. |
143 | 154 |
|
144 | 155 |
```{r} |
145 |
- |
|
156 |
+# Custom clustering algorithm |
|
146 | 157 |
kM <- kmeans(lisaCurves,2) |
147 |
-cellAnnotation(cellExp, "region") <- paste('region',kM$cluster,sep = '_') |
|
148 |
-``` |
|
149 |
- |
|
150 |
- |
|
151 |
- |
|
152 |
-## Alternative hatching plot |
|
153 |
- |
|
154 |
-We could also create this plot using `geom_hatching` and `scale_region_manual`. |
|
155 |
- |
|
156 |
- |
|
157 |
-```{r} |
|
158 |
- |
|
159 |
-df <- as.data.frame(cellSummary(cellExp)) |
|
160 |
- |
|
161 |
-p <- ggplot(df,aes(x = x,y = y, colour = cellType, region = region)) + |
|
162 |
- geom_point() + |
|
163 |
- facet_wrap(~imageID) + |
|
164 |
- geom_hatching(window = "concave", |
|
165 |
- line.spacing = 11, |
|
166 |
- nbp = 50, |
|
167 |
- line.width = 2, |
|
168 |
- hatching.colour = "gray20", |
|
169 |
- window.length = NULL) + |
|
170 |
- theme_minimal() + |
|
171 |
- scale_region_manual(values = 6:7, labels = c('ab','cd')) |
|
172 |
- |
|
173 |
-p |
|
174 |
- |
|
175 |
-``` |
|
176 |
-## Faster ploting |
|
177 |
- |
|
178 |
-The `hatchingPlot` can be quite slow for large images and high `nbp` or `linewidth`. |
|
179 |
-It is often useful to simply plot the regions without the cell type information. |
|
180 |
- |
|
181 |
-```{r} |
|
182 |
- |
|
183 |
-df <- as.data.frame(cellSummary(cellExp)) |
|
184 |
-df <- df[df$imageID == "s1", ] |
|
185 |
- |
|
186 |
-p <- ggplot(df,aes(x = x,y = y, colour = region)) + |
|
187 |
- geom_point() + |
|
188 |
- theme_classic() |
|
189 |
-p |
|
190 | 158 |
|
159 |
+# Storing clusters into colData |
|
160 |
+colData(SCE)$custom_region <- paste('region',kM$cluster,sep = '_') |
|
161 |
+colData(SCE) |> head() |
|
191 | 162 |
``` |
192 | 163 |
|
193 |
-# Using a SingleCellExperiment |
|
194 |
- |
|
195 |
-The `lisaClust` function also works with a `SingleCellExperiment`. First lets |
|
196 |
-create a `SingleCellExperiment` object. |
|
197 |
- |
|
198 |
- |
|
199 |
- |
|
200 |
-```{r} |
|
201 |
- |
|
202 |
-sce <- SingleCellExperiment(colData = cellSummary(cellExp)) |
|
203 |
- |
|
204 |
-``` |
|
205 |
- |
|
206 |
- |
|
207 |
-`lisaClust` just needs columns in `colData` corresponding to the x and y coordinates of the |
|
208 |
-cells, a column annotating the cell types of the cells and a column indicating |
|
209 |
-which image each cell came from. |
|
210 |
- |
|
211 |
-```{r} |
|
212 |
-sce <- lisaClust(sce, |
|
213 |
- k = 2, |
|
214 |
- spatialCoords = c("x", "y"), |
|
215 |
- cellType = "cellType", |
|
216 |
- imageID = "imageID") |
|
217 |
- |
|
218 |
-``` |
|
219 |
-We can then plot the regions using the following. |
|
220 |
- |
|
221 |
-```{r} |
|
222 |
- |
|
223 |
-hatchingPlot(sce) |
|
224 |
- |
|
225 |
-``` |
|
226 |
- |
|
227 |
- |
|
228 | 164 |
|
229 | 165 |
|
230 | 166 |
|
231 | 167 |
# Damond et al. islet data. |
232 | 168 |
|
233 |
-Here we apply our `lisaClust` framework to three images of pancreatic islets |
|
234 |
-from *A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry* by |
|
235 |
-Damond et al. (2019). |
|
169 |
+Next, we apply our `lisaClust` framework to three images of pancreatic |
|
170 |
+islets from *A Map of Human Type 1 Diabetes Progression by Imaging Mass |
|
171 |
+Cytometry* by Damond et al. (2019). |
|
236 | 172 |
|
237 | 173 |
## Read in data |
238 | 174 |
|
239 |
-We will start by reading in the data and storing it as a `SegmentedCells` |
|
240 |
-object. Here the data is in a format consistent with that outputted by |
|
241 |
-CellProfiler. |
|
175 |
+We will start by reading in the data and storing it as a |
|
176 |
+`SingleCellExperiment` object. Here the data is in a format consistent with |
|
177 |
+that outputted by CellProfiler. |
|
178 |
+ |
|
242 | 179 |
```{r} |
243 | 180 |
isletFile <- system.file("extdata","isletCells.txt.gz", package = "spicyR") |
244 | 181 |
cells <- read.table(isletFile, header = TRUE) |
245 |
-cellExp <- SegmentedCells(cells, cellProfiler = TRUE) |
|
182 |
+damonSCE <- SingleCellExperiment(assay = list(intensities = t(cells[,grepl(names(cells), pattern = "Intensity_")])), |
|
183 |
+ colData = cells[,!grepl(names(cells), pattern = "Intensity_")] |
|
184 |
+ ) |
|
246 | 185 |
|
247 | 186 |
``` |
248 | 187 |
|
249 |
- |
|
250 | 188 |
## Cluster cell-types |
251 | 189 |
|
252 |
-This data does not include annotation of the cell-types of each cell. Here we |
|
253 |
-extract the marker intensities from the `SegmentedCells` object using |
|
254 |
-`cellMarks`. We then perform k-means clustering with eight clusters and store |
|
255 |
-these cell-type clusters in our `SegmentedCells` object using `cellType() <-`. |
|
190 |
+This data does not include annotation of the cell-types of each cell. |
|
191 |
+Here we extract the marker intensities from the `SingleCellExperiment` object |
|
192 |
+using `assay()`. We then perform k-means clustering with 10 |
|
193 |
+clusters and store these cell-type clusters in our `SingleCellExperiment` |
|
194 |
+object using `colData() <-`. |
|
195 |
+ |
|
256 | 196 |
```{r} |
257 |
-markers <- cellMarks(cellExp) |
|
197 |
+markers <- t(assay(damonSCE, "intensities")) |
|
258 | 198 |
kM <- kmeans(markers,10) |
259 |
-cellType(cellExp) <- paste('cluster', kM$cluster, sep = '') |
|
199 |
+colData(damonSCE)$cluster <- paste('cluster', kM$cluster, sep = '') |
|
200 |
+colData(damonSCE)[, c("ImageNumber", "cluster")] |> head() |
|
260 | 201 |
``` |
261 | 202 |
|
262 | 203 |
## Generate LISA curves |
263 | 204 |
|
264 |
-As before, we can calculate perform k-means clustering on the local indicators |
|
265 |
-of spatial association (LISA) functions using the `lisaClust` function. |
|
205 |
+As before, we can perform k-means clustering on the local |
|
206 |
+indicators of spatial association (LISA) functions using the `lisaClust` |
|
207 |
+function, remembering to specify the `imageID`, `cellType`, and `spatialCoords` |
|
208 |
+columns in `colData`. |
|
266 | 209 |
|
267 | 210 |
```{r} |
268 | 211 |
|
269 |
-cellExp <- lisaClust(cellExp, k = 2, Rs = c(10,20,50)) |
|
212 |
+damonSCE <- lisaClust(damonSCE, |
|
213 |
+ k = 2, |
|
214 |
+ Rs = c(10,20,50), |
|
215 |
+ imageID = "ImageNumber", |
|
216 |
+ cellType = "cluster", |
|
217 |
+ spatialCoords = c("Location_Center_X", "Location_Center_Y")) |
|
270 | 218 |
|
271 | 219 |
``` |
272 | 220 |
|
273 |
-These regions are stored in cellExp and can be extracted. |
|
221 |
+These regions are stored in `colData` and can be extracted. |
|
274 | 222 |
|
275 | 223 |
```{r} |
276 |
- |
|
277 |
-cellAnnotation(cellExp, "region") |> |
|
278 |
- head() |
|
224 |
+colData(damonSCE)[, c("ImageNumber", "region")] |> |
|
225 |
+ head(20) |
|
279 | 226 |
``` |
280 | 227 |
|
281 |
- |
|
282 | 228 |
## Examine cell type enrichment |
283 | 229 |
|
284 |
-We should check to see which cell types appear more frequently in each region than |
|
285 |
-expected by chance. |
|
230 |
+`lisaClust` also provides a convenient function, `regionMap`, for examining which |
|
231 |
+cell types are located in which regions. In this example, we use this to check |
|
232 |
+which cell types appear more frequently in each region than expected by chance. |
|
286 | 233 |
|
287 |
-```{r} |
|
288 |
-regionMap(cellExp, type = "bubble") |
|
289 |
-``` |
|
234 |
+Here, we clearly see that clusters 2, 5, 1, and 8 are highly concentrated in |
|
235 |
+region 1, whilst all other clusters are thinly spread out across region 2. |
|
290 | 236 |
|
237 |
+We can further segregate these cells by increasing the number of clusters, ie. |
|
238 |
+increasing the parameter `k = ` in the `lisaClust()` function, but for the purposes |
|
239 |
+of demonstration, let's take a look at the `hatchingPlot` of these regions. |
|
291 | 240 |
|
241 |
+```{r} |
|
242 |
+regionMap(damonSCE, |
|
243 |
+ imageID = "ImageNumber", |
|
244 |
+ cellType = "cluster", |
|
245 |
+ spatialCoords = c("Location_Center_X", "Location_Center_Y"), |
|
246 |
+ type = "bubble") |
|
247 |
+``` |
|
292 | 248 |
|
293 | 249 |
## Plot identified regions |
294 | 250 |
|
295 |
-Finally, we can use `hatchingPlot` to construct a `ggplot` object where the |
|
296 |
-regions are marked by different hatching patterns. This allows us to visualize |
|
297 |
-the two regions and ten cell-types simultaneously. |
|
251 |
+Finally, we can use `hatchingPlot` to construct a `ggplot` object where |
|
252 |
+the regions are marked by different hatching patterns. This allows us to |
|
253 |
+visualize the two regions and ten cell-types simultaneously. |
|
298 | 254 |
|
299 | 255 |
```{r} |
300 |
-hatchingPlot(cellExp) |
|
256 |
+hatchingPlot(damonSCE, |
|
257 |
+ cellType = "cluster", |
|
258 |
+ spatialCoords = c("Location_Center_X", "Location_Center_Y")) |
|
301 | 259 |
``` |
302 | 260 |
|
303 |
- |
|
304 | 261 |
# sessionInfo() |
305 | 262 |
|
306 | 263 |
```{r} |