Browse code

remove SegmentedCells from vignette + updates

alexq authored on 25/10/2023 03:38:07
Showing 1 changed files

... ...
@@ -14,10 +14,13 @@ author:
14 14
 package: "`r BiocStyle::pkg_ver('spicyR')`"
15 15
 vignette: >
16 16
   %\VignetteIndexEntry{"Inroduction to lisaClust"}
17
-  %\VignetteEngine{knitr::rmarkdown}
18 17
   %\VignetteEncoding{UTF-8}
18
+  %\VignetteEngine{knitr::rmarkdown}
19 19
 output: 
20 20
   BiocStyle::html_document
21
+editor_options: 
22
+  markdown: 
23
+    wrap: 72
21 24
 ---
22 25
 
23 26
 ```{r, include = FALSE}
... ...
@@ -35,7 +38,6 @@ if (!require("BiocManager"))
35 38
 BiocManager::install("lisaClust")
36 39
 ```
37 40
 
38
-
39 41
 ```{r message=FALSE, warning=FALSE}
40 42
 # load required packages
41 43
 library(lisaClust)
... ...
@@ -43,30 +45,28 @@ library(spicyR)
43 45
 library(ggplot2)
44 46
 library(SingleCellExperiment)
45 47
 ```
46
- 
47
- 
48 48
 
49 49
 # Overview
50
- Clustering local indicators of spatial association (LISA) functions is a 
51
- methodology for identifying consistent spatial organisation of multiple 
52
- cell-types in an unsupervised way. This can be used to enable the 
53
- characterization of interactions between multiple cell-types simultaneously and 
54
- can complement traditional pairwise analysis. In our implementation our LISA 
55
- curves are a localised summary of an L-function from a Poisson point process 
56
- model.  Our framework `lisaClust` can be used to provide a high-level summary 
57
- of cell-type colocalization in high-parameter spatial cytometry data, 
58
- facilitating the identification of distinct tissue compartments or 
59
- identification of complex cellular microenvironments.
60
-
61 50
 
51
+Clustering local indicators of spatial association (LISA) functions is a
52
+methodology for identifying consistent spatial organisation of multiple
53
+cell-types in an unsupervised way. This can be used to enable the
54
+characterization of interactions between multiple cell-types
55
+simultaneously and can complement traditional pairwise analysis. In our
56
+implementation our LISA curves are a localised summary of an L-function
57
+from a Poisson point process model. Our framework `lisaClust` can be
58
+used to provide a high-level summary of cell-type colocalization in
59
+high-parameter spatial cytometry data, facilitating the identification
60
+of distinct tissue compartments or identification of complex cellular
61
+microenvironments.
62 62
 
63 63
 # Quick start
64 64
 
65 65
 ## Generate toy data
66 66
 
67
-TO illustrate our `lisaClust` framework, here we consider a very simple toy 
68
-example where two cell-types are completely separated spatially. We simulate 
69
-data for two different images.
67
+TO illustrate our `lisaClust` framework, here we consider a very simple
68
+toy example where two cell-types are completely separated spatially. We
69
+simulate data for two different images.
70 70
 
71 71
 ```{r eval=T}
72 72
 set.seed(51773)
... ...
@@ -79,42 +79,55 @@ imageID <- rep(c('s1', 's2'),c(800,800))
79 79
 
80 80
 cells <- data.frame(x, y, cellType, imageID)
81 81
 
82
-ggplot(cells, aes(x,y, colour = cellType)) + geom_point() + facet_wrap(~imageID)
82
+ggplot(cells, aes(x,y, colour = cellType)) + geom_point() + facet_wrap(~imageID) + theme_minimal()
83 83
 
84 84
 
85 85
 ```
86 86
 
87
-## Create SegmentedCellExperiment object
87
+## Create Single Cell Experiment object
88 88
 
89
-First we store our data in a `SegmentedCells` object. 
89
+First we store our data in a `SingleCellExperiment` object.
90 90
 
91 91
 ```{r}
92 92
 
93
-cellExp <- SegmentedCells(cells, cellTypeString = 'cellType')
94
-
95
-
93
+SCE <- SingleCellExperiment(colData = cells)
94
+SCE
96 95
 ```
96
+
97 97
 ## Running lisaCLust
98 98
 
99
-We can then use a convience function `lisaClust` to simultaneously calculate local indicators of spatial association (LISA) functions 
100
-using the `lisa` function and perform k-means clustering. 
99
+We can then use the convenience function `lisaClust` to simultaneously
100
+calculate local indicators of spatial association (LISA) functions using
101
+the `lisa` function and perform k-means clustering. The number of
102
+clusters can be specified with the `k =` parameter. In the example
103
+below, we've chosen `k = 2`, resulting in a total of 2 clusters.
104
+
105
+These clusters are stored in `colData` of the `SingleCellExperiment`
106
+object, as a new column with the column name `regions`.
101 107
 
102 108
 ```{r}
103
-cellExp <- lisaClust(cellExp, k = 2)
109
+SCE <- lisaClust(SCE, k = 2)
110
+colData(SCE) |> head()
104 111
 ```
105 112
 
106
-
107 113
 ## Plot identified regions
108 114
 
109
-The `hatchingPlot` function can be used to construct a `ggplot` object where the 
110
-regions are marked by different hatching patterns. This allows us to plot both 
111
-regions and cell-types on the same visualization.
115
+`lisaClust` also provides the convenient `hatchingPlot` function to
116
+visualise the different regions that have been demarcated by the
117
+clustering. `hatchingPlot` outputs a `ggplot` object where the regions
118
+are marked by different hatching patterns. In a real biological dataset,
119
+this allows us to plot both regions and cell-types on the same
120
+visualization.
112 121
 
122
+In the example below, we can visualise our stimulated data where our 2
123
+cell types have been separated neatly into 2 distinct regions based on
124
+which cell type each region is dominated by. `region_2` is dominated by
125
+the red cell type `c1`, and `region_1` is dominated by the blue cell
126
+type `c2`.
113 127
 
114 128
 ```{r}
115
-hatchingPlot(cellExp, useImages = c('s1','s2'))
129
+hatchingPlot(SCE, useImages = c('s1','s2'))
116 130
 ```
117
-
118 131
 ## Using other clustering methods.
119 132
 
120 133
 While the `lisaClust` function is convenient, we have not implemented an exhaustive
... ...
@@ -129,178 +142,122 @@ localised summary of an L-function from a Poisson point process model. The radii
129 142
 that will be calculated over can be set with `Rs`.
130 143
 
131 144
 ```{r}
132
-
133
-lisaCurves <- lisa(cellExp, Rs = c(20, 50, 100))
134
-
145
+lisaCurves <- lisa(SCE, Rs = c(20, 50, 100))
135 146
 ```
136 147
 
137 148
 ### Perform some clustering
138 149
 
139 150
 The LISA curves can then be used to cluster the cells. Here we use k-means 
140 151
 clustering, other clustering methods like SOM could be used. We can store these 
141
-cell clusters or cell "regions" in our `SegmentedCells` object using the 
152
+cell clusters or cell "regions" in our `SingleCellExperiment` object using the 
142 153
 `cellAnnotation() <-` function.
143 154
 
144 155
 ```{r}
145
-
156
+# Custom clustering algorithm
146 157
 kM <- kmeans(lisaCurves,2)
147
-cellAnnotation(cellExp, "region") <- paste('region',kM$cluster,sep = '_')
148
-```
149
-
150
-
151
-
152
-## Alternative hatching plot
153
-
154
-We could also create this plot using `geom_hatching` and `scale_region_manual`.
155
-
156
-
157
-```{r}
158
-
159
-df <- as.data.frame(cellSummary(cellExp))
160
-
161
-p <- ggplot(df,aes(x = x,y = y, colour = cellType, region = region)) + 
162
-  geom_point() + 
163
-  facet_wrap(~imageID) +
164
-  geom_hatching(window = "concave", 
165
-                line.spacing = 11, 
166
-                nbp = 50, 
167
-                line.width = 2, 
168
-                hatching.colour = "gray20",
169
-                window.length = NULL) +
170
-  theme_minimal() + 
171
-  scale_region_manual(values = 6:7, labels = c('ab','cd'))
172
-
173
-p
174
-
175
-```
176
-## Faster ploting
177
-
178
-The `hatchingPlot` can be quite slow for large images and high `nbp` or `linewidth`.
179
-It is often useful to simply plot the regions without the cell type information.
180
-
181
-```{r}
182
-
183
-df <- as.data.frame(cellSummary(cellExp))
184
-df <- df[df$imageID == "s1", ]
185
-
186
-p <- ggplot(df,aes(x = x,y = y, colour  = region)) + 
187
-  geom_point() +
188
-  theme_classic()
189
-p
190 158
 
159
+# Storing clusters into colData
160
+colData(SCE)$custom_region <- paste('region',kM$cluster,sep = '_')
161
+colData(SCE) |> head()
191 162
 ```
192 163
 
193
-# Using a SingleCellExperiment
194
-
195
-The `lisaClust` function also works with a `SingleCellExperiment`. First lets
196
-create a `SingleCellExperiment` object. 
197
-
198
-
199
-
200
-```{r}
201
-
202
-sce <- SingleCellExperiment(colData = cellSummary(cellExp))
203
-
204
-```
205
-
206
-
207
-`lisaClust` just needs columns in `colData` corresponding to the x and y coordinates of the 
208
-cells, a column annotating the cell types of the cells and a column indicating 
209
-which image each cell came from.
210
-
211
-```{r}
212
-sce <- lisaClust(sce, 
213
-                 k = 2, 
214
-                 spatialCoords = c("x", "y"), 
215
-                 cellType = "cellType",
216
-                 imageID = "imageID")
217
-
218
-```
219
-We can then plot the regions using the following.
220
-
221
-```{r}
222
-
223
-hatchingPlot(sce)
224
-
225
-```
226
-
227
-
228 164
 
229 165
 
230 166
 
231 167
 # Damond et al. islet data.
232 168
 
233
-Here we apply our `lisaClust` framework to three images of pancreatic islets 
234
-from *A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry* by 
235
-Damond et al. (2019).
169
+Next, we apply our `lisaClust` framework to three images of pancreatic
170
+islets from *A Map of Human Type 1 Diabetes Progression by Imaging Mass
171
+Cytometry* by Damond et al. (2019).
236 172
 
237 173
 ## Read in data
238 174
 
239
-We will start by reading in the data and storing it as a `SegmentedCells` 
240
-object. Here the data is in a format consistent with that outputted by 
241
-CellProfiler.
175
+We will start by reading in the data and storing it as a
176
+`SingleCellExperiment` object. Here the data is in a format consistent with
177
+that outputted by CellProfiler.
178
+
242 179
 ```{r}
243 180
 isletFile <- system.file("extdata","isletCells.txt.gz", package = "spicyR")
244 181
 cells <- read.table(isletFile, header = TRUE)
245
-cellExp <- SegmentedCells(cells, cellProfiler = TRUE)
182
+damonSCE <- SingleCellExperiment(assay = list(intensities = t(cells[,grepl(names(cells), pattern = "Intensity_")])),
183
+                                 colData = cells[,!grepl(names(cells), pattern = "Intensity_")]
184
+                                 )
246 185
 
247 186
 ```
248 187
 
249
-
250 188
 ## Cluster cell-types
251 189
 
252
-This data does not include annotation of the cell-types of each cell. Here we 
253
-extract the marker intensities from the `SegmentedCells` object using 
254
-`cellMarks`. We then perform k-means clustering with eight clusters and store 
255
-these cell-type clusters in our `SegmentedCells` object using `cellType() <-`.
190
+This data does not include annotation of the cell-types of each cell.
191
+Here we extract the marker intensities from the `SingleCellExperiment` object
192
+using `assay()`. We then perform k-means clustering with 10
193
+clusters and store these cell-type clusters in our `SingleCellExperiment`
194
+object using `colData() <-`.
195
+
256 196
 ```{r}
257
-markers <- cellMarks(cellExp)
197
+markers <- t(assay(damonSCE, "intensities"))
258 198
 kM <- kmeans(markers,10)
259
-cellType(cellExp) <- paste('cluster', kM$cluster, sep = '')
199
+colData(damonSCE)$cluster <- paste('cluster', kM$cluster, sep = '')
200
+colData(damonSCE)[, c("ImageNumber", "cluster")] |> head()
260 201
 ```
261 202
 
262 203
 ## Generate LISA curves
263 204
 
264
-As before, we can calculate perform k-means clustering on the local indicators 
265
-of spatial association (LISA) functions using the `lisaClust` function. 
205
+As before, we can perform k-means clustering on the local
206
+indicators of spatial association (LISA) functions using the `lisaClust`
207
+function, remembering to specify the `imageID`, `cellType`, and `spatialCoords` 
208
+columns in `colData`.
266 209
 
267 210
 ```{r}
268 211
 
269
-cellExp <- lisaClust(cellExp, k = 2, Rs = c(10,20,50))
212
+damonSCE <- lisaClust(damonSCE, 
213
+                      k = 2, 
214
+                      Rs = c(10,20,50), 
215
+                      imageID = "ImageNumber", 
216
+                      cellType = "cluster",
217
+                      spatialCoords = c("Location_Center_X", "Location_Center_Y"))
270 218
 
271 219
 ```
272 220
 
273
-These regions are stored in cellExp and can be extracted.
221
+These regions are stored in `colData` and can be extracted.
274 222
 
275 223
 ```{r}
276
-
277
-cellAnnotation(cellExp, "region") |>
278
-  head()
224
+colData(damonSCE)[, c("ImageNumber", "region")] |>
225
+  head(20)
279 226
 ```
280 227
 
281
-
282 228
 ## Examine cell type enrichment
283 229
 
284
-We should check to see which cell types appear more frequently in each region than
285
-expected by chance. 
230
+`lisaClust` also provides a convenient function, `regionMap`, for examining which 
231
+cell types are located in which regions. In this example, we use this to check
232
+which cell types appear more frequently in each region than expected by chance.
286 233
 
287
-```{r}
288
-regionMap(cellExp, type = "bubble")
289
-```
234
+Here, we clearly see that clusters 2, 5, 1, and 8 are highly concentrated in 
235
+region 1, whilst all other clusters are thinly spread out across region 2.
290 236
 
237
+We can further segregate these cells by increasing the number of clusters, ie.
238
+increasing the parameter `k = ` in the `lisaClust()` function, but for the purposes
239
+of demonstration, let's take a look at the `hatchingPlot` of these regions.
291 240
 
241
+```{r}
242
+regionMap(damonSCE, 
243
+          imageID = "ImageNumber", 
244
+          cellType = "cluster",
245
+          spatialCoords = c("Location_Center_X", "Location_Center_Y"),
246
+          type = "bubble")
247
+```
292 248
 
293 249
 ## Plot identified regions
294 250
 
295
-Finally, we can use `hatchingPlot` to construct a `ggplot` object where the 
296
-regions are marked by different hatching patterns. This allows us to visualize 
297
-the two regions and ten cell-types simultaneously.
251
+Finally, we can use `hatchingPlot` to construct a `ggplot` object where
252
+the regions are marked by different hatching patterns. This allows us to
253
+visualize the two regions and ten cell-types simultaneously.
298 254
 
299 255
 ```{r}
300
-hatchingPlot(cellExp)
256
+hatchingPlot(damonSCE, 
257
+             cellType = "cluster",
258
+             spatialCoords = c("Location_Center_X", "Location_Center_Y"))
301 259
 ```
302 260
 
303
-
304 261
 # sessionInfo()
305 262
 
306 263
 ```{r}