---
title: "Clustering of Local Indicators of Spatial Assocation (LISA) curves"
date: "`r BiocStyle::doc_date()`"
author:
- name: Nicolas Canete
  affiliation:  
  - &WIMR Westmead Institute for Medical Research, University of Sydney, Australia
  email: [email protected]
- name: Ellis Patrick
  affiliation:
  - &WIMR Westmead Institute for Medical Research, University of Sydney, Australia
  - School of Mathematics and Statistics, University of Sydney, Australia
  email: [email protected]
package: "`r BiocStyle::pkg_ver('spicyR')`"
vignette: >
  %\VignetteIndexEntry{"Inroduction to lisaClust"}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
output: 
  BiocStyle::html_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE, message = FALSE, warning = FALSE
)
library(BiocStyle)
```

# Installation

```{r, eval = FALSE}
if (!require("BiocManager"))
    install.packages("BiocManager")
BiocManager::install("lisaClust")
```


```{r message=FALSE, warning=FALSE}
# load required packages
library(lisaClust)
library(spicyR)
library(ggplot2)
library(SingleCellExperiment)
```
 
 

# Overview
 Clustering local indicators of spatial association (LISA) functions is a 
 methodology for identifying consistent spatial organisation of multiple 
 cell-types in an unsupervised way. This can be used to enable the 
 characterization of interactions between multiple cell-types simultaneously and 
 can complement traditional pairwise analysis. In our implementation our LISA 
 curves are a localised summary of an L-function from a Poisson point process 
 model.  Our framework `lisaClust` can be used to provide a high-level summary 
 of cell-type colocalization in high-parameter spatial cytometry data, 
 facilitating the identification of distinct tissue compartments or 
 identification of complex cellular microenvironments.



# Quick start

## Generate toy data

TO illustrate our `lisaClust` framework, here we consider a very simple toy 
example where two cell-types are completely separated spatially. We simulate 
data for two different images.

```{r eval=T}
set.seed(51773)
x <- round(c(runif(200),runif(200)+1,runif(200)+2,runif(200)+3,
           runif(200)+3,runif(200)+2,runif(200)+1,runif(200)),4)*100
y <- round(c(runif(200),runif(200)+1,runif(200)+2,runif(200)+3,
             runif(200),runif(200)+1,runif(200)+2,runif(200)+3),4)*100
cellType <- factor(paste('c',rep(rep(c(1:2),rep(200,2)),4),sep = ''))
imageID <- rep(c('s1', 's2'),c(800,800))

cells <- data.frame(x, y, cellType, imageID)

ggplot(cells, aes(x,y, colour = cellType)) + geom_point() + facet_wrap(~imageID)


```

## Create SegmentedCellExperiment object

First we store our data in a `SegmentedCells` object. 

```{r}

cellExp <- SegmentedCells(cells, cellTypeString = 'cellType')


```
## Running lisaCLust

We can then use a convience function `lisaClust` to simultaneously calculate local indicators of spatial association (LISA) functions 
using the `lisa` function and perform k-means clustering. 

```{r}
cellExp <- lisaClust(cellExp, k = 2)
```


## Plot identified regions

The `hatchingPlot` function can be used to construct a `ggplot` object where the 
regions are marked by different hatching patterns. This allows us to plot both 
regions and cell-types on the same visualization.


```{r}
hatchingPlot(cellExp, imageID = c('s1','s2'))
```

## Using other clustering methods.

While the `lisaClust` function is convenient, we have not implemented an exhaustive
suite of clustering methods as it is very easy to do this yourself. There are 
just two simple steps.

### Generate LISA curves

We can calculate local indicators of spatial association (LISA) functions 
using the `lisa` function. Here the LISA curves are a 
localised summary of an L-function from a Poisson point process model. The radii 
that will be calculated over can be set with `Rs`.

```{r}

lisaCurves <- lisa(cellExp, Rs = c(20, 50, 100))

```

### Perform some clustering

The LISA curves can then be used to cluster the cells. Here we use k-means 
clustering, other clustering methods like SOM could be used. We can store these 
cell clusters or cell "regions" in our `SegmentedCells` object using the 
`cellAnnotation() <-` function.

```{r}

kM <- kmeans(lisaCurves,2)
cellAnnotation(cellExp, "region") <- paste('region',kM$cluster,sep = '_')
```



## Alternative hatching plot

We could also create this plot using `geom_hatching` and `scale_region_manual`.


```{r}

df <- as.data.frame(cellSummary(cellExp))

p <- ggplot(df,aes(x = x,y = y, colour = cellType, region = region)) + 
  geom_point() + 
  facet_wrap(~imageID) +
  geom_hatching(window = "concave", 
                line.spacing = 11, 
                nbp = 50, 
                line.width = 2, 
                hatching.colour = "gray20",
                window.length = NULL) +
  theme_minimal() + 
  scale_region_manual(values = 6:7, labels = c('ab','cd'))

p

```
## Faster ploting

The `hatchingPlot` can be quite slow for large images and high `nbp` or `linewidth`.
It is often useful to simply plot the regions without the cell type information.

```{r}

df <- as.data.frame(cellSummary(cellExp))
df <- df[df$imageID == "s1", ]

p <- ggplot(df,aes(x = x,y = y, colour  = region)) + 
  geom_point() +
  theme_classic()
p

```

## Using a SingleCellExperiment

The `lisaClust` function also works with a `SingleCellExperiment`. First lets
create a `SingleCellExperiment` object. 



```{r}

sce <- SingleCellExperiment(colData = cells)

```


`lisaClust` just needs columns in `colData` corresponding to the x and y coordinates of the 
cells, a column annotating the cell types of the cells and a column indicating 
which image each cell came from.

```{r}
sce <- lisaClust(sce, 
                 k = 2, 
                 spatialCoords = c("x", "y"), 
                 cellType = "cellType",
                 imageID = "imageID")

```
We can then plot the regions using the following.

```{r}

df <- as.data.frame(colData(sce))

df <- df[df$imageID == "s1", ]

p <- ggplot(df,aes(x = x,y = y, colour = cellType, region = region)) + 
  geom_point() +
  geom_hatching() 
p

```





# Damond et al. islet data.

Here we apply our `lisaClust` framework to three images of pancreatic islets 
from *A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry* by 
Damond et al. (2019).

## Read in data

We will start by reading in the data and storing it as a `SegmentedCells` 
object. Here the data is in a format consistent with that outputted by 
CellProfiler.
```{r}
isletFile <- system.file("extdata","isletCells.txt.gz", package = "spicyR")
cells <- read.table(isletFile, header = TRUE)
cellExp <- SegmentedCells(cells, cellProfiler = TRUE)

```


## Cluster cell-types

This data does not include annotation of the cell-types of each cell. Here we 
extract the marker intensities from the `SegmentedCells` object using 
`cellMarks`. We then perform k-means clustering with eight clusters and store 
these cell-type clusters in our `SegmentedCells` object using `cellType() <-`.
```{r}
markers <- cellMarks(cellExp)
kM <- kmeans(markers,10)
cellType(cellExp) <- paste('cluster', kM$cluster, sep = '')
```

## Generate LISA curves

As before, we can calculate perform k-means clustering on the local indicators 
of spatial association (LISA) functions using the `lisaClust` function. 

```{r}

cellExp <- lisaClust(cellExp, k = 2, Rs = c(10,20,50))

```

These regions are stored in cellExp and can be extracted.

```{r}

cellAnnotation(cellExp, "region") |>
  head()
```


## Examine cell type enrichment

We should check to see which cell types appear more frequently in each region than
expected by chance. 

```{r}
regionMap(cellExp, type = "bubble")
```



## Plot identified regions

Finally, we can use `hatchingPlot` to construct a `ggplot` object where the 
regions are marked by different hatching patterns. This allows us to visualize 
the two regions and ten cell-types simultaneously.

```{r}
hatchingPlot(cellExp)
```


# sessionInfo()

```{r}
sessionInfo()
```