Brings Seurat to the tidyverse!
website: stemangiola.github.io/tidyseurat/
Please also have a look at
tidyseurat provides a bridge between the Seurat single-cell package [@butler2018integrating; @stuart2019comprehensive] and the tidyverse [@wickham2019welcome]. It creates an invisible layer that enables viewing the Seurat object as a tidyverse tibble, and provides Seurat-compatible dplyr, tidyr, ggplot and plotly functions.
Functions/utilities availabledplyr
All dplyr
APIs like for any tibble tidyr
All tidyr
APIs like for any tibble ggplot2
ggplot
like for any tibble plotly
plot_ly
like for any tibble tidy
Add tidyseurat
invisible layer over a Seurat object as_tibble
Convert cell-wise information to a tbl_df
join_features
Add feature-wise information, returns a tbl_df
aggregate_cells
Aggregate cell gene-transcription abundance as pseudobulk tissue Installation
From CRAN
From Github (development)
Createtidyseurat
, the best of both worlds!
This is a seurat object but it is evaluated as tibble. So it is fully compatible both with Seurat and tidyverse APIs.
It looks like a tibble
## # A Seurat-tibble abstraction: 80 Ã 15
## # [90mFeatures=230 | Cells=80 | Active assay=RNA | Assays=RNA[0m
## .cell orig.ident nCount_RNA nFeature_RNA RNA_snn_res.0.8 letter.idents groups
## <chr> <fct> <dbl> <int> <fct> <fct> <chr>
## 1 ATGC⦠SeuratPro⦠70 47 0 A g2
## 2 CATG⦠SeuratPro⦠85 52 0 A g1
## 3 GAAC⦠SeuratPro⦠87 50 1 B g2
## 4 TGAC⦠SeuratPro⦠127 56 0 A g2
## 5 AGTC⦠SeuratPro⦠173 53 0 A g2
## 6 TCTG⦠SeuratPro⦠70 48 0 A g1
## 7 TGGT⦠SeuratPro⦠64 36 0 A g1
## 8 GCAG⦠SeuratPro⦠72 45 0 A g1
## 9 GATA⦠SeuratPro⦠52 36 0 A g1
## 10 AATG⦠SeuratPro⦠100 41 0 A g1
## # â¹ 70 more rows
## # â¹ 8 more variables: RNA_snn_res.1 <fct>, PC_1 <dbl>, PC_2 <dbl>, PC_3 <dbl>,
## # PC_4 <dbl>, PC_5 <dbl>, tSNE_1 <dbl>, tSNE_2 <dbl>
But it is a Seurat object after all
## $RNA
## Assay data with 230 features for 80 cells
## Top 10 variable features:
## PPBP, IGLL5, VDAC3, CD1C, AKR1C3, PF4, MYL9, GNLY, TREML1, CA2
Preliminary plots
Set colours and theme for plots.
# Use colourblind-friendly colours
friendly_cols <- c("#88CCEE", "#CC6677", "#DDCC77", "#117733", "#332288", "#AA4499", "#44AA99", "#999933", "#882255", "#661100", "#6699CC")
# Set theme
my_theme <-
list(
scale_fill_manual(values = friendly_cols),
scale_color_manual(values = friendly_cols),
theme_bw() +
theme(
panel.border = element_blank(),
axis.line = element_line(),
panel.grid.major = element_line(size = 0.2),
panel.grid.minor = element_line(size = 0.1),
text = element_text(size = 12),
legend.position = "bottom",
aspect.ratio = 1,
strip.background = element_blank(),
axis.title.x = element_text(margin = margin(t = 10, r = 10, b = 10, l = 10)),
axis.title.y = element_text(margin = margin(t = 10, r = 10, b = 10, l = 10))
)
)
We can treat pbmc_small
effectively as a normal tibble for plotting.
Here we plot number of features per cell.
Here we plot total features per cell.
Here we plot abundance of two features for each group.
Preprocess the datasetAlso you can treat the object as Seurat object and proceed with data processing.
## # A Seurat-tibble abstraction: 80 Ã 17
## # [90mFeatures=220 | Cells=80 | Active assay=SCT | Assays=RNA, SCT[0m
## .cell orig.ident nCount_RNA nFeature_RNA RNA_snn_res.0.8 letter.idents groups
## <chr> <fct> <dbl> <int> <fct> <fct> <chr>
## 1 ATGC⦠SeuratPro⦠70 47 0 A g2
## 2 CATG⦠SeuratPro⦠85 52 0 A g1
## 3 GAAC⦠SeuratPro⦠87 50 1 B g2
## 4 TGAC⦠SeuratPro⦠127 56 0 A g2
## 5 AGTC⦠SeuratPro⦠173 53 0 A g2
## 6 TCTG⦠SeuratPro⦠70 48 0 A g1
## 7 TGGT⦠SeuratPro⦠64 36 0 A g1
## 8 GCAG⦠SeuratPro⦠72 45 0 A g1
## 9 GATA⦠SeuratPro⦠52 36 0 A g1
## 10 AATG⦠SeuratPro⦠100 41 0 A g1
## # â¹ 70 more rows
## # â¹ 10 more variables: RNA_snn_res.1 <fct>, nCount_SCT <dbl>,
## # nFeature_SCT <int>, PC_1 <dbl>, PC_2 <dbl>, PC_3 <dbl>, PC_4 <dbl>,
## # PC_5 <dbl>, tSNE_1 <dbl>, tSNE_2 <dbl>
If a tool is not included in the tidyseurat collection, we can use as_tibble
to permanently convert tidyseurat
into tibble.
We proceed with cluster identification with Seurat.
## # A Seurat-tibble abstraction: 80 Ã 19
## # [90mFeatures=220 | Cells=80 | Active assay=SCT | Assays=RNA, SCT[0m
## .cell orig.ident nCount_RNA nFeature_RNA RNA_snn_res.0.8 letter.idents groups
## <chr> <fct> <dbl> <int> <fct> <fct> <chr>
## 1 ATGC⦠SeuratPro⦠70 47 0 A g2
## 2 CATG⦠SeuratPro⦠85 52 0 A g1
## 3 GAAC⦠SeuratPro⦠87 50 1 B g2
## 4 TGAC⦠SeuratPro⦠127 56 0 A g2
## 5 AGTC⦠SeuratPro⦠173 53 0 A g2
## 6 TCTG⦠SeuratPro⦠70 48 0 A g1
## 7 TGGT⦠SeuratPro⦠64 36 0 A g1
## 8 GCAG⦠SeuratPro⦠72 45 0 A g1
## 9 GATA⦠SeuratPro⦠52 36 0 A g1
## 10 AATG⦠SeuratPro⦠100 41 0 A g1
## # â¹ 70 more rows
## # â¹ 12 more variables: RNA_snn_res.1 <fct>, nCount_SCT <dbl>,
## # nFeature_SCT <int>, SCT_snn_res.0.8 <fct>, seurat_clusters <fct>,
## # PC_1 <dbl>, PC_2 <dbl>, PC_3 <dbl>, PC_4 <dbl>, PC_5 <dbl>, tSNE_1 <dbl>,
## # tSNE_2 <dbl>
Now we can interrogate the object as if it was a regular tibble data frame.
## # A tibble: 6 Ã 3
## groups seurat_clusters n
## <chr> <fct> <int>
## 1 g1 0 23
## 2 g1 1 17
## 3 g1 2 4
## 4 g2 0 17
## 5 g2 1 13
## 6 g2 2 6
We can identify cluster markers using Seurat.
Reduce dimensionsWe can calculate the first 3 UMAP dimensions using the Seurat framework.
And we can plot them using 3D plot using plotly.
screenshot plotly Cell type predictionWe can infer cell type identities using SingleR [@aran2019reference] and manipulate the output using tidyverse.
We can easily summarise the results. For example, we can see how cell type classification overlaps with cluster classification.
We can easily reshape the data for building information-rich faceted plots.
We can easily plot gene correlation per cell category, adding multi-layer annotations.
Nested analysesA powerful tool we can use with tidyseurat is nest
. We can easily perform independent analyses on subsets of the dataset. First we classify cell types in lymphoid and myeloid; then, nest based on the new classification
Now we can independently for the lymphoid and myeloid subsets (i) find variable features, (ii) reduce dimensions, and (iii) cluster using both tidyverse and Seurat seamlessly.
Now we can unnest and plot the new classification.
Aggregating cellsSometimes, it is necessary to aggregate the gene-transcript abundance from a group of cells into a single value. For example, when comparing groups of cells across different samples with fixed-effect models.
In tidyseurat, cell aggregation can be achieved using the aggregate_cells
function.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4