scanpy anndata
Objects from a Manuscript¶
This guide provides steps to access and analyze the scanpy anndata
objects associated with a recent manuscript. These objects are essential for computational biologists and data scientists working in genomics and related fields. There are three replicates available for download:
Each anndata
object contains several elements crucial for comprehensive data analysis:
.X
: Filtered, normalized, and log-transformed count matrix..raw
: Original, filtered raw count matrix..obsm['MAGIC_imputed_data']
: Imputed count matrix using MAGIC algorithm..obsm['tsne']
: t-SNE maps (as presented in the manuscript), generated using scaled diffusion components..obs['clusters']
: Cell clustering information..obs['palantir_pseudotime']
: Cell pseudo-time ordering, as determined by Palantir..obs['palantir_diff_potential']
: Palantir-determined differentiation potential of cells..obsm['palantir_branch_probs']
: Probabilities of cells branching into different lineages, according to Palantir..uns['palantir_branch_probs_cell_types']
: Labels for Palantir branch probabilities..uns['ct_colors']
: Color codes for cell types, as used in the manuscript..uns['cluster_colors']
: Color codes for cell clusters, as used in the manuscript.In [1]:
import scanpy as sc # Read in the data, with backup URLs provided adata_Rep1 = sc.read( "../data/human_cd34_bm_rep1.h5ad", backup_url="https://s3.amazonaws.com/dp-lab-data-public/palantir/human_cd34_bm_rep1.h5ad", ) adata_Rep2 = sc.read( "../data/human_cd34_bm_rep2.h5ad", backup_url="https://s3.amazonaws.com/dp-lab-data-public/palantir/human_cd34_bm_rep2.h5ad", ) adata_Rep3 = sc.read( "../data/human_cd34_bm_rep3.h5ad", backup_url="https://s3.amazonaws.com/dp-lab-data-public/palantir/human_cd34_bm_rep3.h5ad", )
Out[2]:
AnnData object with n_obs × n_vars = 5780 × 14651 obs: 'clusters', 'palantir_pseudotime', 'palantir_diff_potential' uns: 'cluster_colors', 'ct_colors', 'palantir_branch_probs_cell_types' obsm: 'tsne', 'MAGIC_imputed_data', 'palantir_branch_probs'
Out[3]:
AnnData object with n_obs × n_vars = 6501 × 14913 obs: 'clusters', 'palantir_pseudotime', 'palantir_diff_potential' uns: 'cluster_colors', 'ct_colors', 'palantir_branch_probs_cell_types' obsm: 'tsne', 'MAGIC_imputed_data', 'palantir_branch_probs'
Out[4]:
AnnData object with n_obs × n_vars = 12046 × 14044 obs: 'clusters', 'palantir_pseudotime', 'palantir_diff_potential' uns: 'cluster_colors', 'ct_colors', 'palantir_branch_probs_cell_types' obsm: 'tsne', 'MAGIC_imputed_data', 'palantir_branch_probs'Converting
anndata
Objects to Seurat
Objects Using R¶
For researchers working with R and Seurat, the process to convert anndata
objects to Seurat objects involves the following steps:
Set Up R Environment and Libraries:
Seurat
and anndata
.Download and Read the Data:
curl::curl_download
to download the anndata
from the provided URLs.read_h5ad
method from the anndata
library.Create Seurat Objects:
CreateSeuratObject
function to convert the data into Seurat objects, incorporating counts and metadata from the anndata
object.In [ ]:
# this cell only exists to allow running R code inside this python notebook using a conda kernel import sys import os # Get the path to the python executable python_executable_path = sys.executable # Extract the path to the environment from the path to the python executable env_path = os.path.dirname(os.path.dirname(python_executable_path)) print( f"Conda env path: {env_path}\n" "Please make sure you have R installed in the conda environment." ) os.environ['R_HOME'] = os.path.join(env_path, 'lib', 'R') %load_ext rpy2.ipython
In [6]:
%%R library(Seurat) library(anndata) create_seurat <- function(url) { file_path <- sub("https://s3.amazonaws.com/dp-lab-data-public/palantir/", "../data/", url) if (!file.exists(file_path)) { curl::curl_download(url, file_path) } data <- read_h5ad(file_path) seurat_obj <- CreateSeuratObject( counts = t(data$X), meta.data = data$obs, project = "CD34+ Bone Marrow Cells" ) tsne_data <- data$obsm[["tsne"]] rownames(tsne_data) <- rownames(data$obs) colnames(tsne_data) <- c("tSNE_1", "tSNE_2") seurat_obj[["tsne"]] <- CreateDimReducObject( embeddings = tsne_data, key = "tSNE_" ) imputed_data <- t(data$obsm[["MAGIC_imputed_data"]]) colnames(imputed_data) <- rownames(data$obs) rownames(imputed_data) <- rownames(data$var) seurat_obj[["MAGIC_imputed"]] <- CreateAssayObject(counts = imputed_data) fate_probs <- as.data.frame(data$obsm[["palantir_branch_probs"]]) colnames(fate_probs) <- data$uns[["palantir_branch_probs_cell_types"]] rownames(fate_probs) <- rownames(data$obs) seurat_obj <- AddMetaData(seurat_obj, metadata = fate_probs) return(seurat_obj) } human_cd34_bm_Rep1 <- create_seurat("https://s3.amazonaws.com/dp-lab-data-public/palantir/human_cd34_bm_rep1.h5ad") human_cd34_bm_Rep2 <- create_seurat("https://s3.amazonaws.com/dp-lab-data-public/palantir/human_cd34_bm_rep2.h5ad") human_cd34_bm_Rep3 <- create_seurat("https://s3.amazonaws.com/dp-lab-data-public/palantir/human_cd34_bm_rep3.h5ad")
R[write to console]: Loading required package: SeuratObject R[write to console]: Loading required package: sp R[write to console]: Attaching package: ‘SeuratObject’ R[write to console]: The following object is masked from ‘package:base’: intersect
WARNING: The R package "reticulate" only fixed recently an issue that caused a segfault when used with rpy2: https://github.com/rstudio/reticulate/pull/1188 Make sure that you use a version of that package that includes the fix.
R[write to console]: Attaching package: ‘anndata’ R[write to console]: The following object is masked from ‘package:SeuratObject’: Layers R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Data is of class matrix. Coercing to dgCMatrix. R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Data is of class matrix. Coercing to dgCMatrix. R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Data is of class matrix. Coercing to dgCMatrix. R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-') R[write to console]: Warning: R[write to console]: Feature names cannot have underscores ('_'), replacing with dashes ('-')
An object of class Seurat 29302 features across 5780 samples within 2 assays Active assay: RNA (14651 features, 0 variable features) 1 layer present: counts 1 other assay present: MAGIC_imputed 1 dimensional reduction calculated: tsne
An object of class Seurat 29826 features across 6501 samples within 2 assays Active assay: RNA (14913 features, 0 variable features) 1 layer present: counts 1 other assay present: MAGIC_imputed 1 dimensional reduction calculated: tsne
An object of class Seurat 28088 features across 12046 samples within 2 assays Active assay: RNA (14044 features, 0 variable features) 1 layer present: counts 1 other assay present: MAGIC_imputed 1 dimensional reduction calculated: tsne
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4