This is a Pytorch implementation of Harmony algorithm on single-cell sequencing data integration. Please see Ilya Korsunsky et al., 2019 for details.
This package is published on PyPI:
pip install harmony-pytorch
Given an embedding X
as a N-by-d matrix in numpy array structure (N for number of cells, d for embedding components) and cell attributes as a Data Frame df_metadata
, use Harmony for data integration as the following:
from harmony import harmonize Z = harmonize(X, df_metadata, batch_key = 'Channel')
where Channel
is the attribute in df_metadata
for batches.
Alternatively, if there are multiple attributes for batches, write:
Z = harmonize(X, df_metadata, batch_key = ['Lab', 'Date'])Input as MultimodalData Object
It's easy for Harmony-pytorch to work with count matrix data structure from PegasusIO package. Let data
be a MultimodalData object in Python:
from harmony import harmonize Z = harmonize(data.obsm['X_pca'], data.obs, batch_key = 'Channel') data.obsm['X_pca_harmony'] = Z
This will calculate the harmonized PCA matrix for the default UnimodalData of data
.
Given a UnimodalData object unidata
, you can also use the code above to perform Harmony algorithm: simply substitute unidata
for data
there.
It's easy for Harmony-pytorch to work with annotated count matrix data structure from anndata package. Let adata
be an AnnData object in Python:
from harmony import harmonize Z = harmonize(adata.obsm['X_pca'], adata.obs, batch_key = '<your-batch-key>') adata.obsm['X_harmony'] = Z
where <your-batch-key>
should be replaced by the actual batch key attribute name in your data.
For details about AnnData
data structure, please refer to its documentation.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4