matchclot.preprocessing.preprocess module

Summary

Classes:

lsiTransformer

tfidfTransformer

Functions:

harmony

Harmony batch effect correction applied jointly to multiple dataframes

lsi

LSI analysis (following the Seurat v3 approach) :type adata: AnnData :param adata: Input dataset :type n_components: int :param n_components: Number of dimensions to use :type use_highly_variable: Optional[bool] :param use_highly_variable: Whether to use highly variable features only, stored in adata.var['highly_variable']. By default uses them if they have been determined beforehand. :type **kwargs: :param **kwargs: Additional keyword arguments are passed to sklearn.utils.extmath.randomized_svd().

tfidf

TF-IDF normalization (following the Seurat v3 approach) :type X: :param X: Input matrix

Reference

tfidf(X)[source]

TF-IDF normalization (following the Seurat v3 approach) :type X: :param X: Input matrix

Returns:

TF-IDF normalized matrix

Return type:

X_tfidf

class tfidfTransformer[source]

Bases: object

fit(X)[source]
transform(X)[source]
fit_transform(X)[source]
class lsiTransformer(n_components=20, drop_first=True, use_highly_variable=None)[source]

Bases: object

fit(adata)[source]
transform(adata)[source]
fit_transform(adata)[source]
lsi(adata, n_components=20, use_highly_variable=None, **kwargs)[source]

LSI analysis (following the Seurat v3 approach) :type adata: AnnData :param adata: Input dataset :type n_components: int :param n_components: Number of dimensions to use :type use_highly_variable: Optional[bool] :param use_highly_variable: Whether to use highly variable features only, stored in

adata.var['highly_variable']. By default uses them if they have been determined beforehand.

Parameters:

**kwargs – Additional keyword arguments are passed to sklearn.utils.extmath.randomized_svd()

Return type:

None

harmony(df_list, use_gpu=True)[source]

Harmony batch effect correction applied jointly to multiple dataframes