metacells


Namemetacells JSON
Version 0.9.4 PyPI version JSON
download
home_pagehttps://github.com/tanaylab/metacells.git
SummarySingle-cell RNA Sequencing Analysis
upload_time2023-10-24 13:45:10
maintainer
docs_urlNone
authorOren Ben-Kiki
requires_python>=3.7
licenseMIT license
keywords metacells
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Metacells 0.9.4 - Single-cell RNA Sequencing Analysis
=====================================================

.. image:: https://readthedocs.org/projects/metacells/badge/?version=latest
    :target: https://metacells.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

The metacells package implements the improved metacell algorithm [1]_ for single-cell RNA sequencing (scRNA-seq) data
analysis within the `scipy <https://www.scipy.org/>`_ framework, and projection algorithm based on it [2]_. The original
metacell algorithm [3]_ was implemented in R. The python package contains various algorithmic improvements and is
scalable for larger data sets (millions of cells).

Metacell Analysis
-----------------

Naively, scRNA_seq data is a set of cell profiles, where for each one, for each gene, we get a count of the mRNA
molecules that existed in the cell for that gene. This serves as an indicator of how "expressed" or "active" the gene
is.

As in any real world technology, the raw data may suffer from technical artifacts (counting the molecules of two cells
in one profile, counting the molecules from a ruptured cells, counting only the molecules from the cell nucleus, etc.).
This requires pruning the raw data to exclude such artifacts.

The current technology scRNA-seq data is also very sparse (typically <<10% the RNA molecules are counted). This
introduces large sampling variance on top of the original signal, which itself contains significant inherent biological
noise.

Analyzing scRNA-seq data therefore requires processing the profiles in bulk. Classically, this has been done by directly
clustering the cells using various methods.

In contrast, the metacell approach groups together profiles of the "same" biological state into groups of cells of the
"same" biological state, with the *minimal* number of profiles needed for computing robust statistics (in particular,
mean gene expression). Each such group is a single "metacell".

By summing profiles of cells of the "same" state together, each metacell greatly reduces the sampling variance, and
provides a more robust estimation of the transcription state. Note a metacell is *not* a cell type (multiple metacells
may belong to the same "type", or even have the "same" state, if the data sufficiently over-samples this state). Also, a
metacell is *not* a parametric model of the cell state. It is merely a more robust description of some cell state.

The metacells should therefore be further analyzed as if they were cells, using additional methods to classify cell
types, detect cell trajectories and/or lineage, build parametric models for cell behavior, etc. Using metacells as input
for such analysis techniques should benefit both from the more robust, less noisy input; and also from the (~100-fold)
reduction in the number of cells to analyze when dealing with large data (e.g. analyzing millions of individual cells).

A common use case is taking a new data set and using an existing atlas with annotations (in particular, "type"
annotations) to provide initial annotations for the new data set. As of version 0.9 this capability is provided
by this package.

Metacell projection provides a quantitative "projected" genes profile for each query metacell in the atlas, together
with a "corrected" one for the same subset of genes shared between the query and the atlas. Actual correction is
optional, to be used only if there are technological differences between the data sets, e.g. 10X v2 vs. 10X v3. This
allows performing a quantitative comparison between the projected and corrected gene expression profiles for determining
whether the query metacell is a novel state that does not exist in the atlas, or, if it does match an atlas state,
analyze any differences it may still have. This serves both for quality control and for quantitative analysis of
perturbed systems (e.g. knockouts or disease models) in comparison to a baseline atlas.

Terminology and Results Format
------------------------------

**NOTE**: Version 0.9 **breaks compatibility** with version 0.8 when it comes to some APIs and the names and semantics
of the result annotations. See below for the description of updated results (and how they differ from version 0.8). The
new format is meant to improve the usability of the system in downstream analysis pipelines. For convenience we also
list here the results of the new projection pipeline added in version 0.9.*. Versions 0.9.1 and 0.9.2 contain some bug
fixes. Version 0.9.3 allows specifying target UMIs for the metacells, in addition to the target size in cells, and
adaptively tries to satisfy both. This should produce better-sized metacells "out of the box" compared to the 0.9.[0-2]
versions. The latest published version, 0.9.4, contains minor bug fixes and updates for newer versions of dependency
packages.

If you have existing metacell data that was computed using version 0.8 (the current published version you will get
from using ``pip install metacells``, you can use the provided
`conversion script <https://github.com/tanaylab/metacells/blob/master/bin/convert_0.8_to_0.9.py>`_
script to migrate your data to the format described below, while preserving any additional annotations you may have
created for your data (e.g. metacells type annotations). The script will not modify your existing data files, so you can
examine the results and tweak them if necessary.

In the upcoming version 0.10 we will migrate from using ``AnnData`` to using ``daf`` to represent the data (``h5ad``
files will still be supported, either directly through an adapter or via a conversion process). This will again
unavoidingly break API compatibility, but will provide many advantages over the restricted ``AnnData`` APIs.

We apologize for the inconvenience.

Metacells Computation
.....................

In theory, the only inputs required for metacell analysis are cell gene profiles with a UMIs count per gene per cell. In
practice, a key part of the analysis is specifying lists of genes for special treatment. We use the following
terminology for these lists:

``excluded_gene``, ``excluded_cell`` masks
    Excluded genes (and/or cells) are totally ignored by the algorithm (e.g. mytochondrial genes, cells with too few
    UMIs).

    Deciding on the "right" list of excluded genes (and cells) is crucial for creating high-quality metacells. We rely
    on the analyst to provide this list based on prior biological knowledge. To support this supervised task, we provide
    the ``excluded_genes`` and ``exclude_cells`` functions which implement "reasonable" strategies for detecting some
    (not all) of the genes and cells to exclude. For example, these will exclude any genes found by
    ``find_bursty_lonely_genes``, (called ``find_noisy_lonely_genes`` in v0.8). Additional considerations might be to
    use ``relate_genes`` to (manually) exclude genes that are highly correlated with known-to-need-to-be-excluded genes,
    or exclude any cells that are marked as doublets, etc.

    Currently the 1st step of the processing must be to create a "clean" data set which lacks the excluded genes and
    cells (e.g. using ``extract_clean_data``). When we switch to ``daf`` we'll just stay with the original data set and
    apply the exclusion masks to the rest of the algorithm.

``lateral_gene`` mask
    Lateral genes are forbidden from being selected for computing cells similarity (e.g., cell cycle genes). In version
    0.8 these were called "forbidden" genes. Lateral genes are still counted towards the total UMIs count when computing
    gene expression levels for cells similarity. In addition, lateral genes are still used to compute deviant (outlier)
    cells. That is, each computed metacell should still have a consistent gene expression level even for lateral genes.

    The motivation is that we don't want the algorithm to even try to create metacells based on these genes. Since these
    genes may be very strong (again, cell cycle), they would overcome the cell-type genes we are interested in,
    resulting in for example an "M-state" metacell which combines cells from several (similar) cell types.

    Deciding on the "right" list of lateral genes is crucial for creating high-quality metacells. We rely on the analyst
    to provide this list based on prior biological knowledge. To support this supervised task, we provide the
    ``relate_genes`` pipeline for identifying genes closely related to known lateral genes, so they can be added to the
    list.

``noisy_gene`` mask
    Noisy genes are given more freedom when computing deviant (outlier) cells. That is, we don't expect the expression
    level of such genes in the cells in the same metacell to be as consistent as we do for regular (non-noisy) genes.
    Note this isn't related to the question of whether the gene is lateral of not. That is, a gee maybe lateral, noisy,
    both, or neither.

    The motivation is that some genes are inherently bursty and therefore cause many cells which are otherwise a good
    match for their metacell to be marked as deviant (outliers). An indication for this is by examining the
    ``deviant_fold`` matrix (see below).

    Deciding on the "right" list of noisy genes is again crucial for creating high-quality metacells (and minimizing the
    fraction of outlier cells). Again we rely on the analyst here,

Having determined the inputs and possibly tweaking the hyper-parameters (a favorite one is the ``target_metacell_size``,
which by default is 160K UMIs; this may be reduced for small data sets and may be increased for larger data sets), one
typically runs ``divide_and_conquer_pipeline`` to obtain the following:

``metacell`` (index) vs. ``metacell_name`` (string) per cell
    The result of computing metacells for a set of cells with the above assigns each cell a metacell index. We also give
    each metacell a name of the format ``M<index>.<checksum>`` where the checksum reflects the cells grouped into the
    metacell. This protects the analyst from mistakenly applying metadata assigned to metacells from an old computation
    to different newly computed metacells.

    We provide functions (``convey_obs_to_group``, ``convey_group_to_obs``) for conveying between per-cell and
    per-metacell annotations, which all currently use the metacell integer indices (this will change when we switch to
    ``daf``). The metacell string names are safer to use, especially when slicing the data.

``dissolve`` cells mask
    Whether the cell was in a candidate matecall that was dissolved due to being too small (too few cells and/or total
    UMIs). This may aid quality control when there are a large number of outliers; lowering the ``target_metacell_size``
    may help avoid this.

``selected_gene`` mask
    Whether each gene was ever selected to be used to compute the similarity between cells to compute the metacells.
    When using the divide-and-conquer algorithm, this mask is different for each pile (especially in the second phase
    when piles are homogeneous). This mask is the union of all the masks used in all the piles. It is useful for
    ensuring no should-be-lateral genes were selected as this would reduce the quality of the metacells. If such genes
    exist, add them to the ``lateral_gene`` mask and recompute the metacells.

Having computed the metacells, the next step is to run ``collect_metacells`` to create a new ``AnnData`` object for them
(when using ``daf``, they will be created in the same dataset for easier analysis), which will contain all the per-gene
metadata, and also:

``X`` per gene per metacell
    Once the metacells have been computed (typically using ``divide_and_conquer_pipeline``), we can collect the gene
    expression levels profile for each one. The main motivation for computing metacells is that they allow for a robust
    estimation of the gene expression level, and therefore we by default compute a matrix of gene fractions (which sum
    to 1) in each metacell, rather than providing a UMIs count for each. This simplifies the further analysis of the
    computed metacells (this is known as ``e_gc`` in the old R metacells package).

    Note that the expression level of noisy genes is less reliable, as we do not guarantee the cells in each metacell
    have a consistent expression level for such genes. Our estimator therefore uses a normal weighted mean for most
    genes and a normalized geometric mean for the noisy gene. Since the sizes of the cells collected into the same
    metacell may vary, our estimator also ensures one large cell doesn't dominate the results. That is, the computed
    fractions are *not* simply "sum of the gene UMIs in all cells divided by the sum of all gene UMIs in all cells".

``grouped`` per metacell
    The number of cells grouped into each metacell.

``total_umis`` per metacell, and per gene per metacell
    We still provide the total UMIs count for each each gene for each cell in each metacell, and the total UMIs in each
    metacell. Note that the estimated fraction of each gene in the metacell is *not* its total UMIs divided by the
    metacell's total UMIs; the actual estimator is more complex.

    The total UMIs are important to ensure that analysis is meaningful. For example, comparing expression levels of
    lowly-expressed genes in two metacells will yield wildly inaccurate results unless a sufficient number of UMIs were
    used (the sum of UMIs of the gene in both compared metacells). The functions provided here for computing fold
    factors (log base 2 of the ratio) and related comparisons automatically ignore cases when this sum is below some
    threshold (40) by considering the effective fold factor to be 0 (that is, "no difference").

``metacells_level`` per cell or metacell
    This is 0 for rare gene module metacells, 1 for metacells computed from the main piles in the 2nd divide-and-conquer
    phase and 2 for metacells computed for their outliers.

If using ``divide_and_conquer_pipeline``, the following are also computed (but not by the simple
``compute_divide_and_conquer_metacells``:

``rare_gene_module_<N>`` mask (for N = 0, ...)
    A mask of the genes combined into each of the detected "rare gene modules". This is done in (expensive)
    pre-processing before the full divide-and-conquer algorithm to increase the sensitivity of the method, by creating
    metacells computed only from cells that express each rare gene module.

``rare_gene`` mask
    A mask of all the genes in all the rare gene modules, for convenience.

``rare_gene_module`` per cell or metacell
    The index of the rare gene module each cell or metacell expresses (or negative for the common case it expresses none
    of them).

``rare_cell``, ``rare_metacell`` masks
    A mask of all the cells or metacells expressing any of the rare gene modules, for convenience.

In theory one is free to go use the metacells for further analysis, but it is prudent to perform quality control first.
One obvious measure is the number of outlier cells (with a negative metacell index and a metacell name of ``Outliers``).
In addition, one should compute and look at the following (an easy way to compute all of them at once is to call
``compute_for_mcview``, this will change in the future):

``most_similar``, ``most_similar_name`` per cell (computed by ``compute_outliers_most_similar``)
    For each outlier cell (whose metacell index is ``-1`` and metacell name is ``Outliers``), the index and name of the
    metacell which is the "most similar" to the cell (has highest correlation).

``deviant_fold`` per gene per cell (computed by ``compute_deviant_folds``)
    For each cell, for each gene, the ``deviant_fold`` holds the fold factor (log base 2) between the expression level
    of the gene in the cell and the metacell it belongs to (or the most similar metacell for outlier cells). This uses
    the same (strong) normalization factor we use when computing deviant (outlier) cells, so for outliers, you should
    see some (non-excluded, non-noisy) genes with a fold factor above 3 (8x), or some (non-excluded, noisy) genes with a
    fold factor above 5 (32x), which justify why we haven't merged that cell into a metacell; for cells grouped into
    metacells, you shouldn't see (many) such genes. If there is a large number of outlier cells and a few non-noisy
    genes have a high fold factor for many of them, you should consider marking these genes as noisy and recomputing the
    metacells. If they are already marked as noisy, you may want to completely exclude them.

``inner_fold`` per gene per metacell (computed by ``compute_inner_folds``)
    For each metacell, for each gene, the ``inner_fold`` is the strongest (highest absolute value) ``deviant_fold`` of
    any of the cells contained in the metacell. Both this and the ``inner_stdev_log`` below can be used for quality
    control over the consistency of the gene expression in the metacell.

``significant_inner_folds_count`` per gene
    For each gene, the number of metacells in which there's at least one cell with a high ``deviant_fold`` (that is,
    where the ``inner_fold`` is high). This helps in identifying troublesome genes, which can be then marked as noisy,
    lateral or even excluded, depending on their biological significance.

``inner_stdev_log`` per gene per metacell (computed by ``compute_inner_stdev_logs``)
    For each metacell, for each gene, the standard deviation of the log (base 2) of the fraction of the gene across the
    cells of the metacell. Ideally, the standard deviation should be ~1/3rd of the ``deviants_min_gene_fold_factor``
    (which is ``3`` by default), indicating that (all)most cells are within that maximal fold factor. In practice we may
    see higher values - the lower, the better. Both this and the ``inner_fold`` above can be used for quality control over the consistency of the gene expression in the metacell.

``marker_gene`` mask (computed by ``find_metacells_marker_genes``)
    Given the computed metacells, we can identify genes that have a sufficient number of effective UMIs (in some
    metacells) and also have a wide range of expressions (between different metacells). These genes serve as markers for
    identifying the "type" of the metacell (or, more generally, the "gene programs" that are active in each metacell).

    Typically analysis groups the marker genes into "gene modules" (or, more generally, "gene programs"), and then use
    the notion of "type X expresses the gene module/programs Y, Z, ...". As of version 0.9, collecting such gene modules
    (or programs) is left to the analyst with little or no direct support in this package, other than providing the rare
    gene modules (which by definition would apply only to a small subset of the metacells).

``x``, ``y`` per metacell (computed by ``compute_umap_by_markers``)
    A common and generally effective way to visualize the computed metacells is to project them to a 2D view. Currently
    we do this by giving UMAP a distance metric between metacells based on a logistic function based on the expression
    levels of the marker genes. In version 0.8 this was based on picking (some of) the selected genes.

    This view is good for quality control. If it forces "unrelated" cell types together, this might mean that more genes
    should be made lateral, or noisy, or even excluded; or maybe the data contains a metacell of doublets; or metacells
    mixing cells from different types, if too many genes were marked as lateral or noisy, or excluded. It takes a
    surprising small number of such doublet/mixture metacells to mess up the UMAP projection.

    Also, one shouldn't read too much from the 2D layout, as by definition it can't express the "true" structure of the
    data. Looking at specific gene-gene plots gives much more robust insight into the actual differences between the
    metacell types, identify doublets, etc.

``obs_outgoing_weights`` per metacell per metacell (also computed by ``compute_umap_by_markers``)
    The (sparse) matrix of weights of the graph used to generate the ``x`` and ``y`` 2D projection. This graph is *very*
    sparse, that is, has a very low degree for the nodes. It is meant to be used only in conjunction with the 2D
    coordinates for visualization, and should **not** be used by any downstream analysis to determine which metacells
    are "near" each other for any other purpose.

Metacells Projection
....................

For the use case of projecting metacells we use the following terminology:

``atlas``
    A set of metacells with associated metadata, most importantly a ``type`` annotation per metacell. In addition, the
    atlas may provide an ``essential_gene_of_<type>`` mask for each type. For a query metacell to successfully project
    to a given type will require that the query's expression of the type's essential genes matches the atlas. We also
    use the metadata listed above (specifically, ``lateral_gene``, ``noisy_gene`` and ``marker_gene``).

``query``
    A set of metacells with minimal associated metadata, specifically without a ``type``. This may optionally contain
    its own ``lateral_gene``, ``noisy_gene`` and/or even ``marker_gene`` annotations.

``ignored_gene`` mask, ``ignored_gene_of_<type>`` mask
    A set of genes to not even try to match between the query and the atlas. In general the projection matches only a
    subset of the genes (that are common to the atlas and the query). However, the analyst has the option to force
    additional genes to be ignored, either in general or only when projecting metacells of a specific type. Manually
    ignoring specific genes which are known not to match (e.g., due to the query being some experiment, e.g. a knockout
    or a disease model) can improve the quality of the projection for the genes which do match.

Given these two input data sets, the ``projection_pipeline`` computes the following (inside the query ``AnnData``
object):

``atlas_gene`` mask
    A mask of the query genes that also exist in the atlas. We match genes by their name; if projecting query data from
    a different technology, we expect the caller to modify the query gene names to match the atlas before projecting
    it.

``atlas_lateral_gene``, ``atlas_noisy_gene``, ``atlas_marker_gene``, ``essential_gene_of_<type>`` masks
    These masks are copied from the atlas to the query (restricting them to the common ``atlas_gene`` subset).

``projected_noisy_gene``
    The mask of the genes that were considered "noisy" when computing the projection. By default this is the union
    of the noisy atlas and query genes.

``corrected_fraction`` per gene per query metacell
    For each ``atlas_gene``, its fraction in each query metacell, out of only the atlas genes. This may be further
    corrected (see below) if projecting between different scRNA-seq technologies (e.g. 10X v2 and 10X v3). For
    non-``atlas_gene`` this is 0.

``projected_fraction`` per gene per query metacell
    For each ``atlas_gene``, its fraction in its projection on the atlas. This projection is computed as a weighted
    average of some atlas metacells (see below), which are all sufficiently close to each other (in terms of gene
    expression), so averaging them is reasonable to capture the fact the query metacell may be along some position on
    some gradient that isn't an exact match for any specific atlas metacell. For non-``atlas_gene`` this is 0.

``total_atlas_umis`` per query metacell
    The total UMIs of the ``atlas_gene`` in each query metacell. This is used in the analysis as described for
    ``total_umis`` above, that is, to ensure comparing expression levels will ignore cases where the total number of
    UMIs of both compared gene profiles is too low to make a reliable determination. In such cases we take the fold
    factor to be 0.

``weights`` per query metacell per atlas metacsll
    The weights used to compute the ``projected_fractions``. Due to ``AnnData`` limitations this is returned as a
    separate object, but in ``daf`` we should be able to store this directly into the query object.

In theory, this would be enough for looking at the query metacells and comparing them to the atlas, and to project
metadata from the atlas to the query (e.g., the metacell type) using ``convey_atlas_to_query``. In practice, there is
significant amount of quality control one needs to apply before accepting these results, which we compute as follows:

``correction_factor`` per gene
    If projecting a query on an atlas with different technologies (e.g., 10X v3 to 10X v2), an automatically computed
    factor we multiplied the query gene fractions by to compensate for the systematic difference between the
    technologies (1.0 for uncorrected genes and 0.0 for non-``atlas_gene``).

``projected_type`` per query metacell
    For each query metacell, the best atlas ``type`` we can assign to it based on its projection. Note this does not
    indicate that the query metacell is "truly" of this type; to make this determination one needs to look at the
    quality control data below.

``projected_secondary_type`` per query metacell
    In some cases, a query metacell may fail to project well to a single region of the atlas, but does project well to a
    combination of two distinct atlas regions. This may be due to the query metacell containing doublets, of a mixture
    of cells which match different atlas regions (e.g. due to sparsity of data in the query data set). Either way, if
    this happens, we place here the type that best describes the secondary region the query metacell was projected to;
    otherwise this would be the empty string. Note that the ``weights`` matrix above does not distinguish between the
    regions.

``fitted_gene_of_<type>`` mask
    For each type, the genes that were projected well from the query to the atlas for most cells of that type; any
    ``atlas_gene`` outside this mask failed to project well from the query to the atlas for most metacells of this type.
    For non-``atlas_gene`` this is set to ``False``.

    Whether failing to project well some of the ``atlas_gene`` for most metacells of some ``projected_type`` indicates
    that they aren't "truly" of that type is a decision which only the analyst can make based, on prior biological
    knowledge of the relevant genes.

``fitted`` mask per gene per query metacell
    For each ``atlas_gene`` for each query metacell, whether the gene was expected to be projected well, based on the
    query metacell ``projected_type`` (and the ``projected_secondary_type``, if any). For non-``atlas_gene`` this is set
    to ``False``. This does not guarantee the gene was actually projected well.

``misfit`` mask per gene per query metacell
    For each ``atlas_gene`` for each query metacell, whether the ``corrected_fraction`` of the gene was significantly
    different from the ``projected_fractions`` (that is, whether the gene was not projected well for this metacell). For
    non-``atlas_gene`` this is set to ``False``, to make it easier to identify problematic genes.

    This is expected to be rare for ``fitted`` genes and common for the rest of the ``atlas_gene``. If too many
    ``fitted`` genes are also ``misfit``, then one should be suspicious whether the query metacell is "truly" of the
    ``projected_type``.

``essential`` mask per gene per query metacell
    Which of the ``atlas_gene`` were also listed in the ``essential_gene_of_<type>`` for the ``projected_type`` (and
    also the ``projected_secondary_type``, if any) of each query metacell.

    If an ``essential`` gene is also a ``misfit`` gene, then one should be very suspicious whether the query metacell is
    "truly" of the ``projected_type``.

``projected_correlation`` per query metacell
    The correlation between between the ``corrected_fraction`` and the ``projected_fraction`` for only the ``fitted``
    genes expression levels of each query metacell. This serves as a very rough estimator for the quality of the
    projection for this query metacell (e.g. can be used to compute R^2 values).

    In general we expect high correlation (more than 0.9 in most metacells) since we restricted the ``fitted`` genes
    mask only to genes we projected well.

``projected_fold`` per gene per query metacell
    The fold factor between the ``corrected_fraction`` and the ``projected_fraction`` (0 for non-``atlas_gene``). If
    the absolute value of this is high (3 for 8x ratio) then the gene was not projected well for this metacell. This
    will be 0 for non-``atlas_gene``.

    It is expected this would have low values for most ``fitted`` genes and high values for the rest of the
    ``atlas_gene``, but specific values will vary from one query metacell to another. This allows the analyst to make
    fine-grained determination about the quality of the projection, and/or identify quantitative differences between the
    query and the atlas (e.g., when studying perturbed systems such as knockouts or disease models).

``similar`` mask per query metacell
    A conservative determination of whether the query metacell is "similar" to its projection on the atlas. This is
    based on whether the number of ``misfit`` for the query metacell is low enough (by default, up to 3 genes), and also
    that at least 75% of the ``essential`` genes of the query metacell were not ``misfit`` genes. Note that this
    explicitly allows for a ``projected_secondary_type``, that is, a metacell of doublets will be "similar" to the
    atlas, but a metacell of a novel state missing from the atlas will be "dissimilar".

    The final determination of whether to accept the projection is, as always, up to the analyst, based on prior
    biological knowledge, the context of the collection of the query (and atlas) data sets, etc. The analyst need not
    (indeed, *should not*) blindly accept the ``similar`` determination without examining the rest of the quality
    control data listed above.

Installation
------------

In short: ``pip install metacells``. Note that ``metacells`` requires many "heavy" dependencies, most notably ``numpy``,
``pandas``, ``scipy``, ``scanpy``, which ``pip`` should automatically install for you. If you are running inside a
``conda`` environment, you might prefer to use it to first install these dependencies, instead of having ``pip`` install
them from ``PyPI``.

Note that ``metacells`` only runs natively on Linux and MacOS. To run it on a Windows computer, you must activate
`Windows Subsystem for Linux <https://docs.microsoft.com/en-us/windows/wsl>`_ and install ``metacells`` within it.

The metacells package contains extensions written in C++. The ``metacells`` distribution provides pre-compiled Python
wheels for both Linux and MacOS, so installing it using ``pip`` should not require a C++ compilation step.

Note that for X86 CPUs, these pre-compiled wheels were built to use AVX2 (Haswell/Excavator CPUs or newer), and will not
work on older CPUs which are limited to SSE. Also, these wheels will not make use of any newer instructions (such as
AVX512), even if available. While these wheels may not the perfect match for the machine you are running on, they are
expected to work well for most machines.

To see the native capabilities of your machine, you can ``grep flags /proc/cpuinfo | head -1`` which will give you a
long list of supported CPU features in an arbitrary order, which may include ``sse``, ``avx2``, ``avx512``, etc. You can
therefore simply ``grep avx2 /proc/cpuinfo | head -1`` to test whether AVX2 is/not supported by your machine.

You can avoid installing the pre-compiled wheel by running ``pip install metacells --no-binary :all:``. This will force
``pip`` to compile the C++ extensions locally on your machine, optimizing for its native capabilities, whatever these
may be. This will take much longer but may give you *somewhat* faster results (note: the results will **not** be exactly
the same as when running the precompiled wheel due to differences in floating-point rounding). Also, this requires you
to have a C++ compiler which supports C++14 installed (either ``g++`` or ``clang``). Installing a C++ compiler depends
on your specific system (using ``conda`` may make this less painful).

Vignettes
---------

The latest vignettes can be found `here <https://github.com/tanaylab/metacells-vignettes>`_.

References
----------

Please cite the references appropriately in case they are used:

.. [1] Ben-Kiki, O., Bercovich, A., Lifshitz, A. et al. Metacell-2: a divide-and-conquer metacell algorithm for scalable
   scRNA-seq analysis. Genome Biol 23, 100 (2022). https://doi.org/10.1186/s13059-022-02667-1

.. [2] Ben-Kiki, O., Bercovich, A., Lifshitz, A. et al. MCProj: metacell projection for interpretable and quantitative
   use of transcriptional atlases. Genome Biol 24, 220 (2023). https://doi.org/10.1186/s13059-023-03069-7

.. [3] Baran, Y., Bercovich, A., Sebe-Pedros, A. et al. MetaCell: analysis of single-cell RNA-seq data using K-nn graph
   partitions. Genome Biol 20, 206 (2019). `10.1186/s13059-019-1812-2 <https://doi.org/10.1186/s13059-019-1812-2>`_

License (MIT)
-------------

Copyright © 2020-2023 Weizmann Institute of Science

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit
persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the
Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


History
=======

0.5
---

* First published version.

0.6
---

* More robust graph partition.
* Allow forcing feature genes.
* Rename "project" to "convey" to prepare for addition of atlas projection functionality.

0.7.0
-----

* Switch to new project template.
* Fix some edge cases in the pipeline.
* Switch to using ``psutil`` for detecting system resources.
* Fix binary wheel issues.
* Give up on using travis-ci.

0.8.0
-----

* Add inner fold factor computation for metacells quality control.
* Add deviant fold factor computation for metacells quality control.
* Add projection of query data onto an atlas.
* Self-adjusting pile sizes.
* Add convenience function for computing data for MCView.
* Better control over filtering using absolute fold factors.
* Fix edge case in computing noisy lonely genes.
* Additional outliers certificates.
* Stricter deviants detection policy

0.9.0
-----

* Improved and published projection algorithm.
* Restrict number of rare gene candidates.
* Tighter control over metacells size and internal quality.
* Improved divide-and-conquer strategy.
* Base deviants (outliers) on gaps between cells.
* Terminology changes (see the README for details).
* Projection!

0.9.1
-----

* Fix build for python 3.11.
* More robust gene selection, KNN graph creation, and metacells partition.
* More thorough binary wheels.

0.9.2
-----

* Fix numpy compatibility issue.
* Fix K of UMAP skeleton KNN graph (broken in 0.9.1).

0.9.3
-----

* Allow specifying both target UMIs and target size (in cells) for the metacells, and adaptively try to
  satisfy both. This should produce better-sized metacells "out of the box" compared to 0.9.[0-2].

0.9.4
-----

* Fix minor bug in regularization of metacell fractions.
* Fix issue with canonical sparse matrices after downsampling (probably due to scipy.sparse updates?)
* Fix using deprecated AnnData APIs.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tanaylab/metacells.git",
    "name": "metacells",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "metacells",
    "author": "Oren Ben-Kiki",
    "author_email": "oren@ben-kiki.org",
    "download_url": "https://files.pythonhosted.org/packages/6f/50/edf7073ca3efda2fad0fd94afc1115af81be5afed49d399f803c80bca242/metacells-0.9.4.tar.gz",
    "platform": null,
    "description": "Metacells 0.9.4 - Single-cell RNA Sequencing Analysis\n=====================================================\n\n.. image:: https://readthedocs.org/projects/metacells/badge/?version=latest\n    :target: https://metacells.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\nThe metacells package implements the improved metacell algorithm [1]_ for single-cell RNA sequencing (scRNA-seq) data\nanalysis within the `scipy <https://www.scipy.org/>`_ framework, and projection algorithm based on it [2]_. The original\nmetacell algorithm [3]_ was implemented in R. The python package contains various algorithmic improvements and is\nscalable for larger data sets (millions of cells).\n\nMetacell Analysis\n-----------------\n\nNaively, scRNA_seq data is a set of cell profiles, where for each one, for each gene, we get a count of the mRNA\nmolecules that existed in the cell for that gene. This serves as an indicator of how \"expressed\" or \"active\" the gene\nis.\n\nAs in any real world technology, the raw data may suffer from technical artifacts (counting the molecules of two cells\nin one profile, counting the molecules from a ruptured cells, counting only the molecules from the cell nucleus, etc.).\nThis requires pruning the raw data to exclude such artifacts.\n\nThe current technology scRNA-seq data is also very sparse (typically <<10% the RNA molecules are counted). This\nintroduces large sampling variance on top of the original signal, which itself contains significant inherent biological\nnoise.\n\nAnalyzing scRNA-seq data therefore requires processing the profiles in bulk. Classically, this has been done by directly\nclustering the cells using various methods.\n\nIn contrast, the metacell approach groups together profiles of the \"same\" biological state into groups of cells of the\n\"same\" biological state, with the *minimal* number of profiles needed for computing robust statistics (in particular,\nmean gene expression). Each such group is a single \"metacell\".\n\nBy summing profiles of cells of the \"same\" state together, each metacell greatly reduces the sampling variance, and\nprovides a more robust estimation of the transcription state. Note a metacell is *not* a cell type (multiple metacells\nmay belong to the same \"type\", or even have the \"same\" state, if the data sufficiently over-samples this state). Also, a\nmetacell is *not* a parametric model of the cell state. It is merely a more robust description of some cell state.\n\nThe metacells should therefore be further analyzed as if they were cells, using additional methods to classify cell\ntypes, detect cell trajectories and/or lineage, build parametric models for cell behavior, etc. Using metacells as input\nfor such analysis techniques should benefit both from the more robust, less noisy input; and also from the (~100-fold)\nreduction in the number of cells to analyze when dealing with large data (e.g. analyzing millions of individual cells).\n\nA common use case is taking a new data set and using an existing atlas with annotations (in particular, \"type\"\nannotations) to provide initial annotations for the new data set. As of version 0.9 this capability is provided\nby this package.\n\nMetacell projection provides a quantitative \"projected\" genes profile for each query metacell in the atlas, together\nwith a \"corrected\" one for the same subset of genes shared between the query and the atlas. Actual correction is\noptional, to be used only if there are technological differences between the data sets, e.g. 10X v2 vs. 10X v3. This\nallows performing a quantitative comparison between the projected and corrected gene expression profiles for determining\nwhether the query metacell is a novel state that does not exist in the atlas, or, if it does match an atlas state,\nanalyze any differences it may still have. This serves both for quality control and for quantitative analysis of\nperturbed systems (e.g. knockouts or disease models) in comparison to a baseline atlas.\n\nTerminology and Results Format\n------------------------------\n\n**NOTE**: Version 0.9 **breaks compatibility** with version 0.8 when it comes to some APIs and the names and semantics\nof the result annotations. See below for the description of updated results (and how they differ from version 0.8). The\nnew format is meant to improve the usability of the system in downstream analysis pipelines. For convenience we also\nlist here the results of the new projection pipeline added in version 0.9.*. Versions 0.9.1 and 0.9.2 contain some bug\nfixes. Version 0.9.3 allows specifying target UMIs for the metacells, in addition to the target size in cells, and\nadaptively tries to satisfy both. This should produce better-sized metacells \"out of the box\" compared to the 0.9.[0-2]\nversions. The latest published version, 0.9.4, contains minor bug fixes and updates for newer versions of dependency\npackages.\n\nIf you have existing metacell data that was computed using version 0.8 (the current published version you will get\nfrom using ``pip install metacells``, you can use the provided\n`conversion script <https://github.com/tanaylab/metacells/blob/master/bin/convert_0.8_to_0.9.py>`_\nscript to migrate your data to the format described below, while preserving any additional annotations you may have\ncreated for your data (e.g. metacells type annotations). The script will not modify your existing data files, so you can\nexamine the results and tweak them if necessary.\n\nIn the upcoming version 0.10 we will migrate from using ``AnnData`` to using ``daf`` to represent the data (``h5ad``\nfiles will still be supported, either directly through an adapter or via a conversion process). This will again\nunavoidingly break API compatibility, but will provide many advantages over the restricted ``AnnData`` APIs.\n\nWe apologize for the inconvenience.\n\nMetacells Computation\n.....................\n\nIn theory, the only inputs required for metacell analysis are cell gene profiles with a UMIs count per gene per cell. In\npractice, a key part of the analysis is specifying lists of genes for special treatment. We use the following\nterminology for these lists:\n\n``excluded_gene``, ``excluded_cell`` masks\n    Excluded genes (and/or cells) are totally ignored by the algorithm (e.g. mytochondrial genes, cells with too few\n    UMIs).\n\n    Deciding on the \"right\" list of excluded genes (and cells) is crucial for creating high-quality metacells. We rely\n    on the analyst to provide this list based on prior biological knowledge. To support this supervised task, we provide\n    the ``excluded_genes`` and ``exclude_cells`` functions which implement \"reasonable\" strategies for detecting some\n    (not all) of the genes and cells to exclude. For example, these will exclude any genes found by\n    ``find_bursty_lonely_genes``, (called ``find_noisy_lonely_genes`` in v0.8). Additional considerations might be to\n    use ``relate_genes`` to (manually) exclude genes that are highly correlated with known-to-need-to-be-excluded genes,\n    or exclude any cells that are marked as doublets, etc.\n\n    Currently the 1st step of the processing must be to create a \"clean\" data set which lacks the excluded genes and\n    cells (e.g. using ``extract_clean_data``). When we switch to ``daf`` we'll just stay with the original data set and\n    apply the exclusion masks to the rest of the algorithm.\n\n``lateral_gene`` mask\n    Lateral genes are forbidden from being selected for computing cells similarity (e.g., cell cycle genes). In version\n    0.8 these were called \"forbidden\" genes. Lateral genes are still counted towards the total UMIs count when computing\n    gene expression levels for cells similarity. In addition, lateral genes are still used to compute deviant (outlier)\n    cells. That is, each computed metacell should still have a consistent gene expression level even for lateral genes.\n\n    The motivation is that we don't want the algorithm to even try to create metacells based on these genes. Since these\n    genes may be very strong (again, cell cycle), they would overcome the cell-type genes we are interested in,\n    resulting in for example an \"M-state\" metacell which combines cells from several (similar) cell types.\n\n    Deciding on the \"right\" list of lateral genes is crucial for creating high-quality metacells. We rely on the analyst\n    to provide this list based on prior biological knowledge. To support this supervised task, we provide the\n    ``relate_genes`` pipeline for identifying genes closely related to known lateral genes, so they can be added to the\n    list.\n\n``noisy_gene`` mask\n    Noisy genes are given more freedom when computing deviant (outlier) cells. That is, we don't expect the expression\n    level of such genes in the cells in the same metacell to be as consistent as we do for regular (non-noisy) genes.\n    Note this isn't related to the question of whether the gene is lateral of not. That is, a gee maybe lateral, noisy,\n    both, or neither.\n\n    The motivation is that some genes are inherently bursty and therefore cause many cells which are otherwise a good\n    match for their metacell to be marked as deviant (outliers). An indication for this is by examining the\n    ``deviant_fold`` matrix (see below).\n\n    Deciding on the \"right\" list of noisy genes is again crucial for creating high-quality metacells (and minimizing the\n    fraction of outlier cells). Again we rely on the analyst here,\n\nHaving determined the inputs and possibly tweaking the hyper-parameters (a favorite one is the ``target_metacell_size``,\nwhich by default is 160K UMIs; this may be reduced for small data sets and may be increased for larger data sets), one\ntypically runs ``divide_and_conquer_pipeline`` to obtain the following:\n\n``metacell`` (index) vs. ``metacell_name`` (string) per cell\n    The result of computing metacells for a set of cells with the above assigns each cell a metacell index. We also give\n    each metacell a name of the format ``M<index>.<checksum>`` where the checksum reflects the cells grouped into the\n    metacell. This protects the analyst from mistakenly applying metadata assigned to metacells from an old computation\n    to different newly computed metacells.\n\n    We provide functions (``convey_obs_to_group``, ``convey_group_to_obs``) for conveying between per-cell and\n    per-metacell annotations, which all currently use the metacell integer indices (this will change when we switch to\n    ``daf``). The metacell string names are safer to use, especially when slicing the data.\n\n``dissolve`` cells mask\n    Whether the cell was in a candidate matecall that was dissolved due to being too small (too few cells and/or total\n    UMIs). This may aid quality control when there are a large number of outliers; lowering the ``target_metacell_size``\n    may help avoid this.\n\n``selected_gene`` mask\n    Whether each gene was ever selected to be used to compute the similarity between cells to compute the metacells.\n    When using the divide-and-conquer algorithm, this mask is different for each pile (especially in the second phase\n    when piles are homogeneous). This mask is the union of all the masks used in all the piles. It is useful for\n    ensuring no should-be-lateral genes were selected as this would reduce the quality of the metacells. If such genes\n    exist, add them to the ``lateral_gene`` mask and recompute the metacells.\n\nHaving computed the metacells, the next step is to run ``collect_metacells`` to create a new ``AnnData`` object for them\n(when using ``daf``, they will be created in the same dataset for easier analysis), which will contain all the per-gene\nmetadata, and also:\n\n``X`` per gene per metacell\n    Once the metacells have been computed (typically using ``divide_and_conquer_pipeline``), we can collect the gene\n    expression levels profile for each one. The main motivation for computing metacells is that they allow for a robust\n    estimation of the gene expression level, and therefore we by default compute a matrix of gene fractions (which sum\n    to 1) in each metacell, rather than providing a UMIs count for each. This simplifies the further analysis of the\n    computed metacells (this is known as ``e_gc`` in the old R metacells package).\n\n    Note that the expression level of noisy genes is less reliable, as we do not guarantee the cells in each metacell\n    have a consistent expression level for such genes. Our estimator therefore uses a normal weighted mean for most\n    genes and a normalized geometric mean for the noisy gene. Since the sizes of the cells collected into the same\n    metacell may vary, our estimator also ensures one large cell doesn't dominate the results. That is, the computed\n    fractions are *not* simply \"sum of the gene UMIs in all cells divided by the sum of all gene UMIs in all cells\".\n\n``grouped`` per metacell\n    The number of cells grouped into each metacell.\n\n``total_umis`` per metacell, and per gene per metacell\n    We still provide the total UMIs count for each each gene for each cell in each metacell, and the total UMIs in each\n    metacell. Note that the estimated fraction of each gene in the metacell is *not* its total UMIs divided by the\n    metacell's total UMIs; the actual estimator is more complex.\n\n    The total UMIs are important to ensure that analysis is meaningful. For example, comparing expression levels of\n    lowly-expressed genes in two metacells will yield wildly inaccurate results unless a sufficient number of UMIs were\n    used (the sum of UMIs of the gene in both compared metacells). The functions provided here for computing fold\n    factors (log base 2 of the ratio) and related comparisons automatically ignore cases when this sum is below some\n    threshold (40) by considering the effective fold factor to be 0 (that is, \"no difference\").\n\n``metacells_level`` per cell or metacell\n    This is 0 for rare gene module metacells, 1 for metacells computed from the main piles in the 2nd divide-and-conquer\n    phase and 2 for metacells computed for their outliers.\n\nIf using ``divide_and_conquer_pipeline``, the following are also computed (but not by the simple\n``compute_divide_and_conquer_metacells``:\n\n``rare_gene_module_<N>`` mask (for N = 0, ...)\n    A mask of the genes combined into each of the detected \"rare gene modules\". This is done in (expensive)\n    pre-processing before the full divide-and-conquer algorithm to increase the sensitivity of the method, by creating\n    metacells computed only from cells that express each rare gene module.\n\n``rare_gene`` mask\n    A mask of all the genes in all the rare gene modules, for convenience.\n\n``rare_gene_module`` per cell or metacell\n    The index of the rare gene module each cell or metacell expresses (or negative for the common case it expresses none\n    of them).\n\n``rare_cell``, ``rare_metacell`` masks\n    A mask of all the cells or metacells expressing any of the rare gene modules, for convenience.\n\nIn theory one is free to go use the metacells for further analysis, but it is prudent to perform quality control first.\nOne obvious measure is the number of outlier cells (with a negative metacell index and a metacell name of ``Outliers``).\nIn addition, one should compute and look at the following (an easy way to compute all of them at once is to call\n``compute_for_mcview``, this will change in the future):\n\n``most_similar``, ``most_similar_name`` per cell (computed by ``compute_outliers_most_similar``)\n    For each outlier cell (whose metacell index is ``-1`` and metacell name is ``Outliers``), the index and name of the\n    metacell which is the \"most similar\" to the cell (has highest correlation).\n\n``deviant_fold`` per gene per cell (computed by ``compute_deviant_folds``)\n    For each cell, for each gene, the ``deviant_fold`` holds the fold factor (log base 2) between the expression level\n    of the gene in the cell and the metacell it belongs to (or the most similar metacell for outlier cells). This uses\n    the same (strong) normalization factor we use when computing deviant (outlier) cells, so for outliers, you should\n    see some (non-excluded, non-noisy) genes with a fold factor above 3 (8x), or some (non-excluded, noisy) genes with a\n    fold factor above 5 (32x), which justify why we haven't merged that cell into a metacell; for cells grouped into\n    metacells, you shouldn't see (many) such genes. If there is a large number of outlier cells and a few non-noisy\n    genes have a high fold factor for many of them, you should consider marking these genes as noisy and recomputing the\n    metacells. If they are already marked as noisy, you may want to completely exclude them.\n\n``inner_fold`` per gene per metacell (computed by ``compute_inner_folds``)\n    For each metacell, for each gene, the ``inner_fold`` is the strongest (highest absolute value) ``deviant_fold`` of\n    any of the cells contained in the metacell. Both this and the ``inner_stdev_log`` below can be used for quality\n    control over the consistency of the gene expression in the metacell.\n\n``significant_inner_folds_count`` per gene\n    For each gene, the number of metacells in which there's at least one cell with a high ``deviant_fold`` (that is,\n    where the ``inner_fold`` is high). This helps in identifying troublesome genes, which can be then marked as noisy,\n    lateral or even excluded, depending on their biological significance.\n\n``inner_stdev_log`` per gene per metacell (computed by ``compute_inner_stdev_logs``)\n    For each metacell, for each gene, the standard deviation of the log (base 2) of the fraction of the gene across the\n    cells of the metacell. Ideally, the standard deviation should be ~1/3rd of the ``deviants_min_gene_fold_factor``\n    (which is ``3`` by default), indicating that (all)most cells are within that maximal fold factor. In practice we may\n    see higher values - the lower, the better. Both this and the ``inner_fold`` above can be used for quality control over the consistency of the gene expression in the metacell.\n\n``marker_gene`` mask (computed by ``find_metacells_marker_genes``)\n    Given the computed metacells, we can identify genes that have a sufficient number of effective UMIs (in some\n    metacells) and also have a wide range of expressions (between different metacells). These genes serve as markers for\n    identifying the \"type\" of the metacell (or, more generally, the \"gene programs\" that are active in each metacell).\n\n    Typically analysis groups the marker genes into \"gene modules\" (or, more generally, \"gene programs\"), and then use\n    the notion of \"type X expresses the gene module/programs Y, Z, ...\". As of version 0.9, collecting such gene modules\n    (or programs) is left to the analyst with little or no direct support in this package, other than providing the rare\n    gene modules (which by definition would apply only to a small subset of the metacells).\n\n``x``, ``y`` per metacell (computed by ``compute_umap_by_markers``)\n    A common and generally effective way to visualize the computed metacells is to project them to a 2D view. Currently\n    we do this by giving UMAP a distance metric between metacells based on a logistic function based on the expression\n    levels of the marker genes. In version 0.8 this was based on picking (some of) the selected genes.\n\n    This view is good for quality control. If it forces \"unrelated\" cell types together, this might mean that more genes\n    should be made lateral, or noisy, or even excluded; or maybe the data contains a metacell of doublets; or metacells\n    mixing cells from different types, if too many genes were marked as lateral or noisy, or excluded. It takes a\n    surprising small number of such doublet/mixture metacells to mess up the UMAP projection.\n\n    Also, one shouldn't read too much from the 2D layout, as by definition it can't express the \"true\" structure of the\n    data. Looking at specific gene-gene plots gives much more robust insight into the actual differences between the\n    metacell types, identify doublets, etc.\n\n``obs_outgoing_weights`` per metacell per metacell (also computed by ``compute_umap_by_markers``)\n    The (sparse) matrix of weights of the graph used to generate the ``x`` and ``y`` 2D projection. This graph is *very*\n    sparse, that is, has a very low degree for the nodes. It is meant to be used only in conjunction with the 2D\n    coordinates for visualization, and should **not** be used by any downstream analysis to determine which metacells\n    are \"near\" each other for any other purpose.\n\nMetacells Projection\n....................\n\nFor the use case of projecting metacells we use the following terminology:\n\n``atlas``\n    A set of metacells with associated metadata, most importantly a ``type`` annotation per metacell. In addition, the\n    atlas may provide an ``essential_gene_of_<type>`` mask for each type. For a query metacell to successfully project\n    to a given type will require that the query's expression of the type's essential genes matches the atlas. We also\n    use the metadata listed above (specifically, ``lateral_gene``, ``noisy_gene`` and ``marker_gene``).\n\n``query``\n    A set of metacells with minimal associated metadata, specifically without a ``type``. This may optionally contain\n    its own ``lateral_gene``, ``noisy_gene`` and/or even ``marker_gene`` annotations.\n\n``ignored_gene`` mask, ``ignored_gene_of_<type>`` mask\n    A set of genes to not even try to match between the query and the atlas. In general the projection matches only a\n    subset of the genes (that are common to the atlas and the query). However, the analyst has the option to force\n    additional genes to be ignored, either in general or only when projecting metacells of a specific type. Manually\n    ignoring specific genes which are known not to match (e.g., due to the query being some experiment, e.g. a knockout\n    or a disease model) can improve the quality of the projection for the genes which do match.\n\nGiven these two input data sets, the ``projection_pipeline`` computes the following (inside the query ``AnnData``\nobject):\n\n``atlas_gene`` mask\n    A mask of the query genes that also exist in the atlas. We match genes by their name; if projecting query data from\n    a different technology, we expect the caller to modify the query gene names to match the atlas before projecting\n    it.\n\n``atlas_lateral_gene``, ``atlas_noisy_gene``, ``atlas_marker_gene``, ``essential_gene_of_<type>`` masks\n    These masks are copied from the atlas to the query (restricting them to the common ``atlas_gene`` subset).\n\n``projected_noisy_gene``\n    The mask of the genes that were considered \"noisy\" when computing the projection. By default this is the union\n    of the noisy atlas and query genes.\n\n``corrected_fraction`` per gene per query metacell\n    For each ``atlas_gene``, its fraction in each query metacell, out of only the atlas genes. This may be further\n    corrected (see below) if projecting between different scRNA-seq technologies (e.g. 10X v2 and 10X v3). For\n    non-``atlas_gene`` this is 0.\n\n``projected_fraction`` per gene per query metacell\n    For each ``atlas_gene``, its fraction in its projection on the atlas. This projection is computed as a weighted\n    average of some atlas metacells (see below), which are all sufficiently close to each other (in terms of gene\n    expression), so averaging them is reasonable to capture the fact the query metacell may be along some position on\n    some gradient that isn't an exact match for any specific atlas metacell. For non-``atlas_gene`` this is 0.\n\n``total_atlas_umis`` per query metacell\n    The total UMIs of the ``atlas_gene`` in each query metacell. This is used in the analysis as described for\n    ``total_umis`` above, that is, to ensure comparing expression levels will ignore cases where the total number of\n    UMIs of both compared gene profiles is too low to make a reliable determination. In such cases we take the fold\n    factor to be 0.\n\n``weights`` per query metacell per atlas metacsll\n    The weights used to compute the ``projected_fractions``. Due to ``AnnData`` limitations this is returned as a\n    separate object, but in ``daf`` we should be able to store this directly into the query object.\n\nIn theory, this would be enough for looking at the query metacells and comparing them to the atlas, and to project\nmetadata from the atlas to the query (e.g., the metacell type) using ``convey_atlas_to_query``. In practice, there is\nsignificant amount of quality control one needs to apply before accepting these results, which we compute as follows:\n\n``correction_factor`` per gene\n    If projecting a query on an atlas with different technologies (e.g., 10X v3 to 10X v2), an automatically computed\n    factor we multiplied the query gene fractions by to compensate for the systematic difference between the\n    technologies (1.0 for uncorrected genes and 0.0 for non-``atlas_gene``).\n\n``projected_type`` per query metacell\n    For each query metacell, the best atlas ``type`` we can assign to it based on its projection. Note this does not\n    indicate that the query metacell is \"truly\" of this type; to make this determination one needs to look at the\n    quality control data below.\n\n``projected_secondary_type`` per query metacell\n    In some cases, a query metacell may fail to project well to a single region of the atlas, but does project well to a\n    combination of two distinct atlas regions. This may be due to the query metacell containing doublets, of a mixture\n    of cells which match different atlas regions (e.g. due to sparsity of data in the query data set). Either way, if\n    this happens, we place here the type that best describes the secondary region the query metacell was projected to;\n    otherwise this would be the empty string. Note that the ``weights`` matrix above does not distinguish between the\n    regions.\n\n``fitted_gene_of_<type>`` mask\n    For each type, the genes that were projected well from the query to the atlas for most cells of that type; any\n    ``atlas_gene`` outside this mask failed to project well from the query to the atlas for most metacells of this type.\n    For non-``atlas_gene`` this is set to ``False``.\n\n    Whether failing to project well some of the ``atlas_gene`` for most metacells of some ``projected_type`` indicates\n    that they aren't \"truly\" of that type is a decision which only the analyst can make based, on prior biological\n    knowledge of the relevant genes.\n\n``fitted`` mask per gene per query metacell\n    For each ``atlas_gene`` for each query metacell, whether the gene was expected to be projected well, based on the\n    query metacell ``projected_type`` (and the ``projected_secondary_type``, if any). For non-``atlas_gene`` this is set\n    to ``False``. This does not guarantee the gene was actually projected well.\n\n``misfit`` mask per gene per query metacell\n    For each ``atlas_gene`` for each query metacell, whether the ``corrected_fraction`` of the gene was significantly\n    different from the ``projected_fractions`` (that is, whether the gene was not projected well for this metacell). For\n    non-``atlas_gene`` this is set to ``False``, to make it easier to identify problematic genes.\n\n    This is expected to be rare for ``fitted`` genes and common for the rest of the ``atlas_gene``. If too many\n    ``fitted`` genes are also ``misfit``, then one should be suspicious whether the query metacell is \"truly\" of the\n    ``projected_type``.\n\n``essential`` mask per gene per query metacell\n    Which of the ``atlas_gene`` were also listed in the ``essential_gene_of_<type>`` for the ``projected_type`` (and\n    also the ``projected_secondary_type``, if any) of each query metacell.\n\n    If an ``essential`` gene is also a ``misfit`` gene, then one should be very suspicious whether the query metacell is\n    \"truly\" of the ``projected_type``.\n\n``projected_correlation`` per query metacell\n    The correlation between between the ``corrected_fraction`` and the ``projected_fraction`` for only the ``fitted``\n    genes expression levels of each query metacell. This serves as a very rough estimator for the quality of the\n    projection for this query metacell (e.g. can be used to compute R^2 values).\n\n    In general we expect high correlation (more than 0.9 in most metacells) since we restricted the ``fitted`` genes\n    mask only to genes we projected well.\n\n``projected_fold`` per gene per query metacell\n    The fold factor between the ``corrected_fraction`` and the ``projected_fraction`` (0 for non-``atlas_gene``). If\n    the absolute value of this is high (3 for 8x ratio) then the gene was not projected well for this metacell. This\n    will be 0 for non-``atlas_gene``.\n\n    It is expected this would have low values for most ``fitted`` genes and high values for the rest of the\n    ``atlas_gene``, but specific values will vary from one query metacell to another. This allows the analyst to make\n    fine-grained determination about the quality of the projection, and/or identify quantitative differences between the\n    query and the atlas (e.g., when studying perturbed systems such as knockouts or disease models).\n\n``similar`` mask per query metacell\n    A conservative determination of whether the query metacell is \"similar\" to its projection on the atlas. This is\n    based on whether the number of ``misfit`` for the query metacell is low enough (by default, up to 3 genes), and also\n    that at least 75% of the ``essential`` genes of the query metacell were not ``misfit`` genes. Note that this\n    explicitly allows for a ``projected_secondary_type``, that is, a metacell of doublets will be \"similar\" to the\n    atlas, but a metacell of a novel state missing from the atlas will be \"dissimilar\".\n\n    The final determination of whether to accept the projection is, as always, up to the analyst, based on prior\n    biological knowledge, the context of the collection of the query (and atlas) data sets, etc. The analyst need not\n    (indeed, *should not*) blindly accept the ``similar`` determination without examining the rest of the quality\n    control data listed above.\n\nInstallation\n------------\n\nIn short: ``pip install metacells``. Note that ``metacells`` requires many \"heavy\" dependencies, most notably ``numpy``,\n``pandas``, ``scipy``, ``scanpy``, which ``pip`` should automatically install for you. If you are running inside a\n``conda`` environment, you might prefer to use it to first install these dependencies, instead of having ``pip`` install\nthem from ``PyPI``.\n\nNote that ``metacells`` only runs natively on Linux and MacOS. To run it on a Windows computer, you must activate\n`Windows Subsystem for Linux <https://docs.microsoft.com/en-us/windows/wsl>`_ and install ``metacells`` within it.\n\nThe metacells package contains extensions written in C++. The ``metacells`` distribution provides pre-compiled Python\nwheels for both Linux and MacOS, so installing it using ``pip`` should not require a C++ compilation step.\n\nNote that for X86 CPUs, these pre-compiled wheels were built to use AVX2 (Haswell/Excavator CPUs or newer), and will not\nwork on older CPUs which are limited to SSE. Also, these wheels will not make use of any newer instructions (such as\nAVX512), even if available. While these wheels may not the perfect match for the machine you are running on, they are\nexpected to work well for most machines.\n\nTo see the native capabilities of your machine, you can ``grep flags /proc/cpuinfo | head -1`` which will give you a\nlong list of supported CPU features in an arbitrary order, which may include ``sse``, ``avx2``, ``avx512``, etc. You can\ntherefore simply ``grep avx2 /proc/cpuinfo | head -1`` to test whether AVX2 is/not supported by your machine.\n\nYou can avoid installing the pre-compiled wheel by running ``pip install metacells --no-binary :all:``. This will force\n``pip`` to compile the C++ extensions locally on your machine, optimizing for its native capabilities, whatever these\nmay be. This will take much longer but may give you *somewhat* faster results (note: the results will **not** be exactly\nthe same as when running the precompiled wheel due to differences in floating-point rounding). Also, this requires you\nto have a C++ compiler which supports C++14 installed (either ``g++`` or ``clang``). Installing a C++ compiler depends\non your specific system (using ``conda`` may make this less painful).\n\nVignettes\n---------\n\nThe latest vignettes can be found `here <https://github.com/tanaylab/metacells-vignettes>`_.\n\nReferences\n----------\n\nPlease cite the references appropriately in case they are used:\n\n.. [1] Ben-Kiki, O., Bercovich, A., Lifshitz, A. et al. Metacell-2: a divide-and-conquer metacell algorithm for scalable\n   scRNA-seq analysis. Genome Biol 23, 100 (2022). https://doi.org/10.1186/s13059-022-02667-1\n\n.. [2] Ben-Kiki, O., Bercovich, A., Lifshitz, A. et al. MCProj: metacell projection for interpretable and quantitative\n   use of transcriptional atlases. Genome Biol 24, 220 (2023). https://doi.org/10.1186/s13059-023-03069-7\n\n.. [3] Baran, Y., Bercovich, A., Sebe-Pedros, A. et al. MetaCell: analysis of single-cell RNA-seq data using K-nn graph\n   partitions. Genome Biol 20, 206 (2019). `10.1186/s13059-019-1812-2 <https://doi.org/10.1186/s13059-019-1812-2>`_\n\nLicense (MIT)\n-------------\n\nCopyright \u00a9 2020-2023 Weizmann Institute of Science\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated\ndocumentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the\nrights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit\npersons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the\nSoftware.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE\nWARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR\nOTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n\n\nHistory\n=======\n\n0.5\n---\n\n* First published version.\n\n0.6\n---\n\n* More robust graph partition.\n* Allow forcing feature genes.\n* Rename \"project\" to \"convey\" to prepare for addition of atlas projection functionality.\n\n0.7.0\n-----\n\n* Switch to new project template.\n* Fix some edge cases in the pipeline.\n* Switch to using ``psutil`` for detecting system resources.\n* Fix binary wheel issues.\n* Give up on using travis-ci.\n\n0.8.0\n-----\n\n* Add inner fold factor computation for metacells quality control.\n* Add deviant fold factor computation for metacells quality control.\n* Add projection of query data onto an atlas.\n* Self-adjusting pile sizes.\n* Add convenience function for computing data for MCView.\n* Better control over filtering using absolute fold factors.\n* Fix edge case in computing noisy lonely genes.\n* Additional outliers certificates.\n* Stricter deviants detection policy\n\n0.9.0\n-----\n\n* Improved and published projection algorithm.\n* Restrict number of rare gene candidates.\n* Tighter control over metacells size and internal quality.\n* Improved divide-and-conquer strategy.\n* Base deviants (outliers) on gaps between cells.\n* Terminology changes (see the README for details).\n* Projection!\n\n0.9.1\n-----\n\n* Fix build for python 3.11.\n* More robust gene selection, KNN graph creation, and metacells partition.\n* More thorough binary wheels.\n\n0.9.2\n-----\n\n* Fix numpy compatibility issue.\n* Fix K of UMAP skeleton KNN graph (broken in 0.9.1).\n\n0.9.3\n-----\n\n* Allow specifying both target UMIs and target size (in cells) for the metacells, and adaptively try to\n  satisfy both. This should produce better-sized metacells \"out of the box\" compared to 0.9.[0-2].\n\n0.9.4\n-----\n\n* Fix minor bug in regularization of metacell fractions.\n* Fix issue with canonical sparse matrices after downsampling (probably due to scipy.sparse updates?)\n* Fix using deprecated AnnData APIs.\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "Single-cell RNA Sequencing Analysis",
    "version": "0.9.4",
    "project_urls": {
        "Homepage": "https://github.com/tanaylab/metacells.git"
    },
    "split_keywords": [
        "metacells"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cc91aee1abfbae841b2b07885cff30d31bceba65711a54c09d3d99c1a722c288",
                "md5": "008c4a3f48d41d0b5c0e1a6acc8e6981",
                "sha256": "81f4d16ce51312aebe993223cba10a77d4c19ccd7e225fa6d2f1efb5dc63a0ab"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp310-cp310-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "008c4a3f48d41d0b5c0e1a6acc8e6981",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.7",
            "size": 5699677,
            "upload_time": "2023-10-24T13:45:23",
            "upload_time_iso_8601": "2023-10-24T13:45:23.120525Z",
            "url": "https://files.pythonhosted.org/packages/cc/91/aee1abfbae841b2b07885cff30d31bceba65711a54c09d3d99c1a722c288/metacells-0.9.4-cp310-cp310-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f86b8aba7f87522b335e28b85071d42259423c8ecbbb12d2df9e295226501a45",
                "md5": "ca53e41ee99c29ef3f95f2ddc0104432",
                "sha256": "fc75fdfc16fefbfcc0c49be34ea669e8c6331ee60fe269cd81c29b29f3c36cfa"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp310-cp310-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "ca53e41ee99c29ef3f95f2ddc0104432",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.7",
            "size": 5528031,
            "upload_time": "2023-10-24T13:45:27",
            "upload_time_iso_8601": "2023-10-24T13:45:27.818198Z",
            "url": "https://files.pythonhosted.org/packages/f8/6b/8aba7f87522b335e28b85071d42259423c8ecbbb12d2df9e295226501a45/metacells-0.9.4-cp310-cp310-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "29579c05848f312a7738b373a9983caaad1485660f126a362d188db79f94ca06",
                "md5": "b24603f559edb03f435b559552d32934",
                "sha256": "ec5180fe07d7f82428e7d9224a27b05ed0325727af2b7df2074a38e2f0371d7e"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "b24603f559edb03f435b559552d32934",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.7",
            "size": 81741571,
            "upload_time": "2023-10-24T13:45:38",
            "upload_time_iso_8601": "2023-10-24T13:45:38.716155Z",
            "url": "https://files.pythonhosted.org/packages/29/57/9c05848f312a7738b373a9983caaad1485660f126a362d188db79f94ca06/metacells-0.9.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d72b0522063bbb6a66a9de957f9bb268367bdf8185a28168e4c912ef08faf417",
                "md5": "c71ff3512520e92b62ef0f3dc29dfb54",
                "sha256": "0928ba838c6bbfa6c48aaeff0d7cb788c3dd00e97d43fa2a3dc4b257e8f8ac35"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp310-cp310-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c71ff3512520e92b62ef0f3dc29dfb54",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.7",
            "size": 82615838,
            "upload_time": "2023-10-24T13:45:54",
            "upload_time_iso_8601": "2023-10-24T13:45:54.384573Z",
            "url": "https://files.pythonhosted.org/packages/d7/2b/0522063bbb6a66a9de957f9bb268367bdf8185a28168e4c912ef08faf417/metacells-0.9.4-cp310-cp310-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5eac5e1844f9d7fb1bdb7fec3b6d0b46c17e1fa774a6ceb1d1cd7aa560e66068",
                "md5": "2a87541bb238adbc986c718f944f382b",
                "sha256": "e094fecd89588113b7649700d8723b4b8e12807606000e24eae285c7acbf74a3"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp311-cp311-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "2a87541bb238adbc986c718f944f382b",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.7",
            "size": 5701722,
            "upload_time": "2023-10-24T13:46:02",
            "upload_time_iso_8601": "2023-10-24T13:46:02.088956Z",
            "url": "https://files.pythonhosted.org/packages/5e/ac/5e1844f9d7fb1bdb7fec3b6d0b46c17e1fa774a6ceb1d1cd7aa560e66068/metacells-0.9.4-cp311-cp311-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "56a41b836574028759a654a48e73c36c3e55330477efe0272386e5fcb87f85cf",
                "md5": "e2da01f2b82e898235033d2b7f4288cc",
                "sha256": "9171e96c78d768628334d3f1cc67f6c2cbee2239caece9a0f0b906a2b3d47e60"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp311-cp311-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "e2da01f2b82e898235033d2b7f4288cc",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.7",
            "size": 5525327,
            "upload_time": "2023-10-24T13:46:06",
            "upload_time_iso_8601": "2023-10-24T13:46:06.936389Z",
            "url": "https://files.pythonhosted.org/packages/56/a4/1b836574028759a654a48e73c36c3e55330477efe0272386e5fcb87f85cf/metacells-0.9.4-cp311-cp311-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "00cbff2aaafda6870af868c8621c65c8c0f79b6768e52e8f851696a0b5bc2c0e",
                "md5": "939aee858ef203b1558d24b1a223ebdc",
                "sha256": "3214ae8b7ce9ee76844d208cd20ec40686df6a623c64922d3c8d2620b60e39e7"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "939aee858ef203b1558d24b1a223ebdc",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.7",
            "size": 82094940,
            "upload_time": "2023-10-24T13:46:19",
            "upload_time_iso_8601": "2023-10-24T13:46:19.470033Z",
            "url": "https://files.pythonhosted.org/packages/00/cb/ff2aaafda6870af868c8621c65c8c0f79b6768e52e8f851696a0b5bc2c0e/metacells-0.9.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "62d904391cd325f5baaef98a962a522c2f0005ec4e98a548fb8c8bc137b5c271",
                "md5": "ce498ebf0a25fdce79dad353bbf5ffcc",
                "sha256": "34e7275b253ce7f3af9dfa465070a0b2bb21a494655cfbffdd79d8a3089ac897"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp311-cp311-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "ce498ebf0a25fdce79dad353bbf5ffcc",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.7",
            "size": 82903204,
            "upload_time": "2023-10-24T13:46:35",
            "upload_time_iso_8601": "2023-10-24T13:46:35.211004Z",
            "url": "https://files.pythonhosted.org/packages/62/d9/04391cd325f5baaef98a962a522c2f0005ec4e98a548fb8c8bc137b5c271/metacells-0.9.4-cp311-cp311-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2164a141cad3214837bfd1813dc08f584bc793d8db83586851567c581515163a",
                "md5": "0ef013cf2c399d4a9528d26a79666d86",
                "sha256": "e71d2ac49e84cc9870e0d428c02f65fa594ef61139c6c935b63b447f0f88e606"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp312-cp312-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "0ef013cf2c399d4a9528d26a79666d86",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.7",
            "size": 5742940,
            "upload_time": "2023-10-24T13:46:41",
            "upload_time_iso_8601": "2023-10-24T13:46:41.943059Z",
            "url": "https://files.pythonhosted.org/packages/21/64/a141cad3214837bfd1813dc08f584bc793d8db83586851567c581515163a/metacells-0.9.4-cp312-cp312-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9f42a84a79fa4f2fa408f435411e5967b0b2f8716b4425e043f8db0086e2679c",
                "md5": "b48232127a6d51f56f00a87161284bbd",
                "sha256": "301cef72e09173a71a58323be4952d4642c92149733b3134a5e27f6f547253bb"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp312-cp312-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "b48232127a6d51f56f00a87161284bbd",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.7",
            "size": 5544583,
            "upload_time": "2023-10-24T13:46:46",
            "upload_time_iso_8601": "2023-10-24T13:46:46.109857Z",
            "url": "https://files.pythonhosted.org/packages/9f/42/a84a79fa4f2fa408f435411e5967b0b2f8716b4425e043f8db0086e2679c/metacells-0.9.4-cp312-cp312-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aaac377afcd66720e85c2feb6a09070c3459c87429b52c6b7b676415d7b5c6bd",
                "md5": "5314b5f60328ab6bc922c2b9a4c43bb3",
                "sha256": "1a0bfff4fa47321ad2f3743a40d585a726703d514c2057b5d007a5c92aeaf2eb"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "5314b5f60328ab6bc922c2b9a4c43bb3",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.7",
            "size": 82103524,
            "upload_time": "2023-10-24T13:46:57",
            "upload_time_iso_8601": "2023-10-24T13:46:57.453085Z",
            "url": "https://files.pythonhosted.org/packages/aa/ac/377afcd66720e85c2feb6a09070c3459c87429b52c6b7b676415d7b5c6bd/metacells-0.9.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1089f37d7f2d14e2e34ab8ec4bd37c4ebe0a40b3d46612dc80a83f8b5455d7ff",
                "md5": "c5122906311c580c092b98e590a6e666",
                "sha256": "11d246ad79a9cf258666d06d2dcff831c5960f5686e544a0c815c6e70967a899"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp312-cp312-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c5122906311c580c092b98e590a6e666",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.7",
            "size": 82362921,
            "upload_time": "2023-10-24T13:47:12",
            "upload_time_iso_8601": "2023-10-24T13:47:12.588731Z",
            "url": "https://files.pythonhosted.org/packages/10/89/f37d7f2d14e2e34ab8ec4bd37c4ebe0a40b3d46612dc80a83f8b5455d7ff/metacells-0.9.4-cp312-cp312-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "72692fd20dd92f8d39dbeb7b07eae00767a2d5629acd8bd1437b7cc2aca99f24",
                "md5": "8766d7648477505ac8a3432e694ee334",
                "sha256": "3c70e34998d9c12a2255eb6a1d3cc6695dbcba70a25abc78467051db756d2da0"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp37-cp37m-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "8766d7648477505ac8a3432e694ee334",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.7",
            "size": 5513300,
            "upload_time": "2023-10-24T13:47:21",
            "upload_time_iso_8601": "2023-10-24T13:47:21.071546Z",
            "url": "https://files.pythonhosted.org/packages/72/69/2fd20dd92f8d39dbeb7b07eae00767a2d5629acd8bd1437b7cc2aca99f24/metacells-0.9.4-cp37-cp37m-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4b91bfb26e83dcd70af3e47fe4063d5ca9e7d7981be139848cc36b48d7a31e91",
                "md5": "08336c43af7b73acd8bb36042ef0eaab",
                "sha256": "468af6bb396b01315add4676b02673edc9da30ff5123715aa738cf2cb50a931d"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "08336c43af7b73acd8bb36042ef0eaab",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.7",
            "size": 82108978,
            "upload_time": "2023-10-24T13:47:32",
            "upload_time_iso_8601": "2023-10-24T13:47:32.410204Z",
            "url": "https://files.pythonhosted.org/packages/4b/91/bfb26e83dcd70af3e47fe4063d5ca9e7d7981be139848cc36b48d7a31e91/metacells-0.9.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0a1d24ab89743effa6b8215a9c45f9f1282d7676b8fffbc7947beb54112142ff",
                "md5": "9a8bad27829b4999909ea00eef426b32",
                "sha256": "2dc456aaaa2c735301a8860b9bc2a248e2f8265a1795ae9c2a3a46a8bfa0408e"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp37-cp37m-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "9a8bad27829b4999909ea00eef426b32",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.7",
            "size": 82415885,
            "upload_time": "2023-10-24T13:47:47",
            "upload_time_iso_8601": "2023-10-24T13:47:47.731030Z",
            "url": "https://files.pythonhosted.org/packages/0a/1d/24ab89743effa6b8215a9c45f9f1282d7676b8fffbc7947beb54112142ff/metacells-0.9.4-cp37-cp37m-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d51af618e42aab039ff2839d282bd952add13da92d61b23d9692122c96ed7c55",
                "md5": "72c5758bb45789edc5433d08aed3f826",
                "sha256": "1532c9bf99e18d1498ba940437761c42f07fff045fb5bf98e0ca79c80978910c"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp38-cp38-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "72c5758bb45789edc5433d08aed3f826",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.7",
            "size": 5700100,
            "upload_time": "2023-10-24T13:47:54",
            "upload_time_iso_8601": "2023-10-24T13:47:54.837335Z",
            "url": "https://files.pythonhosted.org/packages/d5/1a/f618e42aab039ff2839d282bd952add13da92d61b23d9692122c96ed7c55/metacells-0.9.4-cp38-cp38-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f3a3f0ae2554659549d52b1a9a57e959b5651b4ac3a9126ee68f51ed74851156",
                "md5": "19d7a97e48bceb332122a1d524c2dceb",
                "sha256": "7266909b6938d6d0e1acb5de86d1f718166a4b50c20440484e8cd446d2cb2030"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp38-cp38-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "19d7a97e48bceb332122a1d524c2dceb",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.7",
            "size": 5528015,
            "upload_time": "2023-10-24T13:47:59",
            "upload_time_iso_8601": "2023-10-24T13:47:59.807954Z",
            "url": "https://files.pythonhosted.org/packages/f3/a3/f0ae2554659549d52b1a9a57e959b5651b4ac3a9126ee68f51ed74851156/metacells-0.9.4-cp38-cp38-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "eb1c8bf7cc6841209c647d98549f72a824c0a600d23534785ebf1adc1cac89e6",
                "md5": "18e93418ed7360837497c624d5f1ac68",
                "sha256": "ce199d0991721934fbbf25361de018b14521e5b872fe9ccf2868dce7ff52f457"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "18e93418ed7360837497c624d5f1ac68",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.7",
            "size": 82199481,
            "upload_time": "2023-10-24T13:48:13",
            "upload_time_iso_8601": "2023-10-24T13:48:13.828767Z",
            "url": "https://files.pythonhosted.org/packages/eb/1c/8bf7cc6841209c647d98549f72a824c0a600d23534785ebf1adc1cac89e6/metacells-0.9.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f899afcfcaa1eb3c0f8ad5dc171f94a16d9a5a815903829729e3bf23f96d722",
                "md5": "109b46887cd9974e792bd7b437df2830",
                "sha256": "ead9a68e0e510de477c8cc31f694dc6599e4104fc1a613a74bbd31853335ed1d"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp38-cp38-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "109b46887cd9974e792bd7b437df2830",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.7",
            "size": 83302307,
            "upload_time": "2023-10-24T13:48:25",
            "upload_time_iso_8601": "2023-10-24T13:48:25.644107Z",
            "url": "https://files.pythonhosted.org/packages/1f/89/9afcfcaa1eb3c0f8ad5dc171f94a16d9a5a815903829729e3bf23f96d722/metacells-0.9.4-cp38-cp38-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dbfa87d1be06fd71c7116270cbcc73057b87bb11c84afe8747efe07eeb8eba99",
                "md5": "93061896a50eeb3ddc67217374b1d4f4",
                "sha256": "bfb49ec13ec624539657ea95787ac1e36b7a9a3c41551c95784f0af0545fb958"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp39-cp39-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "93061896a50eeb3ddc67217374b1d4f4",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.7",
            "size": 5699915,
            "upload_time": "2023-10-24T13:48:32",
            "upload_time_iso_8601": "2023-10-24T13:48:32.571463Z",
            "url": "https://files.pythonhosted.org/packages/db/fa/87d1be06fd71c7116270cbcc73057b87bb11c84afe8747efe07eeb8eba99/metacells-0.9.4-cp39-cp39-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dcde77b55eba84a0957f045fa7d777ba4f344f59c20d3259a1c9ce56b6b07adc",
                "md5": "7f674f59af79351c3c1abc3bee07e4b7",
                "sha256": "af342c0eab1a33f6cf82d98844b48c1c54d0f81c8270cf0b8b6b4cb44ccbfa0e"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp39-cp39-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "7f674f59af79351c3c1abc3bee07e4b7",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.7",
            "size": 5528124,
            "upload_time": "2023-10-24T13:48:37",
            "upload_time_iso_8601": "2023-10-24T13:48:37.901068Z",
            "url": "https://files.pythonhosted.org/packages/dc/de/77b55eba84a0957f045fa7d777ba4f344f59c20d3259a1c9ce56b6b07adc/metacells-0.9.4-cp39-cp39-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "77d056a0a513b59359588876bc660720def23415c1005bf750bf78b657513c75",
                "md5": "c6113a286ae841fa59e84c4b6a494958",
                "sha256": "4cde3f5744ee5b5ae0bb80ffdfe50643ab94ec885e33199d64e229097c129318"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c6113a286ae841fa59e84c4b6a494958",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.7",
            "size": 81739441,
            "upload_time": "2023-10-24T13:48:47",
            "upload_time_iso_8601": "2023-10-24T13:48:47.979453Z",
            "url": "https://files.pythonhosted.org/packages/77/d0/56a0a513b59359588876bc660720def23415c1005bf750bf78b657513c75/metacells-0.9.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4266ff6b8625f6f79aa4ea2d98b13dacb3f09681b7e319d60f637e21d88bec78",
                "md5": "4ca9e18f0ff06a5ab8eae8936883886b",
                "sha256": "dd8fa83c28e95a4c2456e77b65a8d19f682f783d0cfb70ff40a6ba2fae88bfee"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4-cp39-cp39-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "4ca9e18f0ff06a5ab8eae8936883886b",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.7",
            "size": 82657917,
            "upload_time": "2023-10-24T13:49:00",
            "upload_time_iso_8601": "2023-10-24T13:49:00.047467Z",
            "url": "https://files.pythonhosted.org/packages/42/66/ff6b8625f6f79aa4ea2d98b13dacb3f09681b7e319d60f637e21d88bec78/metacells-0.9.4-cp39-cp39-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6f50edf7073ca3efda2fad0fd94afc1115af81be5afed49d399f803c80bca242",
                "md5": "fbd22a0e014924937273fdb53bb9dcc7",
                "sha256": "3e3856a690276701733a819d3e5ce1015bedd5553b59a20dddc02cf3bb1dc609"
            },
            "downloads": -1,
            "filename": "metacells-0.9.4.tar.gz",
            "has_sig": false,
            "md5_digest": "fbd22a0e014924937273fdb53bb9dcc7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 426586,
            "upload_time": "2023-10-24T13:45:10",
            "upload_time_iso_8601": "2023-10-24T13:45:10.914239Z",
            "url": "https://files.pythonhosted.org/packages/6f/50/edf7073ca3efda2fad0fd94afc1115af81be5afed49d399f803c80bca242/metacells-0.9.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-24 13:45:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tanaylab",
    "github_project": "metacells",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "metacells"
}
        
Elapsed time: 2.12185s