|logo|
|Release| |PyPI| |MIT licensed| |Travis Build| |GitHub CI| |RTD| |Issues|
|CoreInfra| |FAIRSoft|
|Paper I| |Paper II|
Description
===========
This repo contains a suite of codes to calculate correlation functions and
other clustering statistics for **simulated** galaxies in a cosmological box (co-moving XYZ)
and on **observed** galaxies with on-sky positions (RA, DEC, CZ). Read the
documentation on `corrfunc.rtfd.io <http://corrfunc.rtfd.io/>`_.
Why Should You Use it
======================
1. **Fast** Theory pair-counting is **7x** faster than ``SciPy cKDTree``, and at least **2x** faster than all existing public codes.
2. **OpenMP Parallel** All pair-counting codes can be done in parallel (with strong scaling efficiency >~ 95% up to 10 cores)
3. **Python Extensions** Python extensions allow you to do the compute-heavy bits using C while retaining all of the user-friendliness of Python.
4. **Weights** All correlation functions now support *arbitrary, user-specified* weights for individual points
5. **Modular** The code is written in a modular fashion and is easily extensible to compute arbitrary clustering statistics.
6. **Future-proof** As we get access to newer instruction-sets, the codes will get updated to use the latest and greatest CPU features.
*If you use the codes for your analysis, please star this repo -- that helps us keep track of the number of users.*
Benchmark against Existing Codes
================================
Please see this
`gist <https://gist.github.com/manodeep/cffd9a5d77510e43ccf0>`__ for
some benchmarks with current codes. If you have a pair-counter that you would like to compare, please add in a corresponding function and update the timings.
Installation
============
Pre-requisites
--------------
1. ``make >= 3.80``
2. OpenMP capable compiler like ``icc``, ``gcc>=4.6`` or ``clang >= 3.7``. If
not available, please disable ``USE_OMP`` option option in
``theory.options`` and ``mocks.options``. On a HPC cluster, consult the cluster
documentation for how to load a compiler (often ``module load gcc`` or similar).
If you are using Corrfunc with Anaconda Python, then ``conda install gcc`` (MAC/linux)
should work. On MAC, ``(sudo) port install gcc5`` is also an option.
3. ``gsl >= 2.4``. On an HPC cluster, consult the cluster documentation
(often ``module load gsl`` will work). With Anaconda Python, use
``conda install -c conda-forge gsl`` (MAC/linux). On MAC, you can use
``(sudo) port install gsl`` (MAC) if necessary.
4. ``python >= 2.7`` or ``python>=3.4`` for compiling the CPython extensions.
5. ``numpy>=1.7`` for compiling the CPython extensions.
Method 1: Source Installation (Recommended)
-------------------------------------------
::
$ git clone https://github.com/manodeep/Corrfunc.git
$ cd Corrfunc
$ make
$ make install
$ python -m pip install . [--user]
$ make tests # run the C tests
$ python -m pip install pytest
$ python -m pytest # run the Python tests
Assuming you have ``gcc`` in your ``PATH``, ``make`` and
``make install`` should compile and install the C libraries + Python
extensions within the source directory. If you would like to install the
CPython extensions in your environment, then
``python -m pip install . [--user]`` should be sufficient. If you are primarily
interested in the Python interface, you can condense all of the steps
by using ``python -m pip install . [--user] --install-option="CC=yourcompiler"``
after ``git clone [...]`` and ``cd Corrfunc``.
Compilation Notes
~~~~~~~~~~~~~~~~~
- If Python and/or numpy are not available, then the CPython extensions will not be compiled.
- ``make install`` simply copies files into the ``lib/bin/include`` sub-directories. You do not need ``root`` permissions
- Default compiler on MAC is set to ``clang``, if you want to specify a different compiler, you will have to call ``make CC=yourcompiler``, ``make install CC=yourcompiler``, ``make tests CC=yourcompiler`` etc. If you want to permanently change the default compiler, then please edit the `common.mk <common.mk>`__ file in the base directory.
- If you are directly using ``python -m pip install . [--user] --install-option="CC=yourcompiler"``, please run a ``make distclean`` beforehand (especially if switching compilers)
- Please note that Corrfunc is compiling with optimizations for the architecture
it is compiled on. That is, it uses ``gcc -march=native`` or similar.
For this reason, please try to compile Corrfunc on the architecture it will
be run on (usually this is only a concern in heterogeneous compute environments,
like an HPC cluster with multiple node types). In many cases, you can
compile on a more capable architecture (e.g. with AVX-512 support) then
run on a less capable architecture (e.g. with only AVX2), because the
runtime dispatch will select the appropriate kernel. But the non-kernel
elements of Corrfunc may emit AVX-512 instructions due to ``-march=native``.
If an ``Illegal instruction`` error occurs, then you'll need to recompile
on the target architecture.
Installation notes
~~~~~~~~~~~~~~~~~~
If compilation went smoothly, please run ``make tests`` to ensure the
code is working correctly. Depending on the hardware and compilation
options, the tests might take more than a few minutes. *Note that the
tests are exhaustive and not traditional unit tests*.
For Python tests, please run ``python -m pip install pytest`` and ``python -m pytest``
from the Corrfunc root dir.
While we have tried to ensure that the package compiles and runs out of
the box, cross-platform compatibility turns out to be incredibly hard.
If you run into any issues during compilation and you have all of the
pre-requisites, please see the `FAQ <FAQ>`__ or `email
the Corrfunc mailing list <mailto:corrfunc@googlegroups.com>`__. Also, feel free to create a new issue
with the ``Installation`` label.
Method 2: pip installation
--------------------------
The Python package is directly installable via ``python -m pip install Corrfunc``. However, in that case you will lose the ability to recompile the code. This usually fine if you are only using the Python interface and are on a single machine, like a laptop. For usage on a cluster or other environment with multiple CPU architectures, you may find it more useful to use the Source Installation method above in case you need to compile for a different architecture later.
Testing a pip-installed Corrfunc
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can check that a pip-installed Corrfunc is working with:
::
$ python -m pytest --pyargs Corrfunc
The pip installation does not include all of the test data contained in the main repo,
since it would total over 100 MB and the tests that generate on-the-fly data are similarly
exhaustive. pytest will mark tests where the data files are not availabe as "skipped".
If you would like to run the data-based tests, please use the Source Installation method.
OpenMP on OSX
--------------
Automatically detecting OpenMP support from the compiler and the runtime is a
bit tricky. If you run into any issues compiling (or running) with OpenMP,
please refer to the `FAQ <FAQ>`__ for potential solutions.
Clustering Measures on simulated galaxies
=========================================
Input data
----------
The input galaxies (or any discrete distribution of points) are derived from a
simulation. For instance, the galaxies could be a result of an Halo Occupation
Distribution (HOD) model, a Subhalo Abundance matching (SHAM) model, a
Semi-Empirical model (SEM), or a Semi-Analytic model (SAM) etc. The input set of
points can also be the dark matter halos, or the dark matter particles from
a cosmological simulation. The input set of points are expected to have
positions specified in Cartesian XYZ.
Types of available clustering statistics
----------------------------------------
All codes that work on cosmological boxes with co-moving positions are
located in the ``theory`` directory. The various clustering measures
are:
1. ``DD`` -- Measures auto/cross-correlations between two boxes.
The boxes do not need to be cubes.
2. ``xi`` -- Measures 3-d auto-correlation in a cubic cosmological box.
Assumes PERIODIC boundary conditions.
3. ``wp`` -- Measures auto 2-d point projected correlation function in a
cubic cosmological box. Assumes PERIODIC boundary conditions.
4. ``DDrppi`` -- Measures the auto/cross correlation function between
two boxes. The boxes do not need to be cubes.
5. ``DDsmu`` -- Measures the auto/cross correlation function between
two boxes. The boxes do not need to be cubes.
6. ``vpf`` -- Measures the void probability function + counts-in-cells.
Clustering measures on observed galaxies
========================================
Input data
----------
The input galaxies are typically observed galaxies coming from a large-scale
galaxy survey. In addition, simulated galaxies that have been projected onto the sky
(i.e., where observational systematics have been incorporated and on-sky
positions have been generated) can also be used. We generically refer to both
these kinds of galaxies as "mocks".
The input galaxies are expected to have positions specified in spherical
co-ordinates with at least right ascension (RA) and declination (DEC).
For spatial correlation functions, an approximate "co-moving" distance
(speed of light multiplied by redshift, CZ) is also required.
Types of available clustering statistics
----------------------------------------
All codes that work on mock catalogs (RA, DEC, CZ) are located in the
``mocks`` directory. The various clustering measures are:
1. ``DDrppi_mocks`` -- The standard auto/cross correlation between two data
sets. The outputs, DD, DR and RR can be combined using ``wprp`` to
produce the Landy-Szalay estimator for `wp(rp)`.
2. ``DDsmu_mocks`` -- The standard auto/cross correlation between two data
sets. The outputs, DD, DR and RR can be combined using the Python utility
``convert_3d_counts_to_cf`` to produce the Landy-Szalay estimator for `xi(s, mu)`.
3. ``DDtheta_mocks`` -- Computes angular correlation function between two data
sets. The outputs from ``DDtheta_mocks`` need to be combined with
``wtheta`` to get the full `\omega(\theta)`
4. ``vpf_mocks`` -- Computes the void probability function on mocks.
Science options
===============
If you plan to use the command-line, then you will have to specify the
code runtime options at compile-time. For theory routines, these options
are in the file `theory.options <theory.options>`__ while for the mocks, these options are
in file `mocks.options <mocks.options>`__.
**Note** All options can be specified at
runtime if you use the Python interface or the static libraries. Each one of
the following ``Makefile`` option has a corresponding entry for the runtime
libraries.
Theory (in `theory.options <theory.options>`__)
-------------------------------------------------
1. ``PERIODIC`` (ignored in case of wp/xi) -- switches periodic boundary
conditions on/off. Enabled by default.
2. ``OUTPUT_RPAVG`` -- switches on output of ``<rp>`` in each ``rp``
bin. Can be a massive performance hit (~ 2.2x in case of wp).
Disabled by default.
Mocks (in `mocks.options <mocks.options>`__)
----------------------------------------------
1. ``OUTPUT_RPAVG`` -- switches on output of ``<rp>`` in each ``rp``
bin for ``DDrppi_mocks``. Enabled by default.
2. ``OUTPUT_THETAAVG`` -- switches on output of in each theta bin. Can
be extremely slow (~5x) depending on compiler, and CPU capabilities.
Disabled by default.
3. ``LINK_IN_DEC`` -- creates binning in declination for ``DDtheta_mocks``. Please
check that for your desired limits ``\theta``, this binning does not
produce incorrect results (due to numerical precision). Generally speaking,
if your ``\thetamax`` (the max. ``\theta`` to consider pairs within) is too
small (probaly less than 1 degree), then you should check with and without
this option. Errors are typically sub-percent level.
4. ``LINK_IN_RA`` -- creates binning in RA once binning in DEC has been
enabled for ``DDtheta_mocks``. Same numerical issues as ``LINK_IN_DEC``
5. ``FAST_ACOS`` -- Relevant only when ``OUTPUT_THETAAVG`` is enabled for
``DDtheta_mocks``. Disabled by default. An ``arccos`` is required to
calculate ``<\theta>``. In absence of vectorized ``arccos`` (intel compiler,
``icc`` provides one via intel Short Vector Math Library), this calculation is extremely slow. However, we can approximate
``arccos`` using polynomials (with `Remez Algorithm <https://en.wikipedia.org/wiki/Remez_algorithm>`_).
The approximations are taken from implementations released by `Geometric Tools <http://geometrictools.com/>`_.
Depending on the level of accuracy desired, this implementation of ``fast acos``
can be tweaked in the file `utils/fast_acos.h <utils/fast_acos.h>`__. An alternate, less
accurate implementation is already present in that file. Please check that the loss of
precision is not important for your use-case.
6. ``COMOVING_DIST`` -- Currently there is no support in ``Corrfunc`` for different cosmologies. However, for the
mocks routines like, ``DDrppi_mocks`` and ``vpf_mocks``, cosmology parameters are required to convert between
redshift and co-moving distance. Both ``DDrppi_mocks`` and ``vpf_mocks`` expects to receive a ``redshift`` array
as input; however, with this option enabled, the ``redshift`` array will be assumed to contain already converted
co-moving distances. So, if you have redshifts and want to use an arbitrary cosmology, then convert the redshifts
into co-moving distances, enable this option, and pass the co-moving distance array into the routines.
Common Code options for both Mocks and Theory
==============================================
1. ``DOUBLE_PREC`` -- switches on calculations in double
precision. Calculations are performed in double precision when enabled. This
option is disabled by default in theory and enabled by default in the mocks
routines.
2. ``USE_OMP`` -- uses OpenMP parallelization. Scaling is great for DD
(close to perfect scaling up to 12 threads in our tests) and okay (runtime
becomes constant ~6-8 threads in our tests) for ``DDrppi`` and ``wp``.
Enabled by default. The ``Makefile`` will compare the `CC` variable with
known OpenMP enabled compilers and set compile options accordingly.
Set in `common.mk <common.mk>`__ by default.
3. ``ENABLE_MIN_SEP_OPT`` -- uses some further optimisations based on the
minimum separation between pairs of cells. Enabled by default.
4. ``COPY_PARTICLES`` -- whether or not to create a copy of the particle
positions (and weights, if supplied). Enabled by default (copies of the
particle arrays **are** created)
5. ``FAST_DIVIDE`` -- Disabled by default. Divisions are slow but required
``DDrppi_mocks(r_p,\pi)``, ``DDsmu_mocks(s, \mu)`` and ``DD(s, \mu)``.
Enabling this option, replaces the divisions with a reciprocal
followed by a Newton-Raphson. The code will run ~20% faster at the expense
of some numerical precision. Please check that the loss of precision is not
important for your use-case.
*Optimization for your architecture*
1. The values of ``bin_refine_factor`` and/or ``zbin_refine_factor`` in
the ``countpairs\_\*.c`` files control the cache-misses, and
consequently, the runtime. In trial-and-error methods, Manodeep has seen
any values larger than 3 are generally slower for theory routines but
can be faster for mocks. But some different
combination of 1/2 for ``(z)bin_refine_factor`` might be faster on
your platform.
2. If you are using the angular correlation function and need ``thetaavg``,
you might benefit from using the INTEL MKL library. The vectorized
trigonometric functions provided by MKL can provide significant speedup.
Running the codes
=================
Read the documentation on `corrfunc.rtfd.io <http://corrfunc.rtfd.io/>`_.
Using the command-line interface
--------------------------------
Navigate to the correct directory. Make sure that the options, set in
either `theory.options <theory.options>`__ or `mocks.options <mocks.options>`__ in the root directory are
what you want. If not, edit those two files (and possibly
`common.mk <common.mk>`__), and recompile. Then, you can use the command-line
executables in each individual subdirectory corresponding to the
clustering measure you are interested in. For example, if you want to
compute the full 3-D correlation function, ``\xi(r)``, then run the
executable ``theory/xi/xi``. If you run executables without any arguments,
the program will output a message with all the required arguments.
Calling from C
--------------
Look under the `run_correlations.c <theory/examples/run_correlations.c>`__ and
`run_correlations_mocks.c <mocks/examples/run_correlations_mocks.c>`__ to see examples of
calling the C API directly. If you run the executables,
``run_correlations`` and ``run_correlations_mocks``, the output will
also show how to call the command-line interface for the various
clustering measures.
Calling from Python
-------------------
If all went well, the codes can be directly called from ``python``.
Please see `call_correlation_functions.py <Corrfunc/call_correlation_functions.py>`__ and
`call_correlation_functions_mocks.py <Corrfunc/call_correlation_functions_mocks.py>`__ for examples on how to
use the CPython extensions directly. Here are a few examples:
.. code:: python
from __future__ import print_function
import os.path as path
import numpy as np
import Corrfunc
from Corrfunc.theory import wp
# Setup the problem for wp
boxsize = 500.0
pimax = 40.0
nthreads = 4
# Create a fake data-set.
Npts = 100000
x = np.float32(np.random.random(Npts))
y = np.float32(np.random.random(Npts))
z = np.float32(np.random.random(Npts))
x *= boxsize
y *= boxsize
z *= boxsize
# Setup the bins
rmin = 0.1
rmax = 20.0
nbins = 20
# Create the bins
rbins = np.logspace(np.log10(0.1), np.log10(rmax), nbins + 1)
# Call wp
wp_results = wp(boxsize, pimax, nthreads, rbins, x, y, z, verbose=True, output_rpavg=True)
# Print the results
print("#############################################################################")
print("## rmin rmax rpavg wp npairs")
print("#############################################################################")
print(wp_results)
Author & Maintainers
=====================
Corrfunc was designed and implemented by `Manodeep Sinha <https://github.com/manodeep>`_,
with contributions from `Lehman Garrison <https://github.com/lgarrison>`_,
`Nick Hand <https://github.com/nickhand>`_, and `Arnaud de Mattia <https://github.com/adematti>`_.
Corrfunc is currently maintained by Manodeep Sinha and Lehman Garrison.
Citing
======
If you use ``Corrfunc`` for research, please cite using the MNRAS code paper with the following
bibtex entry:
::
@ARTICLE{2020MNRAS.491.3022S,
author = {{Sinha}, Manodeep and {Garrison}, Lehman H.},
title = "{CORRFUNC - a suite of blazing fast correlation functions on
the CPU}",
journal = {\mnras},
keywords = {methods: numerical, galaxies: general, galaxies:
haloes, dark matter, large-scale structure of Universe, cosmology:
theory},
year = "2020",
month = "Jan",
volume = {491},
number = {2},
pages = {3022-3041},
doi = {10.1093/mnras/stz3157},
adsurl =
{https://ui.adsabs.harvard.edu/abs/2020MNRAS.491.3022S},
adsnote = {Provided by the SAO/NASA
Astrophysics Data System}
}
If you are using ``Corrfunc v2.3.0`` or later, **and** you benefit from the
enhanced vectorised kernels, then please additionally cite this paper:
::
@InProceedings{10.1007/978-981-13-7729-7_1,
author="Sinha, Manodeep and Garrison, Lehman",
editor="Majumdar, Amit and Arora, Ritu",
title="CORRFUNC: Blazing Fast Correlation Functions with AVX512F SIMD Intrinsics",
booktitle="Software Challenges to Exascale Computing",
year="2019",
publisher="Springer Singapore",
address="Singapore",
pages="3--20",
isbn="978-981-13-7729-7",
url={https://doi.org/10.1007/978-981-13-7729-7_1}
}
Mailing list
============
If you have questions or comments about the package, please do so on the
mailing list: https://groups.google.com/forum/#!forum/corrfunc
LICENSE
=======
Corrfunc is released under the MIT license. Basically, do what you want
with the code, including using it in commercial application.
Project URLs
============
- Documentation (http://corrfunc.rtfd.io/)
- Source Repository (https://github.com/manodeep/Corrfunc)
- Entry in the Astrophysical Source Code Library (ASCL) |ASCL|
- Zenodo Releases |Zenodo|
.. |logo| image:: https://github.com/manodeep/Corrfunc/blob/master/corrfunc_logo.png
:target: https://github.com/manodeep/Corrfunc
:alt: Corrfunc logo
.. |Release| image:: https://img.shields.io/github/release/manodeep/Corrfunc.svg
:target: https://github.com/manodeep/Corrfunc/releases/latest
:alt: Latest Release
.. |PyPI| image:: https://img.shields.io/pypi/v/Corrfunc.svg
:target: https://pypi.python.org/pypi/Corrfunc
:alt: PyPI Release
.. |MIT licensed| image:: https://img.shields.io/badge/license-MIT-blue.svg
:target: https://raw.githubusercontent.com/manodeep/Corrfunc/master/LICENSE
:alt: MIT License
.. |Travis Build| image:: https://travis-ci.com/manodeep/Corrfunc.svg?branch=master
:target: https://travis-ci.com/manodeep/Corrfunc
:alt: Build Status
.. |GitHub CI| image:: https://github.com/manodeep/Corrfunc/workflows/GitHub%20CI/badge.svg
:target: https://github.com/manodeep/Corrfunc/actions
:alt: GitHub Actions Status
.. |Issues| image:: https://img.shields.io/github/issues/manodeep/Corrfunc.svg
:target: https://github.com/manodeep/Corrfunc/issues
:alt: Open Issues
.. |RTD| image:: https://readthedocs.org/projects/corrfunc/badge/?version=master
:target: http://corrfunc.readthedocs.io/en/master/?badge=master
:alt: Documentation Status
.. |CoreInfra| image:: https://bestpractices.coreinfrastructure.org/projects/5037/badge
:target: https://bestpractices.coreinfrastructure.org/en/projects/5037
:alt: Core Infrastructure Best Practices Status
.. |FAIRSoft| image:: https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green
:target: https://fair-software.eu
:alt: Fair Software (EU) Compliance
.. |Paper I| image:: https://img.shields.io/badge/arXiv-1911.03545-%23B31B1B
:target: https://arxiv.org/abs/1911.03545
:alt: Corrfunc Paper I
.. |Paper II| image:: https://img.shields.io/badge/arXiv-1911.08275-%23B31B1B
:target: https://arxiv.org/abs/1911.08275
:alt: Corrfunc Paper II
.. |ASCL| image:: https://img.shields.io/badge/ascl-1703.003-blue.svg?colorB=262255
:target: http://ascl.net/1703.003
:alt: ascl:1703.003
.. |Zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.3634195.svg
:target: https://doi.org/10.5281/zenodo.3634195
Raw data
{
"_id": null,
"home_page": "https://github.com/manodeep/Corrfunc",
"name": "Corrfunc",
"maintainer": "Manodeep Sinha",
"docs_url": null,
"requires_python": ">=2.7,!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, <4",
"maintainer_email": "manodeep@gmail.com",
"keywords": "correlation functions,simulations,surveys,galaxies",
"author": "Manodeep Sinha",
"author_email": "manodeep@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/39/df/622884e96b1c0b0eafc2e532ec08e35fa60dc2fd63562c2037382692c5f4/Corrfunc-2.5.2.tar.gz",
"platform": "Linux",
"description": "|logo|\n\n|Release| |PyPI| |MIT licensed| |Travis Build| |GitHub CI| |RTD| |Issues|\n\n|CoreInfra| |FAIRSoft|\n\n|Paper I| |Paper II|\n\n\nDescription\n===========\n\nThis repo contains a suite of codes to calculate correlation functions and\nother clustering statistics for **simulated** galaxies in a cosmological box (co-moving XYZ)\nand on **observed** galaxies with on-sky positions (RA, DEC, CZ). Read the\ndocumentation on `corrfunc.rtfd.io <http://corrfunc.rtfd.io/>`_.\n\nWhy Should You Use it\n======================\n\n1. **Fast** Theory pair-counting is **7x** faster than ``SciPy cKDTree``, and at least **2x** faster than all existing public codes.\n2. **OpenMP Parallel** All pair-counting codes can be done in parallel (with strong scaling efficiency >~ 95% up to 10 cores)\n3. **Python Extensions** Python extensions allow you to do the compute-heavy bits using C while retaining all of the user-friendliness of Python.\n4. **Weights** All correlation functions now support *arbitrary, user-specified* weights for individual points\n5. **Modular** The code is written in a modular fashion and is easily extensible to compute arbitrary clustering statistics.\n6. **Future-proof** As we get access to newer instruction-sets, the codes will get updated to use the latest and greatest CPU features.\n\n*If you use the codes for your analysis, please star this repo -- that helps us keep track of the number of users.*\n\nBenchmark against Existing Codes\n================================\n\nPlease see this\n`gist <https://gist.github.com/manodeep/cffd9a5d77510e43ccf0>`__ for\nsome benchmarks with current codes. If you have a pair-counter that you would like to compare, please add in a corresponding function and update the timings.\n\nInstallation\n============\n\nPre-requisites\n--------------\n\n1. ``make >= 3.80``\n2. OpenMP capable compiler like ``icc``, ``gcc>=4.6`` or ``clang >= 3.7``. If\n not available, please disable ``USE_OMP`` option option in\n ``theory.options`` and ``mocks.options``. On a HPC cluster, consult the cluster\n documentation for how to load a compiler (often ``module load gcc`` or similar).\n If you are using Corrfunc with Anaconda Python, then ``conda install gcc`` (MAC/linux)\n should work. On MAC, ``(sudo) port install gcc5`` is also an option.\n3. ``gsl >= 2.4``. On an HPC cluster, consult the cluster documentation\n (often ``module load gsl`` will work). With Anaconda Python, use\n ``conda install -c conda-forge gsl`` (MAC/linux). On MAC, you can use\n ``(sudo) port install gsl`` (MAC) if necessary.\n4. ``python >= 2.7`` or ``python>=3.4`` for compiling the CPython extensions.\n5. ``numpy>=1.7`` for compiling the CPython extensions.\n\nMethod 1: Source Installation (Recommended)\n-------------------------------------------\n\n::\n\n $ git clone https://github.com/manodeep/Corrfunc.git\n $ cd Corrfunc\n $ make\n $ make install\n $ python -m pip install . [--user]\n \n $ make tests # run the C tests\n $ python -m pip install pytest\n $ python -m pytest # run the Python tests\n\nAssuming you have ``gcc`` in your ``PATH``, ``make`` and\n``make install`` should compile and install the C libraries + Python\nextensions within the source directory. If you would like to install the\nCPython extensions in your environment, then\n``python -m pip install . [--user]`` should be sufficient. If you are primarily\ninterested in the Python interface, you can condense all of the steps\nby using ``python -m pip install . [--user] --install-option=\"CC=yourcompiler\"``\nafter ``git clone [...]`` and ``cd Corrfunc``.\n\nCompilation Notes\n~~~~~~~~~~~~~~~~~\n\n- If Python and/or numpy are not available, then the CPython extensions will not be compiled.\n\n- ``make install`` simply copies files into the ``lib/bin/include`` sub-directories. You do not need ``root`` permissions\n\n- Default compiler on MAC is set to ``clang``, if you want to specify a different compiler, you will have to call ``make CC=yourcompiler``, ``make install CC=yourcompiler``, ``make tests CC=yourcompiler`` etc. If you want to permanently change the default compiler, then please edit the `common.mk <common.mk>`__ file in the base directory.\n\n- If you are directly using ``python -m pip install . [--user] --install-option=\"CC=yourcompiler\"``, please run a ``make distclean`` beforehand (especially if switching compilers)\n\n- Please note that Corrfunc is compiling with optimizations for the architecture\n it is compiled on. That is, it uses ``gcc -march=native`` or similar.\n For this reason, please try to compile Corrfunc on the architecture it will\n be run on (usually this is only a concern in heterogeneous compute environments,\n like an HPC cluster with multiple node types). In many cases, you can\n compile on a more capable architecture (e.g. with AVX-512 support) then\n run on a less capable architecture (e.g. with only AVX2), because the\n runtime dispatch will select the appropriate kernel. But the non-kernel\n elements of Corrfunc may emit AVX-512 instructions due to ``-march=native``.\n If an ``Illegal instruction`` error occurs, then you'll need to recompile\n on the target architecture.\n\nInstallation notes\n~~~~~~~~~~~~~~~~~~\n\nIf compilation went smoothly, please run ``make tests`` to ensure the\ncode is working correctly. Depending on the hardware and compilation\noptions, the tests might take more than a few minutes. *Note that the\ntests are exhaustive and not traditional unit tests*.\n\nFor Python tests, please run ``python -m pip install pytest`` and ``python -m pytest``\nfrom the Corrfunc root dir.\n\nWhile we have tried to ensure that the package compiles and runs out of\nthe box, cross-platform compatibility turns out to be incredibly hard.\nIf you run into any issues during compilation and you have all of the\npre-requisites, please see the `FAQ <FAQ>`__ or `email\nthe Corrfunc mailing list <mailto:corrfunc@googlegroups.com>`__. Also, feel free to create a new issue\nwith the ``Installation`` label.\n\n\nMethod 2: pip installation\n--------------------------\n\nThe Python package is directly installable via ``python -m pip install Corrfunc``. However, in that case you will lose the ability to recompile the code. This usually fine if you are only using the Python interface and are on a single machine, like a laptop. For usage on a cluster or other environment with multiple CPU architectures, you may find it more useful to use the Source Installation method above in case you need to compile for a different architecture later.\n\nTesting a pip-installed Corrfunc\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can check that a pip-installed Corrfunc is working with:\n\n::\n\n $ python -m pytest --pyargs Corrfunc\n\n\nThe pip installation does not include all of the test data contained in the main repo,\nsince it would total over 100 MB and the tests that generate on-the-fly data are similarly\nexhaustive. pytest will mark tests where the data files are not availabe as \"skipped\".\nIf you would like to run the data-based tests, please use the Source Installation method.\n\n\nOpenMP on OSX\n--------------\n\nAutomatically detecting OpenMP support from the compiler and the runtime is a\nbit tricky. If you run into any issues compiling (or running) with OpenMP,\nplease refer to the `FAQ <FAQ>`__ for potential solutions.\n\n\nClustering Measures on simulated galaxies\n=========================================\n\nInput data\n----------\n\nThe input galaxies (or any discrete distribution of points) are derived from a\nsimulation. For instance, the galaxies could be a result of an Halo Occupation\nDistribution (HOD) model, a Subhalo Abundance matching (SHAM) model, a\nSemi-Empirical model (SEM), or a Semi-Analytic model (SAM) etc. The input set of\npoints can also be the dark matter halos, or the dark matter particles from\na cosmological simulation. The input set of points are expected to have\npositions specified in Cartesian XYZ.\n\nTypes of available clustering statistics\n----------------------------------------\n\nAll codes that work on cosmological boxes with co-moving positions are\nlocated in the ``theory`` directory. The various clustering measures\nare:\n\n1. ``DD`` -- Measures auto/cross-correlations between two boxes.\n The boxes do not need to be cubes.\n\n2. ``xi`` -- Measures 3-d auto-correlation in a cubic cosmological box.\n Assumes PERIODIC boundary conditions.\n\n3. ``wp`` -- Measures auto 2-d point projected correlation function in a\n cubic cosmological box. Assumes PERIODIC boundary conditions.\n\n4. ``DDrppi`` -- Measures the auto/cross correlation function between\n two boxes. The boxes do not need to be cubes.\n\n5. ``DDsmu`` -- Measures the auto/cross correlation function between\n two boxes. The boxes do not need to be cubes.\n\n6. ``vpf`` -- Measures the void probability function + counts-in-cells.\n\nClustering measures on observed galaxies\n========================================\n\nInput data\n----------\n\nThe input galaxies are typically observed galaxies coming from a large-scale\ngalaxy survey. In addition, simulated galaxies that have been projected onto the sky\n(i.e., where observational systematics have been incorporated and on-sky\npositions have been generated) can also be used. We generically refer to both\nthese kinds of galaxies as \"mocks\".\n\n\nThe input galaxies are expected to have positions specified in spherical\nco-ordinates with at least right ascension (RA) and declination (DEC).\nFor spatial correlation functions, an approximate \"co-moving\" distance\n(speed of light multiplied by redshift, CZ) is also required.\n\n\nTypes of available clustering statistics\n----------------------------------------\n\nAll codes that work on mock catalogs (RA, DEC, CZ) are located in the\n``mocks`` directory. The various clustering measures are:\n\n1. ``DDrppi_mocks`` -- The standard auto/cross correlation between two data\n sets. The outputs, DD, DR and RR can be combined using ``wprp`` to\n produce the Landy-Szalay estimator for `wp(rp)`.\n\n2. ``DDsmu_mocks`` -- The standard auto/cross correlation between two data\n sets. The outputs, DD, DR and RR can be combined using the Python utility\n ``convert_3d_counts_to_cf`` to produce the Landy-Szalay estimator for `xi(s, mu)`.\n\n3. ``DDtheta_mocks`` -- Computes angular correlation function between two data\n sets. The outputs from ``DDtheta_mocks`` need to be combined with\n ``wtheta`` to get the full `\\omega(\\theta)`\n\n4. ``vpf_mocks`` -- Computes the void probability function on mocks.\n\nScience options\n===============\n\nIf you plan to use the command-line, then you will have to specify the\ncode runtime options at compile-time. For theory routines, these options\nare in the file `theory.options <theory.options>`__ while for the mocks, these options are\nin file `mocks.options <mocks.options>`__.\n\n**Note** All options can be specified at\nruntime if you use the Python interface or the static libraries. Each one of\nthe following ``Makefile`` option has a corresponding entry for the runtime\nlibraries.\n\nTheory (in `theory.options <theory.options>`__)\n-------------------------------------------------\n\n1. ``PERIODIC`` (ignored in case of wp/xi) -- switches periodic boundary\n conditions on/off. Enabled by default.\n\n2. ``OUTPUT_RPAVG`` -- switches on output of ``<rp>`` in each ``rp``\n bin. Can be a massive performance hit (~ 2.2x in case of wp).\n Disabled by default.\n\nMocks (in `mocks.options <mocks.options>`__)\n----------------------------------------------\n\n1. ``OUTPUT_RPAVG`` -- switches on output of ``<rp>`` in each ``rp``\n bin for ``DDrppi_mocks``. Enabled by default.\n\n2. ``OUTPUT_THETAAVG`` -- switches on output of in each theta bin. Can\n be extremely slow (~5x) depending on compiler, and CPU capabilities.\n Disabled by default.\n\n3. ``LINK_IN_DEC`` -- creates binning in declination for ``DDtheta_mocks``. Please\n check that for your desired limits ``\\theta``, this binning does not\n produce incorrect results (due to numerical precision). Generally speaking,\n if your ``\\thetamax`` (the max. ``\\theta`` to consider pairs within) is too\n small (probaly less than 1 degree), then you should check with and without\n this option. Errors are typically sub-percent level.\n\n4. ``LINK_IN_RA`` -- creates binning in RA once binning in DEC has been\n enabled for ``DDtheta_mocks``. Same numerical issues as ``LINK_IN_DEC``\n\n5. ``FAST_ACOS`` -- Relevant only when ``OUTPUT_THETAAVG`` is enabled for\n ``DDtheta_mocks``. Disabled by default. An ``arccos`` is required to\n calculate ``<\\theta>``. In absence of vectorized ``arccos`` (intel compiler,\n ``icc`` provides one via intel Short Vector Math Library), this calculation is extremely slow. However, we can approximate\n ``arccos`` using polynomials (with `Remez Algorithm <https://en.wikipedia.org/wiki/Remez_algorithm>`_).\n The approximations are taken from implementations released by `Geometric Tools <http://geometrictools.com/>`_.\n Depending on the level of accuracy desired, this implementation of ``fast acos``\n can be tweaked in the file `utils/fast_acos.h <utils/fast_acos.h>`__. An alternate, less\n accurate implementation is already present in that file. Please check that the loss of\n precision is not important for your use-case.\n\n6. ``COMOVING_DIST`` -- Currently there is no support in ``Corrfunc`` for different cosmologies. However, for the\n mocks routines like, ``DDrppi_mocks`` and ``vpf_mocks``, cosmology parameters are required to convert between\n redshift and co-moving distance. Both ``DDrppi_mocks`` and ``vpf_mocks`` expects to receive a ``redshift`` array\n as input; however, with this option enabled, the ``redshift`` array will be assumed to contain already converted\n co-moving distances. So, if you have redshifts and want to use an arbitrary cosmology, then convert the redshifts\n into co-moving distances, enable this option, and pass the co-moving distance array into the routines.\n\nCommon Code options for both Mocks and Theory\n==============================================\n\n1. ``DOUBLE_PREC`` -- switches on calculations in double\n precision. Calculations are performed in double precision when enabled. This\n option is disabled by default in theory and enabled by default in the mocks\n routines.\n\n2. ``USE_OMP`` -- uses OpenMP parallelization. Scaling is great for DD\n (close to perfect scaling up to 12 threads in our tests) and okay (runtime\n becomes constant ~6-8 threads in our tests) for ``DDrppi`` and ``wp``.\n Enabled by default. The ``Makefile`` will compare the `CC` variable with\n known OpenMP enabled compilers and set compile options accordingly.\n Set in `common.mk <common.mk>`__ by default.\n\n3. ``ENABLE_MIN_SEP_OPT`` -- uses some further optimisations based on the\n minimum separation between pairs of cells. Enabled by default.\n\n4. ``COPY_PARTICLES`` -- whether or not to create a copy of the particle\n positions (and weights, if supplied). Enabled by default (copies of the\n particle arrays **are** created)\n\n5. ``FAST_DIVIDE`` -- Disabled by default. Divisions are slow but required\n ``DDrppi_mocks(r_p,\\pi)``, ``DDsmu_mocks(s, \\mu)`` and ``DD(s, \\mu)``.\n Enabling this option, replaces the divisions with a reciprocal\n followed by a Newton-Raphson. The code will run ~20% faster at the expense\n of some numerical precision. Please check that the loss of precision is not\n important for your use-case.\n\n*Optimization for your architecture*\n\n1. The values of ``bin_refine_factor`` and/or ``zbin_refine_factor`` in\n the ``countpairs\\_\\*.c`` files control the cache-misses, and\n consequently, the runtime. In trial-and-error methods, Manodeep has seen\n any values larger than 3 are generally slower for theory routines but\n can be faster for mocks. But some different\n combination of 1/2 for ``(z)bin_refine_factor`` might be faster on\n your platform.\n\n2. If you are using the angular correlation function and need ``thetaavg``,\n you might benefit from using the INTEL MKL library. The vectorized\n trigonometric functions provided by MKL can provide significant speedup.\n\n\nRunning the codes\n=================\n\nRead the documentation on `corrfunc.rtfd.io <http://corrfunc.rtfd.io/>`_.\n\n\nUsing the command-line interface\n--------------------------------\n\nNavigate to the correct directory. Make sure that the options, set in\neither `theory.options <theory.options>`__ or `mocks.options <mocks.options>`__ in the root directory are\nwhat you want. If not, edit those two files (and possibly\n`common.mk <common.mk>`__), and recompile. Then, you can use the command-line\nexecutables in each individual subdirectory corresponding to the\nclustering measure you are interested in. For example, if you want to\ncompute the full 3-D correlation function, ``\\xi(r)``, then run the\nexecutable ``theory/xi/xi``. If you run executables without any arguments,\nthe program will output a message with all the required arguments.\n\nCalling from C\n--------------\n\nLook under the `run_correlations.c <theory/examples/run_correlations.c>`__ and\n`run_correlations_mocks.c <mocks/examples/run_correlations_mocks.c>`__ to see examples of\ncalling the C API directly. If you run the executables,\n``run_correlations`` and ``run_correlations_mocks``, the output will\nalso show how to call the command-line interface for the various\nclustering measures.\n\nCalling from Python\n-------------------\n\nIf all went well, the codes can be directly called from ``python``.\nPlease see `call_correlation_functions.py <Corrfunc/call_correlation_functions.py>`__ and\n`call_correlation_functions_mocks.py <Corrfunc/call_correlation_functions_mocks.py>`__ for examples on how to\nuse the CPython extensions directly. Here are a few examples:\n\n.. code:: python\n\n from __future__ import print_function\n import os.path as path\n import numpy as np\n import Corrfunc\n from Corrfunc.theory import wp\n\n # Setup the problem for wp\n boxsize = 500.0\n pimax = 40.0\n nthreads = 4\n\n # Create a fake data-set.\n Npts = 100000\n x = np.float32(np.random.random(Npts))\n y = np.float32(np.random.random(Npts))\n z = np.float32(np.random.random(Npts))\n x *= boxsize\n y *= boxsize\n z *= boxsize\n\n # Setup the bins\n rmin = 0.1\n rmax = 20.0\n nbins = 20\n\n # Create the bins\n rbins = np.logspace(np.log10(0.1), np.log10(rmax), nbins + 1)\n\n # Call wp\n wp_results = wp(boxsize, pimax, nthreads, rbins, x, y, z, verbose=True, output_rpavg=True)\n\n # Print the results\n print(\"#############################################################################\")\n print(\"## rmin rmax rpavg wp npairs\")\n print(\"#############################################################################\")\n print(wp_results)\n\n\nAuthor & Maintainers\n=====================\n\nCorrfunc was designed and implemented by `Manodeep Sinha <https://github.com/manodeep>`_,\nwith contributions from `Lehman Garrison <https://github.com/lgarrison>`_,\n`Nick Hand <https://github.com/nickhand>`_, and `Arnaud de Mattia <https://github.com/adematti>`_.\nCorrfunc is currently maintained by Manodeep Sinha and Lehman Garrison.\n\nCiting\n======\n\nIf you use ``Corrfunc`` for research, please cite using the MNRAS code paper with the following\nbibtex entry:\n\n::\n\n @ARTICLE{2020MNRAS.491.3022S,\n author = {{Sinha}, Manodeep and {Garrison}, Lehman H.},\n title = \"{CORRFUNC - a suite of blazing fast correlation functions on\n the CPU}\",\n journal = {\\mnras},\n keywords = {methods: numerical, galaxies: general, galaxies:\n haloes, dark matter, large-scale structure of Universe, cosmology:\n theory},\n year = \"2020\",\n month = \"Jan\",\n volume = {491},\n number = {2},\n pages = {3022-3041},\n doi = {10.1093/mnras/stz3157},\n adsurl =\n {https://ui.adsabs.harvard.edu/abs/2020MNRAS.491.3022S},\n adsnote = {Provided by the SAO/NASA\n Astrophysics Data System}\n }\n\n\nIf you are using ``Corrfunc v2.3.0`` or later, **and** you benefit from the\nenhanced vectorised kernels, then please additionally cite this paper:\n\n::\n\n @InProceedings{10.1007/978-981-13-7729-7_1,\n author=\"Sinha, Manodeep and Garrison, Lehman\",\n editor=\"Majumdar, Amit and Arora, Ritu\",\n title=\"CORRFUNC: Blazing Fast Correlation Functions with AVX512F SIMD Intrinsics\",\n booktitle=\"Software Challenges to Exascale Computing\",\n year=\"2019\",\n publisher=\"Springer Singapore\",\n address=\"Singapore\",\n pages=\"3--20\",\n isbn=\"978-981-13-7729-7\",\n url={https://doi.org/10.1007/978-981-13-7729-7_1}\n }\n\n\n\nMailing list\n============\n\nIf you have questions or comments about the package, please do so on the\nmailing list: https://groups.google.com/forum/#!forum/corrfunc\n\nLICENSE\n=======\n\nCorrfunc is released under the MIT license. Basically, do what you want\nwith the code, including using it in commercial application.\n\nProject URLs\n============\n\n- Documentation (http://corrfunc.rtfd.io/)\n- Source Repository (https://github.com/manodeep/Corrfunc)\n- Entry in the Astrophysical Source Code Library (ASCL) |ASCL|\n- Zenodo Releases |Zenodo|\n\n.. |logo| image:: https://github.com/manodeep/Corrfunc/blob/master/corrfunc_logo.png\n :target: https://github.com/manodeep/Corrfunc\n :alt: Corrfunc logo\n.. |Release| image:: https://img.shields.io/github/release/manodeep/Corrfunc.svg\n :target: https://github.com/manodeep/Corrfunc/releases/latest\n :alt: Latest Release\n.. |PyPI| image:: https://img.shields.io/pypi/v/Corrfunc.svg\n :target: https://pypi.python.org/pypi/Corrfunc\n :alt: PyPI Release\n.. |MIT licensed| image:: https://img.shields.io/badge/license-MIT-blue.svg\n :target: https://raw.githubusercontent.com/manodeep/Corrfunc/master/LICENSE\n :alt: MIT License\n.. |Travis Build| image:: https://travis-ci.com/manodeep/Corrfunc.svg?branch=master\n :target: https://travis-ci.com/manodeep/Corrfunc\n :alt: Build Status\n.. |GitHub CI| image:: https://github.com/manodeep/Corrfunc/workflows/GitHub%20CI/badge.svg\n :target: https://github.com/manodeep/Corrfunc/actions\n :alt: GitHub Actions Status\n.. |Issues| image:: https://img.shields.io/github/issues/manodeep/Corrfunc.svg\n :target: https://github.com/manodeep/Corrfunc/issues\n :alt: Open Issues\n.. |RTD| image:: https://readthedocs.org/projects/corrfunc/badge/?version=master\n :target: http://corrfunc.readthedocs.io/en/master/?badge=master\n :alt: Documentation Status\n\n.. |CoreInfra| image:: https://bestpractices.coreinfrastructure.org/projects/5037/badge\n :target: https://bestpractices.coreinfrastructure.org/en/projects/5037\n :alt: Core Infrastructure Best Practices Status\n\n.. |FAIRSoft| image:: https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green\n :target: https://fair-software.eu\n :alt: Fair Software (EU) Compliance\n\n.. |Paper I| image:: https://img.shields.io/badge/arXiv-1911.03545-%23B31B1B\n :target: https://arxiv.org/abs/1911.03545\n :alt: Corrfunc Paper I\n.. |Paper II| image:: https://img.shields.io/badge/arXiv-1911.08275-%23B31B1B\n :target: https://arxiv.org/abs/1911.08275\n :alt: Corrfunc Paper II\n\n.. |ASCL| image:: https://img.shields.io/badge/ascl-1703.003-blue.svg?colorB=262255\n :target: http://ascl.net/1703.003\n :alt: ascl:1703.003\n.. |Zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.3634195.svg\n :target: https://doi.org/10.5281/zenodo.3634195\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Blazing fast correlation functions on the CPU",
"version": "2.5.2",
"project_urls": {
"Download": "https://github.com/manodeep/Corrfunc/archive/Corrfunc-2.5.2.tar.gz",
"Homepage": "https://github.com/manodeep/Corrfunc"
},
"split_keywords": [
"correlation functions",
"simulations",
"surveys",
"galaxies"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "39df622884e96b1c0b0eafc2e532ec08e35fa60dc2fd63562c2037382692c5f4",
"md5": "c968f4aebeb9d330b41ff67642f4be01",
"sha256": "191ad0bba852acc590e96cd8b866a3ddbb907b26e91b545318ec056be4e5415e"
},
"downloads": -1,
"filename": "Corrfunc-2.5.2.tar.gz",
"has_sig": false,
"md5_digest": "c968f4aebeb9d330b41ff67642f4be01",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=2.7,!=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, <4",
"size": 91544990,
"upload_time": "2023-10-03T23:28:43",
"upload_time_iso_8601": "2023-10-03T23:28:43.814213Z",
"url": "https://files.pythonhosted.org/packages/39/df/622884e96b1c0b0eafc2e532ec08e35fa60dc2fd63562c2037382692c5f4/Corrfunc-2.5.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-03 23:28:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "manodeep",
"github_project": "Corrfunc",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"lcname": "corrfunc"
}