@lgarrison To guide what needs to/could happen for Corrfunc 3.0, here is my list:
Essential (?)
-
Remove python2 support completely
-
Add modern packaging with pyproject.toml / meson / whatever-else-we-should-be-using
-
Solve the multiple OpenMP runtime library issue
Possibly (?)
-
Add numbins optimisation that only uses the number of bins necessary (as determined by the min. and max. distance possible between two cell-pairs)
-
Change the OpenMP parallelization to go over cell-pairs (improves cache utilisation, reduces memory requirement -> we can increase the max-bin-ref factors)
May be (?)
This is also open for community discussion. If anyone has opinions on what should go in, please do add a comment.
@lgarrison To guide what needs to/could happen for Corrfunc 3.0, here is my list:
Essential (?)
Remove python2 support completely
from __future__type constructs from python codeAdd modern packaging with
pyproject.toml/meson/ whatever-else-we-should-be-usingSolve the multiple OpenMP runtime library issue
Possibly (?)
Add
numbinsoptimisation that only uses the number of bins necessary (as determined by the min. and max. distance possible between two cell-pairs)nbins-1)simd reductionclause, or simply a+=. This could be in conjunction with a bit-setting routine that sets bits and shifts to count up to 64 pairs before callingpopcnt. However, I don't see how to do that without a LOT OF code duplication.Change the OpenMP parallelization to go over cell-pairs (improves cache utilisation, reduces memory requirement -> we can increase the max-bin-ref factors)
generate_cell_pairsfunction that returns the potential neighbouring cell-pairs for any given primary cellMay be (?)
Add ARM64 kernels
-march=armv8a(or-mcpu=apple-m1) to CFLAGSRename package to
corrfuncand release conda wheelstarget_clonesto all functionslinux + x86_64)import Corrfuncstill works (how?)This is also open for community discussion. If anyone has opinions on what should go in, please do add a comment.