fluxbind


Namefluxbind JSON
Version 0.0.1 PyPI version JSON
download
home_pagehttps://github.com/compspec/fluxbind
SummaryProcess mapping for Flux jobs
upload_time2025-10-13 04:01:13
maintainerVanessa Sochat
docs_urlNone
authorVanessa Sochat
requires_pythonNone
licenseLICENSE
keywords cluster orchestration mpi binding topology
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # fluxbind

> Intelligent detection and mapping of processors for HPC

[![PyPI version](https://badge.fury.io/py/fractale.svg)](https://badge.fury.io/py/fluxbind)

![img/fluxbind.png](img/fluxbind-small.png)

## Run

Use fluxbind to run a job binding to specific cores. For flux, this means we require exclusive, and then for each node customize the binding exactly as we want it. We do this via a shape file.


### Basic Examples

```bash
# Start with a first match policy
flux start --config ./examples/config/match-first.toml

# 1. Bind each task to a unique physical core, starting from core:0 (common case)
fluxbind run -n 8 --quiet --shape ./examples/shape/1node/packed-cores-shapefile.yaml sleep 1

# 2. Reverse it!
fluxbind run -n 8 --quiet --shape ./examples/shape/1node/packed-cores-reversed-shapefile.yaml sleep 1

# 3. Packed PUs (hyperthreading), so interleaved.
fluxbind run --tasks-per-core 2 --quiet --shape ./examples/shape/1node/interleaved-shapefile.yaml sleep 1

# 4. Reverse it again!
fluxbind run --tasks-per-core 2 --quiet --shape ./examples/shape/1node/interleaved-reversed-shapefile.yaml sleep 1

# 5. An unbound rank - this tests "unbound" to leave Rank 0 unbound, pack all other ranks onto cores, shifted by one.
fluxbind run -N1 -n 3 --shape ./examples/shape/1node/unbound_rank.yaml sleep 1

# 6. L2 cache affinity. Give each task its own dedicated L2 cache to maximize cache performance.
# On mymachine, each core has its own private L2 cache.
# Therefore, binding one task per L2 cache is equivalent to binding one task per core.
fluxbind run -N1 -n 8 --quiet --shape ./examples/shape/1node/cache-affinity.yaml sleep 1

# 7. Reverse it
fluxbind run -N1 -n 8 --quiet --shape ./examples/shape/1node/cache-reversed-affinity.yaml sleep 1
```

### Kripke Examples

As we prepare to test with apps, here are some tests I'm thinking of doing.

```bash
# 1. Baseline - pack each MPI rank onto its own dedicated physical core (8.693519e-09)
fluxbind run -N 1 -n 8 --shape ./examples/shape/kripke/baseline-shapefile.yaml kripke --procs 2,2,2 --zones 16,16,16 --niter 500

# 2. Spread cores (memory bandwidth)
# If Kripke is limited by memory bandwidth, if we place ranks on every other core, we reduce contention for the shared L3 cache
# If Kripke memory bound, this layout might be faster than packed even with half cores. If compute based, worse (1.341355e-08)
fluxbind run -N 1 -n 4 --shape ./examples/shape/kripke/memory-spread-cores-shapefile.yaml kripke --procs 2,2,1 --zones 16,16,16 --niter 500

# 3. Packed pus (each of 8 cores has 2 pu == 16). We are testing if Kripke can benefit from SMT (simultaneous multi-threading)
fluxbind run -N 1 --tasks-per-core 2 --shape ./examples/shape/kripke/packed-pus-shapefile.yaml kripke --procs 2,4,2 --zones 16,16,16 --niter 500

# 4. Hybrid model: launch just two MPI ranks and give each one a whole L3 cache domain to work with (1.966967e-08)
fluxbind run -N 1 -n 2 --env OMP_NUM_THREADS=4 --env OMP_PLACES=cores --shape ./examples/shape/kripke/hybrid-l3-shapefile.yaml kripke --zones 16,16,16 --niter 500 --procs 2,1,1 --layout GZD
```


## Predict

Use fluxbind to predict binding based on a job shape. This is prediction only, meaning there is no execution of an application or similar.
Here are some examples.

```bash
# Predict binding on this machine for 8 cores
fluxbind predict core:0-7

# Predict binding on corona (based on xml) for 2 NUMA nodes
fluxbind predict --xml ./examples/topology/corona.xml numa:0,1 x core:0-2
```

## License

DevTools is distributed under the terms of the MIT license.
All new contributions must be made under this license.

See [LICENSE](https://github.com/converged-computing/cloud-select/blob/main/LICENSE),
[COPYRIGHT](https://github.com/converged-computing/cloud-select/blob/main/COPYRIGHT), and
[NOTICE](https://github.com/converged-computing/cloud-select/blob/main/NOTICE) for details.

SPDX-License-Identifier: (MIT)

LLNL-CODE- 842614

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/compspec/fluxbind",
    "name": "fluxbind",
    "maintainer": "Vanessa Sochat",
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "cluster, orchestration, mpi, binding, topology",
    "author": "Vanessa Sochat",
    "author_email": "vsoch@users.noreply.github.com",
    "download_url": "https://files.pythonhosted.org/packages/5f/86/b99978d29ba66a0b20f16f06856b4c55b744605055d2a55534ccec717687/fluxbind-0.0.1.tar.gz",
    "platform": null,
    "description": "# fluxbind\n\n> Intelligent detection and mapping of processors for HPC\n\n[![PyPI version](https://badge.fury.io/py/fractale.svg)](https://badge.fury.io/py/fluxbind)\n\n![img/fluxbind.png](img/fluxbind-small.png)\n\n## Run\n\nUse fluxbind to run a job binding to specific cores. For flux, this means we require exclusive, and then for each node customize the binding exactly as we want it. We do this via a shape file.\n\n\n### Basic Examples\n\n```bash\n# Start with a first match policy\nflux start --config ./examples/config/match-first.toml\n\n# 1. Bind each task to a unique physical core, starting from core:0 (common case)\nfluxbind run -n 8 --quiet --shape ./examples/shape/1node/packed-cores-shapefile.yaml sleep 1\n\n# 2. Reverse it!\nfluxbind run -n 8 --quiet --shape ./examples/shape/1node/packed-cores-reversed-shapefile.yaml sleep 1\n\n# 3. Packed PUs (hyperthreading), so interleaved.\nfluxbind run --tasks-per-core 2 --quiet --shape ./examples/shape/1node/interleaved-shapefile.yaml sleep 1\n\n# 4. Reverse it again!\nfluxbind run --tasks-per-core 2 --quiet --shape ./examples/shape/1node/interleaved-reversed-shapefile.yaml sleep 1\n\n# 5. An unbound rank - this tests \"unbound\" to leave Rank 0 unbound, pack all other ranks onto cores, shifted by one.\nfluxbind run -N1 -n 3 --shape ./examples/shape/1node/unbound_rank.yaml sleep 1\n\n# 6. L2 cache affinity. Give each task its own dedicated L2 cache to maximize cache performance.\n# On mymachine, each core has its own private L2 cache.\n# Therefore, binding one task per L2 cache is equivalent to binding one task per core.\nfluxbind run -N1 -n 8 --quiet --shape ./examples/shape/1node/cache-affinity.yaml sleep 1\n\n# 7. Reverse it\nfluxbind run -N1 -n 8 --quiet --shape ./examples/shape/1node/cache-reversed-affinity.yaml sleep 1\n```\n\n### Kripke Examples\n\nAs we prepare to test with apps, here are some tests I'm thinking of doing.\n\n```bash\n# 1. Baseline - pack each MPI rank onto its own dedicated physical core (8.693519e-09)\nfluxbind run -N 1 -n 8 --shape ./examples/shape/kripke/baseline-shapefile.yaml kripke --procs 2,2,2 --zones 16,16,16 --niter 500\n\n# 2. Spread cores (memory bandwidth)\n# If Kripke is limited by memory bandwidth, if we place ranks on every other core, we reduce contention for the shared L3 cache\n# If Kripke memory bound, this layout might be faster than packed even with half cores. If compute based, worse (1.341355e-08)\nfluxbind run -N 1 -n 4 --shape ./examples/shape/kripke/memory-spread-cores-shapefile.yaml kripke --procs 2,2,1 --zones 16,16,16 --niter 500\n\n# 3. Packed pus (each of 8 cores has 2 pu == 16). We are testing if Kripke can benefit from SMT (simultaneous multi-threading)\nfluxbind run -N 1 --tasks-per-core 2 --shape ./examples/shape/kripke/packed-pus-shapefile.yaml kripke --procs 2,4,2 --zones 16,16,16 --niter 500\n\n# 4. Hybrid model: launch just two MPI ranks and give each one a whole L3 cache domain to work with (1.966967e-08)\nfluxbind run -N 1 -n 2 --env OMP_NUM_THREADS=4 --env OMP_PLACES=cores --shape ./examples/shape/kripke/hybrid-l3-shapefile.yaml kripke --zones 16,16,16 --niter 500 --procs 2,1,1 --layout GZD\n```\n\n\n## Predict\n\nUse fluxbind to predict binding based on a job shape. This is prediction only, meaning there is no execution of an application or similar.\nHere are some examples.\n\n```bash\n# Predict binding on this machine for 8 cores\nfluxbind predict core:0-7\n\n# Predict binding on corona (based on xml) for 2 NUMA nodes\nfluxbind predict --xml ./examples/topology/corona.xml numa:0,1 x core:0-2\n```\n\n## License\n\nDevTools is distributed under the terms of the MIT license.\nAll new contributions must be made under this license.\n\nSee [LICENSE](https://github.com/converged-computing/cloud-select/blob/main/LICENSE),\n[COPYRIGHT](https://github.com/converged-computing/cloud-select/blob/main/COPYRIGHT), and\n[NOTICE](https://github.com/converged-computing/cloud-select/blob/main/NOTICE) for details.\n\nSPDX-License-Identifier: (MIT)\n\nLLNL-CODE- 842614\n",
    "bugtrack_url": null,
    "license": "LICENSE",
    "summary": "Process mapping for Flux jobs",
    "version": "0.0.1",
    "project_urls": {
        "Homepage": "https://github.com/compspec/fluxbind"
    },
    "split_keywords": [
        "cluster",
        " orchestration",
        " mpi",
        " binding",
        " topology"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e57af714a9e6c140cd2ade9f3d80f6a3fd8e167953cddaf83c33fdf971b6773b",
                "md5": "a3a197b27d7c75c320341fb2c4665e01",
                "sha256": "eeec26b98921e77ec98053be0a14e92a32628039fecf3bf8b87bd3538fed3938"
            },
            "downloads": -1,
            "filename": "fluxbind-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a3a197b27d7c75c320341fb2c4665e01",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 28629,
            "upload_time": "2025-10-13T04:01:12",
            "upload_time_iso_8601": "2025-10-13T04:01:12.582047Z",
            "url": "https://files.pythonhosted.org/packages/e5/7a/f714a9e6c140cd2ade9f3d80f6a3fd8e167953cddaf83c33fdf971b6773b/fluxbind-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5f86b99978d29ba66a0b20f16f06856b4c55b744605055d2a55534ccec717687",
                "md5": "f0dc68c444ee68ef2f8de3f2ec99fe3c",
                "sha256": "d667c109544d91d6eb3337265800d8168e8ce13e717a4a008bc45573bda9d467"
            },
            "downloads": -1,
            "filename": "fluxbind-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f0dc68c444ee68ef2f8de3f2ec99fe3c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 26342,
            "upload_time": "2025-10-13T04:01:13",
            "upload_time_iso_8601": "2025-10-13T04:01:13.842708Z",
            "url": "https://files.pythonhosted.org/packages/5f/86/b99978d29ba66a0b20f16f06856b4c55b744605055d2a55534ccec717687/fluxbind-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-13 04:01:13",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "compspec",
    "github_project": "fluxbind",
    "github_not_found": true,
    "lcname": "fluxbind"
}
        
Elapsed time: 3.39065s