mpi4py-ve

Name	mpi4py-ve JSON
Version	1.0.1.post1 JSON
	download
home_page	https://github.com/SX-Aurora/mpi4py-ve/
Summary	Python bindings for MPI
upload_time	2024-03-04 02:23:22
maintainer	NEC
docs_url	None
author	NEC
requires_python
license	BSD
keywords	scientific computing parallel computing message passing interface mpi
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            #########
mpi4py-ve 
#########

*mpi4py-ve* is an extension to *mpi4py*, which provides Python bindings for the Message Passing Interface (MPI).
This package also supports to communicate array objects of `NLCPy <https://sxauroratsubasa.sakura.ne.jp/documents/nlcpy/en/>`_ (nlcpy.ndarray) between MPI processes on x86 servers of SX-Aurora TSUBASA systems.
Combining NLCPy with *mpi4py-ve* enables Python scripts to utilize multi-VE computing power.
The current version of *mpi4py-ve* is based on *mpi4py* version 3.0.3.
For details of API references, please refer to `mpi4py manual <https://mpi4py.readthedocs.io/en/stable/>`_.

************
Requirements
************

Before the installation, the following components are required to be installed on your x86 Node of SX-Aurora TSUBASA.

- `Alternative VE Offloading (AVEO) <https://sxauroratsubasa.sakura.ne.jp/documents/veos/en/aveo/index.html>`_
	- required version: >= 3.0.2

- `NEC MPI <https://sxauroratsubasa.sakura.ne.jp/documents/mpi/g2am01e-NEC_MPI_User_Guide_en/frame.html>`_
	- required NEC MPI version: >= 2.26.0 (for Mellanox OFED 4.x) or >= 3.5.0 (for Mellanox OFED 5.x)

- `Python <https://www.python.org/>`_
        - required version: 3.6, 3.7, or 3.8

- `NumPy <https://www.numpy.org/>`_
        - required version: v1.17, v1.18, v1.19, or v1.20

- `NLC(optional) <https://sxauroratsubasa.sakura.ne.jp/documents/sdk/SDK_NLC/UsersGuide/main/en/index.html>`_
	- required version: >= 3.0.0

- `NLCPy(optional) <https://sxauroratsubasa.sakura.ne.jp/documents/nlcpy/en/>`_
        - required version: >= 3.0.1

Since December 2022, mpi4py-ve has been provided as a software of NEC SDK (NEC Software Development Kit for Vector Engine).
If NEC SDK on your machine has been properly installed or updated after that, mpi4py-ve is available by using /usr/bin/python3 command.

******************
Install from wheel
******************

You can install *mpi4py-ve* by executing either of the following commands.

- Install from PyPI

    ::

    $ pip install mpi4py-ve

- Install from your local computer

    1. Download `the wheel package <https://github.com/SX-Aurora/mpi4py-ve/releases>`_ from GitHub.

    2. Put the wheel package to your any directory.

    3. Install the local wheel package via pip command.

        ::

        $ pip install <path_to_wheel>

The shared objects for Vector Host, which are included in the wheel package, are compiled by gcc 4.8.5 and tested by using following software:
    +---------+--------------------+
    | NEC MPI | v2.26.0 and v3.5.0 |
    +---------+--------------------+
    | NumPy   | v1.19.5            |
    +---------+--------------------+
    | NLCPy   | v3.0.1             |
    +---------+--------------------+

***********************************
Install from source (with building)
***********************************

Before building this package, you need to execute the environment setup script *necmpivars.sh* or *necmpivars.csh* once advance.

* When using *sh* or its variant:

    **For VE30**

        ::

        $ source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.sh

    **For VE20, VE10, or VE10E**

        ::

        $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh

* When using *csh* or its variant:

    **For VE30**

        ::

        % source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.csh

    **For VE20, VE10, or VE10E**

        ::

        % source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.csh

Here, X.X.X denotes the version number of NEC MPI.

After that, execute the following commands:

    ::

    $ git clone https://github.com/SX-Aurora/mpi4py-ve.git
    $ cd mpi4py-ve
    $ python setup.py build --mpi=necmpi
    $ python setup.py install 

*******
Example
*******

**Transfer Array**

Transfers an NLCPy's ndarray from MPI rank 0 to 1 by using comm.Send() and comm.Recv():

.. code-block:: python

    from mpi4pyve import MPI
    import nlcpy as vp

    comm = MPI.COMM_WORLD
    size = comm.Get_size()
    rank = comm.Get_rank()

    if rank == 0:
        x = vp.array([1,2,3], dtype=int)
        comm.Send(x, dest=1)

    elif rank == 1:
        y = vp.empty(3, dtype=int)
        comm.Recv(y, source=0)


**Sum of Numbers**

Sums the numbers locally, and reduces all the local sums to the root rank (rank=0):

.. code-block:: python

    from mpi4pyve import MPI
    import nlcpy as vp

    comm = MPI.COMM_WORLD
    size = comm.Get_size()
    rank = comm.Get_rank()

    N = 1000000000
    begin = N * rank // size
    end = N * (rank + 1) // size

    sendbuf = vp.arange(begin, end).sum()
    recvbuf = comm.reduce(sendbuf, MPI.SUM, root=0)

The following table shows the performance results[msec] on VE Type 20B:

+------+------+------+------+------+------+------+------+ 
| np=1 | np=2 | np=3 | np=4 | np=5 | np=6 | np=7 | np=8 |
+------+------+------+------+------+------+------+------+
| 35.8 | 19.0 | 12.6 | 10.1 |  8.1 |  7.0 |  6.0 |  5.5 |
+------+------+------+------+------+------+------+------+

*********
Execution
*********

When executing Python script using *mpi4py-ve*, use *mpirun* command of NEC MPI on an x86 server of SX-Aurora TSUBASA.
Before running the Python script, you need to execute the environment the following setup scripts once advance.

* When using *sh* or its variant:

    **For VE30**

        ::

        $ source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.sh gnu 4.8.5

    **For VE20, VE10, or VE10E**

        ::

        $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh gnu 4.8.5

* When using *csh* or its variant:

    **For VE30**

        ::

        % source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.csh gnu 4.8.5

    **For VE20, VE10, or VE10E**

        ::

        % source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.csh gnu 4.8.5

Here, X.X.X denotes the version number of NEC MPI.

When using the *mpirun* command:

    ::

    $ mpirun -veo -np N $(which python) sample.py

| Here, N is the number of MPI processes that are created on an x86 server.
| NEC MPI 2.21.0 or later supports the environment variable `NMPI_USE_COMMAND_SEARCH_PATH`.
| If `NMPI_USE_COMMAND_SEARCH_PATH` is set to `ON` and the Python command path is added to the environment variable PATH, you do not have to specify with the full path.

    ::

    $ export NMPI_USE_COMMAND_SEARCH_PATH=ON
    $ mpirun -veo -np N python sample.py

| For details of mpirun command, refer to `NEC MPI User's Guide <https://sxauroratsubasa.sakura.ne.jp/documents/mpi/g2am01e-NEC_MPI_User_Guide_en/frame.html>`_.

******************
Execution Examples
******************

The following examples show how to launch MPI programs that use mpi4py-ve and NLCPy on the SX-Aurora TSUBASA.

| *ncore* : Number of cores per VE.
| a.py: Python script using mpi4py-ve and NLCPy.
| 

* Interactive Execution

  * Execution on one VE

    Example of using 4 processes on local VH and 4 VE processes (*ncore* / 4 OpenMP parallel per process) on VE#0 of local VH

    ::

      $ mpirun -veo -np 4 python a.py

  * Execution on multiple VEs on a VH

    Example of using 4 processes on local VH and 4 VE processes (1 process per VE, *ncore* OpenMP parallel per process) on VE#0 to VE#3 of local VH

    ::

      $ VE_NLCPY_NODELIST=0,1,2,3 mpirun -veo -np 4 python a.py


    Example of using 32 processes on local VH and 32 VE processes (8 processes per VE, *ncore* / 8 OpenMP parallel per process) on VE#0 to VE# 3 of local VH

    ::

      $ VE_NLCPY_NODELIST=0,1,2,3 mpirun -veo -np 32 python a.py

  * Execution on multiple VEs on multiple VHs

    Example of using a total of 32 processes on two VHs host1 and host2, and a total of 32 VE processes on VE#0 and VE#1 of each VH (8 processes per VE, *ncore* / 8 OpenMP parallel per process)

    ::

      $ VE_NLCPY_NODELIST=0,1 mpirun -hosts host1,host2 -veo -np 32 python a.py

* NQSV Request Execution

  * Execution on a specific VH, on a VE

    Example of using 32 processes on logical VH#0 and 32 VE processes on logical VE#0 to logical VE#3 on logical VH#0 (8 processes per VE, *ncore* / 8 OpenMP parallel per process)

    ::

      #PBS -T necmpi
      #PBS -b 2 # The number of logical hosts
      #PBS --venum-lhost=4 # The number of VEs per logical host
      #PBS --cpunum-lhost=32 # The number of CPUs per logical host

      source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh
      export NMPI_USE_COMMAND_SEARCH_PATH=ON
      mpirun -host 0 -veo -np 32 python a.py

  * Execution on a specific VH, on a specific VE

    Example of using 16 processes on logical VH#0, 16 VE processes in total on logical VE#0 and logical VE#3 on logical VH#0 (8 processes per VE, *ncore* / 8 OpenMP parallel per process)

    ::

      #PBS -T necmpi
      #PBS -b 2 # The number of logical hosts
      #PBS --venum-lhost=4 # The number of VEs per logical host
      #PBS --cpunum-lhost=16 # The number of CPUs per logical host

      source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh
      export NMPI_USE_COMMAND_SEARCH_PATH=ON
      VE_NLCPY_NODELIST=0,3 mpirun -host 0 -veo -np 16 python a.py

  * Execution on all assigned VEs

    Example of using 32 processes in total on 4 VHs and using 32 VE processes in total from logical VE#0 to logical VE#7 on each of VHs (1 process per VE, *ncore* OpenMP parallel per process).

    ::

      #PBS -T necmpi
      #PBS -b 4 # The number of logical hosts
      #PBS --venum-lhost=8 # The number of VEs per logical host
      #PBS --cpunum-lhost=8 # The number of CPUs per logical host
      #PBS --use-hca=2 # The number of HCAs

      source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh
      export NMPI_USE_COMMAND_SEARCH_PATH=ON
      mpirun -veo -np 32 python a.py

Here, X.X.X denotes the version number of NEC MPI.

*********
Profiling
*********
NEC MPI provides the facility of displaying MPI communication information. 
There are two formats of MPI communication information available as follows:

+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
| Reduced Format  | The maximum, minimum, and average values of MPI communication information of all MPI processes are displayed.                                                                        |
+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Extended Format | MPI communication information of each MPI process is displayed in the ascending order of their ranks in the communicator MPI_COMM_WORLD after the information in the reduced format. |
+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

You can control the display and format of MPI communication information by setting the environment variable NMPI_COMMINF at runtime as shown in the following table.

The Settings of NMPI_COMMINF:

+--------------+-----------------------+ 
| NMPI_COMMINF | Displayed Information |
+--------------+-----------------------+
| NO           | (Default) No Output   |
+--------------+-----------------------+
| YES          | Reduced Format        |
+--------------+-----------------------+
| ALL          | Extended Format       |
+--------------+-----------------------+

When using the *mpirun* command:

    ::

    $ export NMPI_COMMINF=ALL
    $ mpirun -veo -np N python sample.py

***************************************************
Use mpi4py-ve with homebrew classes (without NLCPy)
***************************************************

Below links would be useful to use *mpi4py-ve* with homebrew classes (without NLCPy):

* `use mpi4py-ve with homebrew classes (without NLCPy) <https://github.com/SX-Aurora/mpi4py-ve/blob/v1.0.0/docs/vai_spec_example.rst>`_

***************
Other Documents
***************

Below links would be useful to understand *mpi4py-ve* in more detail:

* `mpi4py-ve tutorial <https://github.com/SX-Aurora/mpi4py-ve/blob/v1.0.0/docs/index.rst>`_

***********
Restriction
***********
* The current version of *mpi4py-ve* does not support some functions that are listed in the section "List of Unsupported Functions" of `mpi4py-ve tutorial <https://github.com/SX-Aurora/mpi4py-ve/blob/v1.0.0/docs/index.rst>`_.
* Communication of type bool between NumPy and NLCPy will fail because of the different number of bytes.

*******
Notices
*******
* If you import NLCPy before calling MPI_Init()/MPI_Init_thread(), a runtime error will be raised.

    Not recommended usage: ::

        $ mpirun -veo -np 1 $(which python) -c "import nlcpy; from mpi4pyve import MPI"
        RuntimeError: NLCPy must be import after MPI initialization

    Recommended usage: ::

        $ mpirun -veo -np 1 $(which python) -c "from mpi4pyve import MPI; import nlcpy" 

    MPI_Init() or MPI_Init_thread() is called when you import the MPI module from the mpi4pyve package.

* If you use the Lock/Lock_all function for one-sided communication using NLCPy array data, you need to put in NLCPy synchronization control.

    Synchronization usage:

    .. code-block:: python

        import mpi4pyve
        from mpi4pyve import MPI
        import nlcpy as vp

        comm = MPI.COMM_WORLD
        size = comm.Get_size()
        rank = comm.Get_rank()

        array = vp.array(0, dtype=int)

        if rank == 0:
            win_n = MPI.Win.Create(array,  comm=MPI.COMM_WORLD)
        else:
            win_n = MPI.Win.Create(None, comm=MPI.COMM_WORLD)
        if rank == 0:
            array.fill(1)
            array.venode.synchronize()
            comm.Barrier()
        if rank != 0:
           comm.Barrier()
            win_n.Lock(MPI.LOCK_EXCLUSIVE, 0)
            win_n.Get([array, MPI.INT], 0)
            win_n.Unlock(0)
            assert array == 1
        comm.Barrier()
        win_n.Free()

*******
License
*******

| The 2-clause BSD license (see *LICENSE* file).
| *mpi4py-ve* is derived from mpi4py (see *LICENSE_DETAIL/LICENSE_DETAIL* file).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/SX-Aurora/mpi4py-ve/",
    "name": "mpi4py-ve",
    "maintainer": "NEC",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "dev-nlcpy@sxarr.jp.nec.com",
    "keywords": "scientific computing,parallel computing,message passing interface,MPI",
    "author": "NEC",
    "author_email": "dev-nlcpy@sxarr.jp.nec.com",
    "download_url": "https://github.com/SX-Aurora/mpi4py-ve/releases/",
    "platform": "Linux",
    "description": "#########\nmpi4py-ve \n#########\n\n*mpi4py-ve* is an extension to *mpi4py*, which provides Python bindings for the Message Passing Interface (MPI).\nThis package also supports to communicate array objects of `NLCPy <https://sxauroratsubasa.sakura.ne.jp/documents/nlcpy/en/>`_ (nlcpy.ndarray) between MPI processes on x86 servers of SX-Aurora TSUBASA systems.\nCombining NLCPy with *mpi4py-ve* enables Python scripts to utilize multi-VE computing power.\nThe current version of *mpi4py-ve* is based on *mpi4py* version 3.0.3.\nFor details of API references, please refer to `mpi4py manual <https://mpi4py.readthedocs.io/en/stable/>`_.\n\n************\nRequirements\n************\n\nBefore the installation, the following components are required to be installed on your x86 Node of SX-Aurora TSUBASA.\n\n- `Alternative VE Offloading (AVEO) <https://sxauroratsubasa.sakura.ne.jp/documents/veos/en/aveo/index.html>`_\n\t- required version: >= 3.0.2\n\n- `NEC MPI <https://sxauroratsubasa.sakura.ne.jp/documents/mpi/g2am01e-NEC_MPI_User_Guide_en/frame.html>`_\n\t- required NEC MPI version: >= 2.26.0 (for Mellanox OFED 4.x) or >= 3.5.0 (for Mellanox OFED 5.x)\n\n- `Python <https://www.python.org/>`_\n        - required version: 3.6, 3.7, or 3.8\n\n- `NumPy <https://www.numpy.org/>`_\n        - required version: v1.17, v1.18, v1.19, or v1.20\n\n- `NLC(optional) <https://sxauroratsubasa.sakura.ne.jp/documents/sdk/SDK_NLC/UsersGuide/main/en/index.html>`_\n\t- required version: >= 3.0.0\n\n- `NLCPy(optional) <https://sxauroratsubasa.sakura.ne.jp/documents/nlcpy/en/>`_\n        - required version: >= 3.0.1\n\nSince December 2022, mpi4py-ve has been provided as a software of NEC SDK (NEC Software Development Kit for Vector Engine).\nIf NEC SDK on your machine has been properly installed or updated after that, mpi4py-ve is available by using /usr/bin/python3 command.\n\n******************\nInstall from wheel\n******************\n\nYou can install *mpi4py-ve* by executing either of the following commands.\n\n- Install from PyPI\n\n    ::\n\n    $ pip install mpi4py-ve\n\n- Install from your local computer\n\n    1. Download `the wheel package <https://github.com/SX-Aurora/mpi4py-ve/releases>`_ from GitHub.\n\n    2. Put the wheel package to your any directory.\n\n    3. Install the local wheel package via pip command.\n\n        ::\n\n        $ pip install <path_to_wheel>\n\nThe shared objects for Vector Host, which are included in the wheel package, are compiled by gcc 4.8.5 and tested by using following software:\n    +---------+--------------------+\n    | NEC MPI | v2.26.0 and v3.5.0 |\n    +---------+--------------------+\n    | NumPy   | v1.19.5            |\n    +---------+--------------------+\n    | NLCPy   | v3.0.1             |\n    +---------+--------------------+\n\n***********************************\nInstall from source (with building)\n***********************************\n\nBefore building this package, you need to execute the environment setup script *necmpivars.sh* or *necmpivars.csh* once advance.\n\n* When using *sh* or its variant:\n\n    **For VE30**\n\n        ::\n\n        $ source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.sh\n\n    **For VE20, VE10, or VE10E**\n\n        ::\n\n        $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh\n\n* When using *csh* or its variant:\n\n    **For VE30**\n\n        ::\n\n        % source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.csh\n\n    **For VE20, VE10, or VE10E**\n\n        ::\n\n        % source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.csh\n\nHere, X.X.X denotes the version number of NEC MPI.\n\nAfter that, execute the following commands:\n\n    ::\n\n    $ git clone https://github.com/SX-Aurora/mpi4py-ve.git\n    $ cd mpi4py-ve\n    $ python setup.py build --mpi=necmpi\n    $ python setup.py install \n\n*******\nExample\n*******\n\n**Transfer Array**\n\nTransfers an NLCPy's ndarray from MPI rank 0 to 1 by using comm.Send() and comm.Recv():\n\n.. code-block:: python\n\n    from mpi4pyve import MPI\n    import nlcpy as vp\n\n    comm = MPI.COMM_WORLD\n    size = comm.Get_size()\n    rank = comm.Get_rank()\n\n    if rank == 0:\n        x = vp.array([1,2,3], dtype=int)\n        comm.Send(x, dest=1)\n\n    elif rank == 1:\n        y = vp.empty(3, dtype=int)\n        comm.Recv(y, source=0)\n\n\n**Sum of Numbers**\n\nSums the numbers locally, and reduces all the local sums to the root rank (rank=0):\n\n.. code-block:: python\n\n    from mpi4pyve import MPI\n    import nlcpy as vp\n\n    comm = MPI.COMM_WORLD\n    size = comm.Get_size()\n    rank = comm.Get_rank()\n\n    N = 1000000000\n    begin = N * rank // size\n    end = N * (rank + 1) // size\n\n    sendbuf = vp.arange(begin, end).sum()\n    recvbuf = comm.reduce(sendbuf, MPI.SUM, root=0)\n\nThe following table shows the performance results[msec] on VE Type 20B:\n\n+------+------+------+------+------+------+------+------+ \n| np=1 | np=2 | np=3 | np=4 | np=5 | np=6 | np=7 | np=8 |\n+------+------+------+------+------+------+------+------+\n| 35.8 | 19.0 | 12.6 | 10.1 |  8.1 |  7.0 |  6.0 |  5.5 |\n+------+------+------+------+------+------+------+------+\n\n*********\nExecution\n*********\n\nWhen executing Python script using *mpi4py-ve*, use *mpirun* command of NEC MPI on an x86 server of SX-Aurora TSUBASA.\nBefore running the Python script, you need to execute the environment the following setup scripts once advance.\n\n* When using *sh* or its variant:\n\n    **For VE30**\n\n        ::\n\n        $ source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.sh gnu 4.8.5\n\n    **For VE20, VE10, or VE10E**\n\n        ::\n\n        $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh gnu 4.8.5\n\n* When using *csh* or its variant:\n\n    **For VE30**\n\n        ::\n\n        % source /opt/nec/ve3/mpi/X.X.X/bin/necmpivars.csh gnu 4.8.5\n\n    **For VE20, VE10, or VE10E**\n\n        ::\n\n        % source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.csh gnu 4.8.5\n\nHere, X.X.X denotes the version number of NEC MPI.\n\nWhen using the *mpirun* command:\n\n    ::\n\n    $ mpirun -veo -np N $(which python) sample.py\n\n| Here, N is the number of MPI processes that are created on an x86 server.\n| NEC MPI 2.21.0 or later supports the environment variable `NMPI_USE_COMMAND_SEARCH_PATH`.\n| If `NMPI_USE_COMMAND_SEARCH_PATH` is set to `ON` and the Python command path is added to the environment variable PATH, you do not have to specify with the full path.\n\n    ::\n\n    $ export NMPI_USE_COMMAND_SEARCH_PATH=ON\n    $ mpirun -veo -np N python sample.py\n\n| For details of mpirun command, refer to `NEC MPI User's Guide <https://sxauroratsubasa.sakura.ne.jp/documents/mpi/g2am01e-NEC_MPI_User_Guide_en/frame.html>`_.\n\n******************\nExecution Examples\n******************\n\nThe following examples show how to launch MPI programs that use mpi4py-ve and NLCPy on the SX-Aurora TSUBASA.\n\n| *ncore* : Number of cores per VE.\n| a.py: Python script using mpi4py-ve and NLCPy.\n| \n\n* Interactive Execution\n\n  * Execution on one VE\n\n    Example of using 4 processes on local VH and 4 VE processes (*ncore* / 4 OpenMP parallel per process) on VE#0 of local VH\n\n    ::\n\n      $ mpirun -veo -np 4 python a.py\n\n  * Execution on multiple VEs on a VH\n\n    Example of using 4 processes on local VH and 4 VE processes (1 process per VE, *ncore* OpenMP parallel per process) on VE#0 to VE#3 of local VH\n\n    ::\n\n      $ VE_NLCPY_NODELIST=0,1,2,3 mpirun -veo -np 4 python a.py\n\n\n    Example of using 32 processes on local VH and 32 VE processes (8 processes per VE, *ncore* / 8 OpenMP parallel per process) on VE#0 to VE# 3 of local VH\n\n    ::\n\n      $ VE_NLCPY_NODELIST=0,1,2,3 mpirun -veo -np 32 python a.py\n\n  * Execution on multiple VEs on multiple VHs\n\n    Example of using a total of 32 processes on two VHs host1 and host2, and a total of 32 VE processes on VE#0 and VE#1 of each VH (8 processes per VE, *ncore* / 8 OpenMP parallel per process)\n\n    ::\n\n      $ VE_NLCPY_NODELIST=0,1 mpirun -hosts host1,host2 -veo -np 32 python a.py\n\n* NQSV Request Execution\n\n  * Execution on a specific VH, on a VE\n\n    Example of using 32 processes on logical VH#0 and 32 VE processes on logical VE#0 to logical VE#3 on logical VH#0 (8 processes per VE, *ncore* / 8 OpenMP parallel per process)\n\n    ::\n\n      #PBS -T necmpi\n      #PBS -b 2 # The number of logical hosts\n      #PBS --venum-lhost=4 # The number of VEs per logical host\n      #PBS --cpunum-lhost=32 # The number of CPUs per logical host\n\n      source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh\n      export NMPI_USE_COMMAND_SEARCH_PATH=ON\n      mpirun -host 0 -veo -np 32 python a.py\n\n  * Execution on a specific VH, on a specific VE\n\n    Example of using 16 processes on logical VH#0, 16 VE processes in total on logical VE#0 and logical VE#3 on logical VH#0 (8 processes per VE, *ncore* / 8 OpenMP parallel per process)\n\n    ::\n\n      #PBS -T necmpi\n      #PBS -b 2 # The number of logical hosts\n      #PBS --venum-lhost=4 # The number of VEs per logical host\n      #PBS --cpunum-lhost=16 # The number of CPUs per logical host\n\n      source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh\n      export NMPI_USE_COMMAND_SEARCH_PATH=ON\n      VE_NLCPY_NODELIST=0,3 mpirun -host 0 -veo -np 16 python a.py\n\n  * Execution on all assigned VEs\n\n    Example of using 32 processes in total on 4 VHs and using 32 VE processes in total from logical VE#0 to logical VE#7 on each of VHs (1 process per VE, *ncore* OpenMP parallel per process).\n\n    ::\n\n      #PBS -T necmpi\n      #PBS -b 4 # The number of logical hosts\n      #PBS --venum-lhost=8 # The number of VEs per logical host\n      #PBS --cpunum-lhost=8 # The number of CPUs per logical host\n      #PBS --use-hca=2 # The number of HCAs\n\n      source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh\n      export NMPI_USE_COMMAND_SEARCH_PATH=ON\n      mpirun -veo -np 32 python a.py\n\nHere, X.X.X denotes the version number of NEC MPI.\n\n*********\nProfiling\n*********\nNEC MPI provides the facility of displaying MPI communication information. \nThere are two formats of MPI communication information available as follows:\n\n+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \n| Reduced Format  | The maximum, minimum, and average values of MPI communication information of all MPI processes are displayed.                                                                        |\n+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| Extended Format | MPI communication information of each MPI process is displayed in the ascending order of their ranks in the communicator MPI_COMM_WORLD after the information in the reduced format. |\n+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n\nYou can control the display and format of MPI communication information by setting the environment variable NMPI_COMMINF at runtime as shown in the following table.\n\nThe Settings of NMPI_COMMINF:\n\n+--------------+-----------------------+ \n| NMPI_COMMINF | Displayed Information |\n+--------------+-----------------------+\n| NO           | (Default) No Output   |\n+--------------+-----------------------+\n| YES          | Reduced Format        |\n+--------------+-----------------------+\n| ALL          | Extended Format       |\n+--------------+-----------------------+\n\nWhen using the *mpirun* command:\n\n    ::\n\n    $ export NMPI_COMMINF=ALL\n    $ mpirun -veo -np N python sample.py\n\n***************************************************\nUse mpi4py-ve with homebrew classes (without NLCPy)\n***************************************************\n\nBelow links would be useful to use *mpi4py-ve* with homebrew classes (without NLCPy):\n\n* `use mpi4py-ve with homebrew classes (without NLCPy) <https://github.com/SX-Aurora/mpi4py-ve/blob/v1.0.0/docs/vai_spec_example.rst>`_\n\n***************\nOther Documents\n***************\n\nBelow links would be useful to understand *mpi4py-ve* in more detail:\n\n* `mpi4py-ve tutorial <https://github.com/SX-Aurora/mpi4py-ve/blob/v1.0.0/docs/index.rst>`_\n\n***********\nRestriction\n***********\n* The current version of *mpi4py-ve* does not support some functions that are listed in the section \"List of Unsupported Functions\" of `mpi4py-ve tutorial <https://github.com/SX-Aurora/mpi4py-ve/blob/v1.0.0/docs/index.rst>`_.\n* Communication of type bool between NumPy and NLCPy will fail because of the different number of bytes.\n\n*******\nNotices\n*******\n* If you import NLCPy before calling MPI_Init()/MPI_Init_thread(), a runtime error will be raised.\n\n    Not recommended usage: ::\n\n        $ mpirun -veo -np 1 $(which python) -c \"import nlcpy; from mpi4pyve import MPI\"\n        RuntimeError: NLCPy must be import after MPI initialization\n\n    Recommended usage: ::\n\n        $ mpirun -veo -np 1 $(which python) -c \"from mpi4pyve import MPI; import nlcpy\" \n\n    MPI_Init() or MPI_Init_thread() is called when you import the MPI module from the mpi4pyve package.\n\n* If you use the Lock/Lock_all function for one-sided communication using NLCPy array data, you need to put in NLCPy synchronization control.\n\n    Synchronization usage:\n\n    .. code-block:: python\n\n        import mpi4pyve\n        from mpi4pyve import MPI\n        import nlcpy as vp\n\n        comm = MPI.COMM_WORLD\n        size = comm.Get_size()\n        rank = comm.Get_rank()\n\n        array = vp.array(0, dtype=int)\n\n        if rank == 0:\n            win_n = MPI.Win.Create(array,  comm=MPI.COMM_WORLD)\n        else:\n            win_n = MPI.Win.Create(None, comm=MPI.COMM_WORLD)\n        if rank == 0:\n            array.fill(1)\n            array.venode.synchronize()\n            comm.Barrier()\n        if rank != 0:\n           comm.Barrier()\n            win_n.Lock(MPI.LOCK_EXCLUSIVE, 0)\n            win_n.Get([array, MPI.INT], 0)\n            win_n.Unlock(0)\n            assert array == 1\n        comm.Barrier()\n        win_n.Free()\n\n*******\nLicense\n*******\n\n| The 2-clause BSD license (see *LICENSE* file).\n| *mpi4py-ve* is derived from mpi4py (see *LICENSE_DETAIL/LICENSE_DETAIL* file).\n\n\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Python bindings for MPI",
    "version": "1.0.1.post1",
    "project_urls": {
        "Download": "https://github.com/SX-Aurora/mpi4py-ve/releases/",
        "Homepage": "https://github.com/SX-Aurora/mpi4py-ve/"
    },
    "split_keywords": [
        "scientific computing",
        "parallel computing",
        "message passing interface",
        "mpi"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3fa04ffc645717984950502ae917eb9ac95e28086ab3558225ee706831a87142",
                "md5": "5cdd304176883e2d43f9e7354f2bd0f7",
                "sha256": "b00d848276ae2116bf95be97c6136ef012a745ee459047ad094f1fb6b09ad8a9"
            },
            "downloads": -1,
            "filename": "mpi4py_ve-1.0.1.post1-cp36-cp36m-manylinux1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "5cdd304176883e2d43f9e7354f2bd0f7",
            "packagetype": "bdist_wheel",
            "python_version": "cp36",
            "requires_python": null,
            "size": 2585906,
            "upload_time": "2024-03-04T02:23:22",
            "upload_time_iso_8601": "2024-03-04T02:23:22.854403Z",
            "url": "https://files.pythonhosted.org/packages/3f/a0/4ffc645717984950502ae917eb9ac95e28086ab3558225ee706831a87142/mpi4py_ve-1.0.1.post1-cp36-cp36m-manylinux1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bbb5e00cd27720dc988c51fbb97571de3bfe4dda3dcae4fd84c1d4004d71f3c2",
                "md5": "4c7a4ae42cbc8861c77f96fd8afe77fd",
                "sha256": "c9da6c84d0158ca07a41e788d50b27a229373b875a8e09f7cda4b8316df53864"
            },
            "downloads": -1,
            "filename": "mpi4py_ve-1.0.1.post1-cp37-cp37m-manylinux1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "4c7a4ae42cbc8861c77f96fd8afe77fd",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 2642534,
            "upload_time": "2024-03-04T02:23:25",
            "upload_time_iso_8601": "2024-03-04T02:23:25.849242Z",
            "url": "https://files.pythonhosted.org/packages/bb/b5/e00cd27720dc988c51fbb97571de3bfe4dda3dcae4fd84c1d4004d71f3c2/mpi4py_ve-1.0.1.post1-cp37-cp37m-manylinux1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "45d102adaa36ab74e73cd4875093102f6291014572c219892017e442508b34ba",
                "md5": "705431a2398010fe6573cdce18515edd",
                "sha256": "a5759dbd083586c14cf5a2f5ba7e288f87df5c562f17e2a38b053a48974f408d"
            },
            "downloads": -1,
            "filename": "mpi4py_ve-1.0.1.post1-cp38-cp38-manylinux1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "705431a2398010fe6573cdce18515edd",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 2741727,
            "upload_time": "2024-03-04T02:23:28",
            "upload_time_iso_8601": "2024-03-04T02:23:28.386494Z",
            "url": "https://files.pythonhosted.org/packages/45/d1/02adaa36ab74e73cd4875093102f6291014572c219892017e442508b34ba/mpi4py_ve-1.0.1.post1-cp38-cp38-manylinux1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-04 02:23:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SX-Aurora",
    "github_project": "mpi4py-ve",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "mpi4py-ve"
}

NEC