bitfount


Namebitfount JSON
Version 0.9.3 PyPI version JSON
download
home_pagehttps://github.com/bitfount/bitfount
SummaryMachine Learning and Federated Learning Library.
upload_time2024-02-07 11:44:14
maintainer
docs_urlNone
authorBitfount
requires_python>=3.8,<3.11,!=3.9.7
licenseApache License 2.0
keywords federated learning privacy ai machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
<img src="https://bitfount-web-resources.s3.eu-west-2.amazonaws.com/bitfount_logo_horizontal.png" width="600px">

**Federated learning and data analytics that just works**

---

</br>
<!-- Github workflow badges are case sensitive - the name must match the name of the workflow exactly -->

![Python versions](https://img.shields.io/pypi/pyversions/bitfount)
[![PyPI Latest Release](https://img.shields.io/pypi/v/bitfount.svg)](https://pypi.org/project/bitfount/)
[![PyPI Downloads](https://pepy.tech/badge/bitfount)](https://pepy.tech/project/bitfount)
![](https://github.com/bitfount/bitfount/workflows/CI/badge.svg?branch=develop)
![](https://github.com/bitfount/bitfount/workflows/tutorials/badge.svg?branch=develop)
[![codecov](https://codecov.io/gh/bitfount/bitfount/branch/develop/graph/badge.svg?token=r1hulrgehK)](https://codecov.io/gh/bitfount/bitfount)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)
[![mypy type checked](https://img.shields.io/badge/mypy-checked-blue)](https://github.com/python/mypy)
[![flake8](https://img.shields.io/badge/linter-flake8-success)](https://github.com/PyCQA/flake8)
[![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/bitfount/bitfount/blob/develop/LICENSE)

<!-- ![docs-coverage](interrogate.svg) -->

</div>

## Table of Contents

- [Using the Docker images](#using-the-docker-images)
- [Running the Python code](#running-the-python-code)
  - [Installation](#installation)
  - [Getting started (Tutorials)](#getting-started-tutorials)
  - [Federated training scripts](#federated-training-scripts)
  - [Basic Local Usage](#basic-local-usage)
- [License](#license)

## Using the Docker images

There are two docker images, one for running a Pod (`ghcr.io/bitfount/pod:stable`),
and another for running a modelling task (`ghcr.io/bitfount/modeller:stable`).

Both of the images require a `config.yaml` file to be provided to them,
by default they will try to load it from `/mount/config/config.yaml` inside the docker container.
You can provide this file easily by mounting/binding a volume to the container,
how you do this may vary depending on your platform/environment (Docker/docker-compose/ECS),
if you have any problems doing this then feel free to reach out to us.

Alternative you could copy a config file into a stopped container using [docker cp](https://docs.docker.com/engine/reference/commandline/cp/).

If you're using a CSV data source then you'll also need to mount your data to the container,
this will need to be mounted at the path specified in your config, for simplicity it's easiest
put your config and your CSV in the same directory and then mount it to the container.

Once your container is running you will need to check the logs and complete the login step,
allowing your container to authenticate with Bitfount.
The process is the same as when running locally (e.g. the tutorials),
except that we can't open the login page automatically for you.

## Running the Python code

### Installation

#### Where to get it

Binary installers for the latest released version are available at the [Python
Package Index (PyPI)](https://pypi.org/project/bitfount).

`pip install bitfount`

For DICOM support, you will need to install the DICOM extras:

`pip install 'bitfount[dicom]'`

If you want to use differential privacy (DP), you will need to install the DP extras:

`pip install 'bitfount[dp]'`

Ensure you are using python 3.8 or 3.9. The DP extra is not supported on 3.10.

If you are planning on using the `bitfount` package with Jupyter Notebooks, we recommend you install the splinter package `bitfount[tutorials]` which will make sure you are running compatible jupyter dependencies.

`pip install 'bitfount[tutorials]'`

#### Installation from sources

To install `bitfount` from source you need to create a python virtual environment.

In the `bitfount` directory (same one where you found this file after cloning the git repo), execute:

`pip install -r requirements/requirements.in`

These requirements are set to permissive ranges but are not guaranteed to work for all releases, especially the latest versions. For a pinned version of these requirements which are guaranteed to work, run the following command instead:

```bash
#!/bin/bash
PYTHON_VERSION=$(python -c "import platform; print(''.join(platform.python_version_tuple()[:2]))")
pip install -r requirements/${PYTHON_VERSION}/requirements.txt
```

To be able to use differential privacy (DP), you will need to additionally install the DP requirements. Please note that this is only compatible with Python version 3.8 and 3.9. Also, it is restricted to non-arm architectures:

```bash
#!/bin/bash
PYTHON_VERSION=$(python -c "import platform; print(''.join(platform.python_version_tuple()[:2]))")
PLATFORM_PROCESSOR=$(python -c "import platform; print(platform.processor())")

if [[ ${PYTHON_VERSION} == "38" || ${PYTHON_VERSION} == "39" ]] && [[ ${PLATFORM_PROCESSOR} != "arm" ]]; then
    pip install -r requirements/${PYTHON_VERSION}/differential_privacy/requirements-dp.txt
fi
```

For MacOS you also need to install `libomp`:

`brew install libomp`

### Getting started (Tutorials)

In order to run the tutorials, you also need to install the tutorial requirements:

```bash
#!/bin/bash
PYTHON_VERSION=$(python -c "import platform; print(''.join(platform.python_version_tuple()[:2]))")
pip install -r requirements/${PYTHON_VERSION}/requirements-tutorial.txt
```

To get started using the Bitfount package in a federated setting, we recommend
that you start with our tutorials. Run `jupyter notebook`and open up the first
tutorial in the "Connecting Data & Creating Pods folder: `running_a_pod.ipynb`

### Federated training scripts

Some simple scripts have been provided to run a Pod or Modelling job from a config file.

> ⚠️ If you are running from a source install (such as from `git clone`) you will
> need to use <span style="white-space: nowrap">`python -m scripts.<script_name>`</span>
> rather than use `bitfount <script_name>` directly.

To run a pod:

`bitfount run_pod --path_to_config_yaml=<CONFIG_FILE>`

To run a modelling job:

`bitfount run_modeller --path_to_config_yaml=<CONFIG_FILE>`

### Basic Local Usage

As well as providing the ability to use data in remote pods, this package also enables local ML training. Some example code for this purpose is given below.

**1\. Import bitfount**

```python
import bitfount as bf
```

**2\. Create DataSource and load data**

```python
census_income = bf.CSVSource(
    path="https://bitfount-hosted-downloads.s3.eu-west-2.amazonaws.com/bitfount-tutorials/census_income.csv",
    ignore_cols=["fnlwgt"],
)
census_income.load_data()
```

**3\. Create Schema**

```python
schema = bf.BitfountSchema(
    census_income,
    table_name="census_income",
    force_stypes={
        "census_income": {
            "categorical":[
                "TARGET",
                "workclass",
                "marital-status",
                "occupation",
                "relationship",
                "race",
                "native-country",
                "gender",
                "education"
            ]
        }
    }
)
```

**4\. Transform Data**

```python
clean_data = bf.CleanDataTransformation()
processor = bf.TransformationProcessor([clean_data], schema.get_table_schema("census_income"))
census_income.data = processor.transform(census_income.data)
schema.add_datasource_tables(census_income, table_name="census_income")
```

**5\. Create DataStructure**

```python
census_income_data_structure=bf.DataStructure(
  table="census_income",
  target="TARGET",
)
```

**6\. Create and Train Model**

```python
nn = bf.PyTorchTabularClassifier(
    datastructure=census_income_data_structure,
    schema=schema,
    epochs=2,
    batch_size=256,
    optimizer=bf.Optimizer("RAdam", {"lr": 0.001}),
)
nn.fit(census_income)
nn.serialize("demo_task_model.pt")
```

**7\. Evaluate**

```python
preds, target = nn.evaluate()
metrics = bf.MetricCollection.create_from_model(nn)
results = metrics.compute(target, preds)
print(results)
```

**8\. Assert results**

```python
import numpy as np
assert nn._validation_results[-1]["validation_loss"] is not np.nan
assert results["AUC"] > 0.7
```

## License

The license for this software is available in the `LICENSE` file.
This can be found in the Github Repository, as well as inside the Docker image.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bitfount/bitfount",
    "name": "bitfount",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<3.11,!=3.9.7",
    "maintainer_email": "",
    "keywords": "federated learning,privacy,AI,machine learning",
    "author": "Bitfount",
    "author_email": "info@bitfount.com",
    "download_url": "https://files.pythonhosted.org/packages/87/83/2e0fe99084bcc288a7c06441f722cdaf1ab1444e34871381a753dad5b49f/bitfount-0.9.3.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n<img src=\"https://bitfount-web-resources.s3.eu-west-2.amazonaws.com/bitfount_logo_horizontal.png\" width=\"600px\">\n\n**Federated learning and data analytics that just works**\n\n---\n\n</br>\n<!-- Github workflow badges are case sensitive - the name must match the name of the workflow exactly -->\n\n![Python versions](https://img.shields.io/pypi/pyversions/bitfount)\n[![PyPI Latest Release](https://img.shields.io/pypi/v/bitfount.svg)](https://pypi.org/project/bitfount/)\n[![PyPI Downloads](https://pepy.tech/badge/bitfount)](https://pepy.tech/project/bitfount)\n![](https://github.com/bitfount/bitfount/workflows/CI/badge.svg?branch=develop)\n![](https://github.com/bitfount/bitfount/workflows/tutorials/badge.svg?branch=develop)\n[![codecov](https://codecov.io/gh/bitfount/bitfount/branch/develop/graph/badge.svg?token=r1hulrgehK)](https://codecov.io/gh/bitfount/bitfount)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)\n[![mypy type checked](https://img.shields.io/badge/mypy-checked-blue)](https://github.com/python/mypy)\n[![flake8](https://img.shields.io/badge/linter-flake8-success)](https://github.com/PyCQA/flake8)\n[![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/bitfount/bitfount/blob/develop/LICENSE)\n\n<!-- ![docs-coverage](interrogate.svg) -->\n\n</div>\n\n## Table of Contents\n\n- [Using the Docker images](#using-the-docker-images)\n- [Running the Python code](#running-the-python-code)\n  - [Installation](#installation)\n  - [Getting started (Tutorials)](#getting-started-tutorials)\n  - [Federated training scripts](#federated-training-scripts)\n  - [Basic Local Usage](#basic-local-usage)\n- [License](#license)\n\n## Using the Docker images\n\nThere are two docker images, one for running a Pod (`ghcr.io/bitfount/pod:stable`),\nand another for running a modelling task (`ghcr.io/bitfount/modeller:stable`).\n\nBoth of the images require a `config.yaml` file to be provided to them,\nby default they will try to load it from `/mount/config/config.yaml` inside the docker container.\nYou can provide this file easily by mounting/binding a volume to the container,\nhow you do this may vary depending on your platform/environment (Docker/docker-compose/ECS),\nif you have any problems doing this then feel free to reach out to us.\n\nAlternative you could copy a config file into a stopped container using [docker cp](https://docs.docker.com/engine/reference/commandline/cp/).\n\nIf you're using a CSV data source then you'll also need to mount your data to the container,\nthis will need to be mounted at the path specified in your config, for simplicity it's easiest\nput your config and your CSV in the same directory and then mount it to the container.\n\nOnce your container is running you will need to check the logs and complete the login step,\nallowing your container to authenticate with Bitfount.\nThe process is the same as when running locally (e.g. the tutorials),\nexcept that we can't open the login page automatically for you.\n\n## Running the Python code\n\n### Installation\n\n#### Where to get it\n\nBinary installers for the latest released version are available at the [Python\nPackage Index (PyPI)](https://pypi.org/project/bitfount).\n\n`pip install bitfount`\n\nFor DICOM support, you will need to install the DICOM extras:\n\n`pip install 'bitfount[dicom]'`\n\nIf you want to use differential privacy (DP), you will need to install the DP extras:\n\n`pip install 'bitfount[dp]'`\n\nEnsure you are using python 3.8 or 3.9. The DP extra is not supported on 3.10.\n\nIf you are planning on using the `bitfount` package with Jupyter Notebooks, we recommend you install the splinter package `bitfount[tutorials]` which will make sure you are running compatible jupyter dependencies.\n\n`pip install 'bitfount[tutorials]'`\n\n#### Installation from sources\n\nTo install `bitfount` from source you need to create a python virtual environment.\n\nIn the `bitfount` directory (same one where you found this file after cloning the git repo), execute:\n\n`pip install -r requirements/requirements.in`\n\nThese requirements are set to permissive ranges but are not guaranteed to work for all releases, especially the latest versions. For a pinned version of these requirements which are guaranteed to work, run the following command instead:\n\n```bash\n#!/bin/bash\nPYTHON_VERSION=$(python -c \"import platform; print(''.join(platform.python_version_tuple()[:2]))\")\npip install -r requirements/${PYTHON_VERSION}/requirements.txt\n```\n\nTo be able to use differential privacy (DP), you will need to additionally install the DP requirements. Please note that this is only compatible with Python version 3.8 and 3.9. Also, it is restricted to non-arm architectures:\n\n```bash\n#!/bin/bash\nPYTHON_VERSION=$(python -c \"import platform; print(''.join(platform.python_version_tuple()[:2]))\")\nPLATFORM_PROCESSOR=$(python -c \"import platform; print(platform.processor())\")\n\nif [[ ${PYTHON_VERSION} == \"38\" || ${PYTHON_VERSION} == \"39\" ]] && [[ ${PLATFORM_PROCESSOR} != \"arm\" ]]; then\n    pip install -r requirements/${PYTHON_VERSION}/differential_privacy/requirements-dp.txt\nfi\n```\n\nFor MacOS you also need to install `libomp`:\n\n`brew install libomp`\n\n### Getting started (Tutorials)\n\nIn order to run the tutorials, you also need to install the tutorial requirements:\n\n```bash\n#!/bin/bash\nPYTHON_VERSION=$(python -c \"import platform; print(''.join(platform.python_version_tuple()[:2]))\")\npip install -r requirements/${PYTHON_VERSION}/requirements-tutorial.txt\n```\n\nTo get started using the Bitfount package in a federated setting, we recommend\nthat you start with our tutorials. Run `jupyter notebook`and open up the first\ntutorial in the \"Connecting Data & Creating Pods folder: `running_a_pod.ipynb`\n\n### Federated training scripts\n\nSome simple scripts have been provided to run a Pod or Modelling job from a config file.\n\n> \u26a0\ufe0f If you are running from a source install (such as from `git clone`) you will\n> need to use <span style=\"white-space: nowrap\">`python -m scripts.<script_name>`</span>\n> rather than use `bitfount <script_name>` directly.\n\nTo run a pod:\n\n`bitfount run_pod --path_to_config_yaml=<CONFIG_FILE>`\n\nTo run a modelling job:\n\n`bitfount run_modeller --path_to_config_yaml=<CONFIG_FILE>`\n\n### Basic Local Usage\n\nAs well as providing the ability to use data in remote pods, this package also enables local ML training. Some example code for this purpose is given below.\n\n**1\\. Import bitfount**\n\n```python\nimport bitfount as bf\n```\n\n**2\\. Create DataSource and load data**\n\n```python\ncensus_income = bf.CSVSource(\n    path=\"https://bitfount-hosted-downloads.s3.eu-west-2.amazonaws.com/bitfount-tutorials/census_income.csv\",\n    ignore_cols=[\"fnlwgt\"],\n)\ncensus_income.load_data()\n```\n\n**3\\. Create Schema**\n\n```python\nschema = bf.BitfountSchema(\n    census_income,\n    table_name=\"census_income\",\n    force_stypes={\n        \"census_income\": {\n            \"categorical\":[\n                \"TARGET\",\n                \"workclass\",\n                \"marital-status\",\n                \"occupation\",\n                \"relationship\",\n                \"race\",\n                \"native-country\",\n                \"gender\",\n                \"education\"\n            ]\n        }\n    }\n)\n```\n\n**4\\. Transform Data**\n\n```python\nclean_data = bf.CleanDataTransformation()\nprocessor = bf.TransformationProcessor([clean_data], schema.get_table_schema(\"census_income\"))\ncensus_income.data = processor.transform(census_income.data)\nschema.add_datasource_tables(census_income, table_name=\"census_income\")\n```\n\n**5\\. Create DataStructure**\n\n```python\ncensus_income_data_structure=bf.DataStructure(\n  table=\"census_income\",\n  target=\"TARGET\",\n)\n```\n\n**6\\. Create and Train Model**\n\n```python\nnn = bf.PyTorchTabularClassifier(\n    datastructure=census_income_data_structure,\n    schema=schema,\n    epochs=2,\n    batch_size=256,\n    optimizer=bf.Optimizer(\"RAdam\", {\"lr\": 0.001}),\n)\nnn.fit(census_income)\nnn.serialize(\"demo_task_model.pt\")\n```\n\n**7\\. Evaluate**\n\n```python\npreds, target = nn.evaluate()\nmetrics = bf.MetricCollection.create_from_model(nn)\nresults = metrics.compute(target, preds)\nprint(results)\n```\n\n**8\\. Assert results**\n\n```python\nimport numpy as np\nassert nn._validation_results[-1][\"validation_loss\"] is not np.nan\nassert results[\"AUC\"] > 0.7\n```\n\n## License\n\nThe license for this software is available in the `LICENSE` file.\nThis can be found in the Github Repository, as well as inside the Docker image.\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Machine Learning and Federated Learning Library.",
    "version": "0.9.3",
    "project_urls": {
        "Documentation": "https://docs.bitfount.com/",
        "Homepage": "https://bitfount.com",
        "Hub": "https://hub.bitfount.com",
        "Source Code": "https://github.com/bitfount/bitfount/"
    },
    "split_keywords": [
        "federated learning",
        "privacy",
        "ai",
        "machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "47504c3cf463e97b54db0e4311cff365e4d544856de96664f3084e4a3dd05dfc",
                "md5": "9734f778ab7b4664b575f781f67d0ca6",
                "sha256": "12b5b6f03fc7035c4474c5a5ae14961ef64275e3e8401385b10832209936c246"
            },
            "downloads": -1,
            "filename": "bitfount-0.9.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9734f778ab7b4664b575f781f67d0ca6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<3.11,!=3.9.7",
            "size": 604290,
            "upload_time": "2024-02-07T11:44:11",
            "upload_time_iso_8601": "2024-02-07T11:44:11.179182Z",
            "url": "https://files.pythonhosted.org/packages/47/50/4c3cf463e97b54db0e4311cff365e4d544856de96664f3084e4a3dd05dfc/bitfount-0.9.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "87832e0fe99084bcc288a7c06441f722cdaf1ab1444e34871381a753dad5b49f",
                "md5": "ccbe8ac349de23031db49480c8348ef4",
                "sha256": "3a17b1124f85b650838cf98c7b74d31150838efbdc1584605d2b60819b9a08cd"
            },
            "downloads": -1,
            "filename": "bitfount-0.9.3.tar.gz",
            "has_sig": false,
            "md5_digest": "ccbe8ac349de23031db49480c8348ef4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<3.11,!=3.9.7",
            "size": 529083,
            "upload_time": "2024-02-07T11:44:14",
            "upload_time_iso_8601": "2024-02-07T11:44:14.802158Z",
            "url": "https://files.pythonhosted.org/packages/87/83/2e0fe99084bcc288a7c06441f722cdaf1ab1444e34871381a753dad5b49f/bitfount-0.9.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-07 11:44:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bitfount",
    "github_project": "bitfount",
    "github_not_found": true,
    "lcname": "bitfount"
}
        
Elapsed time: 0.18692s