dsmanager


Namedsmanager JSON
Version 1.2.9.1 PyPI version JSON
download
home_page
SummaryData Science tools to ease access and use of data and models
upload_time2023-02-04 13:54:44
maintainer
docs_urlNone
authorRayane AMROUCHE
requires_python>=3.8,<4
licenseGPL-3.0-or-later
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <h1 align="center"
>Data Science Manager 👨‍💻
</h1>
<p
>
  <a
  href="#"
  target="_blank"
  >
    <img
    alt="Version"
    src="https://img.shields.io/badge/version-1.2-blue.svg?cacheSeconds=2592000"
    />
  </a>
  <a
  href="http://dsmanager.rtfd.io/"
  target="_blank"
  >
    <img
    alt="Documentation"
    src="https://img.shields.io/badge/documentation-rtfd-orange.svg"
    />
  </a>
  <a
  href="LICENSE"
  target="_blank"
  >
    <img
    alt="License: Adel Rayane Amrouche"
    src="https://img.shields.io/badge/License-Adel Rayane Amrouche-yellow.svg"
    />
  </a>
</p>

> Data Science tools to ease access and use of data and models

## Install

The easiest way to install scikit-learn is using `pip`:

```sh
pip install dsmanager
```

or `poetry`

```sh
poetry add dsmanager
```

or `conda`

```sh
conda install dsmanager
```

Multiple sub dependencies are available depending on the needs:

```sh
pip install dsmanager[sharepoint] # Add Sharepoint source handling
pip install dsmanager[salesforce] # Add SalesForce source handling
pip install dsmanager[kaggle] # Add Kaggle source handling
pip install dsmanager[snowflake] # Add Snowflkae source handling
pip install dsmanager[mysql] # Add MySQL source handling
pip install dsmanager[pgsql] # Add PostgreSQL source handling
pip install dsmanager[all_sources] # All the supported sources
```

## Usage

The DS Manager has 3 main components:

- A **DataManager** component
- A **Controller** component
- A **Model** component

### DataManager

The DataManager allows to manage different types of data sources among which one can mention:

- File (File locally or online)
- Http (Http requests)
- Ftp (Ftp hosted files)
- Sql (Sql database tables)
- Sharepoint (Microsoft OneDrive files)
- SalesForce (SalesForce classes)
- Kaggle (Kaggle datasets)

The first step to use the DataManager is to instance it with a metadata path.

```python
from dsmanager import DataManager
dm = DataManager("data/metadata.json")
```

The metadata file is generated if it does not exist and it consist of a dict of sources following this schema:

```json
{
  "SOURCE_NAME": {
    "source_type": "name_of_the_source",
    "args": {}
  }
}
```

Each source has a `source_type` corresponding to the name of the source. You can access this list with this command:

```python
DataManager().datasources
```

Each of these data sources has its own read and write schemas because of its own parameters requierements. You can also add additional arguments which are not required with the parameter `args`.

You can obtain the schemas for a specific datasource with the following commands:

```python
source_name = "file"
DataManager().datasources[source_name].read_schema #use write_schema for the output sources.
```

Output:

```json
{
    "source_type": "file",
    "path": "local_path | online_uri",
    "file_type": "csv | excel | text | json | ...",
    "encoding": "utf-8",
    "args": {
        "pandas_read_file_argument_keyword": "value_for_this_argument"
    }
}
```

## Development

### Source code

You can check the latest sources with the command:

```python
git clone https://gitlab.com/bigrayou/dsmanager
```

### Testing

After installation, you can launch the test suite from outside the dsmanager directory (you will need to have pytest >= 7.1.3 installed):

```python
pytest -v
```

### Dependencies

The DSManager requires:

- aiohttp >=3.8.3
- cryptography 38.0.4
- dash >=2.7.1,<3.0.0
- llvmlite >=0.39.1,<0.40.0
- nest-asyncio >=1.5.6,<2.0.0
- numba >=0.56.4,<0.57.0
- numexpr >=2.8.4,<3.0.0
- numpy >=1.23.3,<2.0.0
- openpyxl >=3.0.10,<4.0.0
- optuna >=3.0.5,<4.0.0
- pandas >=1.5.0,<2.0.0
- paramiko >=2.12.0,<3.0.0
- pickle-mixin >=1.0.2,<2.0.0
- python-dotenv >=0.21.0,<0.22.0
- requests >=2.28.1,<3.0.0
- scikit-learn >=1.2.0,<2.0.0
- setuptools >=65.6.3,<66.0.0
- shap >=0.41.0,<0.42.0
- sqlalchemy >=1.4.45,<2.0.0
- sweetviz >=2.1.4,<3.0.0
- tqdm >=4.64.1,<5.0.0

Optionnaly, the DSManager could require:

- azure-common >=1.1.28,<2.0.0
- azure-storage-blob >=12.14.1,<13.0.0
- azure-storage-common >=2.1.0,<3.0.0
- kaggle >=1.5.12,<2.0.0
- mysqlclient >=2.1.1,<3.0.0
- psycopg2-binary >=2.9.5,<3.0.0
- shareplum >=0.5.1,<0.6.0
- simple-salesforce >=1.12.2,<2.0.0
- snowflake-connector-python >=2.9.0,<3.0.0
- snowflake-sqlalchemy >=1.4.4,<2.0.0

## Author

👤 **Rayane Amrouche**

- Github: [@AARayane](https://github.com/AARayane)
- Gitlab: [@Bigrayou](https://gitlab.com/bigrayou)

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "dsmanager",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4",
    "maintainer_email": "",
    "keywords": "",
    "author": "Rayane AMROUCHE",
    "author_email": "rayaneamrouche@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/8c/b3/bb34cfc181bb88aca2c9992947a6ca71d480c78fb677742ef9aba97dc218/dsmanager-1.2.9.1.tar.gz",
    "platform": null,
    "description": "<h1 align=\"center\"\n>Data Science Manager \ud83d\udc68\u200d\ud83d\udcbb\n</h1>\n<p\n>\n  <a\n  href=\"#\"\n  target=\"_blank\"\n  >\n    <img\n    alt=\"Version\"\n    src=\"https://img.shields.io/badge/version-1.2-blue.svg?cacheSeconds=2592000\"\n    />\n  </a>\n  <a\n  href=\"http://dsmanager.rtfd.io/\"\n  target=\"_blank\"\n  >\n    <img\n    alt=\"Documentation\"\n    src=\"https://img.shields.io/badge/documentation-rtfd-orange.svg\"\n    />\n  </a>\n  <a\n  href=\"LICENSE\"\n  target=\"_blank\"\n  >\n    <img\n    alt=\"License: Adel Rayane Amrouche\"\n    src=\"https://img.shields.io/badge/License-Adel Rayane Amrouche-yellow.svg\"\n    />\n  </a>\n</p>\n\n> Data Science tools to ease access and use of data and models\n\n## Install\n\nThe easiest way to install scikit-learn is using `pip`:\n\n```sh\npip install dsmanager\n```\n\nor `poetry`\n\n```sh\npoetry add dsmanager\n```\n\nor `conda`\n\n```sh\nconda install dsmanager\n```\n\nMultiple sub dependencies are available depending on the needs:\n\n```sh\npip install dsmanager[sharepoint] # Add Sharepoint source handling\npip install dsmanager[salesforce] # Add SalesForce source handling\npip install dsmanager[kaggle] # Add Kaggle source handling\npip install dsmanager[snowflake] # Add Snowflkae source handling\npip install dsmanager[mysql] # Add MySQL source handling\npip install dsmanager[pgsql] # Add PostgreSQL source handling\npip install dsmanager[all_sources] # All the supported sources\n```\n\n## Usage\n\nThe DS Manager has 3 main components:\n\n- A **DataManager** component\n- A **Controller** component\n- A **Model** component\n\n### DataManager\n\nThe DataManager allows to manage different types of data sources among which one can mention:\n\n- File (File locally or online)\n- Http (Http requests)\n- Ftp (Ftp hosted files)\n- Sql (Sql database tables)\n- Sharepoint (Microsoft OneDrive files)\n- SalesForce (SalesForce classes)\n- Kaggle (Kaggle datasets)\n\nThe first step to use the DataManager is to instance it with a metadata path.\n\n```python\nfrom dsmanager import DataManager\ndm = DataManager(\"data/metadata.json\")\n```\n\nThe metadata file is generated if it does not exist and it consist of a dict of sources following this schema:\n\n```json\n{\n  \"SOURCE_NAME\": {\n    \"source_type\": \"name_of_the_source\",\n    \"args\": {}\n  }\n}\n```\n\nEach source has a `source_type` corresponding to the name of the source. You can access this list with this command:\n\n```python\nDataManager().datasources\n```\n\nEach of these data sources has its own read and write schemas because of its own parameters requierements. You can also add additional arguments which are not required with the parameter `args`.\n\nYou can obtain the schemas for a specific datasource with the following commands:\n\n```python\nsource_name = \"file\"\nDataManager().datasources[source_name].read_schema #use write_schema for the output sources.\n```\n\nOutput:\n\n```json\n{\n    \"source_type\": \"file\",\n    \"path\": \"local_path | online_uri\",\n    \"file_type\": \"csv | excel | text | json | ...\",\n    \"encoding\": \"utf-8\",\n    \"args\": {\n        \"pandas_read_file_argument_keyword\": \"value_for_this_argument\"\n    }\n}\n```\n\n## Development\n\n### Source code\n\nYou can check the latest sources with the command:\n\n```python\ngit clone https://gitlab.com/bigrayou/dsmanager\n```\n\n### Testing\n\nAfter installation, you can launch the test suite from outside the dsmanager directory (you will need to have pytest >= 7.1.3 installed):\n\n```python\npytest -v\n```\n\n### Dependencies\n\nThe DSManager requires:\n\n- aiohttp >=3.8.3\n- cryptography 38.0.4\n- dash >=2.7.1,<3.0.0\n- llvmlite >=0.39.1,<0.40.0\n- nest-asyncio >=1.5.6,<2.0.0\n- numba >=0.56.4,<0.57.0\n- numexpr >=2.8.4,<3.0.0\n- numpy >=1.23.3,<2.0.0\n- openpyxl >=3.0.10,<4.0.0\n- optuna >=3.0.5,<4.0.0\n- pandas >=1.5.0,<2.0.0\n- paramiko >=2.12.0,<3.0.0\n- pickle-mixin >=1.0.2,<2.0.0\n- python-dotenv >=0.21.0,<0.22.0\n- requests >=2.28.1,<3.0.0\n- scikit-learn >=1.2.0,<2.0.0\n- setuptools >=65.6.3,<66.0.0\n- shap >=0.41.0,<0.42.0\n- sqlalchemy >=1.4.45,<2.0.0\n- sweetviz >=2.1.4,<3.0.0\n- tqdm >=4.64.1,<5.0.0\n\nOptionnaly, the DSManager could require:\n\n- azure-common >=1.1.28,<2.0.0\n- azure-storage-blob >=12.14.1,<13.0.0\n- azure-storage-common >=2.1.0,<3.0.0\n- kaggle >=1.5.12,<2.0.0\n- mysqlclient >=2.1.1,<3.0.0\n- psycopg2-binary >=2.9.5,<3.0.0\n- shareplum >=0.5.1,<0.6.0\n- simple-salesforce >=1.12.2,<2.0.0\n- snowflake-connector-python >=2.9.0,<3.0.0\n- snowflake-sqlalchemy >=1.4.4,<2.0.0\n\n## Author\n\n\ud83d\udc64 **Rayane Amrouche**\n\n- Github: [@AARayane](https://github.com/AARayane)\n- Gitlab: [@Bigrayou](https://gitlab.com/bigrayou)\n",
    "bugtrack_url": null,
    "license": "GPL-3.0-or-later",
    "summary": "Data Science tools to ease access and use of data and models",
    "version": "1.2.9.1",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8d6fe5cb6322d47f18c6d567f72bc1d6e64fc860fa845fe24e604caf358e8e9a",
                "md5": "874e76f818dc471d0e2f1408495b1784",
                "sha256": "f25d24abc2e527969a4b44238d3ec470140463001c1a17176f4c412944d1a529"
            },
            "downloads": -1,
            "filename": "dsmanager-1.2.9.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "874e76f818dc471d0e2f1408495b1784",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4",
            "size": 56478,
            "upload_time": "2023-02-04T13:54:42",
            "upload_time_iso_8601": "2023-02-04T13:54:42.647256Z",
            "url": "https://files.pythonhosted.org/packages/8d/6f/e5cb6322d47f18c6d567f72bc1d6e64fc860fa845fe24e604caf358e8e9a/dsmanager-1.2.9.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8cb3bb34cfc181bb88aca2c9992947a6ca71d480c78fb677742ef9aba97dc218",
                "md5": "e9294f635a54995f4ad8ed057f721ded",
                "sha256": "73051a191feb701d374a33da5c33053b3e3705b7e844387082c84c6f29cddd8f"
            },
            "downloads": -1,
            "filename": "dsmanager-1.2.9.1.tar.gz",
            "has_sig": false,
            "md5_digest": "e9294f635a54995f4ad8ed057f721ded",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4",
            "size": 44403,
            "upload_time": "2023-02-04T13:54:44",
            "upload_time_iso_8601": "2023-02-04T13:54:44.238296Z",
            "url": "https://files.pythonhosted.org/packages/8c/b3/bb34cfc181bb88aca2c9992947a6ca71d480c78fb677742ef9aba97dc218/dsmanager-1.2.9.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-04 13:54:44",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "dsmanager"
}
        
Elapsed time: 0.15289s