search-data-collector

Name	search-data-collector JSON
Version	0.6.1 JSON
	download
home_page	None
Summary	None
upload_time	2024-08-15 12:20:47
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	MIT License Copyright (c) 2021 Simon Blanke Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	visualization data-science
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <H1 align="center">
    Search Data Collector
</H1>


<p align="center">
  <a href="https://github.com/SimonBlanke/search-data-collector/actions">
    <img src="https://github.com/SimonBlanke/search-data-collector/actions/workflows/tests.yml/badge.svg?branch=main" alt="img not loaded: try F5 :)">
  </a>
  <a href="https://app.codecov.io/gh/SimonBlanke/search-data-collector">
    <img src="https://img.shields.io/codecov/c/github/SimonBlanke/search-data-collector/main&logo=codecov" alt="img not loaded: try F5 :)">
  </a>
</p>


<br>

<H2 align="center">
    Thread-safe and atomic collection of tabular data into csv-files.
</H2>

<br>

## Introduction

The search-data-collector provides a single class with following methods to manage data:
 - save
 - append
 - load
 - remove

The Search-Data-Collector was created as a utility function for the [Gradient-Free-Optimizers](https://github.com/SimonBlanke/Gradient-Free-Optimizers)- and [Hyperactive](https://github.com/SimonBlanke/Hyperactive)-package. It is intended to be used as a tool to collect search-data from the optimization run. The search-data can be collected during the optimization run as a dictionary via `append` or after the run as a dataframe with the `save`-method. <br>
The `append`-method is thread-safe to work with hyperactive-multiprocessing. The `save`-method is atomic to avoid accidental data-loss, when interupting the save-process. <br>
For the Hyperactive-package the search-data-collector handles functions in the data by converting them to strings. If the data is loaded you can pass the search-space to convert the strings back to functions. 



<br>

## Disclaimer

This project is in an early development stage and is sparsely tested. If you encounter bugs or have suggestions for improvements, then please open an issue.


<br>

## Installation

```console
pip install search-data-collector 
```


<br>

## Examples


<br>

### Append search-data

```python
import numpy as np
from hyperactive import Hyperactive
from search_data_collector import CsvSearchData

collector = CsvSearchData("./search_data.csv")  # the csv is created automatically


def parabola_function(para):
    loss = para["x"] * para["x"] + para["y"] * para["y"]

    data_dict = dict(para)  # copy the parameter dictionary
    data_dict["score"] = -loss  # add the score to the dictionary
    collector.append(data_dict)  # you can append a dictionary to the csv

    return -loss


search_space = {
    "x": list(np.arange(-10, 10, 0.1)),
    "y": list(np.arange(-10, 10, 0.1)),
}


hyper = Hyperactive()
hyper.add_search(parabola_function, search_space, n_iter=1000)
hyper.run()
search_data = hyper.search_data(parabola_function)

search_data = collector.load(search_space)  # load data

print("\n search_data \n", search_data)
```


<br>

### Save search-data

```python
import numpy as np
from hyperactive import Hyperactive
from search_data_collector import CsvSearchData

collector = CsvSearchData("./search_data.csv")  # the csv is created automatically


def parabola_function(para):
    loss = para["x"] * para["x"] + para["y"] * para["y"]

    return -loss


search_space = {
    "x": list(np.arange(-10, 10, 0.1)),
    "y": list(np.arange(-10, 10, 0.1)),
}


hyper = Hyperactive()
hyper.add_search(parabola_function, search_space, n_iter=1000)
hyper.run()
search_data = hyper.search_data(parabola_function)

collector.save(search_data)  # save a dataframe instead

search_data = collector.load(search_space)  # load data

print("\n search_data \n", search_data)
```



<br>

### Functions in the search-space/search-data

```python
import numpy as np
from hyperactive import Hyperactive
from search_data_collector import CsvSearchData

collector = CsvSearchData("./search_data.csv")  # the csv is created automatically


def parabola_function(para):
    loss = para["x"] * para["x"] + para["y"] * para["y"]

    return -loss


# just some dummy functions to show how this works


def function1():
    print("this is function1")


def function2():
    print("this is function2")


def function3():
    print("this is function3")


search_space = {
    "x": list(np.arange(-10, 10, 0.1)),
    "y": list(np.arange(-10, 10, 0.1)),
    "string.example": ["string1", "string2", "string3"],
    "function.example": [function1, function2, function3],
}


hyper = Hyperactive()
hyper.add_search(parabola_function, search_space, n_iter=30)
hyper.run()
search_data = hyper.search_data(parabola_function)

collector.save(search_data)  # save a dataframe instead of appending a dictionary

search_data = collector.load()  # load data

print(
    "\n In this dataframe the 'function.example'-column contains strings, which are the '__name__' of the functions. \n search_data \n ",
    search_data,
    "\n",
)

search_data = collector.load(search_space)  # load data with search-space

print(
    print(
        "\n In this dataframe the 'function.example'-column contains the functions again. \n search_data \n ",
        search_data,
        "\n",
    )
)
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "search-data-collector",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Simon Blanke <simon.blanke@yahoo.com>",
    "keywords": "visualization, data-science",
    "author": null,
    "author_email": "Simon Blanke <simon.blanke@yahoo.com>",
    "download_url": null,
    "platform": null,
    "description": "<H1 align=\"center\">\n    Search Data Collector\n</H1>\n\n\n<p align=\"center\">\n  <a href=\"https://github.com/SimonBlanke/search-data-collector/actions\">\n    <img src=\"https://github.com/SimonBlanke/search-data-collector/actions/workflows/tests.yml/badge.svg?branch=main\" alt=\"img not loaded: try F5 :)\">\n  </a>\n  <a href=\"https://app.codecov.io/gh/SimonBlanke/search-data-collector\">\n    <img src=\"https://img.shields.io/codecov/c/github/SimonBlanke/search-data-collector/main&logo=codecov\" alt=\"img not loaded: try F5 :)\">\n  </a>\n</p>\n\n\n<br>\n\n<H2 align=\"center\">\n    Thread-safe and atomic collection of tabular data into csv-files.\n</H2>\n\n<br>\n\n## Introduction\n\nThe search-data-collector provides a single class with following methods to manage data:\n - save\n - append\n - load\n - remove\n\nThe Search-Data-Collector was created as a utility function for the [Gradient-Free-Optimizers](https://github.com/SimonBlanke/Gradient-Free-Optimizers)- and [Hyperactive](https://github.com/SimonBlanke/Hyperactive)-package. It is intended to be used as a tool to collect search-data from the optimization run. The search-data can be collected during the optimization run as a dictionary via `append` or after the run as a dataframe with the `save`-method. <br>\nThe `append`-method is thread-safe to work with hyperactive-multiprocessing. The `save`-method is atomic to avoid accidental data-loss, when interupting the save-process. <br>\nFor the Hyperactive-package the search-data-collector handles functions in the data by converting them to strings. If the data is loaded you can pass the search-space to convert the strings back to functions. \n\n\n\n<br>\n\n## Disclaimer\n\nThis project is in an early development stage and is sparsely tested. If you encounter bugs or have suggestions for improvements, then please open an issue.\n\n\n<br>\n\n## Installation\n\n```console\npip install search-data-collector \n```\n\n\n<br>\n\n## Examples\n\n\n<br>\n\n### Append search-data\n\n```python\nimport numpy as np\nfrom hyperactive import Hyperactive\nfrom search_data_collector import CsvSearchData\n\ncollector = CsvSearchData(\"./search_data.csv\")  # the csv is created automatically\n\n\ndef parabola_function(para):\n    loss = para[\"x\"] * para[\"x\"] + para[\"y\"] * para[\"y\"]\n\n    data_dict = dict(para)  # copy the parameter dictionary\n    data_dict[\"score\"] = -loss  # add the score to the dictionary\n    collector.append(data_dict)  # you can append a dictionary to the csv\n\n    return -loss\n\n\nsearch_space = {\n    \"x\": list(np.arange(-10, 10, 0.1)),\n    \"y\": list(np.arange(-10, 10, 0.1)),\n}\n\n\nhyper = Hyperactive()\nhyper.add_search(parabola_function, search_space, n_iter=1000)\nhyper.run()\nsearch_data = hyper.search_data(parabola_function)\n\nsearch_data = collector.load(search_space)  # load data\n\nprint(\"\\n search_data \\n\", search_data)\n```\n\n\n<br>\n\n### Save search-data\n\n```python\nimport numpy as np\nfrom hyperactive import Hyperactive\nfrom search_data_collector import CsvSearchData\n\ncollector = CsvSearchData(\"./search_data.csv\")  # the csv is created automatically\n\n\ndef parabola_function(para):\n    loss = para[\"x\"] * para[\"x\"] + para[\"y\"] * para[\"y\"]\n\n    return -loss\n\n\nsearch_space = {\n    \"x\": list(np.arange(-10, 10, 0.1)),\n    \"y\": list(np.arange(-10, 10, 0.1)),\n}\n\n\nhyper = Hyperactive()\nhyper.add_search(parabola_function, search_space, n_iter=1000)\nhyper.run()\nsearch_data = hyper.search_data(parabola_function)\n\ncollector.save(search_data)  # save a dataframe instead\n\nsearch_data = collector.load(search_space)  # load data\n\nprint(\"\\n search_data \\n\", search_data)\n```\n\n\n\n<br>\n\n### Functions in the search-space/search-data\n\n```python\nimport numpy as np\nfrom hyperactive import Hyperactive\nfrom search_data_collector import CsvSearchData\n\ncollector = CsvSearchData(\"./search_data.csv\")  # the csv is created automatically\n\n\ndef parabola_function(para):\n    loss = para[\"x\"] * para[\"x\"] + para[\"y\"] * para[\"y\"]\n\n    return -loss\n\n\n# just some dummy functions to show how this works\n\n\ndef function1():\n    print(\"this is function1\")\n\n\ndef function2():\n    print(\"this is function2\")\n\n\ndef function3():\n    print(\"this is function3\")\n\n\nsearch_space = {\n    \"x\": list(np.arange(-10, 10, 0.1)),\n    \"y\": list(np.arange(-10, 10, 0.1)),\n    \"string.example\": [\"string1\", \"string2\", \"string3\"],\n    \"function.example\": [function1, function2, function3],\n}\n\n\nhyper = Hyperactive()\nhyper.add_search(parabola_function, search_space, n_iter=30)\nhyper.run()\nsearch_data = hyper.search_data(parabola_function)\n\ncollector.save(search_data)  # save a dataframe instead of appending a dictionary\n\nsearch_data = collector.load()  # load data\n\nprint(\n    \"\\n In this dataframe the 'function.example'-column contains strings, which are the '__name__' of the functions. \\n search_data \\n \",\n    search_data,\n    \"\\n\",\n)\n\nsearch_data = collector.load(search_space)  # load data with search-space\n\nprint(\n    print(\n        \"\\n In this dataframe the 'function.example'-column contains the functions again. \\n search_data \\n \",\n        search_data,\n        \"\\n\",\n    )\n)\n```\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2021 Simon Blanke  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": null,
    "version": "0.6.1",
    "project_urls": {
        "Bug Reports": "https://github.com/SimonBlanke/search-data-collector/issues",
        "Homepage": "https://github.com/SimonBlanke/search-data-collector",
        "Source": "https://github.com/SimonBlanke/search-data-collector"
    },
    "split_keywords": [
        "visualization",
        " data-science"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "345800bde402c439cfd10981d9aa1eca899ccd67279a9d7211b6d570f62ec955",
                "md5": "95a11138d1530a483e9755ad63567711",
                "sha256": "bfcbf5c103bd11df4a7e763e260658730e759817b0dd506614625c4886c6d000"
            },
            "downloads": -1,
            "filename": "search_data_collector-0.6.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "95a11138d1530a483e9755ad63567711",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8073,
            "upload_time": "2024-08-15T12:20:47",
            "upload_time_iso_8601": "2024-08-15T12:20:47.412447Z",
            "url": "https://files.pythonhosted.org/packages/34/58/00bde402c439cfd10981d9aa1eca899ccd67279a9d7211b6d570f62ec955/search_data_collector-0.6.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-15 12:20:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SimonBlanke",
    "github_project": "search-data-collector",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "search-data-collector"
}

None