Name | sim-datasets JSON |
Version |
0.1.1
JSON |
| download |
home_page | None |
Summary | A unified platform solution for symbolic regression, providing comprehensive support for Scientific-Intelligent-Modeling toolkits. Seamlessly integrates with ModelScope and Hugging Face for efficient dataset access. |
upload_time | 2025-07-17 15:59:18 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | GPL-3.0 |
keywords |
datasets
machine learning
scientific modeling
symbolic regression
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# SIM-Datasets
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/GPL-3.0)
[](https://badge.fury.io/py/sim-datasets)
A unified platform solution for symbolic regression, providing comprehensive support for Scientific-Intelligent-Modeling toolkits. Seamlessly integrates with ModelScope and Hugging Face for efficient dataset access.
## 🌟 Key Features
- 🔄 **Multi-Source Support**: Simultaneously supports HuggingFace and ModelScope platforms
- ⚡ **Smart Source Selection**: Automatically selects the fastest download source
- 🚀 **Concurrent Downloads**: Supports asynchronous concurrent downloads with up to 20 concurrent tasks
- 📊 **Real-time Progress**: Displays detailed download progress and status
- 📁 **Smart Caching**: Automatically caches download results to avoid repeated downloads
- 🛠️ **Command Line Tools**: Provides convenient command-line interface
- 🔧 **Proxy Support**: Complete proxy configuration support
- 📋 **Dataset Management**: Unified dataset list and configuration management
## 📦 Installation
### Install from Source
```bash
# Clone repository
git clone https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling.git
cd scientific-intelligent-modelling
# Install dependencies
pip install -e .
```
### Install from PyPI
```bash
pip install sim-datasets
```
## 🚀 Quick Start
### Basic Usage
```python
from sim_datasets import get_datasets_list, download_dataset
# Get dataset list
datasets = get_datasets_list('llm-srbench')
print(f"Found {len(datasets)} datasets")
# Download single dataset
result = download_single_dataset('llm-srbench/bio_pop_growth/BPG0')
print(f"Dataset downloaded: {result['cache_path']}")
# Download entire dataset collection
result = download_dataset('llm-srbench')
print(f"Downloaded {len(result['downloaded'])} datasets")
```
### Advanced Usage
```python
from sim_datasets import download_dataset_parallel
# Concurrent download (recommended for large datasets)
result = download_dataset_parallel(
'llm-srbench',
source='huggingface', # or 'modelscope'
max_workers=10, # number of concurrent workers
proxy='http://proxy:8080' # optional proxy
)
print(f"Successfully downloaded: {len(result['downloaded'])}")
print(f"Failed: {len(result['failed'])}")
```
## 📋 Supported Datasets
### LLM-SRBench Datasets
- **Biological Population Growth** (`bio_pop_growth`): Biological population dynamics modeling data
- **Chemical Reactions** (`chem_react`): Chemical reaction kinetics data
- **LSR Transform** (`lsrtransform`): Linear symbolic regression transform data
- **Materials Science** (`matsci`): Materials science related data
- **Physical Oscillations** (`phys_osc`): Physical oscillation system data
### SRBench 1.0 Datasets
- **Feynman Equations** (`feynman`): Feynman physics equation data
- **Strogatz Systems** (`strogatz`): Strogatz nonlinear system data
- **Black Box Functions** (`blackbox`): Black box function data
### SRSD Datasets
- **Feynman Easy** (`srsd-feynman_easy`): Simple Feynman equations
- **Feynman Medium** (`srsd-feynman_medium`): Medium difficulty Feynman equations
- **Feynman Hard** (`srsd-feynman_hard`): Hard Feynman equations
## 📄 License
This project is licensed under the [GPL-3.0](https://opensource.org/licenses/GPL-3.0) License.
## 👥 Authors
- **Ziwen Zhang** - *Lead Developer* - [244824379@qq.com](mailto:244824379@qq.com)
- **Kai Li** - *Contributor* - [kai.li@ia.ac.cn](mailto:kai.li@ia.ac.cn)
## 🙏 Acknowledgments
Thanks to the following open source projects:
- [Hugging Face](https://huggingface.co/) - Providing dataset hosting services
- [ModelScope](https://modelscope.cn/) - Providing model and dataset platform
- [datasets](https://github.com/huggingface/datasets) - Dataset processing library
## 📞 Contact Us
- Email: [244824379@qq.com](mailto:244824379@qq.com)
- Project Homepage: [https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling](https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling)
- Issue Reports: [GitHub Issues](https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling/issues)
---
⭐ If this project helps you, please give us a star!
Raw data
{
"_id": null,
"home_page": null,
"name": "sim-datasets",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Ziwen Zhang <244824379@qq.com>",
"keywords": "datasets, machine learning, scientific modeling, symbolic regression",
"author": null,
"author_email": "Ziwen Zhang <244824379@qq.com>, Kai Li <kai.li@ia.ac.cn>",
"download_url": "https://files.pythonhosted.org/packages/d9/eb/ed826c94c9f26f01caf7a9aaeda7ecf5b64a0bf19ffe560f6089397078fa/sim_datasets-0.1.1.tar.gz",
"platform": null,
"description": "# SIM-Datasets\n\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/GPL-3.0)\n[](https://badge.fury.io/py/sim-datasets)\n\nA unified platform solution for symbolic regression, providing comprehensive support for Scientific-Intelligent-Modeling toolkits. Seamlessly integrates with ModelScope and Hugging Face for efficient dataset access.\n\n## \ud83c\udf1f Key Features\n\n- \ud83d\udd04 **Multi-Source Support**: Simultaneously supports HuggingFace and ModelScope platforms\n- \u26a1 **Smart Source Selection**: Automatically selects the fastest download source\n- \ud83d\ude80 **Concurrent Downloads**: Supports asynchronous concurrent downloads with up to 20 concurrent tasks\n- \ud83d\udcca **Real-time Progress**: Displays detailed download progress and status\n- \ud83d\udcc1 **Smart Caching**: Automatically caches download results to avoid repeated downloads\n- \ud83d\udee0\ufe0f **Command Line Tools**: Provides convenient command-line interface\n- \ud83d\udd27 **Proxy Support**: Complete proxy configuration support\n- \ud83d\udccb **Dataset Management**: Unified dataset list and configuration management\n\n## \ud83d\udce6 Installation\n\n### Install from Source\n\n```bash\n# Clone repository\ngit clone https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling.git\ncd scientific-intelligent-modelling\n\n# Install dependencies\npip install -e .\n```\n\n### Install from PyPI\n\n```bash\npip install sim-datasets\n```\n\n## \ud83d\ude80 Quick Start\n\n### Basic Usage\n\n```python\nfrom sim_datasets import get_datasets_list, download_dataset\n\n# Get dataset list\ndatasets = get_datasets_list('llm-srbench')\nprint(f\"Found {len(datasets)} datasets\")\n\n# Download single dataset\nresult = download_single_dataset('llm-srbench/bio_pop_growth/BPG0')\nprint(f\"Dataset downloaded: {result['cache_path']}\")\n\n# Download entire dataset collection\nresult = download_dataset('llm-srbench')\nprint(f\"Downloaded {len(result['downloaded'])} datasets\")\n```\n\n### Advanced Usage\n\n```python\nfrom sim_datasets import download_dataset_parallel\n\n# Concurrent download (recommended for large datasets)\nresult = download_dataset_parallel(\n 'llm-srbench',\n source='huggingface', # or 'modelscope'\n max_workers=10, # number of concurrent workers\n proxy='http://proxy:8080' # optional proxy\n)\n\nprint(f\"Successfully downloaded: {len(result['downloaded'])}\")\nprint(f\"Failed: {len(result['failed'])}\")\n```\n\n## \ud83d\udccb Supported Datasets\n\n### LLM-SRBench Datasets\n- **Biological Population Growth** (`bio_pop_growth`): Biological population dynamics modeling data\n- **Chemical Reactions** (`chem_react`): Chemical reaction kinetics data\n- **LSR Transform** (`lsrtransform`): Linear symbolic regression transform data\n- **Materials Science** (`matsci`): Materials science related data\n- **Physical Oscillations** (`phys_osc`): Physical oscillation system data\n\n### SRBench 1.0 Datasets\n- **Feynman Equations** (`feynman`): Feynman physics equation data\n- **Strogatz Systems** (`strogatz`): Strogatz nonlinear system data\n- **Black Box Functions** (`blackbox`): Black box function data\n\n### SRSD Datasets\n- **Feynman Easy** (`srsd-feynman_easy`): Simple Feynman equations\n- **Feynman Medium** (`srsd-feynman_medium`): Medium difficulty Feynman equations\n- **Feynman Hard** (`srsd-feynman_hard`): Hard Feynman equations\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the [GPL-3.0](https://opensource.org/licenses/GPL-3.0) License.\n\n## \ud83d\udc65 Authors\n\n- **Ziwen Zhang** - *Lead Developer* - [244824379@qq.com](mailto:244824379@qq.com)\n- **Kai Li** - *Contributor* - [kai.li@ia.ac.cn](mailto:kai.li@ia.ac.cn)\n\n## \ud83d\ude4f Acknowledgments\n\nThanks to the following open source projects:\n\n- [Hugging Face](https://huggingface.co/) - Providing dataset hosting services\n- [ModelScope](https://modelscope.cn/) - Providing model and dataset platform\n- [datasets](https://github.com/huggingface/datasets) - Dataset processing library\n\n## \ud83d\udcde Contact Us\n\n- Email: [244824379@qq.com](mailto:244824379@qq.com)\n- Project Homepage: [https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling](https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling)\n- Issue Reports: [GitHub Issues](https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling/issues)\n\n---\n\n\u2b50 If this project helps you, please give us a star! ",
"bugtrack_url": null,
"license": "GPL-3.0",
"summary": "A unified platform solution for symbolic regression, providing comprehensive support for Scientific-Intelligent-Modeling toolkits. Seamlessly integrates with ModelScope and Hugging Face for efficient dataset access.",
"version": "0.1.1",
"project_urls": {
"Documentation": "https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling",
"Homepage": "https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling",
"Issues": "https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling/issues",
"Repository": "https://github.com/scientific-intelligent-modelling/scientific-intelligent-modelling"
},
"split_keywords": [
"datasets",
" machine learning",
" scientific modeling",
" symbolic regression"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "076a59b5e04fa9292b880d004715d6f3f1656a5a3109971bca8d20c71af4c3ea",
"md5": "619ec313d05622c509082c19c97a08b1",
"sha256": "b8336199ade0373d691c1e88869758a89e6dccf01965739ffd059500a5072ca4"
},
"downloads": -1,
"filename": "sim_datasets-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "619ec313d05622c509082c19c97a08b1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 25779,
"upload_time": "2025-07-17T15:59:16",
"upload_time_iso_8601": "2025-07-17T15:59:16.891094Z",
"url": "https://files.pythonhosted.org/packages/07/6a/59b5e04fa9292b880d004715d6f3f1656a5a3109971bca8d20c71af4c3ea/sim_datasets-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d9ebed826c94c9f26f01caf7a9aaeda7ecf5b64a0bf19ffe560f6089397078fa",
"md5": "5a1505222a714a74962452eb7f9ddabe",
"sha256": "b349cbd6a2ae209bdfe91b5d9bd7acafae2c3e1d0983438205eeaea140939422"
},
"downloads": -1,
"filename": "sim_datasets-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "5a1505222a714a74962452eb7f9ddabe",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 15472,
"upload_time": "2025-07-17T15:59:18",
"upload_time_iso_8601": "2025-07-17T15:59:18.361581Z",
"url": "https://files.pythonhosted.org/packages/d9/eb/ed826c94c9f26f01caf7a9aaeda7ecf5b64a0bf19ffe560f6089397078fa/sim_datasets-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-17 15:59:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "scientific-intelligent-modelling",
"github_project": "scientific-intelligent-modelling",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "sim-datasets"
}