# PyScrew
PyScrew is a Python package designed to simplify access to industrial research data from screw driving experiments. It provides a streamlined interface for downloading, validating, and preparing experimental datasets hosted on Zenodo.
More information on the data is available here: https://zenodo.org/records/14769379
## Features
- **Easy Data Access**: Simple interface to download and extract screw driving datasets
- **Data Integrity**: Automatic checksum verification and secure extraction
- **Caching System**: Smart caching to prevent redundant downloads
## Installation
Install PyScrew directly from PyPI:
```bash
pip install pyscrew
```
## Quck start
```python
import pyscrew
# List available scenarios with their descriptions
scenarios = pyscrew.list_scenarios()
print("Available scenarios:", scenarios)
# Load and process data from a specific scenario
data = pyscrew.get_data("surface-friction")
# Access the data
print("Available measurements:", data.keys())
print("Number of torque measurements:", len(data["torque values"]))
```
## Available Scenarios
Our datasets examine various aspects of screw driving operations in industrial settings. Each scenario focuses on specific experimental conditions and research questions:
| ID | Name | Description | Samples | Classes | Documentation |
|----|------|-------------|---------|---------|---------------|
| s01 | Thread Degradation | Examines thread degradation in plastic materials through repeated fastening operations | 5,000 | 1 | [Details](docs/scenarios/s01_thread-degradation.md) |
| s02 | Surface Friction | Investigates the impact of different surface conditions (water, lubricant, adhesive, etc.) on screw driving operations | 12,500 | 8 | [Details](docs/scenarios/s02_surface-friction.md) |
| s03 | Error Collection 1 | Current place holder doc for the upcoming scenario 3 with multiple error classes | TBD | TBD | [Details](docs/scenarios/s03_error-collection-1.md) |
## Package structure
```bash
PyScrew/
├── docs/
│ └── scenarios/ # Detailed scenario documentation
│ ├── s01_thread-degradation.md
│ ├── s02_surface-friction.md
│ └── s03_error-collection-1.md
├── src/
│ └── pyscrew/
│ ├── __init__.py # Package initialization and version
│ ├── main.py # Main interface and high-level functions
│ ├── loading.py # Data loading from Zenodo
│ ├── processing.py # Data processing functionality
│ ├── tools/ # Utility scripts and tools
│ │ ├── create_label_csv.py # Label file generation
│ │ └── get_dataset_metrics.py # Documentation metrics calculation
│ └── utils/ # Utility functions and helpers
│ ├── data_model.py
│ └── logger.py
└── tests/ # Test suite
```
## API Reference
### Main Functions
`get_data(scenario_name: str, cache_dir: Optional[Path] = None, force: bool = False) -> Path`
Downloads and extracts a specific dataset.
* `scenario_name`: Name of the dataset to download
* `cache_dir`: Optional custom cache directory (default: ~/.cache/pyscrew)
* `force`: Force re-download even if cached
* **Returns:** Path to extracted dataset
`list_scenarios() -> Dict[str, str]`
Lists all available datasets and their descriptions.
* Returns: Dictionary mapping scenario names to descriptions
## Cache Structure
Downloaded data is stored in:
```bash
~/.cache/pyscrew/
├── archives/ # Compressed dataset archives
└── extracted/ # Extracted dataset files
├── s01_thread-degradation/
├── s02_surface-friction/
├── s03_error-collection-1/
└── ...
```
## Development
The package is under active development. Further implementation will add data processing utilities and data validation tools.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Citation
If you use this package in your research, please cite either one of the following publications:
* West, N., & Deuse, J. (2024). A Comparative Study of Machine Learning Approaches for Anomaly Detection in Industrial Screw Driving Data. Proceedings of the 57th Hawaii International Conference on System Sciences (HICSS), 1050-1059. https://hdl.handle.net/10125/106504
* West, N., Trianni, A. & Deuse, J. (2024). Data-driven analysis of bolted joints in plastic housings with surface-based anomalies using supervised and unsupervised machine learning. CIE51 Proceedings. _(DOI will follow after publication of the proceedings)_
*A dedicated paper for this library is currently in progress.*
Raw data
{
"_id": null,
"home_page": null,
"name": "pyscrew",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "industrial data, manufacturing, open data, research data, screw driving",
"author": null,
"author_email": "Nikolai West <nikolai.west@tu-dortmund.de>",
"download_url": "https://files.pythonhosted.org/packages/f1/c7/cb29ddd743d5ad01a4ff813577ec105cde1c1997a54182958c3856924e29/pyscrew-0.1.4.tar.gz",
"platform": null,
"description": "# PyScrew\n\nPyScrew is a Python package designed to simplify access to industrial research data from screw driving experiments. It provides a streamlined interface for downloading, validating, and preparing experimental datasets hosted on Zenodo.\n\nMore information on the data is available here: https://zenodo.org/records/14769379\n\n## Features\n\n- **Easy Data Access**: Simple interface to download and extract screw driving datasets\n- **Data Integrity**: Automatic checksum verification and secure extraction\n- **Caching System**: Smart caching to prevent redundant downloads\n\n## Installation\n\nInstall PyScrew directly from PyPI:\n\n```bash\npip install pyscrew\n```\n\n## Quck start\n\n```python \nimport pyscrew\n\n# List available scenarios with their descriptions\nscenarios = pyscrew.list_scenarios()\nprint(\"Available scenarios:\", scenarios)\n\n# Load and process data from a specific scenario\ndata = pyscrew.get_data(\"surface-friction\")\n\n# Access the data\nprint(\"Available measurements:\", data.keys())\nprint(\"Number of torque measurements:\", len(data[\"torque values\"]))\n```\n\n## Available Scenarios\n\nOur datasets examine various aspects of screw driving operations in industrial settings. Each scenario focuses on specific experimental conditions and research questions:\n\n| ID | Name | Description | Samples | Classes | Documentation |\n|----|------|-------------|---------|---------|---------------|\n| s01 | Thread Degradation | Examines thread degradation in plastic materials through repeated fastening operations | 5,000 | 1 | [Details](docs/scenarios/s01_thread-degradation.md) |\n| s02 | Surface Friction | Investigates the impact of different surface conditions (water, lubricant, adhesive, etc.) on screw driving operations | 12,500 | 8 | [Details](docs/scenarios/s02_surface-friction.md) |\n| s03 | Error Collection 1 | Current place holder doc for the upcoming scenario 3 with multiple error classes | TBD | TBD | [Details](docs/scenarios/s03_error-collection-1.md) |\n\n## Package structure\n\n```bash\nPyScrew/\n\u251c\u2500\u2500 docs/\n\u2502 \u2514\u2500\u2500 scenarios/ # Detailed scenario documentation\n\u2502 \u251c\u2500\u2500 s01_thread-degradation.md\n\u2502 \u251c\u2500\u2500 s02_surface-friction.md\n\u2502 \u2514\u2500\u2500 s03_error-collection-1.md\n\u251c\u2500\u2500 src/\n\u2502 \u2514\u2500\u2500 pyscrew/\n\u2502 \u251c\u2500\u2500 __init__.py # Package initialization and version\n\u2502 \u251c\u2500\u2500 main.py # Main interface and high-level functions\n\u2502 \u251c\u2500\u2500 loading.py # Data loading from Zenodo\n\u2502 \u251c\u2500\u2500 processing.py # Data processing functionality\n\u2502 \u251c\u2500\u2500 tools/ # Utility scripts and tools\n\u2502 \u2502 \u251c\u2500\u2500 create_label_csv.py # Label file generation\n\u2502 \u2502 \u2514\u2500\u2500 get_dataset_metrics.py # Documentation metrics calculation\n\u2502 \u2514\u2500\u2500 utils/ # Utility functions and helpers\n\u2502 \u251c\u2500\u2500 data_model.py\n\u2502 \u2514\u2500\u2500 logger.py\n\u2514\u2500\u2500 tests/ # Test suite\n```\n\n## API Reference\n\n### Main Functions\n\n`get_data(scenario_name: str, cache_dir: Optional[Path] = None, force: bool = False) -> Path`\n\nDownloads and extracts a specific dataset.\n\n* `scenario_name`: Name of the dataset to download\n* `cache_dir`: Optional custom cache directory (default: ~/.cache/pyscrew)\n* `force`: Force re-download even if cached\n* **Returns:** Path to extracted dataset\n\n`list_scenarios() -> Dict[str, str]`\n\nLists all available datasets and their descriptions.\n\n* Returns: Dictionary mapping scenario names to descriptions\n\n## Cache Structure\n\nDownloaded data is stored in:\n\n```bash \n~/.cache/pyscrew/\n\u251c\u2500\u2500 archives/ # Compressed dataset archives\n\u2514\u2500\u2500 extracted/ # Extracted dataset files\n \u251c\u2500\u2500 s01_thread-degradation/\n \u251c\u2500\u2500 s02_surface-friction/\n \u251c\u2500\u2500 s03_error-collection-1/\n \u2514\u2500\u2500 ...\n```\n\n## Development\nThe package is under active development. Further implementation will add data processing utilities and data validation tools. \n\n## Contributing\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Citation\nIf you use this package in your research, please cite either one of the following publications:\n* West, N., & Deuse, J. (2024). A Comparative Study of Machine Learning Approaches for Anomaly Detection in Industrial Screw Driving Data. Proceedings of the 57th Hawaii International Conference on System Sciences (HICSS), 1050-1059. https://hdl.handle.net/10125/106504\n* West, N., Trianni, A. & Deuse, J. (2024). Data-driven analysis of bolted joints in plastic housings with surface-based anomalies using supervised and unsupervised machine learning. CIE51 Proceedings. _(DOI will follow after publication of the proceedings)_\n\n*A dedicated paper for this library is currently in progress.*",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python package for accessing industrial research data from a screw driving system",
"version": "0.1.4",
"project_urls": {
"Bug Tracker": "https://github.com/nikolaiwest/pyscrew/issues",
"Documentation": "https://github.com/nikolaiwest/pyscrew#readme",
"Homepage": "https://github.com/nikolaiwest/pyscrew"
},
"split_keywords": [
"industrial data",
" manufacturing",
" open data",
" research data",
" screw driving"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c40068dcd28459939475861dedbaec50b1d570df01da37a15ef66df3a68bc12a",
"md5": "2eb8a1396f7441c4d148c46af2075a33",
"sha256": "826a2517b52468b4348113e1f288056fdf79bd23c46c5479858f16c757fc2324"
},
"downloads": -1,
"filename": "pyscrew-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2eb8a1396f7441c4d148c46af2075a33",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 45116,
"upload_time": "2025-02-12T17:56:22",
"upload_time_iso_8601": "2025-02-12T17:56:22.432551Z",
"url": "https://files.pythonhosted.org/packages/c4/00/68dcd28459939475861dedbaec50b1d570df01da37a15ef66df3a68bc12a/pyscrew-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f1c7cb29ddd743d5ad01a4ff813577ec105cde1c1997a54182958c3856924e29",
"md5": "35c84511c77423d3c333f18c02b0dff2",
"sha256": "8b4a8db24de002f4befc339f4836a95f2b541c4e8b29c9ac1373e6ddaaf339ec"
},
"downloads": -1,
"filename": "pyscrew-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "35c84511c77423d3c333f18c02b0dff2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 45888,
"upload_time": "2025-02-12T17:56:23",
"upload_time_iso_8601": "2025-02-12T17:56:23.586556Z",
"url": "https://files.pythonhosted.org/packages/f1/c7/cb29ddd743d5ad01a4ff813577ec105cde1c1997a54182958c3856924e29/pyscrew-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-12 17:56:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nikolaiwest",
"github_project": "pyscrew",
"github_not_found": true,
"lcname": "pyscrew"
}