Name | trane JSON |
Version |
0.8.0
JSON |
| download |
home_page | |
Summary | automatically generate prediction problems and labels for supervised learning. |
upload_time | 2024-01-02 15:50:36 |
maintainer | |
docs_url | None |
author | |
requires_python | <4,>=3.8 |
license | MIT License |
keywords |
trane
data science
machine learning
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<p align="center">
<img width=50% src="https://github.com/trane-dev/Trane/blob/main/docs/trane-header.png" alt="Trane Logo" />
</p>
<p align="center">
<a href="https://github.com/trane-dev/Trane/actions/workflows/tests.yaml" target="_blank">
<img src="https://github.com/trane-dev/Trane/actions/workflows/tests.yaml/badge.svg" alt="Tests Status" />
</a>
<a href="https://codecov.io/gh/trane-dev/Trane" target="_blank">
<img src="https://codecov.io/gh/trane-dev/Trane/branch/main/graph/badge.svg?token=HafAlYGH8F" alt="Code Coverage" />
</a>
<a href="https://badge.fury.io/py/Trane" target="_blank">
<img src="https://badge.fury.io/py/Trane.svg?maxAge=2592000" alt="PyPI Version" />
</a>
<a href="https://pepy.tech/project/Trane" target="_blank">
<img src="https://static.pepy.tech/badge/trane" alt="PyPI Downloads" />
</a>
</p>
<hr>
**Trane** is a software package that automatically generates problems for temporal datasets and produces labels for supervised learning. Its goal is to streamline the machine learning problem-solving process.
## Install
Install Trane using pip:
```shell
python -m pip install trane
```
## Usage
Here's a quick demonstration of Trane in action:
```python
import trane
data, metadata = trane.load_airbnb()
problem_generator = trane.ProblemGenerator(
metadata=metadata,
entity_columns=["location"]
)
problems = problem_generator.generate()
for problem in problems[:5]:
print(problem)
```
A few of the generated problems:
```
==================================================
Generated 40 total problems
--------------------------------------------------
Classification problems: 5
Regression problems: 35
==================================================
For each <location> predict if there exists a record
For each <location> predict if there exists a record with <location> equal to <str>
For each <location> predict if there exists a record with <location> not equal to <str>
For each <location> predict if there exists a record with <rating> equal to <str>
For each <location> predict if there exists a record with <rating> not equal to <str>
```
With Trane's LLM add-on (`pip install trane[llm]`), we can determine the relevant problems with OpenAI:
```python
from trane.llm import analyze
instructions = "determine 5 most relevant problems about user's booking preferences. Do not include 'predict the first/last X' problems"
context = "Airbnb data listings in major cities, including information about hosts, pricing, location, and room type, along with over 5 million historical reviews."
relevant_problems = analyze(
problems=problems,
instructions=instructions,
context=context,
model="gpt-3.5-turbo-16k"
)
for problem in relevant_problems:
print(problem)
print(f'Reasoning: {problem.get_reasoning()}\n')
```
Output
```text
For each <location> predict if there exists a record
Reasoning: This problem can help identify locations with missing data or locations that have not been booked at all.
For each <location> predict the first <location> in all related records
Reasoning: Predicting the first location in all related records can provide insights into the most frequently booked locations for each city.
For each <location> predict the first <rating> in all related records
Reasoning: Predicting the first rating in all related records can provide insights into the average satisfaction level of guests for each location.
For each <location> predict the last <location> in all related records
Reasoning: Predicting the last location in all related records can provide insights into the most recent bookings for each city.
For each <location> predict the last <rating> in all related records
Reasoning: Predicting the last rating in all related records can provide insights into the recent satisfaction level of guests for each location.
```
## Community
- **Questions or Issues?** Create a [GitHub issue](https://github.com/trane-dev/Trane/issues).
- **Want to Chat?** [Join our Slack community](https://join.slack.com/t/trane-dev/shared_invite/zt-1zglnh25c-ryuQFarw0rVgKHC6ywUOlg).
## Cite Trane
If you find Trane beneficial, consider citing our paper:
Ben Schreck, Kalyan Veeramachaneni. [What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems.](https://dai.lids.mit.edu/wp-content/uploads/2017/10/Trane1.pdf) *IEEE DSAA 2016*, 440-451.
BibTeX entry:
```bibtex
@inproceedings{schreck2016would,
title={What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems},
author={Schreck, Benjamin and Veeramachaneni, Kalyan},
booktitle={Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on},
pages={440--451},
year={2016},
organization={IEEE}
}
```
Raw data
{
"_id": null,
"home_page": "",
"name": "trane",
"maintainer": "",
"docs_url": null,
"requires_python": "<4,>=3.8",
"maintainer_email": "MIT Data to AI Lab <dai-lab-trane@mit.edu>",
"keywords": "trane,data science,machine learning",
"author": "",
"author_email": "MIT Data to AI Lab <dai-lab-trane@mit.edu>",
"download_url": "https://files.pythonhosted.org/packages/2c/87/77b9b61a74c9b66392b9b383efdd2c572bffd4b40fd4d64e9fdb3f19a805/trane-0.8.0.tar.gz",
"platform": null,
"description": "\n<p align=\"center\">\n<img width=50% src=\"https://github.com/trane-dev/Trane/blob/main/docs/trane-header.png\" alt=\"Trane Logo\" />\n</p>\n\n<p align=\"center\">\n <a href=\"https://github.com/trane-dev/Trane/actions/workflows/tests.yaml\" target=\"_blank\">\n <img src=\"https://github.com/trane-dev/Trane/actions/workflows/tests.yaml/badge.svg\" alt=\"Tests Status\" />\n </a>\n <a href=\"https://codecov.io/gh/trane-dev/Trane\" target=\"_blank\">\n <img src=\"https://codecov.io/gh/trane-dev/Trane/branch/main/graph/badge.svg?token=HafAlYGH8F\" alt=\"Code Coverage\" />\n </a>\n <a href=\"https://badge.fury.io/py/Trane\" target=\"_blank\">\n <img src=\"https://badge.fury.io/py/Trane.svg?maxAge=2592000\" alt=\"PyPI Version\" />\n </a>\n <a href=\"https://pepy.tech/project/Trane\" target=\"_blank\">\n <img src=\"https://static.pepy.tech/badge/trane\" alt=\"PyPI Downloads\" />\n </a>\n</p>\n\n<hr>\n\n**Trane** is a software package that automatically generates problems for temporal datasets and produces labels for supervised learning. Its goal is to streamline the machine learning problem-solving process.\n\n## Install\n\nInstall Trane using pip:\n\n```shell\npython -m pip install trane\n```\n\n## Usage\n\nHere's a quick demonstration of Trane in action:\n\n```python\nimport trane\n\ndata, metadata = trane.load_airbnb()\nproblem_generator = trane.ProblemGenerator(\n metadata=metadata,\n entity_columns=[\"location\"]\n)\nproblems = problem_generator.generate()\n\nfor problem in problems[:5]:\n print(problem)\n```\n\nA few of the generated problems:\n```\n==================================================\nGenerated 40 total problems\n--------------------------------------------------\nClassification problems: 5\nRegression problems: 35\n==================================================\nFor each <location> predict if there exists a record\nFor each <location> predict if there exists a record with <location> equal to <str>\nFor each <location> predict if there exists a record with <location> not equal to <str>\nFor each <location> predict if there exists a record with <rating> equal to <str>\nFor each <location> predict if there exists a record with <rating> not equal to <str>\n```\n\nWith Trane's LLM add-on (`pip install trane[llm]`), we can determine the relevant problems with OpenAI:\n```python\nfrom trane.llm import analyze\n\ninstructions = \"determine 5 most relevant problems about user's booking preferences. Do not include 'predict the first/last X' problems\"\ncontext = \"Airbnb data listings in major cities, including information about hosts, pricing, location, and room type, along with over 5 million historical reviews.\"\nrelevant_problems = analyze(\n problems=problems,\n instructions=instructions,\n context=context,\n model=\"gpt-3.5-turbo-16k\"\n)\nfor problem in relevant_problems:\n print(problem)\n print(f'Reasoning: {problem.get_reasoning()}\\n')\n```\nOutput\n```text\nFor each <location> predict if there exists a record\nReasoning: This problem can help identify locations with missing data or locations that have not been booked at all.\n\nFor each <location> predict the first <location> in all related records\nReasoning: Predicting the first location in all related records can provide insights into the most frequently booked locations for each city.\n\nFor each <location> predict the first <rating> in all related records\nReasoning: Predicting the first rating in all related records can provide insights into the average satisfaction level of guests for each location.\n\nFor each <location> predict the last <location> in all related records\nReasoning: Predicting the last location in all related records can provide insights into the most recent bookings for each city.\n\nFor each <location> predict the last <rating> in all related records\nReasoning: Predicting the last rating in all related records can provide insights into the recent satisfaction level of guests for each location.\n```\n\n## Community\n\n- **Questions or Issues?** Create a [GitHub issue](https://github.com/trane-dev/Trane/issues).\n- **Want to Chat?** [Join our Slack community](https://join.slack.com/t/trane-dev/shared_invite/zt-1zglnh25c-ryuQFarw0rVgKHC6ywUOlg).\n\n## Cite Trane\n\nIf you find Trane beneficial, consider citing our paper:\n\nBen Schreck, Kalyan Veeramachaneni. [What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems.](https://dai.lids.mit.edu/wp-content/uploads/2017/10/Trane1.pdf) *IEEE DSAA 2016*, 440-451.\n\nBibTeX entry:\n\n```bibtex\n@inproceedings{schreck2016would,\n title={What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems},\n author={Schreck, Benjamin and Veeramachaneni, Kalyan},\n booktitle={Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on},\n pages={440--451},\n year={2016},\n organization={IEEE}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "automatically generate prediction problems and labels for supervised learning.",
"version": "0.8.0",
"project_urls": {
"Changes": "https://github.com/trane-dev/Trane/blob/main/docs/changelog.md",
"Chat": "https://join.slack.com/t/trane-dev/shared_invite/zt-1zglnh25c-ryuQFarw0rVgKHC6ywUOlg",
"Issue Tracker": "https://github.com/trane-dev/Trane/issues",
"Source Code": "https://github.com/trane-dev/Trane/",
"Twitter": "https://twitter.com/lab_dai"
},
"split_keywords": [
"trane",
"data science",
"machine learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f2f01755d68322eca0c1344c5786650ec0d4d1f2d141d1b3e9135fff28090d64",
"md5": "7fd6e736471214a7059e6ce19fe38a18",
"sha256": "9f69b86da4bd3226a1b25bb7f6fafb91ae47b9e7ef21a9dc99d4e200f6c9a8b5"
},
"downloads": -1,
"filename": "trane-0.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7fd6e736471214a7059e6ce19fe38a18",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4,>=3.8",
"size": 4390115,
"upload_time": "2024-01-02T15:50:33",
"upload_time_iso_8601": "2024-01-02T15:50:33.652073Z",
"url": "https://files.pythonhosted.org/packages/f2/f0/1755d68322eca0c1344c5786650ec0d4d1f2d141d1b3e9135fff28090d64/trane-0.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2c8777b9b61a74c9b66392b9b383efdd2c572bffd4b40fd4d64e9fdb3f19a805",
"md5": "1ce664566a94b7eb49792eb64af887cf",
"sha256": "677514a691ba5a49a4b4569605a23990005549cd7943c71c8fc8e4ccef60684f"
},
"downloads": -1,
"filename": "trane-0.8.0.tar.gz",
"has_sig": false,
"md5_digest": "1ce664566a94b7eb49792eb64af887cf",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4,>=3.8",
"size": 4366668,
"upload_time": "2024-01-02T15:50:36",
"upload_time_iso_8601": "2024-01-02T15:50:36.697110Z",
"url": "https://files.pythonhosted.org/packages/2c/87/77b9b61a74c9b66392b9b383efdd2c572bffd4b40fd4d64e9fdb3f19a805/trane-0.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-02 15:50:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "trane-dev",
"github_project": "Trane",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "trane"
}