taxonomy-synthesis


Nametaxonomy-synthesis JSON
Version 0.1.9 PyPI version JSON
download
home_pageNone
SummaryAn AI-driven framework for synthesizing adaptive taxonomies, enabling automated data categorization and classification within dynamic hierarchical structures.
upload_time2024-11-05 06:02:01
maintainerNone
docs_urlNone
authorSebastian Sosa
requires_python<4.0,>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # `taxonomy-synthesis`
An AI-driven framework for synthesizing adaptive taxonomies, enabling automated data categorization and classification within dynamic hierarchical structures.

_TLDR: copy this README and throw it into ChatGPT. It will figure things out for you. (will create a "GPT" soon)_

_Join our [Discord Community](https://discord.gg/jZSVhtwTz6) for questions, discussions, and collaboration!_

_Check out our YouTube [demo video](https://www.youtube.com/shorts/6QWXs241IEo) to see Taxonomy Synthesis in action!_

## Explain Like I'm 5 πŸ€”
Imagine you have a big box of different animals, but you’re not sure how to group them. You know there are "Mammals" and "Reptiles," but you don’t know the smaller groups they belong to, like which mammals are more similar or which reptiles go together. This tool uses smart AI helpers to figure out those smaller groups for you, like finding out there are "Rodents" and "Primates" among the mammals, and "Lizards" and "Snakes" among the reptiles. It then helps you sort all the animals into the right groups automatically, keeping everything neatly organized!

## Features πŸ› οΈ

- **Manual and Automatic Taxonomy Generation**: Flexibly create taxonomy trees manually or automatically from arbitrary items.
- **Recursive Tree Primitives**: Utilize a tree structure that supports recursive operations, making it easy to manage hierarchical data.
- **AI-Generated Subcategories**: Automatically generate subcategories using AI models based on the context and data provided.
- **AI Classification**: Automatically classify items into appropriate categories using advanced AI models.

## Quickstart Guide ([colab](https://colab.research.google.com/drive/1BgUYdeT6aP23nYm2zopjLNKAB_9-7Hp2?usp=sharing)) πŸš€


In this quickstart, we'll walk you through the process of using `taxonomy-synthesis` to create a simplified phylogenetic tree for a list of animals. We'll demonstrate how to initialize the package, set up an OpenAI client, manually create a taxonomy tree, generate subcategories automatically, and classify items using AI.

### 1. Download and Install the Package

First, ensure you have the package installed. You can install taxonomy-synthesis directly using pip:

```bash
pip install taxonomy-synthesis
```

### 2. Set Up OpenAI Client

Before proceeding, make sure you have an OpenAI API key.

```python
# Set up the OpenAI client
from openai import OpenAI

client = OpenAI(api_key='sk-...')
```

### 3. Prepare Your Data

We'll start with a list of 10 animal species, each represented with an arbitrary schema containing fields like `name`, `fun fact`, `lifespan`, and `emoji`. The only required field is `id`, which should be unique for each item.

```python
# Prepare a list of items (animals) with various attributes
items = [
  {"id": "🦘", "name": "Kangaroo", "fun_fact": "Can hop at high speeds", "lifespan_years": 23, "emoji": "🦘"},
  {"id": "🐨", "name": "Koala", "fun_fact": "Sleeps up to 22 hours a day", "lifespan_years": 18, "emoji": "🐨"},
  {"id": "🐘", "name": "Elephant", "fun_fact": "Largest land animal", "lifespan_years": 60, "emoji": "🐘"},
  {"id": "πŸ•", "name": "Dog", "fun_fact": "Best friend of humans", "lifespan_years": 15, "emoji": "πŸ•"},
  {"id": "πŸ„", "name": "Cow", "fun_fact": "Gives milk", "lifespan_years": 20, "emoji": "πŸ„"},
  {"id": "🐁", "name": "Mouse", "fun_fact": "Can squeeze through tiny gaps", "lifespan_years": 2, "emoji": "🐁"},
  {"id": "🐊", "name": "Crocodile", "fun_fact": "Lives in water and land", "lifespan_years": 70, "emoji": "🐊"},
  {"id": "🐍", "name": "Snake", "fun_fact": "No legs", "lifespan_years": 9, "emoji": "🐍"},
  {"id": "🐒", "name": "Turtle", "fun_fact": "Can live over 100 years", "lifespan_years": 100, "emoji": "🐒"},
  {"id": "🦎", "name": "Gecko", "fun_fact": "Can climb walls", "lifespan_years": 5, "emoji": "🦎"}
]
```

### 4. Initialize the Tree Structure

Create the root node for our taxonomy tree and initialize two subclasses: `Mammals` and `Reptiles`.

```python
from taxonomy_synthesis.models import Category, Item
from taxonomy_synthesis.tree.tree_node import TreeNode

# Create root node and two primary subclasses
root_category = Category(name="Animals", description="All animals")
mammal_category = Category(name="Mammals", description="Mammal species")
reptile_category = Category(name="Reptiles", description="Reptile species")

root_node = TreeNode(value=root_category)
mammal_node = TreeNode(value=mammal_category)
reptile_node = TreeNode(value=reptile_category)

# Add subclasses to the root node
root_node.add_child(mammal_node)
root_node.add_child(reptile_node)
```

### 5. Classify Items in the Root Node

Classify all items under the root node into `Mammals` or `Reptiles` using the AI classifier.

```python
from taxonomy_synthesis.tree.node_operator import NodeOperator
from taxonomy_synthesis.classifiers.gpt_classifier import GPTClassifier

# Initialize the GPT classifier and node operator
classifier = GPTClassifier(client=client)
generator = None  # We'll use manual generation for this part
operator = NodeOperator(classifier=classifier, generator=generator)

# Convert dictionary items to Item objects and classify
item_objects = [Item(**item) for item in items]
classified_items = operator.classify_items(root_node, item_objects)

print("After initial classification:")
print(root_node.print_tree())
```
_Output:_
```
After initial classification:
Animals: []
  Mammals: [🦘, 🐨, 🐘, πŸ•, πŸ„, 🐁]
  Reptiles: [🐊, 🐍, 🐒, 🦎]
```

### 6. Generate Subcategories for Mammals

Use AI to automatically generate subcategories under `Mammals` based on the provided data.

```python
from taxonomy_synthesis.generator.taxonomy_generator import TaxonomyGenerator

# Initialize the Taxonomy Generator
generator = TaxonomyGenerator(
    client=client,
    max_categories=2,
    generation_method="Create categories inaccordance to the philogenetic tree."
)
operator.generator = generator

# Generate subcategories under Mammals
new_categories = operator.generate_subcategories(mammal_node)

print("Generated subcategories under 'Mammals':")
print(mammal_node.print_tree())
```
_Output:_
```
Generated subcategories under 'Mammals':
Mammals: [🦘, 🐨, 🐘, πŸ•, πŸ„, 🐁]
  marsupials: []
  placentals: []
```

### 7. Reclassify Items under Mammals

Now classify the items specifically under the `Mammals` node into their newly generated subcategories.

```python
# Reclassify items under Mammals based on the new subcategories
classified_items = operator.classify_items(mammal_node, mammal_node.get_all_items())

print("After reclassification under 'Mammals':")
print(root_node.print_tree())
```
_Output:_
```
After reclassification under 'Mammals':
Mammals: []
  marsupials: [🦘, 🐨]
  placentals: [🐘, πŸ•, πŸ„, 🐁]
```

### 8. Print the Final Tree Structure

Finally, print the entire tree to see the categorized structure.

```python
# Print the final tree structure
print("Final taxonomy tree structure:")
print(root_node.print_tree())
```
_Output:_
```
Final taxonomy tree structure:
Animals: []
  Mammals: []
    marsupials: [🦘, 🐨]
    placentals: [🐘, πŸ•, πŸ„, 🐁]
  Reptiles: [🐊, 🐍, 🐒, 🦎]
```

## System Diagram 🎨

For a visual representation of the system architecture and its components, refer to the following diagram:

![v1 Class Diagram](https://github.com/user-attachments/assets/ffdbe2b1-4ad4-4b2b-9a72-5b14b2f3adfa)

## Contributing πŸ€—

Contributions are welcome! To get started, follow these steps to set up your development environment:

1. **Clone the Repository**:

   ```bash
   git clone https://github.com/CakeCrusher/TaxonomySynthesis.git
   cd taxonomy-synthesis
   ```

2. **Install Poetry** (if not already installed):

   ```bash
   curl -sSL https://install.python-poetry.org | python3 -
   ```

3. **Install Dependencies**:

   Use Poetry to install all the dependencies in a virtual environment:

   ```bash
   poetry install
   ```

4. **Activate the Virtual Environment**:

   To activate the virtual environment created by Poetry:

   ```bash
   poetry shell
   ```

5. **Run Pre-Commit Hooks**:

   To maintain code quality, please run pre-commit hooks before submitting any pull requests:

   ```bash
   poetry run pre-commit install
   poetry run pre-commit run --all-files
   ```

We encourage you to open issues for any bugs you encounter or features you'd like to see added. Pull requests are also highly appreciated! Let's work together to improve and expand this project.
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "taxonomy-synthesis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Sebastian Sosa",
    "author_email": "1sebastian1sosa1@gmail.com> @ OPENGPT LLC <https://opengpt.llc/",
    "download_url": "https://files.pythonhosted.org/packages/24/7b/1c4a6c4ca0eec75b7b5eff0e91612935c3bf5617432152ba96caa93729b6/taxonomy_synthesis-0.1.9.tar.gz",
    "platform": null,
    "description": "# `taxonomy-synthesis`\nAn AI-driven framework for synthesizing adaptive taxonomies, enabling automated data categorization and classification within dynamic hierarchical structures.\n\n_TLDR: copy this README and throw it into ChatGPT. It will figure things out for you. (will create a \"GPT\" soon)_\n\n_Join our [Discord Community](https://discord.gg/jZSVhtwTz6) for questions, discussions, and collaboration!_\n\n_Check out our YouTube [demo video](https://www.youtube.com/shorts/6QWXs241IEo) to see Taxonomy Synthesis in action!_\n\n## Explain Like I'm 5 \ud83e\udd14\nImagine you have a big box of different animals, but you\u2019re not sure how to group them. You know there are \"Mammals\" and \"Reptiles,\" but you don\u2019t know the smaller groups they belong to, like which mammals are more similar or which reptiles go together. This tool uses smart AI helpers to figure out those smaller groups for you, like finding out there are \"Rodents\" and \"Primates\" among the mammals, and \"Lizards\" and \"Snakes\" among the reptiles. It then helps you sort all the animals into the right groups automatically, keeping everything neatly organized!\n\n## Features \ud83d\udee0\ufe0f\n\n- **Manual and Automatic Taxonomy Generation**: Flexibly create taxonomy trees manually or automatically from arbitrary items.\n- **Recursive Tree Primitives**: Utilize a tree structure that supports recursive operations, making it easy to manage hierarchical data.\n- **AI-Generated Subcategories**: Automatically generate subcategories using AI models based on the context and data provided.\n- **AI Classification**: Automatically classify items into appropriate categories using advanced AI models.\n\n## Quickstart Guide ([colab](https://colab.research.google.com/drive/1BgUYdeT6aP23nYm2zopjLNKAB_9-7Hp2?usp=sharing)) \ud83d\ude80\n\n\nIn this quickstart, we'll walk you through the process of using `taxonomy-synthesis` to create a simplified phylogenetic tree for a list of animals. We'll demonstrate how to initialize the package, set up an OpenAI client, manually create a taxonomy tree, generate subcategories automatically, and classify items using AI.\n\n### 1. Download and Install the Package\n\nFirst, ensure you have the package installed. You can install taxonomy-synthesis directly using pip:\n\n```bash\npip install taxonomy-synthesis\n```\n\n### 2. Set Up OpenAI Client\n\nBefore proceeding, make sure you have an OpenAI API key.\n\n```python\n# Set up the OpenAI client\nfrom openai import OpenAI\n\nclient = OpenAI(api_key='sk-...')\n```\n\n### 3. Prepare Your Data\n\nWe'll start with a list of 10 animal species, each represented with an arbitrary schema containing fields like `name`, `fun fact`, `lifespan`, and `emoji`. The only required field is `id`, which should be unique for each item.\n\n```python\n# Prepare a list of items (animals) with various attributes\nitems = [\n  {\"id\": \"\ud83e\udd98\", \"name\": \"Kangaroo\", \"fun_fact\": \"Can hop at high speeds\", \"lifespan_years\": 23, \"emoji\": \"\ud83e\udd98\"},\n  {\"id\": \"\ud83d\udc28\", \"name\": \"Koala\", \"fun_fact\": \"Sleeps up to 22 hours a day\", \"lifespan_years\": 18, \"emoji\": \"\ud83d\udc28\"},\n  {\"id\": \"\ud83d\udc18\", \"name\": \"Elephant\", \"fun_fact\": \"Largest land animal\", \"lifespan_years\": 60, \"emoji\": \"\ud83d\udc18\"},\n  {\"id\": \"\ud83d\udc15\", \"name\": \"Dog\", \"fun_fact\": \"Best friend of humans\", \"lifespan_years\": 15, \"emoji\": \"\ud83d\udc15\"},\n  {\"id\": \"\ud83d\udc04\", \"name\": \"Cow\", \"fun_fact\": \"Gives milk\", \"lifespan_years\": 20, \"emoji\": \"\ud83d\udc04\"},\n  {\"id\": \"\ud83d\udc01\", \"name\": \"Mouse\", \"fun_fact\": \"Can squeeze through tiny gaps\", \"lifespan_years\": 2, \"emoji\": \"\ud83d\udc01\"},\n  {\"id\": \"\ud83d\udc0a\", \"name\": \"Crocodile\", \"fun_fact\": \"Lives in water and land\", \"lifespan_years\": 70, \"emoji\": \"\ud83d\udc0a\"},\n  {\"id\": \"\ud83d\udc0d\", \"name\": \"Snake\", \"fun_fact\": \"No legs\", \"lifespan_years\": 9, \"emoji\": \"\ud83d\udc0d\"},\n  {\"id\": \"\ud83d\udc22\", \"name\": \"Turtle\", \"fun_fact\": \"Can live over 100 years\", \"lifespan_years\": 100, \"emoji\": \"\ud83d\udc22\"},\n  {\"id\": \"\ud83e\udd8e\", \"name\": \"Gecko\", \"fun_fact\": \"Can climb walls\", \"lifespan_years\": 5, \"emoji\": \"\ud83e\udd8e\"}\n]\n```\n\n### 4. Initialize the Tree Structure\n\nCreate the root node for our taxonomy tree and initialize two subclasses: `Mammals` and `Reptiles`.\n\n```python\nfrom taxonomy_synthesis.models import Category, Item\nfrom taxonomy_synthesis.tree.tree_node import TreeNode\n\n# Create root node and two primary subclasses\nroot_category = Category(name=\"Animals\", description=\"All animals\")\nmammal_category = Category(name=\"Mammals\", description=\"Mammal species\")\nreptile_category = Category(name=\"Reptiles\", description=\"Reptile species\")\n\nroot_node = TreeNode(value=root_category)\nmammal_node = TreeNode(value=mammal_category)\nreptile_node = TreeNode(value=reptile_category)\n\n# Add subclasses to the root node\nroot_node.add_child(mammal_node)\nroot_node.add_child(reptile_node)\n```\n\n### 5. Classify Items in the Root Node\n\nClassify all items under the root node into `Mammals` or `Reptiles` using the AI classifier.\n\n```python\nfrom taxonomy_synthesis.tree.node_operator import NodeOperator\nfrom taxonomy_synthesis.classifiers.gpt_classifier import GPTClassifier\n\n# Initialize the GPT classifier and node operator\nclassifier = GPTClassifier(client=client)\ngenerator = None  # We'll use manual generation for this part\noperator = NodeOperator(classifier=classifier, generator=generator)\n\n# Convert dictionary items to Item objects and classify\nitem_objects = [Item(**item) for item in items]\nclassified_items = operator.classify_items(root_node, item_objects)\n\nprint(\"After initial classification:\")\nprint(root_node.print_tree())\n```\n_Output:_\n```\nAfter initial classification:\nAnimals: []\n  Mammals: [\ud83e\udd98, \ud83d\udc28, \ud83d\udc18, \ud83d\udc15, \ud83d\udc04, \ud83d\udc01]\n  Reptiles: [\ud83d\udc0a, \ud83d\udc0d, \ud83d\udc22, \ud83e\udd8e]\n```\n\n### 6. Generate Subcategories for Mammals\n\nUse AI to automatically generate subcategories under `Mammals` based on the provided data.\n\n```python\nfrom taxonomy_synthesis.generator.taxonomy_generator import TaxonomyGenerator\n\n# Initialize the Taxonomy Generator\ngenerator = TaxonomyGenerator(\n    client=client,\n    max_categories=2,\n    generation_method=\"Create categories inaccordance to the philogenetic tree.\"\n)\noperator.generator = generator\n\n# Generate subcategories under Mammals\nnew_categories = operator.generate_subcategories(mammal_node)\n\nprint(\"Generated subcategories under 'Mammals':\")\nprint(mammal_node.print_tree())\n```\n_Output:_\n```\nGenerated subcategories under 'Mammals':\nMammals: [\ud83e\udd98, \ud83d\udc28, \ud83d\udc18, \ud83d\udc15, \ud83d\udc04, \ud83d\udc01]\n  marsupials: []\n  placentals: []\n```\n\n### 7. Reclassify Items under Mammals\n\nNow classify the items specifically under the `Mammals` node into their newly generated subcategories.\n\n```python\n# Reclassify items under Mammals based on the new subcategories\nclassified_items = operator.classify_items(mammal_node, mammal_node.get_all_items())\n\nprint(\"After reclassification under 'Mammals':\")\nprint(root_node.print_tree())\n```\n_Output:_\n```\nAfter reclassification under 'Mammals':\nMammals: []\n  marsupials: [\ud83e\udd98, \ud83d\udc28]\n  placentals: [\ud83d\udc18, \ud83d\udc15, \ud83d\udc04, \ud83d\udc01]\n```\n\n### 8. Print the Final Tree Structure\n\nFinally, print the entire tree to see the categorized structure.\n\n```python\n# Print the final tree structure\nprint(\"Final taxonomy tree structure:\")\nprint(root_node.print_tree())\n```\n_Output:_\n```\nFinal taxonomy tree structure:\nAnimals: []\n  Mammals: []\n    marsupials: [\ud83e\udd98, \ud83d\udc28]\n    placentals: [\ud83d\udc18, \ud83d\udc15, \ud83d\udc04, \ud83d\udc01]\n  Reptiles: [\ud83d\udc0a, \ud83d\udc0d, \ud83d\udc22, \ud83e\udd8e]\n```\n\n## System Diagram \ud83c\udfa8\n\nFor a visual representation of the system architecture and its components, refer to the following diagram:\n\n![v1 Class Diagram](https://github.com/user-attachments/assets/ffdbe2b1-4ad4-4b2b-9a72-5b14b2f3adfa)\n\n## Contributing \ud83e\udd17\n\nContributions are welcome! To get started, follow these steps to set up your development environment:\n\n1. **Clone the Repository**:\n\n   ```bash\n   git clone https://github.com/CakeCrusher/TaxonomySynthesis.git\n   cd taxonomy-synthesis\n   ```\n\n2. **Install Poetry** (if not already installed):\n\n   ```bash\n   curl -sSL https://install.python-poetry.org | python3 -\n   ```\n\n3. **Install Dependencies**:\n\n   Use Poetry to install all the dependencies in a virtual environment:\n\n   ```bash\n   poetry install\n   ```\n\n4. **Activate the Virtual Environment**:\n\n   To activate the virtual environment created by Poetry:\n\n   ```bash\n   poetry shell\n   ```\n\n5. **Run Pre-Commit Hooks**:\n\n   To maintain code quality, please run pre-commit hooks before submitting any pull requests:\n\n   ```bash\n   poetry run pre-commit install\n   poetry run pre-commit run --all-files\n   ```\n\nWe encourage you to open issues for any bugs you encounter or features you'd like to see added. Pull requests are also highly appreciated! Let's work together to improve and expand this project.",
    "bugtrack_url": null,
    "license": null,
    "summary": "An AI-driven framework for synthesizing adaptive taxonomies, enabling automated data categorization and classification within dynamic hierarchical structures.",
    "version": "0.1.9",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fa96ac0cce6b7b48348f67ed04d5d6dc28d17d65335e44eae804059eb089d3a1",
                "md5": "eca348b598cfe2f98cf755828ad83807",
                "sha256": "16be28436813ff76e3b69c55cf1439c8d0e19ecb2ec1c1f6094abe7bd5acf0a4"
            },
            "downloads": -1,
            "filename": "taxonomy_synthesis-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "eca348b598cfe2f98cf755828ad83807",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 13534,
            "upload_time": "2024-11-05T06:02:00",
            "upload_time_iso_8601": "2024-11-05T06:02:00.060363Z",
            "url": "https://files.pythonhosted.org/packages/fa/96/ac0cce6b7b48348f67ed04d5d6dc28d17d65335e44eae804059eb089d3a1/taxonomy_synthesis-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "247b1c4a6c4ca0eec75b7b5eff0e91612935c3bf5617432152ba96caa93729b6",
                "md5": "60c2ebe76a2bfc832cd0a7a09f3437f3",
                "sha256": "6abae374babbba384db0cd4b21fc4cd2cf46f6fc4af041e1910314e1e80d8e65"
            },
            "downloads": -1,
            "filename": "taxonomy_synthesis-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "60c2ebe76a2bfc832cd0a7a09f3437f3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 12430,
            "upload_time": "2024-11-05T06:02:01",
            "upload_time_iso_8601": "2024-11-05T06:02:01.339771Z",
            "url": "https://files.pythonhosted.org/packages/24/7b/1c4a6c4ca0eec75b7b5eff0e91612935c3bf5617432152ba96caa93729b6/taxonomy_synthesis-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-05 06:02:01",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "taxonomy-synthesis"
}
        
Elapsed time: 0.36211s