wyn-transformers


Namewyn-transformers JSON
Version 0.1.8 PyPI version JSON
download
home_pageNone
SummaryThe official package to train a transformer from scratch.
upload_time2024-08-27 00:09:53
maintainerNone
docs_urlNone
authorYiqiao Yin
requires_python<4.0,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # wyn-transformers 🧠✨

A package that allows developers to train a transformer model from scratch, tailored for sentence-to-sentence tasks, such as question-answer pairs.

Repo is [here](https://github.com/yiqiao-yin/wyn-transformers).

<details>
<summary>📺 Click here for YouTube Tutorials</summary>

1. [Introduction to wyn-transformers](https://youtu.be/D-bbwlV7arU)
2. [Train on Custom Data Frame](https://youtu.be/IZkJwIXRao4)
3. [How to Fine-tune Transformers](https://youtu.be/RJ-kxr5LMQA)
4. [Push and save model to HuggingFace cloud](https://youtu.be/CtVJuQ1knoY)
5. [Load pre-trained transformer and train again](https://youtu.be/ZgzHg7j73Ms)

*More tutorials coming soon!*

</details>

<details>
<summary>📓 Click here for Jupyter Notebook Examples</summary>

1. [Basic Transformer Training Example](https://github.com/yiqiao-yin/WYNAssociates/blob/main/docs/ref-deeplearning/ex_%20-%20wyn-transformers%20tutorial%20-%20part%201-5.ipynb)

*Check the notebooks folder in the repository for more examples.*

</details>


## Description

`wyn-transformers` is a Python package designed to simplify the process of training transformer models from scratch. It's ideal for tasks involving sentence-to-sentence transformations, like building models for question-answering systems.

## Folder Directory 📁

Here's the folder directory for your `wyn-transformers` package using the specified style:

```
wyn-transformers
├── pyproject.toml
├── README.md
├── wyn_transformers
│   ├── __init__.py
│   ├── transformers.py
│   ├── inference.py
│   └── push_to_hub.py
└── tests
    └── __init__.py
```

- **`pyproject.toml`**: The configuration file for the Poetry package manager, which includes metadata and dependencies for your package.
- **`README.md`**: The markdown file that provides information and instructions about the `wyn-transformers` package.
- **`wyn_transformers`**: The main package directory containing the core Python files.
  - **`__init__.py`**: Initializes the `wyn_transformers` package.
  - **`transformers.py`**: Defines the Transformer model and helper functions.
  - **`inference.py`**: Contains functions for making inferences from the trained model and converting tokens back to text.
  - **`push_to_hub.py`**: Provides functionality to push the trained TensorFlow model to HuggingFace, requiring a HuggingFace token.
- **`tests`**: The directory for test scripts and files.
  - **`__init__.py`**: Initializes the tests package.

## Installation 🛠️

To install the `wyn-transformers` package, run:

```bash
! pip install wyn-transformers
```

## Usage 🚀

### Importing the Package

```python
import wyn_transformers
from wyn_transformers.transformers import *

# Hyperparameters
num_layers = 2
d_model = 64
dff = 128
num_heads = 4
input_vocab_size = 8500
maximum_position_encoding = 10000

# Instantiate the Transformer model
transformer = TransformerModel(num_layers, d_model, num_heads, dff, input_vocab_size, maximum_position_encoding)

# Compile the model
transformer.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Generate random sample data
sample_data = np.random.randint(0, input_vocab_size, size=(64, 38))

# Fit the model on the random sample data
transformer.fit(sample_data, sample_data, epochs=5)
```

### Using Custom Question-Answer Pairs 📊

You can use a pandas DataFrame to train the model with custom question-answer pairs. Here's an example to get you started:

```python
import tensorflow as tf
import pandas as pd
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Create a sample pandas DataFrame
data = {
    'question': [
        'What is the capital of France?',
        'How many continents are there?',
        'What is the largest mammal?',
        'Who wrote the play Hamlet?'
    ],
    'answer': [
        'The capital of France is Paris.',
        'There are seven continents.',
        'The blue whale is the largest mammal.',
        'William Shakespeare wrote Hamlet.'
    ]
}

# Or read it from a directory
# data = pd.DataFrame("test.csv")

df = pd.DataFrame(data)
df

# Initialize the Tokenizer
tokenizer = Tokenizer(num_words=10000, oov_token="")

# Fit the tokenizer on the questions and answers
tokenizer.fit_on_texts(df['question'].tolist() + df['answer'].tolist())

# Convert texts to sequences
question_sequences = tokenizer.texts_to_sequences(df['question'].tolist())
answer_sequences = tokenizer.texts_to_sequences(df['answer'].tolist())

# Pad sequences to ensure consistent input size for the model
max_length = 10  # Example fixed length; this can be adjusted as needed
question_padded = pad_sequences(question_sequences, maxlen=max_length, padding='post')
answer_padded = pad_sequences(answer_sequences, maxlen=max_length, padding='post')

# Combine questions and answers for training
sample_data = np.concatenate((question_padded, answer_padded), axis=0)

# Display the prepared sample data
print("Sample data (tokenized and padded):\n", sample_data)
```

### Converting Tokens Back to Text ✨

Use the `inference` module to convert tokenized sequences back to readable text:

```python
import tensorflow as tf
from wyn_transformers.inference import *

# Testing the function to convert back to text
print("Original token:")
print(question_padded)
print("\nConverted back to text (questions):")
print(sequences_to_text(question_padded, tokenizer))

print("Original token:")
print(answer_padded)
print("\nConverted back to text (answers):")
print(sequences_to_text(answer_padded, tokenizer))
```

### Training with Custom Data 🔄

With the custom tokenized data ready, train the model as before:

```python
# Hyperparameters
num_layers = 2
d_model = 64
dff = 128
num_heads = 4
input_vocab_size = 8500
maximum_position_encoding = 10000

# Instantiate the Transformer model
transformer = TransformerModel(num_layers, d_model, num_heads, dff, input_vocab_size, maximum_position_encoding)

# Compile the model
transformer.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Fit the model on the custom sample data
transformer.fit(sample_data, sample_data, epochs=5)
```

### Question-Answer

After the model is trained, it is usually desired to make an inference. This way the user can experience the outcome of the model. To do this, we can use the following code to send a question into the function `predict_text`. The function is designed to tokenize the question, make a prediction using the trained model, and convert the numerical output back to texts.

```python
# Test the function with the example input
input_text = "what is the capital of France?"
predicted_response = predict_text(input_text, transformer, tokenizer, max_length=15)
print("Predicted Response:", predicted_response)
```

### Push model to the cloud

When you are at a good stopping point, you can use the following code to push your model to the HuggingFace cloud environment. We provide helper function `push_model_to_huggingface` to assit you to serialize the model to write the artifact, weights, and architecture to the cloud. 

```python
from wyn_transformers.push_to_hub import *

# Example usage:
huggingface_token = "HF_TOKEN_HERE"
account_name = "HF_ACCOUNT_NAME"
model_name = "MODEL_NAME"

# Call the function to push the model
# result = push_model_to_huggingface(huggingface_token, account_name, transformer, model_name)
result = push_model_to_huggingface(huggingface_token, account_name, transformer, model_name, tokenizer)
print(result)
```

### Load Pre-trained Model

When you desire to load the model back, you can use the following code to load your pre-trained transformer model to continue fine-tuning.

```python
from huggingface_hub import hf_hub_download
import tensorflow as tf
import os
import json
import pickle

# Define the Hugging Face model repository path
model_repo_url = f"{account_name}/{model_name}"

# Step 1: Download the model file from Hugging Face
model_filename = f"{model_name}.keras"
model_file_path = hf_hub_download(repo_id=model_repo_url, filename=model_filename, use_auth_token=huggingface_token)

# Step 2: Load the pre-trained model from the downloaded file
pre_trained_transformer = tf.keras.models.load_model(model_file_path, custom_objects={"TransformerModel": TransformerModel})

# Step 3: Compile the model to prepare for further training
pre_trained_transformer.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Step 4: Reload the tokenizer (if used) by downloading tokenizer files from Hugging Face
tokenizer_config_path = hf_hub_download(repo_id=model_repo_url, filename="tokenizer_config.json", use_auth_token=huggingface_token)
vocab_path = hf_hub_download(repo_id=model_repo_url, filename="vocab.pkl", use_auth_token=huggingface_token)

# Load the tokenizer configuration from the downloaded file
with open(tokenizer_config_path, "r") as f:
    tokenizer_config = json.load(f)

# Recreate the tokenizer using TensorFlow's Tokenizer class
from tensorflow.keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer(
    num_words=tokenizer_config.get("num_words"),
    filters=tokenizer_config.get("filters"),
    lower=tokenizer_config.get("lower"),
    split=tokenizer_config.get("split"),
    char_level=tokenizer_config.get("char_level")
)
tokenizer.word_index = tokenizer_config.get("word_index")
tokenizer.index_word = tokenizer_config.get("index_word")

# Load the vocabulary from the pickle file
with open(vocab_path, "rb") as f:
    tokenizer.word_index = pickle.load(f)

# Clean up downloaded files
os.remove(tokenizer_config_path)
os.remove(vocab_path)
```

## Author 👨‍💻

Yiqiao Yin  
Personal site: [y-yin.io](https://www.y-yin.io/)  
Email: eagle0504@gmail.com  

Feel free to reach out for any questions or collaborations!
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "wyn-transformers",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Yiqiao Yin",
    "author_email": "eagle0504@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/38/12/4afd1f0c8f629d34ff067a3dc8c5cc4db17b309930c04830b542be7f471c/wyn_transformers-0.1.8.tar.gz",
    "platform": null,
    "description": "# wyn-transformers \ud83e\udde0\u2728\n\nA package that allows developers to train a transformer model from scratch, tailored for sentence-to-sentence tasks, such as question-answer pairs.\n\nRepo is [here](https://github.com/yiqiao-yin/wyn-transformers).\n\n<details>\n<summary>\ud83d\udcfa Click here for YouTube Tutorials</summary>\n\n1. [Introduction to wyn-transformers](https://youtu.be/D-bbwlV7arU)\n2. [Train on Custom Data Frame](https://youtu.be/IZkJwIXRao4)\n3. [How to Fine-tune Transformers](https://youtu.be/RJ-kxr5LMQA)\n4. [Push and save model to HuggingFace cloud](https://youtu.be/CtVJuQ1knoY)\n5. [Load pre-trained transformer and train again](https://youtu.be/ZgzHg7j73Ms)\n\n*More tutorials coming soon!*\n\n</details>\n\n<details>\n<summary>\ud83d\udcd3 Click here for Jupyter Notebook Examples</summary>\n\n1. [Basic Transformer Training Example](https://github.com/yiqiao-yin/WYNAssociates/blob/main/docs/ref-deeplearning/ex_%20-%20wyn-transformers%20tutorial%20-%20part%201-5.ipynb)\n\n*Check the notebooks folder in the repository for more examples.*\n\n</details>\n\n\n## Description\n\n`wyn-transformers` is a Python package designed to simplify the process of training transformer models from scratch. It's ideal for tasks involving sentence-to-sentence transformations, like building models for question-answering systems.\n\n## Folder Directory \ud83d\udcc1\n\nHere's the folder directory for your `wyn-transformers` package using the specified style:\n\n```\nwyn-transformers\n\u251c\u2500\u2500 pyproject.toml\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 wyn_transformers\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 transformers.py\n\u2502   \u251c\u2500\u2500 inference.py\n\u2502   \u2514\u2500\u2500 push_to_hub.py\n\u2514\u2500\u2500 tests\n    \u2514\u2500\u2500 __init__.py\n```\n\n- **`pyproject.toml`**: The configuration file for the Poetry package manager, which includes metadata and dependencies for your package.\n- **`README.md`**: The markdown file that provides information and instructions about the `wyn-transformers` package.\n- **`wyn_transformers`**: The main package directory containing the core Python files.\n  - **`__init__.py`**: Initializes the `wyn_transformers` package.\n  - **`transformers.py`**: Defines the Transformer model and helper functions.\n  - **`inference.py`**: Contains functions for making inferences from the trained model and converting tokens back to text.\n  - **`push_to_hub.py`**: Provides functionality to push the trained TensorFlow model to HuggingFace, requiring a HuggingFace token.\n- **`tests`**: The directory for test scripts and files.\n  - **`__init__.py`**: Initializes the tests package.\n\n## Installation \ud83d\udee0\ufe0f\n\nTo install the `wyn-transformers` package, run:\n\n```bash\n! pip install wyn-transformers\n```\n\n## Usage \ud83d\ude80\n\n### Importing the Package\n\n```python\nimport wyn_transformers\nfrom wyn_transformers.transformers import *\n\n# Hyperparameters\nnum_layers = 2\nd_model = 64\ndff = 128\nnum_heads = 4\ninput_vocab_size = 8500\nmaximum_position_encoding = 10000\n\n# Instantiate the Transformer model\ntransformer = TransformerModel(num_layers, d_model, num_heads, dff, input_vocab_size, maximum_position_encoding)\n\n# Compile the model\ntransformer.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])\n\n# Generate random sample data\nsample_data = np.random.randint(0, input_vocab_size, size=(64, 38))\n\n# Fit the model on the random sample data\ntransformer.fit(sample_data, sample_data, epochs=5)\n```\n\n### Using Custom Question-Answer Pairs \ud83d\udcca\n\nYou can use a pandas DataFrame to train the model with custom question-answer pairs. Here's an example to get you started:\n\n```python\nimport tensorflow as tf\nimport pandas as pd\nimport numpy as np\nfrom tensorflow.keras.preprocessing.text import Tokenizer\nfrom tensorflow.keras.preprocessing.sequence import pad_sequences\n\n# Create a sample pandas DataFrame\ndata = {\n    'question': [\n        'What is the capital of France?',\n        'How many continents are there?',\n        'What is the largest mammal?',\n        'Who wrote the play Hamlet?'\n    ],\n    'answer': [\n        'The capital of France is Paris.',\n        'There are seven continents.',\n        'The blue whale is the largest mammal.',\n        'William Shakespeare wrote Hamlet.'\n    ]\n}\n\n# Or read it from a directory\n# data = pd.DataFrame(\"test.csv\")\n\ndf = pd.DataFrame(data)\ndf\n\n# Initialize the Tokenizer\ntokenizer = Tokenizer(num_words=10000, oov_token=\"\")\n\n# Fit the tokenizer on the questions and answers\ntokenizer.fit_on_texts(df['question'].tolist() + df['answer'].tolist())\n\n# Convert texts to sequences\nquestion_sequences = tokenizer.texts_to_sequences(df['question'].tolist())\nanswer_sequences = tokenizer.texts_to_sequences(df['answer'].tolist())\n\n# Pad sequences to ensure consistent input size for the model\nmax_length = 10  # Example fixed length; this can be adjusted as needed\nquestion_padded = pad_sequences(question_sequences, maxlen=max_length, padding='post')\nanswer_padded = pad_sequences(answer_sequences, maxlen=max_length, padding='post')\n\n# Combine questions and answers for training\nsample_data = np.concatenate((question_padded, answer_padded), axis=0)\n\n# Display the prepared sample data\nprint(\"Sample data (tokenized and padded):\\n\", sample_data)\n```\n\n### Converting Tokens Back to Text \u2728\n\nUse the `inference` module to convert tokenized sequences back to readable text:\n\n```python\nimport tensorflow as tf\nfrom wyn_transformers.inference import *\n\n# Testing the function to convert back to text\nprint(\"Original token:\")\nprint(question_padded)\nprint(\"\\nConverted back to text (questions):\")\nprint(sequences_to_text(question_padded, tokenizer))\n\nprint(\"Original token:\")\nprint(answer_padded)\nprint(\"\\nConverted back to text (answers):\")\nprint(sequences_to_text(answer_padded, tokenizer))\n```\n\n### Training with Custom Data \ud83d\udd04\n\nWith the custom tokenized data ready, train the model as before:\n\n```python\n# Hyperparameters\nnum_layers = 2\nd_model = 64\ndff = 128\nnum_heads = 4\ninput_vocab_size = 8500\nmaximum_position_encoding = 10000\n\n# Instantiate the Transformer model\ntransformer = TransformerModel(num_layers, d_model, num_heads, dff, input_vocab_size, maximum_position_encoding)\n\n# Compile the model\ntransformer.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])\n\n# Fit the model on the custom sample data\ntransformer.fit(sample_data, sample_data, epochs=5)\n```\n\n### Question-Answer\n\nAfter the model is trained, it is usually desired to make an inference. This way the user can experience the outcome of the model. To do this, we can use the following code to send a question into the function `predict_text`. The function is designed to tokenize the question, make a prediction using the trained model, and convert the numerical output back to texts.\n\n```python\n# Test the function with the example input\ninput_text = \"what is the capital of France?\"\npredicted_response = predict_text(input_text, transformer, tokenizer, max_length=15)\nprint(\"Predicted Response:\", predicted_response)\n```\n\n### Push model to the cloud\n\nWhen you are at a good stopping point, you can use the following code to push your model to the HuggingFace cloud environment. We provide helper function `push_model_to_huggingface` to assit you to serialize the model to write the artifact, weights, and architecture to the cloud. \n\n```python\nfrom wyn_transformers.push_to_hub import *\n\n# Example usage:\nhuggingface_token = \"HF_TOKEN_HERE\"\naccount_name = \"HF_ACCOUNT_NAME\"\nmodel_name = \"MODEL_NAME\"\n\n# Call the function to push the model\n# result = push_model_to_huggingface(huggingface_token, account_name, transformer, model_name)\nresult = push_model_to_huggingface(huggingface_token, account_name, transformer, model_name, tokenizer)\nprint(result)\n```\n\n### Load Pre-trained Model\n\nWhen you desire to load the model back, you can use the following code to load your pre-trained transformer model to continue fine-tuning.\n\n```python\nfrom huggingface_hub import hf_hub_download\nimport tensorflow as tf\nimport os\nimport json\nimport pickle\n\n# Define the Hugging Face model repository path\nmodel_repo_url = f\"{account_name}/{model_name}\"\n\n# Step 1: Download the model file from Hugging Face\nmodel_filename = f\"{model_name}.keras\"\nmodel_file_path = hf_hub_download(repo_id=model_repo_url, filename=model_filename, use_auth_token=huggingface_token)\n\n# Step 2: Load the pre-trained model from the downloaded file\npre_trained_transformer = tf.keras.models.load_model(model_file_path, custom_objects={\"TransformerModel\": TransformerModel})\n\n# Step 3: Compile the model to prepare for further training\npre_trained_transformer.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])\n\n# Step 4: Reload the tokenizer (if used) by downloading tokenizer files from Hugging Face\ntokenizer_config_path = hf_hub_download(repo_id=model_repo_url, filename=\"tokenizer_config.json\", use_auth_token=huggingface_token)\nvocab_path = hf_hub_download(repo_id=model_repo_url, filename=\"vocab.pkl\", use_auth_token=huggingface_token)\n\n# Load the tokenizer configuration from the downloaded file\nwith open(tokenizer_config_path, \"r\") as f:\n    tokenizer_config = json.load(f)\n\n# Recreate the tokenizer using TensorFlow's Tokenizer class\nfrom tensorflow.keras.preprocessing.text import Tokenizer\ntokenizer = Tokenizer(\n    num_words=tokenizer_config.get(\"num_words\"),\n    filters=tokenizer_config.get(\"filters\"),\n    lower=tokenizer_config.get(\"lower\"),\n    split=tokenizer_config.get(\"split\"),\n    char_level=tokenizer_config.get(\"char_level\")\n)\ntokenizer.word_index = tokenizer_config.get(\"word_index\")\ntokenizer.index_word = tokenizer_config.get(\"index_word\")\n\n# Load the vocabulary from the pickle file\nwith open(vocab_path, \"rb\") as f:\n    tokenizer.word_index = pickle.load(f)\n\n# Clean up downloaded files\nos.remove(tokenizer_config_path)\nos.remove(vocab_path)\n```\n\n## Author \ud83d\udc68\u200d\ud83d\udcbb\n\nYiqiao Yin  \nPersonal site: [y-yin.io](https://www.y-yin.io/)  \nEmail: eagle0504@gmail.com  \n\nFeel free to reach out for any questions or collaborations!",
    "bugtrack_url": null,
    "license": null,
    "summary": "The official package to train a transformer from scratch.",
    "version": "0.1.8",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6e5b596e719638da34d0da26d12a831a3dce81994e4a82815c005e50912c3e0e",
                "md5": "020e9ee0d1fd1456515617b87fa2a330",
                "sha256": "714f1b47d18a2144c19dbe9770dd17a5c8b640e23a2e3ba3c1f327df23760d57"
            },
            "downloads": -1,
            "filename": "wyn_transformers-0.1.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "020e9ee0d1fd1456515617b87fa2a330",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 12334,
            "upload_time": "2024-08-27T00:09:52",
            "upload_time_iso_8601": "2024-08-27T00:09:52.181334Z",
            "url": "https://files.pythonhosted.org/packages/6e/5b/596e719638da34d0da26d12a831a3dce81994e4a82815c005e50912c3e0e/wyn_transformers-0.1.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "38124afd1f0c8f629d34ff067a3dc8c5cc4db17b309930c04830b542be7f471c",
                "md5": "74ed854dad5360508788bc557b4a2317",
                "sha256": "9ede26d4233d9ffbc4dbb478e9e9f0ff67c2026577688a75d2e24f5d6332f612"
            },
            "downloads": -1,
            "filename": "wyn_transformers-0.1.8.tar.gz",
            "has_sig": false,
            "md5_digest": "74ed854dad5360508788bc557b4a2317",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 13911,
            "upload_time": "2024-08-27T00:09:53",
            "upload_time_iso_8601": "2024-08-27T00:09:53.488445Z",
            "url": "https://files.pythonhosted.org/packages/38/12/4afd1f0c8f629d34ff067a3dc8c5cc4db17b309930c04830b542be7f471c/wyn_transformers-0.1.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-27 00:09:53",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "wyn-transformers"
}
        
Elapsed time: 0.31170s