flexitransformers

Name	flexitransformers JSON
Version	0.2.1 JSON
	download
home_page	None
Summary	This repository provides a flexible and modular implementation of the Transformer architecture,
upload_time	2025-10-06 23:28:44
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	MIT
keywords	transformer nlp deep learning machine learning attention
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # FlexiTransformers

[![FlexiTransformers Logo](docs/_static/new_logo.png)](https://github.com/A-Elshahawy/flexitransformers)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/flexitransformers.svg)](https://pypi.org/project/flexitransformers/0.2.1/) [![Python 3.10+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) [![PyTorch](https://img.shields.io/badge/PyTorch-2.0.1%2B-red.svg)](https://pytorch.org/) [![Documentation](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://a-elshahawy.github.io/FlexiTransformers/)[![Code style: Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) ![mypy](https://img.shields.io/badge/mypy-type%20checked-blue) ![pre-commit](https://img.shields.io/badge/pre--commit-enabled-success)

A modular transformer framework for educational purposes, enabling flexible experimentation with encoder-decoder, encoder-only (BERT-style), and decoder-only (GPT-style) architectures.

> **Note:** This library is primarily designed for educational purposes and research experimentation. For production use cases, consider mature frameworks like Hugging Face Transformers.

## Features

| Feature                        | Support                                                       |
| ------------------------------ | ------------------------------------------------------------- |
| **Model Types**          | Encoder-Decoder, Encoder-Only, Decoder-Only                   |
| **Attention Mechanisms** | Absolute, ALiBi, Relative (Transformer-XL), Rotary (RoFormer) |
| **Positional Encodings** | Absolute (sinusoidal), ALiBi, Rotary                          |
| **Normalization**        | Pre-norm, Post-norm                                           |
| **Training Utilities**   | Built-in Trainer, Callbacks, Learning rate scheduling         |
| **Custom Architectures** | Full configuration control                                    |

## Installation

**Requirements:**

* Python 3.11+
* PyTorch 2.0.1+

### Via pip

```bash
pip install flexitransformers
```

### From source

```bash
git clone https://github.com/A-Elshahawy/flexitransformers.git
cd flexitransformers
pip install -e .
```

**Import the library as `flexit` in your code.**

## Quick Start

### 1. Encoder-Decoder (Seq2Seq Translation)

```python
import torch
from flexit.models import FlexiTransformer
from flexit.utils import subsequent_mask

# Define model configuration
model = FlexiTransformer(
    model_type='encoder-decoder',
    src_vocab=10000,
    tgt_vocab=10000,
    d_model=512,
    n_heads=8,
    n_layers=6,
    dropout=0.1,
    pe_type='absolute'  # or 'alibi', 'rotary'
)

# Create sample data
batch_size, seq_len = 32, 64
src = torch.randint(0, 10000, (batch_size, seq_len))
tgt = torch.randint(0, 10000, (batch_size, seq_len))

# Create masks (assuming 0 is padding)
src_mask = (src != 0).unsqueeze(-2)
tgt_mask = (tgt != 0).unsqueeze(-2) & subsequent_mask(tgt.size(-1))

# Forward pass
output = model(src, tgt, src_mask, tgt_mask)
print(f"Output shape: {output.shape}")  # [32, 64, 512]
```

### 2. Encoder-Only (BERT-style Classification)

```python
from flexit.models import FlexiBERT

# BERT-style model for binary classification
model = FlexiBERT(
    src_vocab=30000,
    num_classes=2,
    d_model=768,
    n_heads=12,
    n_layers=12,
    pe_type='alibi',  # ALiBi works well for BERT-style models
    dropout=0.1
)

# Input data
input_ids = torch.randint(0, 30000, (32, 128))
attention_mask = (input_ids != 0).unsqueeze(-2)

# Get classification logits
logits = model(input_ids, attention_mask)
print(f"Logits shape: {logits.shape}")  # [32, 2]
```

### 3. Decoder-Only (GPT-style Language Model)

```python
from flexit.models import FlexiGPT

# GPT-style autoregressive model
model = FlexiGPT(
    tgt_vocab=50000,
    d_model=768,
    n_heads=12,
    n_layers=12,
    pe_type='rotary',  # Rotary embeddings work well for GPT-style
    dropout=0.1
)

# Input sequence
input_ids = torch.randint(0, 50000, (32, 128))
tgt_mask = subsequent_mask(input_ids.size(-1))

# Forward pass
output = model(input_ids, tgt_mask)
print(f"Output shape: {output.shape}")  # [32, 128, 768]
```

## Training

### Basic Training Loop

```python
import torch.optim as optim
from torch.utils.data import DataLoader
from flexit.train import Trainer, Batch
from flexit.loss import LossCompute
from flexit.callbacks import CheckpointCallback, EarlyStoppingCallback

# Prepare your data
train_loader = DataLoader(your_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(your_val_dataset, batch_size=64)

# Setup training components
criterion = torch.nn.CrossEntropyLoss(ignore_index=0)
loss_compute = LossCompute(
    generator=model.generator,
    criterion=criterion,
    model=model,
    grad_clip=1.0
)

optimizer = optim.Adam(model.parameters(), lr=1e-4, betas=(0.9, 0.98))
scheduler = optim.lr_scheduler.LambdaLR(
    optimizer,
    lr_lambda=lambda step: min((step + 1) ** -0.5, (step + 1) * 4000 ** -1.5)
)

# Initialize trainer with callbacks
trainer = Trainer(
    model=model,
    optimizer=optimizer,
    scheduler=scheduler,
    loss_fn=loss_compute,
    train_dataloader=train_loader,
    val_dataloader=val_loader,
    callbacks=[
        CheckpointCallback(save_best=True, keep_last=3),
        EarlyStoppingCallback(patience=5, min_delta=0.001)
    ]
)

# Train the model
metrics = trainer.fit(epochs=20)
print(metrics.to_dict())
```

### Custom Batch Handling

```python
from flexit.train import Batch

# For decoder-only models (GPT-style)
batch = Batch(
    tgt=sequence_tensor,  # [batch_size, seq_len]
    model_type='decoder-only',
    pad=0
)

# For encoder-only models (BERT-style)
batch = Batch(
    src=input_tensor,
    labels=label_tensor,
    model_type='encoder-only',
    pad=0
)

# For encoder-decoder models
batch = Batch(
    src=source_tensor,
    tgt=target_tensor,
    model_type='encoder-decoder',
    pad=0
)
```

## Advanced Configuration

### Comparing Attention Mechanisms

```python
# Experiment with different attention types
configs = {
    'absolute': {'pe_type': 'absolute'},
    'alibi': {'pe_type': 'alibi'},
    'rotary': {'pe_type': 'rotary', 'rope_percentage': 0.5},
    'relative': {'pe_type': 'relative', 'max_len': 1024}
}

for name, config in configs.items():
    model = FlexiTransformer(
        model_type='decoder-only',
        tgt_vocab=10000,
        d_model=512,
        n_heads=8,
        n_layers=6,
        **config
    )
    # Train and evaluate each variant
```

### Asymmetric Encoder-Decoder

```python
# Different layer counts for encoder/decoder
model = FlexiTransformer(
    model_type='encoder-decoder',
    src_vocab=10000,
    tgt_vocab=10000,
    d_model=512,
    n_heads=8,
    n_layers=(12, 6),  # 12 encoder layers, 6 decoder layers
    dropout=0.1
)
```

### Custom Initialization

```python
model = FlexiTransformer(
    src_vocab=10000,
    tgt_vocab=10000,
    init_method='kaiming_uniform',  # or 'xavier_uniform', 'orthogonal'
    ff_activation='gelu',  # or 'relu', 'silu'
    pre_norm=True  # Pre-layer normalization (like GPT)
)
```

## Architecture Variants

### Available Model Classes

* **`FlexiTransformer`** : Base class, fully customizable
* **`FlexiBERT`** : Encoder-only, optimized for classification
* **`FlexiGPT`** : Decoder-only, optimized for generation
* **`TransformerModel`** : Standard encoder-decoder

### Configuration Options

```python
from flexit.configs import ModelConfig

config = ModelConfig(
    model_type='encoder-decoder',  # or 'encoder-only', 'decoder-only'
    src_vocab=10000,
    tgt_vocab=10000,
    d_model=512,          # Model dimension
    d_ff=2048,            # Feed-forward dimension
    n_heads=8,            # Attention heads
    n_layers=6,           # Number of layers (or tuple for asymmetric)
    dropout=0.1,
    pe_type='absolute',   # 'absolute', 'alibi', 'relative', 'rotary'
    pre_norm=True,        # Pre-norm vs post-norm
    ff_activation='relu', # 'relu', 'gelu', 'silu', etc.
    init_method='xavier_uniform'
)
```

## API Reference

**Full documentation:** [https://a-elshahawy.github.io/FlexiTransformers/](https://a-elshahawy.github.io/FlexiTransformers/)

### Key Modules

* **`flexit.models`** : Model classes (`FlexiTransformer`, `FlexiBERT`, `FlexiGPT`)
* **`flexit.attention`** : Attention mechanisms (Absolute, ALiBi, Relative, Rotary)
* **`flexit.train`** : Training utilities (`Trainer`, `Batch`, `LossCompute`)
* **`flexit.callbacks`** : Training callbacks (`CheckpointCallback`, `EarlyStoppingCallback`)
* **`flexit.configs`** : Configuration classes (`ModelConfig`)
* **`flexit.loss`** : Loss functions (`LabelSmoothing`, `BertLoss`)

## Contributing

Contributions are welcome. Please:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/improvement`)
3. Make your changes with tests
4. Run tests and type checking (`mypy`, `ruff`)
5. Submit a pull request

For major changes, open an issue first to discuss the proposed changes.

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Citation

If you use FlexiTransformers in your research, please cite:

```bibtex
@software{flexitransformers2024,
  author = {Elshahawy, Ahmed},
  title = {FlexiTransformers: A Modular Transformer Framework},
  year = {2024},
  url = {https://github.com/A-Elshahawy/flexitransformers}
}
```

## References

This library implements concepts from:

* Vaswani et al. (2017) - "Attention is All You Need"
* Press et al. (2021) - "Train Short, Test Long: Attention with Linear Biases" (ALiBi)
* Su et al. (2021) - "RoFormer: Enhanced Transformer with Rotary Position Embedding"
* Dai et al. (2019) - "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

## Contact

**Ahmed Elshahawy**

* GitHub: [@A-Elshahawy](https://github.com/A-Elshahawy)
* LinkedIn: [Ahmed Elshahawy](https://www.linkedin.com/in/ahmed-elshahawy-a42149218/)
* Email: ahmedelshahawy078@gmail.com

---

**Links:**

* [Documentation](https://a-elshahawy.github.io/FlexiTransformers/)
* [PyPI Package](https://pypi.org/project/flexitransformers/)
* [GitHub Repository](https://github.com/A-Elshahawy/flexitransformers)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "flexitransformers",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "transformer, nlp, deep learning, machine learning, attention",
    "author": null,
    "author_email": "Ahmed Elshahawy <aelshahawy78@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/46/6b/1c7fd65f93769dda489eec94e4d5d78cc0fff2d7a7a02f3cf643a97fef69/flexitransformers-0.2.1.tar.gz",
    "platform": null,
    "description": "# FlexiTransformers\n\n[![FlexiTransformers Logo](docs/_static/new_logo.png)](https://github.com/A-Elshahawy/flexitransformers)\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/flexitransformers.svg)](https://pypi.org/project/flexitransformers/0.2.1/) [![Python 3.10+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) [![PyTorch](https://img.shields.io/badge/PyTorch-2.0.1%2B-red.svg)](https://pytorch.org/) [![Documentation](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://a-elshahawy.github.io/FlexiTransformers/)[![Code style: Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) ![mypy](https://img.shields.io/badge/mypy-type%20checked-blue) ![pre-commit](https://img.shields.io/badge/pre--commit-enabled-success)\n\nA modular transformer framework for educational purposes, enabling flexible experimentation with encoder-decoder, encoder-only (BERT-style), and decoder-only (GPT-style) architectures.\n\n> **Note:** This library is primarily designed for educational purposes and research experimentation. For production use cases, consider mature frameworks like Hugging Face Transformers.\n\n## Features\n\n| Feature                        | Support                                                       |\n| ------------------------------ | ------------------------------------------------------------- |\n| **Model Types**          | Encoder-Decoder, Encoder-Only, Decoder-Only                   |\n| **Attention Mechanisms** | Absolute, ALiBi, Relative (Transformer-XL), Rotary (RoFormer) |\n| **Positional Encodings** | Absolute (sinusoidal), ALiBi, Rotary                          |\n| **Normalization**        | Pre-norm, Post-norm                                           |\n| **Training Utilities**   | Built-in Trainer, Callbacks, Learning rate scheduling         |\n| **Custom Architectures** | Full configuration control                                    |\n\n## Installation\n\n**Requirements:**\n\n* Python 3.11+\n* PyTorch 2.0.1+\n\n### Via pip\n\n```bash\npip install flexitransformers\n```\n\n### From source\n\n```bash\ngit clone https://github.com/A-Elshahawy/flexitransformers.git\ncd flexitransformers\npip install -e .\n```\n\n**Import the library as `flexit` in your code.**\n\n## Quick Start\n\n### 1. Encoder-Decoder (Seq2Seq Translation)\n\n```python\nimport torch\nfrom flexit.models import FlexiTransformer\nfrom flexit.utils import subsequent_mask\n\n# Define model configuration\nmodel = FlexiTransformer(\n    model_type='encoder-decoder',\n    src_vocab=10000,\n    tgt_vocab=10000,\n    d_model=512,\n    n_heads=8,\n    n_layers=6,\n    dropout=0.1,\n    pe_type='absolute'  # or 'alibi', 'rotary'\n)\n\n# Create sample data\nbatch_size, seq_len = 32, 64\nsrc = torch.randint(0, 10000, (batch_size, seq_len))\ntgt = torch.randint(0, 10000, (batch_size, seq_len))\n\n# Create masks (assuming 0 is padding)\nsrc_mask = (src != 0).unsqueeze(-2)\ntgt_mask = (tgt != 0).unsqueeze(-2) & subsequent_mask(tgt.size(-1))\n\n# Forward pass\noutput = model(src, tgt, src_mask, tgt_mask)\nprint(f\"Output shape: {output.shape}\")  # [32, 64, 512]\n```\n\n### 2. Encoder-Only (BERT-style Classification)\n\n```python\nfrom flexit.models import FlexiBERT\n\n# BERT-style model for binary classification\nmodel = FlexiBERT(\n    src_vocab=30000,\n    num_classes=2,\n    d_model=768,\n    n_heads=12,\n    n_layers=12,\n    pe_type='alibi',  # ALiBi works well for BERT-style models\n    dropout=0.1\n)\n\n# Input data\ninput_ids = torch.randint(0, 30000, (32, 128))\nattention_mask = (input_ids != 0).unsqueeze(-2)\n\n# Get classification logits\nlogits = model(input_ids, attention_mask)\nprint(f\"Logits shape: {logits.shape}\")  # [32, 2]\n```\n\n### 3. Decoder-Only (GPT-style Language Model)\n\n```python\nfrom flexit.models import FlexiGPT\n\n# GPT-style autoregressive model\nmodel = FlexiGPT(\n    tgt_vocab=50000,\n    d_model=768,\n    n_heads=12,\n    n_layers=12,\n    pe_type='rotary',  # Rotary embeddings work well for GPT-style\n    dropout=0.1\n)\n\n# Input sequence\ninput_ids = torch.randint(0, 50000, (32, 128))\ntgt_mask = subsequent_mask(input_ids.size(-1))\n\n# Forward pass\noutput = model(input_ids, tgt_mask)\nprint(f\"Output shape: {output.shape}\")  # [32, 128, 768]\n```\n\n## Training\n\n### Basic Training Loop\n\n```python\nimport torch.optim as optim\nfrom torch.utils.data import DataLoader\nfrom flexit.train import Trainer, Batch\nfrom flexit.loss import LossCompute\nfrom flexit.callbacks import CheckpointCallback, EarlyStoppingCallback\n\n# Prepare your data\ntrain_loader = DataLoader(your_dataset, batch_size=64, shuffle=True)\nval_loader = DataLoader(your_val_dataset, batch_size=64)\n\n# Setup training components\ncriterion = torch.nn.CrossEntropyLoss(ignore_index=0)\nloss_compute = LossCompute(\n    generator=model.generator,\n    criterion=criterion,\n    model=model,\n    grad_clip=1.0\n)\n\noptimizer = optim.Adam(model.parameters(), lr=1e-4, betas=(0.9, 0.98))\nscheduler = optim.lr_scheduler.LambdaLR(\n    optimizer,\n    lr_lambda=lambda step: min((step + 1) ** -0.5, (step + 1) * 4000 ** -1.5)\n)\n\n# Initialize trainer with callbacks\ntrainer = Trainer(\n    model=model,\n    optimizer=optimizer,\n    scheduler=scheduler,\n    loss_fn=loss_compute,\n    train_dataloader=train_loader,\n    val_dataloader=val_loader,\n    callbacks=[\n        CheckpointCallback(save_best=True, keep_last=3),\n        EarlyStoppingCallback(patience=5, min_delta=0.001)\n    ]\n)\n\n# Train the model\nmetrics = trainer.fit(epochs=20)\nprint(metrics.to_dict())\n```\n\n### Custom Batch Handling\n\n```python\nfrom flexit.train import Batch\n\n# For decoder-only models (GPT-style)\nbatch = Batch(\n    tgt=sequence_tensor,  # [batch_size, seq_len]\n    model_type='decoder-only',\n    pad=0\n)\n\n# For encoder-only models (BERT-style)\nbatch = Batch(\n    src=input_tensor,\n    labels=label_tensor,\n    model_type='encoder-only',\n    pad=0\n)\n\n# For encoder-decoder models\nbatch = Batch(\n    src=source_tensor,\n    tgt=target_tensor,\n    model_type='encoder-decoder',\n    pad=0\n)\n```\n\n## Advanced Configuration\n\n### Comparing Attention Mechanisms\n\n```python\n# Experiment with different attention types\nconfigs = {\n    'absolute': {'pe_type': 'absolute'},\n    'alibi': {'pe_type': 'alibi'},\n    'rotary': {'pe_type': 'rotary', 'rope_percentage': 0.5},\n    'relative': {'pe_type': 'relative', 'max_len': 1024}\n}\n\nfor name, config in configs.items():\n    model = FlexiTransformer(\n        model_type='decoder-only',\n        tgt_vocab=10000,\n        d_model=512,\n        n_heads=8,\n        n_layers=6,\n        **config\n    )\n    # Train and evaluate each variant\n```\n\n### Asymmetric Encoder-Decoder\n\n```python\n# Different layer counts for encoder/decoder\nmodel = FlexiTransformer(\n    model_type='encoder-decoder',\n    src_vocab=10000,\n    tgt_vocab=10000,\n    d_model=512,\n    n_heads=8,\n    n_layers=(12, 6),  # 12 encoder layers, 6 decoder layers\n    dropout=0.1\n)\n```\n\n### Custom Initialization\n\n```python\nmodel = FlexiTransformer(\n    src_vocab=10000,\n    tgt_vocab=10000,\n    init_method='kaiming_uniform',  # or 'xavier_uniform', 'orthogonal'\n    ff_activation='gelu',  # or 'relu', 'silu'\n    pre_norm=True  # Pre-layer normalization (like GPT)\n)\n```\n\n## Architecture Variants\n\n### Available Model Classes\n\n* **`FlexiTransformer`** : Base class, fully customizable\n* **`FlexiBERT`** : Encoder-only, optimized for classification\n* **`FlexiGPT`** : Decoder-only, optimized for generation\n* **`TransformerModel`** : Standard encoder-decoder\n\n### Configuration Options\n\n```python\nfrom flexit.configs import ModelConfig\n\nconfig = ModelConfig(\n    model_type='encoder-decoder',  # or 'encoder-only', 'decoder-only'\n    src_vocab=10000,\n    tgt_vocab=10000,\n    d_model=512,          # Model dimension\n    d_ff=2048,            # Feed-forward dimension\n    n_heads=8,            # Attention heads\n    n_layers=6,           # Number of layers (or tuple for asymmetric)\n    dropout=0.1,\n    pe_type='absolute',   # 'absolute', 'alibi', 'relative', 'rotary'\n    pre_norm=True,        # Pre-norm vs post-norm\n    ff_activation='relu', # 'relu', 'gelu', 'silu', etc.\n    init_method='xavier_uniform'\n)\n```\n\n## API Reference\n\n**Full documentation:** [https://a-elshahawy.github.io/FlexiTransformers/](https://a-elshahawy.github.io/FlexiTransformers/)\n\n### Key Modules\n\n* **`flexit.models`** : Model classes (`FlexiTransformer`, `FlexiBERT`, `FlexiGPT`)\n* **`flexit.attention`** : Attention mechanisms (Absolute, ALiBi, Relative, Rotary)\n* **`flexit.train`** : Training utilities (`Trainer`, `Batch`, `LossCompute`)\n* **`flexit.callbacks`** : Training callbacks (`CheckpointCallback`, `EarlyStoppingCallback`)\n* **`flexit.configs`** : Configuration classes (`ModelConfig`)\n* **`flexit.loss`** : Loss functions (`LabelSmoothing`, `BertLoss`)\n\n## Contributing\n\nContributions are welcome. Please:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/improvement`)\n3. Make your changes with tests\n4. Run tests and type checking (`mypy`, `ruff`)\n5. Submit a pull request\n\nFor major changes, open an issue first to discuss the proposed changes.\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use FlexiTransformers in your research, please cite:\n\n```bibtex\n@software{flexitransformers2024,\n  author = {Elshahawy, Ahmed},\n  title = {FlexiTransformers: A Modular Transformer Framework},\n  year = {2024},\n  url = {https://github.com/A-Elshahawy/flexitransformers}\n}\n```\n\n## References\n\nThis library implements concepts from:\n\n* Vaswani et al. (2017) - \"Attention is All You Need\"\n* Press et al. (2021) - \"Train Short, Test Long: Attention with Linear Biases\" (ALiBi)\n* Su et al. (2021) - \"RoFormer: Enhanced Transformer with Rotary Position Embedding\"\n* Dai et al. (2019) - \"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context\"\n\n## Contact\n\n**Ahmed Elshahawy**\n\n* GitHub: [@A-Elshahawy](https://github.com/A-Elshahawy)\n* LinkedIn: [Ahmed Elshahawy](https://www.linkedin.com/in/ahmed-elshahawy-a42149218/)\n* Email: ahmedelshahawy078@gmail.com\n\n---\n\n**Links:**\n\n* [Documentation](https://a-elshahawy.github.io/FlexiTransformers/)\n* [PyPI Package](https://pypi.org/project/flexitransformers/)\n* [GitHub Repository](https://github.com/A-Elshahawy/flexitransformers)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "This repository provides a flexible and modular implementation of the Transformer architecture,",
    "version": "0.2.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/A-Elshahawy/FlexiTransformers/issues",
        "Homepage": "https://github.com/A-Elshahawy/FlexiTransformers"
    },
    "split_keywords": [
        "transformer",
        " nlp",
        " deep learning",
        " machine learning",
        " attention"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "66237431793005ec0c5366fd0614ea54a836b40c5b716220941680050146fc0b",
                "md5": "b087628039d752474291fc6eec41035f",
                "sha256": "e2787f55301d8064d1aab82b752d44c0301988eae8a6a8a628839c2f5cb16222"
            },
            "downloads": -1,
            "filename": "flexitransformers-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b087628039d752474291fc6eec41035f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 43932,
            "upload_time": "2025-10-06T23:28:43",
            "upload_time_iso_8601": "2025-10-06T23:28:43.453605Z",
            "url": "https://files.pythonhosted.org/packages/66/23/7431793005ec0c5366fd0614ea54a836b40c5b716220941680050146fc0b/flexitransformers-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "466b1c7fd65f93769dda489eec94e4d5d78cc0fff2d7a7a02f3cf643a97fef69",
                "md5": "a009022b3ff7b8c881cfa94f04cbe76b",
                "sha256": "62315a3ad704904eed393d305f9b3cfe190d3981dc90949fa3af69e95d60207c"
            },
            "downloads": -1,
            "filename": "flexitransformers-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "a009022b3ff7b8c881cfa94f04cbe76b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 42039,
            "upload_time": "2025-10-06T23:28:44",
            "upload_time_iso_8601": "2025-10-06T23:28:44.848943Z",
            "url": "https://files.pythonhosted.org/packages/46/6b/1c7fd65f93769dda489eec94e4d5d78cc0fff2d7a7a02f3cf643a97fef69/flexitransformers-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-06 23:28:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "A-Elshahawy",
    "github_project": "FlexiTransformers",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "flexitransformers"
}

None