tnsa


Nametnsa JSON
Version 7.3.2 PyPI version JSON
download
home_pagehttps://github.com/TnsaAi/tnsa.stable.curiosity
SummaryA transformer model with advanced features for casual language modeling.
upload_time2024-11-15 15:51:35
maintainerNone
docs_urlNone
authorTNSA AI
requires_pythonNone
licenseNGen2Community License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TNSA Curiosity

**TNSA Stable Curiosity** is a transformer-based model architecture designed for casual language modeling tasks. It is an enhancement of the ARCH-X9 and NGen2 model, optimized for various NLP tasks such as text classification, token classification, and language generation. The architecture features advanced mechanisms like gradient checkpointing, making it more efficient and scalable.

## Installation

To install `tnsa`, you can use `pip` from PyPI:

 ``` bash
pip install tnsa 
```

## How to use Curiosity OpenModel Architecture(Based on ARCH-X 9)

``` python
from tnsa.stable.curiosity import TNSAforCasualLM

# Initialize the model
model = TNSAforCasualLM(
    hidden_size=768,
    num_hidden_layers=12,
    num_attention_heads=12,
    intermediate_size=3072,
    intermediate_act_fn='gelu',  # Can also use other activations like 'relu'
    hidden_dropout_prob=0.1,
    attention_probs_dropout_prob=0.1,
    initializer_range=0.02,
)

# Example input
input_tensor = ...  # Your input tensor here, with shape [batch_size, seq_length, hidden_size]
attention_mask = ...  # Your attention mask tensor here

# Forward pass through the model
output = model(input_tensor=input_tensor, attention_mask=attention_mask)

print(output)

#Instialize you training loop you can keep the parameters to default to re-create NGen2-Nano Base on OpenWEB

```

## Key Parameters
`hidden_size`: The size of the hidden layers. Defaults to 768 (same as BERT's base).

`num_hidden_layers`: The number of transformer layers. Defaults to 12.

`num_attention_heads`: The number of attention heads in each layer. Defaults to 12.

`intermediate_size`: The size of the intermediate (feedforward) layer. Defaults to 3072.

`intermediate_act_f`n: The activation function to use in the intermediate layer. Default is gelu.

`hidden_dropout_prob`: Dropout probability for hidden layers. Default is 0.1.

`attention_probs_dropout_prob`: Dropout probability for attention layers. Default is 0.1.

`initializer_range`: The standard deviation of the initializer. Default is 0.02.

`use_gradient_checkpointing`: A boolean flag to enable or disable gradient checkpointing for memory efficiency. Default is False.

## How Curiosity `OpenModelArchitecture` Differs from ARCH-X 9`(Closed Source)`
The `Curiosity` architecture is based on the standard transformer architecture used in `NGen2`, with the following enhancements:

`Gradient Checkpointing`: An optional feature to enable gradient checkpointing, allowing for more efficient memory usage during training. This is particularly useful when working with large models.

`Improved Attention Mechanism`: The attention mechanism has been fine-tuned for better handling of `long-range` dependencies and more accurate attention distributions.

`Optimized Architecture`: Custom improvements to layer normalization and dropout mechanisms help improve the model’s performance on 
various `NLP` tasks.

## Model Performance

While `Curiosity` is similar to NGen2, it has been `fine-tuned` to outperform `NGen2` in some language modeling tasks by using a more efficient memory usage pattern, which makes it better suited for tasks with large datasets or longer sequences.

## License
The code is licensed under the NGen2Community License. Please review the LICENSE file for more details.While the base of the code is still closed sourced. you `i.e (user or developer)` should use it to develop
custom models but not copy or modify the code itself.

## Copyrighted and Licensed by:
  **Copyright (c) 2024, TNSAAI Inc. All rights reserved.**

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/TnsaAi/tnsa.stable.curiosity",
    "name": "tnsa",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "TNSA AI",
    "author_email": "tnsa.company@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ec/95/62f466ed1efbde5544062e0cac5dba07ffd9aa83ea6e77cf2634f66b51bf/tnsa-7.3.2.tar.gz",
    "platform": null,
    "description": "# TNSA Curiosity\r\n\r\n**TNSA Stable Curiosity** is a transformer-based model architecture designed for casual language modeling tasks. It is an enhancement of the ARCH-X9 and NGen2 model, optimized for various NLP tasks such as text classification, token classification, and language generation. The architecture features advanced mechanisms like gradient checkpointing, making it more efficient and scalable.\r\n\r\n## Installation\r\n\r\nTo install `tnsa`, you can use `pip` from PyPI:\r\n\r\n ``` bash\r\npip install tnsa \r\n```\r\n\r\n## How to use Curiosity OpenModel Architecture(Based on ARCH-X 9)\r\n\r\n``` python\r\nfrom tnsa.stable.curiosity import TNSAforCasualLM\r\n\r\n# Initialize the model\r\nmodel = TNSAforCasualLM(\r\n    hidden_size=768,\r\n    num_hidden_layers=12,\r\n    num_attention_heads=12,\r\n    intermediate_size=3072,\r\n    intermediate_act_fn='gelu',  # Can also use other activations like 'relu'\r\n    hidden_dropout_prob=0.1,\r\n    attention_probs_dropout_prob=0.1,\r\n    initializer_range=0.02,\r\n)\r\n\r\n# Example input\r\ninput_tensor = ...  # Your input tensor here, with shape [batch_size, seq_length, hidden_size]\r\nattention_mask = ...  # Your attention mask tensor here\r\n\r\n# Forward pass through the model\r\noutput = model(input_tensor=input_tensor, attention_mask=attention_mask)\r\n\r\nprint(output)\r\n\r\n#Instialize you training loop you can keep the parameters to default to re-create NGen2-Nano Base on OpenWEB\r\n\r\n```\r\n\r\n## Key Parameters\r\n`hidden_size`: The size of the hidden layers. Defaults to 768 (same as BERT's base).\r\n\r\n`num_hidden_layers`: The number of transformer layers. Defaults to 12.\r\n\r\n`num_attention_heads`: The number of attention heads in each layer. Defaults to 12.\r\n\r\n`intermediate_size`: The size of the intermediate (feedforward) layer. Defaults to 3072.\r\n\r\n`intermediate_act_f`n: The activation function to use in the intermediate layer. Default is gelu.\r\n\r\n`hidden_dropout_prob`: Dropout probability for hidden layers. Default is 0.1.\r\n\r\n`attention_probs_dropout_prob`: Dropout probability for attention layers. Default is 0.1.\r\n\r\n`initializer_range`: The standard deviation of the initializer. Default is 0.02.\r\n\r\n`use_gradient_checkpointing`: A boolean flag to enable or disable gradient checkpointing for memory efficiency. Default is False.\r\n\r\n## How Curiosity `OpenModelArchitecture` Differs from ARCH-X 9`(Closed Source)`\r\nThe `Curiosity` architecture is based on the standard transformer architecture used in `NGen2`, with the following enhancements:\r\n\r\n`Gradient Checkpointing`: An optional feature to enable gradient checkpointing, allowing for more efficient memory usage during training. This is particularly useful when working with large models.\r\n\r\n`Improved Attention Mechanism`: The attention mechanism has been fine-tuned for better handling of `long-range` dependencies and more accurate attention distributions.\r\n\r\n`Optimized Architecture`: Custom improvements to layer normalization and dropout mechanisms help improve the model\u00e2\u20ac\u2122s performance on \r\nvarious `NLP` tasks.\r\n\r\n## Model Performance\r\n\r\nWhile `Curiosity` is similar to NGen2, it has been `fine-tuned` to outperform `NGen2` in some language modeling tasks by using a more efficient memory usage pattern, which makes it better suited for tasks with large datasets or longer sequences.\r\n\r\n## License\r\nThe code is licensed under the NGen2Community License. Please review the LICENSE file for more details.While the base of the code is still closed sourced. you `i.e (user or developer)` should use it to develop\r\ncustom models but not copy or modify the code itself.\r\n\r\n## Copyrighted and Licensed by:\r\n  **Copyright (c) 2024, TNSAAI Inc. All rights reserved.**\r\n",
    "bugtrack_url": null,
    "license": "NGen2Community License",
    "summary": "A transformer model with advanced features for casual language modeling.",
    "version": "7.3.2",
    "project_urls": {
        "Homepage": "https://github.com/TnsaAi/tnsa.stable.curiosity"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "998c1cf74de37f9bee34f473101a798b90bee7e36a4595fb69f07cb7c3b4fed1",
                "md5": "68f3bd843ca36419c13bc505f53686ec",
                "sha256": "84e427acc102c3c5b17f766424f0d017ddf7a8a194bbc80afaf435fc11efb07c"
            },
            "downloads": -1,
            "filename": "tnsa-7.3.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "68f3bd843ca36419c13bc505f53686ec",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4627,
            "upload_time": "2024-11-15T15:51:33",
            "upload_time_iso_8601": "2024-11-15T15:51:33.307909Z",
            "url": "https://files.pythonhosted.org/packages/99/8c/1cf74de37f9bee34f473101a798b90bee7e36a4595fb69f07cb7c3b4fed1/tnsa-7.3.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ec9562f466ed1efbde5544062e0cac5dba07ffd9aa83ea6e77cf2634f66b51bf",
                "md5": "1931c837fac760e9d62827f4f41132ea",
                "sha256": "1a536000479416853a4035936cf1c14c0786c29731f21ba6a1af2214faa670d8"
            },
            "downloads": -1,
            "filename": "tnsa-7.3.2.tar.gz",
            "has_sig": false,
            "md5_digest": "1931c837fac760e9d62827f4f41132ea",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4762,
            "upload_time": "2024-11-15T15:51:35",
            "upload_time_iso_8601": "2024-11-15T15:51:35.773426Z",
            "url": "https://files.pythonhosted.org/packages/ec/95/62f466ed1efbde5544062e0cac5dba07ffd9aa83ea6e77cf2634f66b51bf/tnsa-7.3.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-15 15:51:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TnsaAi",
    "github_project": "tnsa.stable.curiosity",
    "github_not_found": true,
    "lcname": "tnsa"
}
        
Elapsed time: 1.44890s