BiLLM

Name	BiLLM JSON
Version	0.1.6 JSON
	download
home_page	None
Summary	Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings.
upload_time	2024-06-05 07:49:09
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # BiLLM
Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings. Compatible with 🤗 transformers.

<a href="https://arxiv.org/abs/2310.01208">
    <img src="https://img.shields.io/badge/Arxiv-2310.01208-yellow.svg?style=flat-square" alt="https://arxiv.org/abs/2310.01208" />
</a>
<a href="https://arxiv.org/abs/2311.05296">
    <img src="https://img.shields.io/badge/Arxiv-2311.05296-yellow.svg?style=flat-square" alt="https://arxiv.org/abs/2311.05296" />
</a>
<a href="https://pypi.org/project/billm/">
    <img src="https://img.shields.io/pypi/v/billm?style=flat-square" alt="PyPI version" />
</a>
<a href="https://pypi.org/project/billm/">
    <img src="https://img.shields.io/pypi/dm/billm?style=flat-square" alt="PyPI Downloads" />
</a>
<a href="http://makeapullrequest.com">
    <img src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square" alt="http://makeapullrequest.com" />
</a>
<a href="https://pdm-project.org">
    <img src="https://img.shields.io/badge/pdm-managed-blueviolet" alt="https://pdm-project.org" />
</a>


## Supported Models

- LLaMA
- Mistral
- Qwen2
- OpenELM

## Usage

1) `python -m pip install -U billm`

2) Specify start index for bi-directional layers via `export BiLLM_START_INDEX={layer_index}`. if not specified, default is 0, i.e., all layers are bi-directional. If set to -1, BiLLM is disabled.

3) Import LLMs from BiLLM and initialize them as usual with transformers.

```diff
- from transformers import (
-    LLamaModel,
-    LLamaForCausalLM,
-    LLamaForSequenceClassification,
-    MistralModel,
-    MistralForCausalLM,
-    MistralForSequenceClassification
-    Qwen2Model,
-    Qwen2ForCausalLM,
-    Qwen2ForSequenceClassification
- )

+ from billm import (
+    LLamaModel,
+    LLamaForCausalLM,
+    LLamaForSequenceClassification,
+    LLamaForTokenClassification,
+    MistralModel,
+    MistralForCausalLM,
+    MistralForSequenceClassification,
+    MistralForTokenClassification,
+    Qwen2Model,
+    Qwen2ForCausalLM,
+    Qwen2ForSequenceClassification,
+    Qwen2ForTokenClassification
+    OpenELMModel,
+    OpenELMForCausalLM,
+    OpenELMForSequenceClassification,
+    OpenELMForTokenClassification
+ )
```

## Examples

### NER

**training:**

```bash
$ cd examples
$ WANDB_MODE=disabled BiLLM_START_INDEX=0 CUDA_VISIBLE_DEVICES=3 python billm_ner.py \
--model_name_or_path mistralai/Mistral-7B-v0.1 \
--dataset_name_or_path conll2003 \
--push_to_hub 0
```

**inference:**

```python
from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification


label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
    peft_config.base_model_name_or_path,
    num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge and unload is necessary for inference
model = model.merge_and_unload()

token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)
```

### Sentence Embeddings

refer to AnglE: https://github.com/SeanLee97/AnglE


## Citation

If you use this toolkit in your work, please cite the following paper:

1) For sentence embeddings modeling:

```bibtex
@inproceedings{li2024bellm,
    title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
    author = "Li, Xianming and Li, Jing",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}
```

2) For other tasks:

```bibtex
@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "BiLLM",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "Sean Lee <xmlee97@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/e6/36/76a8fe630ebe0a2f1def45ff0031330c4d755adc4c23291c70840d2dc3e1/billm-0.1.6.tar.gz",
    "platform": null,
    "description": "# BiLLM\nTool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings. Compatible with \ud83e\udd17 transformers.\n\n<a href=\"https://arxiv.org/abs/2310.01208\">\n    <img src=\"https://img.shields.io/badge/Arxiv-2310.01208-yellow.svg?style=flat-square\" alt=\"https://arxiv.org/abs/2310.01208\" />\n</a>\n<a href=\"https://arxiv.org/abs/2311.05296\">\n    <img src=\"https://img.shields.io/badge/Arxiv-2311.05296-yellow.svg?style=flat-square\" alt=\"https://arxiv.org/abs/2311.05296\" />\n</a>\n<a href=\"https://pypi.org/project/billm/\">\n    <img src=\"https://img.shields.io/pypi/v/billm?style=flat-square\" alt=\"PyPI version\" />\n</a>\n<a href=\"https://pypi.org/project/billm/\">\n    <img src=\"https://img.shields.io/pypi/dm/billm?style=flat-square\" alt=\"PyPI Downloads\" />\n</a>\n<a href=\"http://makeapullrequest.com\">\n    <img src=\"https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square\" alt=\"http://makeapullrequest.com\" />\n</a>\n<a href=\"https://pdm-project.org\">\n    <img src=\"https://img.shields.io/badge/pdm-managed-blueviolet\" alt=\"https://pdm-project.org\" />\n</a>\n\n\n## Supported Models\n\n- LLaMA\n- Mistral\n- Qwen2\n- OpenELM\n\n## Usage\n\n1) `python -m pip install -U billm`\n\n2) Specify start index for bi-directional layers via `export BiLLM_START_INDEX={layer_index}`. if not specified, default is 0, i.e., all layers are bi-directional. If set to -1, BiLLM is disabled.\n\n3) Import LLMs from BiLLM and initialize them as usual with transformers.\n\n```diff\n- from transformers import (\n-    LLamaModel,\n-    LLamaForCausalLM,\n-    LLamaForSequenceClassification,\n-    MistralModel,\n-    MistralForCausalLM,\n-    MistralForSequenceClassification\n-    Qwen2Model,\n-    Qwen2ForCausalLM,\n-    Qwen2ForSequenceClassification\n- )\n\n+ from billm import (\n+    LLamaModel,\n+    LLamaForCausalLM,\n+    LLamaForSequenceClassification,\n+    LLamaForTokenClassification,\n+    MistralModel,\n+    MistralForCausalLM,\n+    MistralForSequenceClassification,\n+    MistralForTokenClassification,\n+    Qwen2Model,\n+    Qwen2ForCausalLM,\n+    Qwen2ForSequenceClassification,\n+    Qwen2ForTokenClassification\n+    OpenELMModel,\n+    OpenELMForCausalLM,\n+    OpenELMForSequenceClassification,\n+    OpenELMForTokenClassification\n+ )\n```\n\n## Examples\n\n### NER\n\n**training:**\n\n```bash\n$ cd examples\n$ WANDB_MODE=disabled BiLLM_START_INDEX=0 CUDA_VISIBLE_DEVICES=3 python billm_ner.py \\\n--model_name_or_path mistralai/Mistral-7B-v0.1 \\\n--dataset_name_or_path conll2003 \\\n--push_to_hub 0\n```\n\n**inference:**\n\n```python\nfrom transformers import AutoTokenizer, pipeline\nfrom peft import PeftModel, PeftConfig\nfrom billm import MistralForTokenClassification\n\n\nlabel2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}\nid2label = {v: k for k, v in label2id.items()}\nmodel_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'\ntokenizer = AutoTokenizer.from_pretrained(model_id)\npeft_config = PeftConfig.from_pretrained(model_id)\nmodel = MistralForTokenClassification.from_pretrained(\n    peft_config.base_model_name_or_path,\n    num_labels=len(label2id), id2label=id2label, label2id=label2id\n)\nmodel = PeftModel.from_pretrained(model, model_id)\n# merge and unload is necessary for inference\nmodel = model.merge_and_unload()\n\ntoken_classifier = pipeline(\"token-classification\", model=model, tokenizer=tokenizer, aggregation_strategy=\"simple\")\nsentence = \"I live in Hong Kong. I am a student at Hong Kong PolyU.\"\ntokens = token_classifier(sentence)\nprint(tokens)\n```\n\n### Sentence Embeddings\n\nrefer to AnglE: https://github.com/SeanLee97/AnglE\n\n\n## Citation\n\nIf you use this toolkit in your work, please cite the following paper:\n\n1) For sentence embeddings modeling:\n\n```bibtex\n@inproceedings{li2024bellm,\n    title = \"BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings\",\n    author = \"Li, Xianming and Li, Jing\",\n    booktitle = \"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics\",\n    year = \"2024\",\n    publisher = \"Association for Computational Linguistics\"\n}\n```\n\n2) For other tasks:\n\n```bibtex\n@article{li2023label,\n  title={Label supervised llama finetuning},\n  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},\n  journal={arXiv preprint arXiv:2310.01208},\n  year={2023}\n}\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings.",
    "version": "0.1.6",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2434025291bf2a5c3b5479b5d630cd9eaf30f2bb7ce66ba67c62e7fa58251308",
                "md5": "f62641d97d3ff8fe44ab086a45c621d5",
                "sha256": "4b4fd197913e36b681e8470dbbec4d56494cdc0090691a9d2a13ba6a047cee63"
            },
            "downloads": -1,
            "filename": "billm-0.1.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f62641d97d3ff8fe44ab086a45c621d5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 35241,
            "upload_time": "2024-06-05T07:49:07",
            "upload_time_iso_8601": "2024-06-05T07:49:07.354046Z",
            "url": "https://files.pythonhosted.org/packages/24/34/025291bf2a5c3b5479b5d630cd9eaf30f2bb7ce66ba67c62e7fa58251308/billm-0.1.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e63676a8fe630ebe0a2f1def45ff0031330c4d755adc4c23291c70840d2dc3e1",
                "md5": "fcd4c4cf352b77668d291b13f84325ad",
                "sha256": "e79f543c322de9750cbd4dcfdbbf89e79ab6f13ea4b0bdbda8dcfa753acb6266"
            },
            "downloads": -1,
            "filename": "billm-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "fcd4c4cf352b77668d291b13f84325ad",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 28567,
            "upload_time": "2024-06-05T07:49:09",
            "upload_time_iso_8601": "2024-06-05T07:49:09.340796Z",
            "url": "https://files.pythonhosted.org/packages/e6/36/76a8fe630ebe0a2f1def45ff0031330c4d755adc4c23291c70840d2dc3e1/billm-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-05 07:49:09",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "billm"
}

None