Name | geov JSON |
Version |
0.0.2
JSON |
| download |
home_page | https://github.com/geov-ai/geov |
Summary | The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER). We have shared a pretrained 9B parameter model. |
upload_time | 2023-03-30 06:29:22 |
maintainer | |
docs_url | None |
author | Better Planet Investments and labml.ai |
requires_python | |
license | |
keywords |
machine
learning
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# GeoV
## Overview
The GeoV model was designed by Georges Harik and uses
[Rotary Positional Embeddings with Relative distances (RoPER)](http://research.labml.ai/RoPER.html)
by [Georges Hark](https://twitter.com/ghark) and [Varuna Jayasiri](https://twitter.com/vpj).
[RoPER](http://research.labml.ai/RoPER.html), in addition to using relative positions in the attention score
calculation by RoPE embeddings, adds relative positional information explicitly to value embeddings.
Specifically, it incorporates the relative positions of the tokens paid attention to.
RoPER has given better performance in some algorithmic tasks, and seems comparable to RoPE in language modeling.
The GeoV tokenizer uses [SentencePiece](https://github.com/google/sentencepiece)
[unigram language model](https://arxiv.org/abs/1804.10959) and tokenizes symbols,
digits and new line characters separately, in order to achieve better performance on mathematical content and code.
This model was contributed by [gharik](https://huggingface.co/gharik) and [vpj](https://huggingface.co/vpj).
We have shared 9B parameter pre-trained model at [GeoV/GeoV-9b](https://huggingface.co/GeoV/GeoV-9b),
We plan to release checkpoints around every 20b tokens trained from here until around 300b tokens.
We will also train smaller and larger versions.
Our aim is to have broadly available smaller and larger models.
This implementation is built on top of [transformers](https://github.com/huggingface/transformers) library.
## Installations
```shell
pip install geov
```
## Generation
```python
from geov import GeoVForCausalLM, GeoVTokenizer
model = GeoVForCausalLM.from_pretrained("GeoV/GeoV-9b")
tokenizer = GeoVTokenizer.from_pretrained("GeoV/GeoV-9b")
prompt = "In mathematics, topology is the study of"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
gen_tokens = model.generate(
input_ids,
do_sample=True,
temperature=0.9,
max_length=100,
)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
```
Raw data
{
"_id": null,
"home_page": "https://github.com/geov-ai/geov",
"name": "geov",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "machine learning",
"author": "Better Planet Investments and labml.ai",
"author_email": "contact@labml.ai",
"download_url": "https://files.pythonhosted.org/packages/a7/24/fc830ae9da607c5857b071056ee78972e72575e0a04b9c1ec9c40a35bd05/geov-0.0.2.tar.gz",
"platform": null,
"description": "# GeoV\n\n## Overview\n\nThe GeoV model was designed by Georges Harik and uses \n[Rotary Positional Embeddings with Relative distances (RoPER)](http://research.labml.ai/RoPER.html) \nby [Georges Hark](https://twitter.com/ghark) and [Varuna Jayasiri](https://twitter.com/vpj).\n\n[RoPER](http://research.labml.ai/RoPER.html), in addition to using relative positions in the attention score\ncalculation by RoPE embeddings, adds relative positional information explicitly to value embeddings. \nSpecifically, it incorporates the relative positions of the tokens paid attention to.\nRoPER has given better performance in some algorithmic tasks, and seems comparable to RoPE in language modeling.\n\nThe GeoV tokenizer uses [SentencePiece](https://github.com/google/sentencepiece)\n[unigram language model](https://arxiv.org/abs/1804.10959) and tokenizes symbols,\ndigits and new line characters separately, in order to achieve better performance on mathematical content and code.\n\nThis model was contributed by [gharik](https://huggingface.co/gharik) and [vpj](https://huggingface.co/vpj).\n\nWe have shared 9B parameter pre-trained model at [GeoV/GeoV-9b](https://huggingface.co/GeoV/GeoV-9b),\nWe plan to release checkpoints around every 20b tokens trained from here until around 300b tokens.\nWe will also train smaller and larger versions.\nOur aim is to have broadly available smaller and larger models.\n\nThis implementation is built on top of [transformers](https://github.com/huggingface/transformers) library.\n\n## Installations\n\n```shell\npip install geov\n```\n\n## Generation\n\n```python\nfrom geov import GeoVForCausalLM, GeoVTokenizer\n\nmodel = GeoVForCausalLM.from_pretrained(\"GeoV/GeoV-9b\")\ntokenizer = GeoVTokenizer.from_pretrained(\"GeoV/GeoV-9b\")\n\nprompt = \"In mathematics, topology is the study of\"\n\ninput_ids = tokenizer(prompt, return_tensors=\"pt\").input_ids\n\ngen_tokens = model.generate(\n input_ids,\n do_sample=True,\n temperature=0.9,\n max_length=100,\n)\ngen_text = tokenizer.batch_decode(gen_tokens)[0]\n```\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER). We have shared a pretrained 9B parameter model.",
"version": "0.0.2",
"split_keywords": [
"machine",
"learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d66e2a0e17ee8e8c03f38652c6f02c00a080a31a8133d4477e17b9ecad436ff3",
"md5": "3fdbe9c9534e74023963a7956d0de77b",
"sha256": "2cbde524f8dbe208ea904bcf6910400d7476514e6f407cef911ddd107eb065db"
},
"downloads": -1,
"filename": "geov-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3fdbe9c9534e74023963a7956d0de77b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 14383,
"upload_time": "2023-03-30T06:29:19",
"upload_time_iso_8601": "2023-03-30T06:29:19.990535Z",
"url": "https://files.pythonhosted.org/packages/d6/6e/2a0e17ee8e8c03f38652c6f02c00a080a31a8133d4477e17b9ecad436ff3/geov-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a724fc830ae9da607c5857b071056ee78972e72575e0a04b9c1ec9c40a35bd05",
"md5": "78351886a76142e3b02bbf854070cefd",
"sha256": "5a737bd20531ed1ad72c9742839af754371d0108fd997b02b0200c8ae6393e3f"
},
"downloads": -1,
"filename": "geov-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "78351886a76142e3b02bbf854070cefd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 12818,
"upload_time": "2023-03-30T06:29:22",
"upload_time_iso_8601": "2023-03-30T06:29:22.228917Z",
"url": "https://files.pythonhosted.org/packages/a7/24/fc830ae9da607c5857b071056ee78972e72575e0a04b9c1ec9c40a35bd05/geov-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-30 06:29:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "geov-ai",
"github_project": "geov",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "geov"
}