rbpy-rb

Name	rbpy-rb JSON
Version	0.12.4 JSON
	download
home_page	https://github.com/readerbench/ReaderBench
Summary	ReaderBench library written in python
upload_time	2023-08-09 08:27:14
maintainer
docs_url	None
author	Woodcarver
requires_python	>=3.6
license
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # ReaderBench Python

## Install
We recommend using virtual environments, as some packages require an exact version.   
If you only want to use the package do the following:  
1. `sudo apt-get install python3-pip, python3-venv, python3-dev`    
2. `python3 -m venv rbenv` (create virutal environment named rbenv)
3. `source rbenv/bin/activate` (activate virtual env)
4. `pip3 uninstall setuptools && pip3 install setuptools && pip3 install --upgrade pip && pip3 install --no-cache-dir rbpy-rb`
5. Use it as in: https://github.com/readerbench/ReaderBench/blob/master/usage.py  

If you want to contribute to the code base of package:   
1. `sudo apt-get install python3-pip, python3-venv, python3-dev`    
2. `git clone git@git.readerbench.com:ReaderBench/readerbenchpy.git && cd readerbenchpy/`  
3. `python3 -m venv rbenv` (create virutal environment named rbenv)
4. `source rbenv/bin/activate` (activate virtual env)
5. `pip3 uninstall setuptools && pip3 install setuptools && pip3 install --upgrade pip`
6. `pip3 install -r requirements.txt` 
7. `python3 nltk_download.py`  
Optional: prei-install model for en (otherwise most of the English processings would fail
    and ask to run this command):
8. `python3 -m spacy download en_core_web_lg`


If you want to install spellchecking (hunspell) also you need this non-python libraries:
1. `sudo apt-get install libhunspell-1.6-0 libhunspell-dev hunspell-ro`
2. `pip3 install hunspell`

## Usage
For usage (parsing, lemmatization, NER, wordnet, content words, indices etc.)  see file `usage.py` from 
https://github.com/readerbench/ReaderBench    


## Tips
You may also need some spacy models which are downloaded through spacy.     
You have to download these spacy models by yourself, using the command:    
`python3 -m spacy download name_of_the_model` 
The logger will also write instructions on which models you need, and how to download them.  

## Developer instructions

## How to use Bert

Our models are also available in the HuggingFace platform: https://huggingface.co/readerbench 

You can use them directly from HuggingFace:
```
# tensorflow
from transformers import AutoModel, AutoTokenizer, TFAutoModel
tokenizer = AutoTokenizer.from_pretrained("readerbench/RoBERT-base")
model = TFAutoModel.from_pretrained("readerbench/RoBERT-base")
inputs = tokenizer("exemplu de propoziție", return_tensors="tf")
outputs = model(inputs)

# pytorch
from transformers import AutoModel, AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("readerbench/RoBERT-base")
model = AutoModel.from_pretrained("readerbench/RoBERT-base")
inputs = tokenizer("exemplu de propoziție", return_tensors="pt")
outputs = model(**inputs)
```

or from ReaderBench:

```
from rb.core.lang import Lang
from rb.processings.encoders.bert import BertWrapper
from tensorflow import keras

bert_wrapper = BertWrapper(Lang.RO, max_seq_len=128)
inputs, bert_layer = bert_wrapper.create_inputs_and_model()
cls_output = bert_wrapper.get_output(bert_layer, "cls") # or "pool"

# Add decision layer and compile model
# eg. 
# hidden = keras.layers.Dense(..)(cls_output)
# output = keras.layers.Dense(..)(hidden)
# model = keras.Model(inputs=inputs, outputs=[output])
# model.compile(..)

bert_wrapper.load_weights() #must be called after compile

# Process inputs for model
feed_inputs = bert_wrapper.process_input(["text1", "text2", "text3"])
# feed_output = ...
# model.fit(feed_inputs, feed_output, ...)
```

## How to use the logger
In each file you have to initialize the logger:  
```sh
from rb.utils.rblogger import Logger  
logger = Logger.get_logger() 
logger.info("info msg")
logger.warning("warning msg")  
logger.error()
```
## How to push the wheel on pip
1. `rm -r dist/`
2. `pip3 install twine wheel`
3. `./upload_to_pypi.sh`


## How to run rb/core/cscl/csv_parser.py
1. Do the installing steps from contribution
2. run `pip3 install xmltodict`
3. run `EXPORT PYTHONPATH=/add/path/to/repo/readerbenchpy/`
4. add json resources in a `jsons` directory in `readerbenchpy/rb/core/cscl/`
5. run `cd rb/core/cscl/ && python3 csv_parser.py`

## Supported Date Formats
ReaderBench is able to perform conversation analysis from chats and communities. Each utterance must have the time expressed in one of the following formats:
- %Y-%m-%d %H:%M:%S.%f %Z
- %Y-%m-%d %H:%M:%S %Z
- %Y-%m-%d %H:%M %Z
- %Y-%m-%d %H:%M:%S.%f
- %Y-%m-%d %H:%M:%S
- %Y-%m-%d %H:%M
where codifications are extracted from [Python date format codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/readerbench/ReaderBench",
    "name": "rbpy-rb",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Woodcarver",
    "author_email": "batpepastrama@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/84/cc/ee85c8ac687b685437ae6d6a28c7706a77eec8affaf434ccdb8fed23df2f/rbpy-rb-0.12.4.tar.gz",
    "platform": null,
    "description": "# ReaderBench Python\n\n## Install\nWe recommend using virtual environments, as some packages require an exact version.   \nIf you only want to use the package do the following:  \n1. `sudo apt-get install python3-pip, python3-venv, python3-dev`    \n2. `python3 -m venv rbenv` (create virutal environment named rbenv)\n3. `source rbenv/bin/activate` (activate virtual env)\n4. `pip3 uninstall setuptools && pip3 install setuptools && pip3 install --upgrade pip && pip3 install --no-cache-dir rbpy-rb`\n5. Use it as in: https://github.com/readerbench/ReaderBench/blob/master/usage.py  \n\nIf you want to contribute to the code base of package:   \n1. `sudo apt-get install python3-pip, python3-venv, python3-dev`    \n2. `git clone git@git.readerbench.com:ReaderBench/readerbenchpy.git && cd readerbenchpy/`  \n3. `python3 -m venv rbenv` (create virutal environment named rbenv)\n4. `source rbenv/bin/activate` (activate virtual env)\n5. `pip3 uninstall setuptools && pip3 install setuptools && pip3 install --upgrade pip`\n6. `pip3 install -r requirements.txt` \n7. `python3 nltk_download.py`  \nOptional: prei-install model for en (otherwise most of the English processings would fail\n    and ask to run this command):\n8. `python3 -m spacy download en_core_web_lg`\n\n\nIf you want to install spellchecking (hunspell) also you need this non-python libraries:\n1. `sudo apt-get install libhunspell-1.6-0 libhunspell-dev hunspell-ro`\n2. `pip3 install hunspell`\n\n## Usage\nFor usage (parsing, lemmatization, NER, wordnet, content words, indices etc.)  see file `usage.py` from \nhttps://github.com/readerbench/ReaderBench    \n\n\n## Tips\nYou may also need some spacy models which are downloaded through spacy.     \nYou have to download these spacy models by yourself, using the command:    \n`python3 -m spacy download name_of_the_model` \nThe logger will also write instructions on which models you need, and how to download them.  \n\n## Developer instructions\n\n## How to use Bert\n\nOur models are also available in the HuggingFace platform: https://huggingface.co/readerbench \n\nYou can use them directly from HuggingFace:\n```\n# tensorflow\nfrom transformers import AutoModel, AutoTokenizer, TFAutoModel\ntokenizer = AutoTokenizer.from_pretrained(\"readerbench/RoBERT-base\")\nmodel = TFAutoModel.from_pretrained(\"readerbench/RoBERT-base\")\ninputs = tokenizer(\"exemplu de propozi\u021bie\", return_tensors=\"tf\")\noutputs = model(inputs)\n\n# pytorch\nfrom transformers import AutoModel, AutoTokenizer, AutoModel\ntokenizer = AutoTokenizer.from_pretrained(\"readerbench/RoBERT-base\")\nmodel = AutoModel.from_pretrained(\"readerbench/RoBERT-base\")\ninputs = tokenizer(\"exemplu de propozi\u021bie\", return_tensors=\"pt\")\noutputs = model(**inputs)\n```\n\nor from ReaderBench:\n\n```\nfrom rb.core.lang import Lang\nfrom rb.processings.encoders.bert import BertWrapper\nfrom tensorflow import keras\n\nbert_wrapper = BertWrapper(Lang.RO, max_seq_len=128)\ninputs, bert_layer = bert_wrapper.create_inputs_and_model()\ncls_output = bert_wrapper.get_output(bert_layer, \"cls\") # or \"pool\"\n\n# Add decision layer and compile model\n# eg. \n# hidden = keras.layers.Dense(..)(cls_output)\n# output = keras.layers.Dense(..)(hidden)\n# model = keras.Model(inputs=inputs, outputs=[output])\n# model.compile(..)\n\nbert_wrapper.load_weights() #must be called after compile\n\n# Process inputs for model\nfeed_inputs = bert_wrapper.process_input([\"text1\", \"text2\", \"text3\"])\n# feed_output = ...\n# model.fit(feed_inputs, feed_output, ...)\n```\n\n## How to use the logger\nIn each file you have to initialize the logger:  \n```sh\nfrom rb.utils.rblogger import Logger  \nlogger = Logger.get_logger() \nlogger.info(\"info msg\")\nlogger.warning(\"warning msg\")  \nlogger.error()\n```\n## How to push the wheel on pip\n1. `rm -r dist/`\n2. `pip3 install twine wheel`\n3. `./upload_to_pypi.sh`\n\n\n## How to run rb/core/cscl/csv_parser.py\n1. Do the installing steps from contribution\n2. run `pip3 install xmltodict`\n3. run `EXPORT PYTHONPATH=/add/path/to/repo/readerbenchpy/`\n4. add json resources in a `jsons` directory in `readerbenchpy/rb/core/cscl/`\n5. run `cd rb/core/cscl/ && python3 csv_parser.py`\n\n## Supported Date Formats\nReaderBench is able to perform conversation analysis from chats and communities. Each utterance must have the time expressed in one of the following formats:\n- %Y-%m-%d %H:%M:%S.%f %Z\n- %Y-%m-%d %H:%M:%S %Z\n- %Y-%m-%d %H:%M %Z\n- %Y-%m-%d %H:%M:%S.%f\n- %Y-%m-%d %H:%M:%S\n- %Y-%m-%d %H:%M\nwhere codifications are extracted from [Python date format codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "ReaderBench library written in python",
    "version": "0.12.4",
    "project_urls": {
        "Homepage": "https://github.com/readerbench/ReaderBench"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84ccee85c8ac687b685437ae6d6a28c7706a77eec8affaf434ccdb8fed23df2f",
                "md5": "bef3425d6cdd3daae801d5d7769c65f9",
                "sha256": "8bf4f518d8587611e2295fdb1f6f84ccc4661bf084026fc7f3227afeced2c9f9"
            },
            "downloads": -1,
            "filename": "rbpy-rb-0.12.4.tar.gz",
            "has_sig": false,
            "md5_digest": "bef3425d6cdd3daae801d5d7769c65f9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 1493610,
            "upload_time": "2023-08-09T08:27:14",
            "upload_time_iso_8601": "2023-08-09T08:27:14.784344Z",
            "url": "https://files.pythonhosted.org/packages/84/cc/ee85c8ac687b685437ae6d6a28c7706a77eec8affaf434ccdb8fed23df2f/rbpy-rb-0.12.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-09 08:27:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "readerbench",
    "github_project": "ReaderBench",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "rbpy-rb"
}

Woodcarver