PyDigger - unearthing stuff about Python


NameVersionSummarydate
rs-bytepiece 0.2.2 bytepiece-rs Python binding 2023-11-12 08:52:37
count-tokens 0.7.0 Count number of tokens in the text file using toktoken tokenizer from OpenAI. 2023-09-26 11:16:08
UnicodeTokenizer 0.2.1 UnicodeTokenizer: tokenize all Unicode text 2023-09-20 21:46:37
semiformal 0.7.0 Tokenizer for semiformal unicode text using TR-29 segmentation 2023-08-20 08:00:32
tokenizer 3.4.3 A tokenizer for Icelandic text 2023-08-11 15:09:13
tokenstream 1.6.0 A versatile token stream for handwritten parsers 2023-08-02 18:52:57
Texo 0.0.4 Sentiment Analysis Multiple language and for all products 2023-07-09 15:34:19
nepalitokenizers 0.0.1 Pre-trained Tokenizers for the Nepali language with an interface to HuggingFace's tokenizers library for customizability. 2023-06-23 22:51:43
Texo-v1 0.0.2 Sentiment Analysis Multiple language and for all products 2023-06-18 16:12:15
pinyintokenizer 0.0.2 Pinyin Tokenizer, chinese pinyin tokenizer 2023-06-08 08:04:40
wyzard 1.0 Run various transformers models from one packages. 2023-05-18 06:10:17
botok 0.8.12 Tibetan Word Tokenizer 2023-05-17 11:36:37
gpt3-tokenizer 0.1.4 Encoder/Decoder and tokens counter for GPT3 2023-05-16 00:50:21
ZiTokenizer 0.0.8 ZiTokenizer: tokenize world text as Zi 2023-04-20 17:50:08
zltk 0.0.0 A collection of commonly used functions. 2023-04-06 14:53:47
basictokenizer 0.0.4 A basic and useful tokenizer. 2023-02-06 23:07:12
code-tokenizers 0.0.5 Aligning BPE and AST 2023-02-06 03:46:53
space-wrap 0.0.3 Automated Spacy wrapper to turn plain text into Spacy doc objects 2023-01-27 21:46:55
ZiCutter 0.0.8 ZiCutter: cut character smaller 2023-01-10 18:10:33
segments 2.2.1 2022-07-08 13:42:58
hourdayweektotal
2511479551274488
Elapsed time: 1.02490s