PyDigger - unearthing stuff about Python

Found 19 out of 302,945. Showing 19 on page -1. Total pages: 1.

Name	Version	Summary	date
llmvision	0.1.1	Visualize how LLMs tokenize text	2025-07-26 07:34:54
miditok	3.0.6.post1	MIDI / symbolic music tokenizers for Deep Learning models.	2025-07-22 12:41:12
rs-bpe	0.1.0	A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust	2025-03-19 05:58:24
nlpashto	0.0.25	Pashto Natural Language Processing Toolkit	2025-02-01 11:17:44
code-tokenize	0.2.1	Fast program tokenization and structural analysis in Python	2025-01-14 09:17:25
rftokenizer	2.3.0	A character-wise tokenizer for morphologically rich languages	2024-12-17 19:05:30
alphacodings	0.2.0	base26 ([A-Z]) and base52 ([A-Za-z]) encodings	2024-12-09 03:04:43
QuickBPE	2.1	A fast BPE implementation in C	2024-12-05 11:37:29
zhon	2.1.1	Zhon provides constants used in Chinese text processing.	2024-11-20 00:29:10
llama-tokens	0.0.3	A Quick Library with Llama 3.1/3.2 Tokenization - source https://github.com/jeffxtang/llama-tokens	2024-11-10 17:03:39
eKoNLPy	2.0.6	A Korean natural language processing toolkit for economic analysis	2024-11-02 00:02:54
huspacy	0.12.0	HuSpaCy: industrial strength Hungarian natural language processing	2024-10-28 10:30:55
maze-dataset	1.1.0	generating and working with datasets of mazes	2024-09-10 19:33:49
taibun	1.1.7	Taiwanese Hokkien Transliterator and Tokeniser	2024-08-31 20:25:01
bpeasy	0.1.3	Fast bare-bones BPE for modern tokenizer training	2024-08-23 10:47:52
simplemma	1.1.1	A lightweight toolkit for multilingual lemmatization and language detection.	2024-08-08 12:20:45
textmate-grammar-python	0.6.1	A lexer and tokenizer for grammar files as defined by TextMate and used in VSCode, implemented in Python.	2024-07-31 18:43:24
process-twarc	0.20.2	Tools for transforming raw data from Twarc2 to structured data for Masked Language Modeling.	2024-06-12 11:40:55
example990420	1.1.1	Taiwanese Hokkien Transliterator and Tokeniser	2024-05-01 20:28:38

Found 19 out of 302,945. Showing 19 on page -1. Total pages: 1.

first prev next last