PyDigger - unearthing stuff about Python

Found 25 out of 312,775. Showing 20 on page 1. Total pages: 2.

Name	Version	Summary	date
nupunkt-rs	0.1.1	High-performance Rust implementation of nupunkt sentence/paragraph tokenization	2025-08-16 02:05:56
llmbuilder	0.4.6	A comprehensive toolkit for building, training, and deploying language models	2025-08-14 20:16:12
tokker	0.3.9	Tokker: a fast local-first CLI tokenizer with all the best models in one place	2025-08-09 17:59:33
chunkipy	1.0.0.post1	Chunkipy is an easy-to-use library for chunking text based on the size estimator function you provide.	2025-08-08 12:37:03
maze-dataset	1.4.0	generating and working with datasets of mazes	2025-08-06 23:08:57
ultranlp	1.0.6	Ultra-fast, comprehensive NLP preprocessing library with advanced tokenization	2025-08-02 10:21:43
sakurs	0.1.1	Fast, parallel sentence boundary detection using Delta-Stack Monoid algorithm	2025-07-27 15:33:39
llmvision	0.1.1	Visualize how LLMs tokenize text	2025-07-26 07:34:54
miditok	3.0.6.post1	MIDI / symbolic music tokenizers for Deep Learning models.	2025-07-22 12:41:12
rs-bpe	0.1.0	A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust	2025-03-19 05:58:24
nlpashto	0.0.25	Pashto Natural Language Processing Toolkit	2025-02-01 11:17:44
code-tokenize	0.2.1	Fast program tokenization and structural analysis in Python	2025-01-14 09:17:25
rftokenizer	2.3.0	A character-wise tokenizer for morphologically rich languages	2024-12-17 19:05:30
alphacodings	0.2.0	base26 ([A-Z]) and base52 ([A-Za-z]) encodings	2024-12-09 03:04:43
QuickBPE	2.1	A fast BPE implementation in C	2024-12-05 11:37:29
zhon	2.1.1	Zhon provides constants used in Chinese text processing.	2024-11-20 00:29:10
llama-tokens	0.0.3	A Quick Library with Llama 3.1/3.2 Tokenization - source https://github.com/jeffxtang/llama-tokens	2024-11-10 17:03:39
eKoNLPy	2.0.6	A Korean natural language processing toolkit for economic analysis	2024-11-02 00:02:54
huspacy	0.12.0	HuSpaCy: industrial strength Hungarian natural language processing	2024-10-28 10:30:55
taibun	1.1.7	Taiwanese Hokkien Transliterator and Tokeniser	2024-08-31 20:25:01

Found 25 out of 312,775. Showing 20 on page 1. Total pages: 2.

first prev next last