PyDigger - unearthing stuff about Python

Found 32 out of 317,081. Showing 20 on page 1. Total pages: 2.

Name	Version	Summary	date
wizardhtml	1.0.1	WHATWG-compliant HTML5 toolkit: DFA tokenizer, spec-guided tree builder, DOM, configurable serializer, high-level cleaner, pretty-printer, and HTML to Markdown.	2025-08-29 12:37:55
tokenizers	0.22.0	None	2025-08-29 10:25:33
tokenpal	0.1.0	Your friendly token counting pal for OpenAI, Anthropic, and Google LLM models. Simple, accurate, fast.	2025-08-27 03:44:00
mon-tokenizer	0.1.3	A simple tokenizer for Mon text	2025-08-23 15:45:38
amharicNLP	0.8.0	amharicNLP is a Python package for Amharic Natural Language Processing (NLP) and text preprocessing.	2025-08-23 10:34:58
burmese-tokenizer	0.1.3	A simple tokenizer for Burmese text	2025-08-23 04:26:14
appserver-sdk-python-ai	0.0.21	SDK Python para serviços de IA da AppServer	2025-08-17 21:44:18
dir2text	3.0.1	A Python library and command-line tool for expressing directory structures and file contents in formats suitable for Large Language Models (LLMs). It combines directory tree visualization with file contents in a memory-efficient, streaming format.	2025-08-07 23:01:08
llmvision	0.1.1	Visualize how LLMs tokenize text	2025-07-26 07:34:54
semantic-text-splitter	0.25.0	Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python.	2025-03-22 06:57:25
rs-bpe	0.1.0	A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust	2025-03-19 05:58:24
ts-tokenizer	0.1.19	TS Tokenizer is a hybrid (lexicon-based and rule-based) tokenizer designed specifically for tokenizing Turkish texts.	2025-01-30 19:59:44
UniTok	4.3.6	Unified Tokenizer	2025-01-30 14:28:25
pinyintokenizer	0.0.3	Pinyin Tokenizer, chinese pinyin tokenizer	2025-01-28 09:07:39
pgn-tokenizer	0.1.3	A byte pair encoding tokenizer for chess portable game notation (PGN)	2025-01-25 15:02:55
count-tokens	0.7.2	Count number of tokens in the text file using toktoken tokenizer from OpenAI.	2025-01-09 05:15:28
tokenlens	0.1.6	A library for accurate token counting and limit validation across various LLM providers	2025-01-05 03:40:24
token-vision	0.1.0	A fast, offline token calculator for images with various AI models (Claude, GPT-4V, Gemini)	2025-01-02 19:35:06
optilearn	1.3.6	Use to train neural networks, A Package for optimize models, transfer or copy files from one directory to other, use for nlp short word treatment, choosing optimal data for ML models, use for Image Scraping , use in timeseries problem to split the data into train and test, Deal with emojis and emoticons in nlp, word tokenize, token, get the list of Punctuation marks and English Pronouns too, can be used to read text files	2024-11-28 07:34:30
midi-neural-processor	1.0.3	Tokenize MIDI files for neural network processing	2024-11-25 08:39:21

Found 32 out of 317,081. Showing 20 on page 1. Total pages: 2.

first prev next last