PyDigger - unearthing stuff about Python

Found 19 out of 333,111. Showing 19 on page 1. Total pages: 1.

Name	Version	Summary	date
plankode	0.0.0a0	Unified Token Rendering Library for DeepSeek Models.	2025-10-07 10:53:05
rwa-sdk-qdf	0.1.0	QuantDeFi.ai RWA SDK - Professional toolkit for tokenized asset data analysis \| $297B+ tracked assets	2025-09-16 03:38:58
precious-nlp	0.1.2	A tokenizer-free NLP library with T-FREE, CANINE, and byte-level approaches	2025-09-02 11:28:22
PyTokenCounter	1.8.2	A Python library for tokenizing text and counting tokens using various encoding schemes.	2025-08-22 03:20:04
romanRekhta	0.1.0	An NLP library for Roman Urdu text preprocessing, tokenization, and stopword handling.	2025-07-25 11:42:17
tokenization-scorer	1.1.8	Package for evaluating text tokenizations.	2025-01-13 10:36:40
text2text	1.9.4	Text2Text Language Modeling Toolkit	2025-01-12 22:34:27
LughaatNLP	1.3.1	A Python package for natural language processing tasks for the Urdu language, including normalization, part-of-speech (POS) tagging, named entity recognition (NER), stemming, lemmatization, tokenization, and stopword removal , text-to-speech , speech-to-text, summarization.	2024-12-30 22:07:32
python-ucto	0.6.9	This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).	2024-12-17 11:56:39
tokenize-text	0.2.32	Tokenizing and processing text inputs with transformer models	2024-08-11 21:53:24
tokenize-transformer	0.2.14	Tokenizing and processing text inputs with transformer models	2024-08-01 18:39:29
BasicTextMetrics	0.2.1	Analyze textual data and extract useful metrics such as word count, character count, average word length, and most common words	2024-02-28 13:32:17
huspacy-nightly	0.11.0.dev261	HuSpaCy: industrial strength Hungarian natural language processing	2024-01-03 19:48:33
miditok-for-musiclang	0.0.1	A convenient MIDI tokenizer for Deep Learning networks, with multiple encoding strategies	2023-11-23 17:31:40
hebpipe	3.0.0.6	A pipeline for Hebrew NLP	2023-08-22 00:50:27
ipa-core	0.1.3	NLP Preprocessing Pipeline Wrappers	2023-05-12 15:14:56
pyonmttok	1.36.0	Fast and customizable text tokenization library with BPE and SentencePiece support	2023-01-11 13:46:07
mosestokenizer	1.2.1	Wrappers for several pre-processing scripts from the Moses toolkit.	2021-10-22 14:15:07
sentence-splitter	1.4	Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder	2019-01-14 17:11:25

Found 19 out of 333,111. Showing 19 on page 1. Total pages: 1.

first prev next last