PyDigger - unearthing stuff about Python


NameVersionSummarydate
blockingpy 0.1.8 Blocking records for record linkage and data deduplication based on ANN algorithms. 2025-01-17 16:54:07
splinkclickhouse 0.3.4 Clickhouse backend support for Splink 2024-12-16 10:07:24
mail-deduplicate 7.6.1 📧 CLI to deduplicate mails from mail boxes 2024-11-30 08:27:27
mim-nlp 0.2.0 A Python package with ready-to-use models for various NLP tasks and text preprocessing utilities. The implementation allows fine-tuning. 2024-07-25 11:29:32
rensa 0.1.6 High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datasets 2024-06-25 17:10:17
process-twarc 0.20.2 Tools for transforming raw data from Twarc2 to structured data for Masked Language Modeling. 2024-06-12 11:40:55
inoutlists 1.0.1 inoutlists is a python package to parse and normalize different sources of lists (OFAC, EU, UN, etc) to a common dictionary interface. 2024-06-03 22:18:59
hourdayweektotal
6713407375281762
Elapsed time: 2.06764s