PyDigger - unearthing stuff about Python


NameVersionSummarydate
raydp-nightly 2025.7.9.dev0 RayDP: Distributed Data Processing on Ray 2025-07-09 01:19:07
raydp 1.6.2 RayDP: Distributed Data Processing on Ray 2025-03-14 08:43:42
spark-acl-tools 0.6.7 spark_acl_tools 2025-03-11 16:27:51
spark-on-k8s 0.12.0 A Python package to submit and manage Apache Spark applications on Kubernetes. 2025-02-17 23:20:11
impyla 0.20.0 Python client for the Impala distributed query engine 2025-01-29 15:50:41
pysparta 0.5.6 Library to help ETL using pyspark 2025-01-06 19:34:53
scholar-spark-observability 0.8.0 A Python package for monitoring and observability in Apache Spark applications 2024-12-30 07:50:23
spark-dataframe-tools 0.7.2 spark_dataframe_tools 2024-12-30 03:28:34
spark-nlp 5.5.2 John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. 2024-12-18 16:04:11
johnsnowlabs-for-databricks 5.5.2 The John Snow Labs Library gives you access to all of John Snow Labs Enterprise And Open Source products in an easy and simple manner. Access 10000+ state-of-the-art NLP and OCR models for Finance, Legal and Medical domains. Easily scalable to Spark Cluster 2024-12-05 17:16:56
onetl 0.12.5 One ETL tool to rule them all 2024-12-03 09:32:12
spark-dataproc-local-tools 0.1.4 spark_dataproc_local_tools 2024-11-30 08:28:43
spark-jdbc-ingestor 1.0.1 A library to handle JDBC ingestion from a SQL database in a simple and efficient way. 2024-11-19 22:24:32
aws-insurancelake-etl 4.1.3 A CDK Python app for deploying ETL jobs that operate data pipelines for InsuranceLake in AWS 2024-11-18 19:02:45
johnsnowlabs 5.5.1 The John Snow Labs Library gives you access to all of John Snow Labs Enterprise And Open Source products in an easy and simple manner. Access 10000+ state-of-the-art NLP and OCR models for Finance, Legal and Medical domains. Easily scalable to Spark Cluster 2024-11-17 16:16:43
emrrunner 1.0.9 A powerful CLI tool and API for managing Spark jobs on Amazon EMR clusters 2024-11-03 16:44:04
spark-connect-proxy 0.0.10 A reverse proxy server which allows secure connectivity to a Spark Connect server 2024-10-16 15:39:49
nlu 5.4.1 John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 20000+ of pretrained models in 200+ languages. It enables swift and simple development and research with its powerful Pythonic and Keras inspired API. It is powerd by John Snow Labs powerful Spark NLP library. 2024-09-27 01:23:20
spark-datax-tools 0.7.0 spark_datax_tools 2024-09-19 00:39:50
spark-gaps-date-rorc-tools 0.2.3 spark_gaps_date_rorc_tools 2024-09-12 00:31:57
hourdayweektotal
5813338223295497
Elapsed time: 1.20437s