privacy-similarity

Name	privacy-similarity JSON
Version	0.1.4 JSON
	download
home_page	https://github.com/alexandernicholson/python-similarity
Summary	Privacy-preserving similarity search for massive DataFrames with PII
upload_time	2025-10-21 23:51:10
maintainer	None
docs_url	None
author	Alexander Nicholson
requires_python	>=3.8
license	None
keywords	privacy similarity-search differential-privacy pii embeddings faiss deduplication
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Privacy-Preserving Similarity Search

A Python package for privacy-preserving similarity search on massive DataFrames containing PII (Personally Identifiable Information). Perfect for finding duplicate customers, similar purchase histories, and entity resolution while maintaining strong privacy guarantees.

## Features

- **Multiple Privacy Modes**: Differential Privacy, Homomorphic Encryption, Secure Hashing
- **Scalable Search**: Built on FAISS for billion-scale vector similarity search
- **Advanced Embeddings**: Deep learning-based embeddings using Sentence Transformers
- **Smart Blocking**: LSH and clustering-based candidate generation
- **Flexible API**: Easy-to-use interface for DataFrame operations
- **Production-Ready**: Based on research from Amazon, Meta, Google, and Microsoft

## Architecture

The package implements a multi-layered architecture:

1. **Privacy Protection Layer**: Transforms sensitive data using DP, HE, or secure hashing
2. **Embedding Generation**: Converts text and structured data into dense vector representations
3. **Blocking/Filtering**: Reduces search space using LSH or clustering techniques
4. **Similarity Search**: FAISS-based approximate nearest neighbor search
5. **Post-Processing**: Refinement and deduplication

### Component Diagram

```mermaid
graph TB
    subgraph Input["Input Layer"]
        DF[DataFrame with PII]
    end

    subgraph Privacy["Privacy Protection Layer"]
        DP[Differential Privacy<br/>DP-MinHash, DP-OPH]
        HE[Homomorphic Encryption<br/>Secure Inner Products]
        SH[Secure Hashing<br/>Bloom Filters, k-Anonymity]
    end

    subgraph Embeddings["Embedding Generation Layer"]
        TE[Text Embeddings<br/>Sentence Transformers]
        PII[PII Tokenizer<br/>Name/Email/Address]
        NF[Numeric Features<br/>Scaling, Encoding]
    end

    subgraph Blocking["Blocking/Filtering Layer"]
        LSH[LSH Blocking<br/>Random Projection, MinHash]
        CLUSTER[Clustering<br/>K-Means, Canopy]
        DYNAMIC[Dynamic Bucketing<br/>Adaptive Radius]
    end

    subgraph Search["Similarity Search Layer"]
        FLAT[FAISS Flat<br/>Exact Search]
        HNSW[FAISS HNSW<br/>Graph-based ANN]
        IVF[FAISS IVF<br/>Quantized Search]
    end

    subgraph Output["Output Layer"]
        RESULTS[Search Results<br/>Top-k Neighbors]
        DUPES[Duplicate Groups<br/>Connected Components]
    end

    DF --> Privacy
    DP --> Embeddings
    HE --> Embeddings
    SH --> Embeddings

    Embeddings --> TE
    Embeddings --> PII
    Embeddings --> NF

    TE --> Blocking
    PII --> Blocking
    NF --> Blocking

    LSH --> Search
    CLUSTER --> Search
    DYNAMIC --> Search

    FLAT --> Output
    HNSW --> Output
    IVF --> Output

    RESULTS --> USER[User Application]
    DUPES --> USER

    style Privacy fill:#e1f5ff
    style Embeddings fill:#fff3cd
    style Blocking fill:#d4edda
    style Search fill:#f8d7da
    style Output fill:#d1ecf1
```

### Data Flow

```mermaid
sequenceDiagram
    participant U as User
    participant API as Core API
    participant P as Privacy Layer
    participant E as Embeddings
    participant B as Blocking
    participant S as FAISS Search

    U->>API: fit(df, sensitive_columns)
    API->>P: apply_privacy(sensitive_data)
    P-->>API: protected_data
    API->>E: generate_embeddings(data)
    E-->>API: vectors
    API->>B: create_blocks(vectors)
    B-->>API: candidate_sets
    API->>S: build_index(vectors)
    S-->>API: index
    API-->>U: fitted_model

    U->>API: search(query_df, k=10)
    API->>P: apply_privacy(query_data)
    P-->>API: protected_query
    API->>E: generate_embeddings(query)
    E-->>API: query_vectors
    API->>B: filter_candidates(query_vectors)
    B-->>API: candidate_ids
    API->>S: search(query_vectors, k)
    S-->>API: neighbors, distances
    API-->>U: results_df
```

## Installation

### From PyPI (Recommended)

```bash
pip install privacy-similarity
```

### From Source

```bash
git clone https://github.com/alexandernicholson/python-similarity.git
cd python-similarity
pip install -r requirements.txt
pip install -e .
```

### GPU Support

For GPU acceleration (5-10x faster on large datasets):
```bash
pip install privacy-similarity
pip install faiss-gpu
```

## Quick Start

```python
from privacy_similarity import PrivacyPreservingSimilaritySearch
import pandas as pd

# Create sample DataFrame with customer data
df = pd.DataFrame({
    'customer_id': [1, 2, 3, 4, 5],
    'name': ['John Smith', 'Jon Smith', 'Jane Doe', 'John A. Smith', 'Alice Johnson'],
    'email': ['john@example.com', 'jon@example.com', 'jane@example.com', 'jsmith@example.com', 'alice@example.com'],
    'address': ['123 Main St', '123 Main Street', '456 Oak Ave', '123 Main St.', '789 Pine Rd'],
    'interests': ['sports, technology', 'tech, sports', 'reading, cooking', 'technology, sports', 'cooking, travel']
})

# Initialize with privacy and search parameters
searcher = PrivacyPreservingSimilaritySearch(
    privacy_mode='differential_privacy',  # or 'homomorphic', 'secure_hashing'
    epsilon=1.0,  # Differential privacy parameter
    embedding_model='sentence-transformers/all-MiniLM-L6-v2',
    index_type='HNSW',  # or 'IVF-HNSW', 'IVF-PQ' for larger datasets
    use_gpu=False
)

# Fit the model on your data
searcher.fit(
    df,
    sensitive_columns=['name', 'email', 'address'],
    embedding_columns=['interests'],
    id_column='customer_id'
)

# Find duplicates
duplicates = searcher.find_duplicates(threshold=0.85)
print(f"Found {len(duplicates)} duplicate groups")

# Search for similar records
query_df = pd.DataFrame({
    'name': ['Jonathan Smith'],
    'email': ['j.smith@example.com'],
    'address': ['123 Main Street'],
    'interests': ['sports and tech']
})

results = searcher.search(query_df, k=3, similarity_threshold=0.7)
print(results)
```

## Privacy Modes

### Differential Privacy (Recommended)
- **Overhead**: 1.5-2x
- **Use Case**: Statistical privacy guarantees for analytics
- **Parameters**: `epsilon` (privacy budget, lower = more private)

```python
searcher = PrivacyPreservingSimilaritySearch(
    privacy_mode='differential_privacy',
    epsilon=1.0  # Standard: 0.1 (high privacy) to 10.0 (low privacy)
)
```

### Homomorphic Encryption
- **Overhead**: 10-100x
- **Use Case**: Cryptographic guarantees for sensitive data
- **Parameters**: `encryption_key_size`

```python
searcher = PrivacyPreservingSimilaritySearch(
    privacy_mode='homomorphic',
    encryption_key_size=2048
)
```

### Secure Hashing
- **Overhead**: 1x
- **Use Case**: Internal use, public data
- **Parameters**: `salt` (random string for security)

```python
searcher = PrivacyPreservingSimilaritySearch(
    privacy_mode='secure_hashing',
    salt='your-random-salt-string'
)
```

## Index Types

- **HNSW**: Best for <10M records, excellent accuracy and speed
- **IVF-HNSW**: Best for 10M-1B records, balanced performance
- **IVF-PQ**: Best for 1B+ records with memory constraints

## Performance Characteristics

| Index Type | Dataset Size | QPS | Recall | Memory |
|-----------|--------------|-----|--------|---------|
| HNSW | <10M | 10^4-10^5 | >95% | High |
| IVF-HNSW | 10M-1B | 10^3-10^4 | >90% | Medium |
| IVF-PQ | 1B+ | 10^2-10^3 | >85% | Low |

## Use Cases

### Customer Deduplication
```python
# Find duplicate customer records
duplicates = searcher.find_duplicates(
    threshold=0.9,
    max_cluster_size=100
)

# Get detailed match information
for group in duplicates:
    print(f"Duplicate group: {group['ids']}")
    print(f"Confidence: {group['similarity']}")
```

### Similar Customer Discovery
```python
# Find customers with similar interests or purchase history
similar = searcher.search(
    query_df,
    k=10,
    similarity_threshold=0.75,
    return_distances=True
)
```

### Privacy-Preserving Analytics
```python
# Perform analytics on sensitive data without exposing PII
searcher = PrivacyPreservingSimilaritySearch(
    privacy_mode='differential_privacy',
    epsilon=0.1  # High privacy
)
```

## Advanced Features

### Custom Embeddings
```python
# Use your own embedding model
from sentence_transformers import SentenceTransformer

custom_model = SentenceTransformer('your-model-name')
searcher = PrivacyPreservingSimilaritySearch(
    embedding_model=custom_model
)
```

### Batch Processing
```python
# Process large datasets in batches
searcher.fit_batch(
    df,
    batch_size=10000,
    n_jobs=-1  # Use all CPU cores
)
```

### Incremental Updates
```python
# Add new records to existing index
searcher.add_records(new_df)
```

## Research Background

This package is built on state-of-the-art research from:

- **Meta/Facebook**: FAISS library for billion-scale vector search
- **Amazon**: Semantic product search and entity resolution
- **Airbnb**: Real-time personalization using embeddings
- **Academic Research**: Differential privacy, homomorphic encryption, LSH

Key papers implemented:
- FAISS: A library for efficient similarity search (Meta AI)
- Differential Privacy for MinHash and LSH
- Privacy-Preserving Text Embeddings with Homomorphic Encryption
- Neural LSH for Entity Blocking

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

MIT License

## Citation

If you use this package in your research, please cite:

```bibtex
@software{privacy_similarity,
  title={Privacy-Preserving Similarity Search},
  author={Alexander Nicholson},
  year={2025},
  url={https://github.com/alexandernicholson/python-similarity}
}
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alexandernicholson/python-similarity",
    "name": "privacy-similarity",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "privacy, similarity-search, differential-privacy, pii, embeddings, faiss, deduplication",
    "author": "Alexander Nicholson",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/7d/85/6d79194cb33c773f42d885893d30e3152c6f5d4725059a529b68b56aee9d/privacy_similarity-0.1.4.tar.gz",
    "platform": null,
    "description": "# Privacy-Preserving Similarity Search\n\nA Python package for privacy-preserving similarity search on massive DataFrames containing PII (Personally Identifiable Information). Perfect for finding duplicate customers, similar purchase histories, and entity resolution while maintaining strong privacy guarantees.\n\n## Features\n\n- **Multiple Privacy Modes**: Differential Privacy, Homomorphic Encryption, Secure Hashing\n- **Scalable Search**: Built on FAISS for billion-scale vector similarity search\n- **Advanced Embeddings**: Deep learning-based embeddings using Sentence Transformers\n- **Smart Blocking**: LSH and clustering-based candidate generation\n- **Flexible API**: Easy-to-use interface for DataFrame operations\n- **Production-Ready**: Based on research from Amazon, Meta, Google, and Microsoft\n\n## Architecture\n\nThe package implements a multi-layered architecture:\n\n1. **Privacy Protection Layer**: Transforms sensitive data using DP, HE, or secure hashing\n2. **Embedding Generation**: Converts text and structured data into dense vector representations\n3. **Blocking/Filtering**: Reduces search space using LSH or clustering techniques\n4. **Similarity Search**: FAISS-based approximate nearest neighbor search\n5. **Post-Processing**: Refinement and deduplication\n\n### Component Diagram\n\n```mermaid\ngraph TB\n    subgraph Input[\"Input Layer\"]\n        DF[DataFrame with PII]\n    end\n\n    subgraph Privacy[\"Privacy Protection Layer\"]\n        DP[Differential Privacy<br/>DP-MinHash, DP-OPH]\n        HE[Homomorphic Encryption<br/>Secure Inner Products]\n        SH[Secure Hashing<br/>Bloom Filters, k-Anonymity]\n    end\n\n    subgraph Embeddings[\"Embedding Generation Layer\"]\n        TE[Text Embeddings<br/>Sentence Transformers]\n        PII[PII Tokenizer<br/>Name/Email/Address]\n        NF[Numeric Features<br/>Scaling, Encoding]\n    end\n\n    subgraph Blocking[\"Blocking/Filtering Layer\"]\n        LSH[LSH Blocking<br/>Random Projection, MinHash]\n        CLUSTER[Clustering<br/>K-Means, Canopy]\n        DYNAMIC[Dynamic Bucketing<br/>Adaptive Radius]\n    end\n\n    subgraph Search[\"Similarity Search Layer\"]\n        FLAT[FAISS Flat<br/>Exact Search]\n        HNSW[FAISS HNSW<br/>Graph-based ANN]\n        IVF[FAISS IVF<br/>Quantized Search]\n    end\n\n    subgraph Output[\"Output Layer\"]\n        RESULTS[Search Results<br/>Top-k Neighbors]\n        DUPES[Duplicate Groups<br/>Connected Components]\n    end\n\n    DF --> Privacy\n    DP --> Embeddings\n    HE --> Embeddings\n    SH --> Embeddings\n\n    Embeddings --> TE\n    Embeddings --> PII\n    Embeddings --> NF\n\n    TE --> Blocking\n    PII --> Blocking\n    NF --> Blocking\n\n    LSH --> Search\n    CLUSTER --> Search\n    DYNAMIC --> Search\n\n    FLAT --> Output\n    HNSW --> Output\n    IVF --> Output\n\n    RESULTS --> USER[User Application]\n    DUPES --> USER\n\n    style Privacy fill:#e1f5ff\n    style Embeddings fill:#fff3cd\n    style Blocking fill:#d4edda\n    style Search fill:#f8d7da\n    style Output fill:#d1ecf1\n```\n\n### Data Flow\n\n```mermaid\nsequenceDiagram\n    participant U as User\n    participant API as Core API\n    participant P as Privacy Layer\n    participant E as Embeddings\n    participant B as Blocking\n    participant S as FAISS Search\n\n    U->>API: fit(df, sensitive_columns)\n    API->>P: apply_privacy(sensitive_data)\n    P-->>API: protected_data\n    API->>E: generate_embeddings(data)\n    E-->>API: vectors\n    API->>B: create_blocks(vectors)\n    B-->>API: candidate_sets\n    API->>S: build_index(vectors)\n    S-->>API: index\n    API-->>U: fitted_model\n\n    U->>API: search(query_df, k=10)\n    API->>P: apply_privacy(query_data)\n    P-->>API: protected_query\n    API->>E: generate_embeddings(query)\n    E-->>API: query_vectors\n    API->>B: filter_candidates(query_vectors)\n    B-->>API: candidate_ids\n    API->>S: search(query_vectors, k)\n    S-->>API: neighbors, distances\n    API-->>U: results_df\n```\n\n## Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install privacy-similarity\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/alexandernicholson/python-similarity.git\ncd python-similarity\npip install -r requirements.txt\npip install -e .\n```\n\n### GPU Support\n\nFor GPU acceleration (5-10x faster on large datasets):\n```bash\npip install privacy-similarity\npip install faiss-gpu\n```\n\n## Quick Start\n\n```python\nfrom privacy_similarity import PrivacyPreservingSimilaritySearch\nimport pandas as pd\n\n# Create sample DataFrame with customer data\ndf = pd.DataFrame({\n    'customer_id': [1, 2, 3, 4, 5],\n    'name': ['John Smith', 'Jon Smith', 'Jane Doe', 'John A. Smith', 'Alice Johnson'],\n    'email': ['john@example.com', 'jon@example.com', 'jane@example.com', 'jsmith@example.com', 'alice@example.com'],\n    'address': ['123 Main St', '123 Main Street', '456 Oak Ave', '123 Main St.', '789 Pine Rd'],\n    'interests': ['sports, technology', 'tech, sports', 'reading, cooking', 'technology, sports', 'cooking, travel']\n})\n\n# Initialize with privacy and search parameters\nsearcher = PrivacyPreservingSimilaritySearch(\n    privacy_mode='differential_privacy',  # or 'homomorphic', 'secure_hashing'\n    epsilon=1.0,  # Differential privacy parameter\n    embedding_model='sentence-transformers/all-MiniLM-L6-v2',\n    index_type='HNSW',  # or 'IVF-HNSW', 'IVF-PQ' for larger datasets\n    use_gpu=False\n)\n\n# Fit the model on your data\nsearcher.fit(\n    df,\n    sensitive_columns=['name', 'email', 'address'],\n    embedding_columns=['interests'],\n    id_column='customer_id'\n)\n\n# Find duplicates\nduplicates = searcher.find_duplicates(threshold=0.85)\nprint(f\"Found {len(duplicates)} duplicate groups\")\n\n# Search for similar records\nquery_df = pd.DataFrame({\n    'name': ['Jonathan Smith'],\n    'email': ['j.smith@example.com'],\n    'address': ['123 Main Street'],\n    'interests': ['sports and tech']\n})\n\nresults = searcher.search(query_df, k=3, similarity_threshold=0.7)\nprint(results)\n```\n\n## Privacy Modes\n\n### Differential Privacy (Recommended)\n- **Overhead**: 1.5-2x\n- **Use Case**: Statistical privacy guarantees for analytics\n- **Parameters**: `epsilon` (privacy budget, lower = more private)\n\n```python\nsearcher = PrivacyPreservingSimilaritySearch(\n    privacy_mode='differential_privacy',\n    epsilon=1.0  # Standard: 0.1 (high privacy) to 10.0 (low privacy)\n)\n```\n\n### Homomorphic Encryption\n- **Overhead**: 10-100x\n- **Use Case**: Cryptographic guarantees for sensitive data\n- **Parameters**: `encryption_key_size`\n\n```python\nsearcher = PrivacyPreservingSimilaritySearch(\n    privacy_mode='homomorphic',\n    encryption_key_size=2048\n)\n```\n\n### Secure Hashing\n- **Overhead**: 1x\n- **Use Case**: Internal use, public data\n- **Parameters**: `salt` (random string for security)\n\n```python\nsearcher = PrivacyPreservingSimilaritySearch(\n    privacy_mode='secure_hashing',\n    salt='your-random-salt-string'\n)\n```\n\n## Index Types\n\n- **HNSW**: Best for <10M records, excellent accuracy and speed\n- **IVF-HNSW**: Best for 10M-1B records, balanced performance\n- **IVF-PQ**: Best for 1B+ records with memory constraints\n\n## Performance Characteristics\n\n| Index Type | Dataset Size | QPS | Recall | Memory |\n|-----------|--------------|-----|--------|---------|\n| HNSW | <10M | 10^4-10^5 | >95% | High |\n| IVF-HNSW | 10M-1B | 10^3-10^4 | >90% | Medium |\n| IVF-PQ | 1B+ | 10^2-10^3 | >85% | Low |\n\n## Use Cases\n\n### Customer Deduplication\n```python\n# Find duplicate customer records\nduplicates = searcher.find_duplicates(\n    threshold=0.9,\n    max_cluster_size=100\n)\n\n# Get detailed match information\nfor group in duplicates:\n    print(f\"Duplicate group: {group['ids']}\")\n    print(f\"Confidence: {group['similarity']}\")\n```\n\n### Similar Customer Discovery\n```python\n# Find customers with similar interests or purchase history\nsimilar = searcher.search(\n    query_df,\n    k=10,\n    similarity_threshold=0.75,\n    return_distances=True\n)\n```\n\n### Privacy-Preserving Analytics\n```python\n# Perform analytics on sensitive data without exposing PII\nsearcher = PrivacyPreservingSimilaritySearch(\n    privacy_mode='differential_privacy',\n    epsilon=0.1  # High privacy\n)\n```\n\n## Advanced Features\n\n### Custom Embeddings\n```python\n# Use your own embedding model\nfrom sentence_transformers import SentenceTransformer\n\ncustom_model = SentenceTransformer('your-model-name')\nsearcher = PrivacyPreservingSimilaritySearch(\n    embedding_model=custom_model\n)\n```\n\n### Batch Processing\n```python\n# Process large datasets in batches\nsearcher.fit_batch(\n    df,\n    batch_size=10000,\n    n_jobs=-1  # Use all CPU cores\n)\n```\n\n### Incremental Updates\n```python\n# Add new records to existing index\nsearcher.add_records(new_df)\n```\n\n## Research Background\n\nThis package is built on state-of-the-art research from:\n\n- **Meta/Facebook**: FAISS library for billion-scale vector search\n- **Amazon**: Semantic product search and entity resolution\n- **Airbnb**: Real-time personalization using embeddings\n- **Academic Research**: Differential privacy, homomorphic encryption, LSH\n\nKey papers implemented:\n- FAISS: A library for efficient similarity search (Meta AI)\n- Differential Privacy for MinHash and LSH\n- Privacy-Preserving Text Embeddings with Homomorphic Encryption\n- Neural LSH for Entity Blocking\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nMIT License\n\n## Citation\n\nIf you use this package in your research, please cite:\n\n```bibtex\n@software{privacy_similarity,\n  title={Privacy-Preserving Similarity Search},\n  author={Alexander Nicholson},\n  year={2025},\n  url={https://github.com/alexandernicholson/python-similarity}\n}\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Privacy-preserving similarity search for massive DataFrames with PII",
    "version": "0.1.4",
    "project_urls": {
        "Changelog": "https://github.com/alexandernicholson/python-similarity/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/alexandernicholson/python-similarity/tree/main/docs",
        "Homepage": "https://github.com/alexandernicholson/python-similarity",
        "Issues": "https://github.com/alexandernicholson/python-similarity/issues",
        "Repository": "https://github.com/alexandernicholson/python-similarity"
    },
    "split_keywords": [
        "privacy",
        " similarity-search",
        " differential-privacy",
        " pii",
        " embeddings",
        " faiss",
        " deduplication"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8ba632d0fd1ca04cb327d50ffbec42498e1a1ee231bdcf35a7c29d1de402d27e",
                "md5": "1e099d9f7d4738de89f13503837e80b2",
                "sha256": "04dfca899d0b4aa07f0229f51889b93aac08d948a797acf29e5e44ae657b691b"
            },
            "downloads": -1,
            "filename": "privacy_similarity-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1e099d9f7d4738de89f13503837e80b2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 42204,
            "upload_time": "2025-10-21T23:51:09",
            "upload_time_iso_8601": "2025-10-21T23:51:09.312695Z",
            "url": "https://files.pythonhosted.org/packages/8b/a6/32d0fd1ca04cb327d50ffbec42498e1a1ee231bdcf35a7c29d1de402d27e/privacy_similarity-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7d856d79194cb33c773f42d885893d30e3152c6f5d4725059a529b68b56aee9d",
                "md5": "33166845d7c815df7ffd671593852036",
                "sha256": "1646ed84e0119833226129d7f929ddc2235f1edaed977ba634fad659ead09458"
            },
            "downloads": -1,
            "filename": "privacy_similarity-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "33166845d7c815df7ffd671593852036",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 95745,
            "upload_time": "2025-10-21T23:51:10",
            "upload_time_iso_8601": "2025-10-21T23:51:10.285629Z",
            "url": "https://files.pythonhosted.org/packages/7d/85/6d79194cb33c773f42d885893d30e3152c6f5d4725059a529b68b56aee9d/privacy_similarity-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-21 23:51:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alexandernicholson",
    "github_project": "python-similarity",
    "github_not_found": true,
    "lcname": "privacy-similarity"
}

Alexander Nicholson