# PII Scanner
A library for scanning Personally Identifiable Information (PII) from text.
## Installation
```bash
pip install pii_scanner
```
## Usage
```bash
from pii_scanner.main import PIIScanner
# Create an instance of the scanner
pii_scanner = PIIScanner()
# Sample data containing PII
data = [
"Ankit Gupta",
"Lucknow",
"+9191840562195",
"Indian",
"Sofia Rossi",
"Rome",
"+390612345678",
"Italian"
]
# Scan the structured data
results = pii_scanner.scan_structured_data(data, chunk_size=10, sample_size=1.0)
# Print the results
print(results)
```
## Output
```bash
[
{
"text": "Ankit Gupta",
"entity_detected": [
{"type": "PERSON", "start": 0, "end": 11, "score": 0.85}
]
},
{
"text": "Rome",
"entity_detected": [
{"type": "LOCATION", "start": 0, "end": 4, "score": 0.85}
]
},
{
"text": "+919140562195",
"entity_detected": [
{"type": "PHONE_NUMBER", "start": 0, "end": 13, "score": 0.85}
]
},
{
"text": "Sofia Rossi",
"entity_detected": [
{"type": "PERSON", "start": 0, "end": 11, "score": 0.85}
]
},
{
"text": "Indian",
"entity_detected": [
{"type": "NATIONALITY", "start": 0, "end": 6, "score": 0.9}
]
}
]
```
Raw data
{
"_id": null,
"home_page": "https://github.com/devankit01/pii_scanner",
"name": "pii-scanner",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Ankit Gupta",
"author_email": "devankitgupta01@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/d5/2d/672e0e75d183525ef0a889e6e6524080c18589a70f46917dcc91bab45375/pii_scanner-0.1.3.tar.gz",
"platform": null,
"description": "# PII Scanner\n\nA library for scanning Personally Identifiable Information (PII) from text.\n\n## Installation\n\n```bash\npip install pii_scanner\n```\n\n## Usage \n\n```bash\nfrom pii_scanner.main import PIIScanner\n\n# Create an instance of the scanner\npii_scanner = PIIScanner()\n\n# Sample data containing PII\ndata = [\n \"Ankit Gupta\",\n \"Lucknow\",\n \"+9191840562195\",\n \"Indian\",\n \"Sofia Rossi\",\n \"Rome\",\n \"+390612345678\",\n \"Italian\"\n]\n\n# Scan the structured data\nresults = pii_scanner.scan_structured_data(data, chunk_size=10, sample_size=1.0)\n\n# Print the results\nprint(results)\n\n```\n\n\n## Output \n\n```bash\n[\n {\n \"text\": \"Ankit Gupta\",\n \"entity_detected\": [\n {\"type\": \"PERSON\", \"start\": 0, \"end\": 11, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"Rome\",\n \"entity_detected\": [\n {\"type\": \"LOCATION\", \"start\": 0, \"end\": 4, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"+919140562195\",\n \"entity_detected\": [\n {\"type\": \"PHONE_NUMBER\", \"start\": 0, \"end\": 13, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"Sofia Rossi\",\n \"entity_detected\": [\n {\"type\": \"PERSON\", \"start\": 0, \"end\": 11, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"Indian\",\n \"entity_detected\": [\n {\"type\": \"NATIONALITY\", \"start\": 0, \"end\": 6, \"score\": 0.9}\n ]\n }\n]\n\n\n```\n\n\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A library for scanning Personally Identifiable Information (PII).",
"version": "0.1.3",
"project_urls": {
"Homepage": "https://github.com/devankit01/pii_scanner"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "db800a8973b1c0fcd61016f39cd9f94fa6b7e068c59130ffc0d6ecbc53237c66",
"md5": "0c63bf722fb339b1f54aa1a470a5765f",
"sha256": "0393533a04f8bf96de37183dca0e8a8270c5503d0f72bd52c4708cf7c414834e"
},
"downloads": -1,
"filename": "pii_scanner-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0c63bf722fb339b1f54aa1a470a5765f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 5388,
"upload_time": "2024-09-29T09:13:00",
"upload_time_iso_8601": "2024-09-29T09:13:00.845725Z",
"url": "https://files.pythonhosted.org/packages/db/80/0a8973b1c0fcd61016f39cd9f94fa6b7e068c59130ffc0d6ecbc53237c66/pii_scanner-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d52d672e0e75d183525ef0a889e6e6524080c18589a70f46917dcc91bab45375",
"md5": "b853c9736a5f2a32744f735160d709b2",
"sha256": "36bf9208ea40d428a58e592527f57b0e01d7536f2853154387a7c397b1b76cab"
},
"downloads": -1,
"filename": "pii_scanner-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "b853c9736a5f2a32744f735160d709b2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 4460,
"upload_time": "2024-09-29T09:13:02",
"upload_time_iso_8601": "2024-09-29T09:13:02.282830Z",
"url": "https://files.pythonhosted.org/packages/d5/2d/672e0e75d183525ef0a889e6e6524080c18589a70f46917dcc91bab45375/pii_scanner-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-29 09:13:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "devankit01",
"github_project": "pii_scanner",
"github_not_found": true,
"lcname": "pii-scanner"
}