# PII Scanner
A library for scanning Personally Identifiable Information (PII) from text.
## Installation
```bash
pip install pii_scanner
```
## Usage
```bash
from pii_scanner.scanner import PIIScanner
# Create an instance of the scanner
pii_scanner = PIIScanner()
# Sample data containing PII
data = [
"Ankit Gupta",
"Lucknow",
"+9191840562195",
"Indian",
"Sofia Rossi",
"Rome",
"+390612345678",
"Italian"
]
# Scan the structured data
results = pii_scanner.scan(data=data, chunk_size=10, sample_size=1.0)
# Print the results
print(results)
```
## Output
```bash
[
{
"text": "Ankit Gupta",
"entity_detected": [
{"type": "PERSON", "start": 0, "end": 11, "score": 0.85}
]
},
{
"text": "Rome",
"entity_detected": [
{"type": "LOCATION", "start": 0, "end": 4, "score": 0.85}
]
},
{
"text": "+919140562195",
"entity_detected": [
{"type": "PHONE_NUMBER", "start": 0, "end": 13, "score": 0.85}
]
},
{
"text": "Sofia Rossi",
"entity_detected": [
{"type": "PERSON", "start": 0, "end": 11, "score": 0.85}
]
},
{
"text": "Indian",
"entity_detected": [
{"type": "NATIONALITY", "start": 0, "end": 6, "score": 0.9}
]
}
]
```
Raw data
{
"_id": null,
"home_page": "https://github.com/devankit01/pii_scanner",
"name": "PIIScanner",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Ankit Gupta",
"author_email": "devankitgupta01@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/4e/0c/348b5e5cec05cc7d1ccd3de5e09f10cce57cf4326d71edd57b170fe20647/piiscanner-0.1.2.tar.gz",
"platform": null,
"description": "# PII Scanner\n\nA library for scanning Personally Identifiable Information (PII) from text.\n\n## Installation\n\n```bash\npip install pii_scanner\n```\n\n## Usage \n\n```bash\nfrom pii_scanner.scanner import PIIScanner\n\n# Create an instance of the scanner\npii_scanner = PIIScanner()\n\n# Sample data containing PII\ndata = [\n \"Ankit Gupta\",\n \"Lucknow\",\n \"+9191840562195\",\n \"Indian\",\n \"Sofia Rossi\",\n \"Rome\",\n \"+390612345678\",\n \"Italian\"\n]\n\n# Scan the structured data\nresults = pii_scanner.scan(data=data, chunk_size=10, sample_size=1.0)\n\n# Print the results\nprint(results)\n\n```\n\n\n## Output \n\n```bash\n[\n {\n \"text\": \"Ankit Gupta\",\n \"entity_detected\": [\n {\"type\": \"PERSON\", \"start\": 0, \"end\": 11, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"Rome\",\n \"entity_detected\": [\n {\"type\": \"LOCATION\", \"start\": 0, \"end\": 4, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"+919140562195\",\n \"entity_detected\": [\n {\"type\": \"PHONE_NUMBER\", \"start\": 0, \"end\": 13, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"Sofia Rossi\",\n \"entity_detected\": [\n {\"type\": \"PERSON\", \"start\": 0, \"end\": 11, \"score\": 0.85}\n ]\n },\n {\n \"text\": \"Indian\",\n \"entity_detected\": [\n {\"type\": \"NATIONALITY\", \"start\": 0, \"end\": 6, \"score\": 0.9}\n ]\n }\n]\n\n\n```\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A library for scanning Personally Identifiable Information (PII).",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/devankit01/pii_scanner"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b73562aad6b069d6ae1b4cb5a64c8d29e9bbbe5626a9b12bd50f8bf6eb8dca2a",
"md5": "773bc8a342e1deca4bdd75a62178541a",
"sha256": "2de6e9b13676b3834e3383eb34f85f93b3b151d511a0122dc134e8cb8fbf52d6"
},
"downloads": -1,
"filename": "PIIScanner-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "773bc8a342e1deca4bdd75a62178541a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 6673,
"upload_time": "2024-12-12T13:05:40",
"upload_time_iso_8601": "2024-12-12T13:05:40.590029Z",
"url": "https://files.pythonhosted.org/packages/b7/35/62aad6b069d6ae1b4cb5a64c8d29e9bbbe5626a9b12bd50f8bf6eb8dca2a/PIIScanner-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4e0c348b5e5cec05cc7d1ccd3de5e09f10cce57cf4326d71edd57b170fe20647",
"md5": "d6e70f275f7ba722197619e065b16ecf",
"sha256": "54112b277d0dbc4b48c76fa515c4cb67e984e6130718bf2bfbbedff217857b5e"
},
"downloads": -1,
"filename": "piiscanner-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "d6e70f275f7ba722197619e065b16ecf",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 6162,
"upload_time": "2024-12-12T13:05:42",
"upload_time_iso_8601": "2024-12-12T13:05:42.130450Z",
"url": "https://files.pythonhosted.org/packages/4e/0c/348b5e5cec05cc7d1ccd3de5e09f10cce57cf4326d71edd57b170fe20647/piiscanner-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-12 13:05:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "devankit01",
"github_project": "pii_scanner",
"github_not_found": true,
"lcname": "piiscanner"
}