# Pii Extractor plugin: regex
This repository builds a Python package that installs a pii-extract-base
plugin to performs PII detection for text data based on regular expressions
(with optional context). The name of the plugin entry point is
`piisa-detectors-regex`.
The PII Tasks in the package are structured by language & country, since many
of the PII elements are language- and/or -country dependent.
## Requirements
The package
* needs at least Python 3.8
* needs the pii-data and the pii-extract-base base packages
* uses the regex package (instead of the standard `re` package in the core
Python library)
* uses the python-stdnum package to validate many identifiers (and the
python-phonenumbers to validate phone numbers)
## Usage
The package does not have any user-facing entry points, and it is used
automatically by the PIISA framework.
## Building
The provided Makefile can be used to process the package:
* `make pkg` will build the Python package, creating a file that can be
installed with `pip`
* `make unit` will launch all unit tests (using pytest, so pytest must be
available)
* `make install` will install the package in a Python virtualenv. The
virtualenv will be chosen as, in this order:
- the one defined in the `VENV` environment variable, if it is defined
- if there is a virtualenv activated in the shell, it will be used
- otherwise, a default is chosen as `/opt/venv/bigscience` (it will be
created if it does not exist)
## Contributing
To add a new PII processing task, please see the contributing instructions.
Raw data
{
"_id": null,
"home_page": "https://github.com/piisa/pii-extract-plg-regex",
"name": "pii-extract-plg-regex",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "PIISA, PII",
"author": "Paulo Villegas",
"author_email": "paulo.vllgs@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ab/81/68ee28e00787824e53f0f00c81eec7d5cdd3ef160889dd7524d1f3632112/pii-extract-plg-regex-0.5.1.tar.gz",
"platform": null,
"description": "# Pii Extractor plugin: regex\n\n\nThis repository builds a Python package that installs a pii-extract-base\nplugin to performs PII detection for text data based on regular expressions\n(with optional context). The name of the plugin entry point is \n`piisa-detectors-regex`.\n\nThe PII Tasks in the package are structured by language & country, since many\nof the PII elements are language- and/or -country dependent.\n\n\n## Requirements\n\nThe package\n * needs at least Python 3.8\n * needs the pii-data and the pii-extract-base base packages\n * uses the regex package (instead of the standard `re` package in the core\n Python library)\n * uses the python-stdnum package to validate many identifiers (and the \n python-phonenumbers to validate phone numbers)\n\n\n## Usage\n\nThe package does not have any user-facing entry points, and it is used\nautomatically by the PIISA framework.\n\n\n## Building\n\nThe provided Makefile can be used to process the package:\n * `make pkg` will build the Python package, creating a file that can be\n installed with `pip`\n * `make unit` will launch all unit tests (using pytest, so pytest must be\n available)\n * `make install` will install the package in a Python virtualenv. The\n virtualenv will be chosen as, in this order:\n - the one defined in the `VENV` environment variable, if it is defined\n - if there is a virtualenv activated in the shell, it will be used\n - otherwise, a default is chosen as `/opt/venv/bigscience` (it will be\n created if it does not exist)\n\n\n## Contributing\n\nTo add a new PII processing task, please see the contributing instructions.\n\n\n",
"bugtrack_url": null,
"license": "Apache",
"summary": "Regex modules for the extraction of PII from text chunks",
"version": "0.5.1",
"project_urls": {
"Download": "https://github.com/piisa/pii-extract-plg-regex/tarball/v0.5.1",
"Homepage": "https://github.com/piisa/pii-extract-plg-regex"
},
"split_keywords": [
"piisa",
" pii"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ab8168ee28e00787824e53f0f00c81eec7d5cdd3ef160889dd7524d1f3632112",
"md5": "526e698972703cf3240043bd5eadb52c",
"sha256": "1ff882fa5a36c39633aa93c38731869acecd5f71bbecc46c856e2219b07d1d85"
},
"downloads": -1,
"filename": "pii-extract-plg-regex-0.5.1.tar.gz",
"has_sig": false,
"md5_digest": "526e698972703cf3240043bd5eadb52c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 24062,
"upload_time": "2024-01-24T19:47:26",
"upload_time_iso_8601": "2024-01-24T19:47:26.718026Z",
"url": "https://files.pythonhosted.org/packages/ab/81/68ee28e00787824e53f0f00c81eec7d5cdd3ef160889dd7524d1f3632112/pii-extract-plg-regex-0.5.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-24 19:47:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "piisa",
"github_project": "pii-extract-plg-regex",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "pii-extract-plg-regex"
}