incognito-anonymizer


Nameincognito-anonymizer JSON
Version 0.0.5 PyPI version JSON
download
home_pageNone
SummaryA module to anonymize french text data
upload_time2025-01-27 13:03:21
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseNone
keywords anonymizer incognito
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Incognito

## Description
**Incognito** is a Python module for anonymizing French text. It uses Regex and other strategies to mask names and personal information provided by the user.  
This module was specifically designed for medical reports, ensuring that disease names remain unaltered.

[![python](https://img.shields.io/badge/Python-3.12-3776AB.svg?style=flat&logo=python&logoColor=white)](https://www.python.org)

---

## Installation
### From pip
```bash
pip install incognito-anonymizer
```
### From this repository
1. Clone the repository:
    ```bash
    git clone https://github.com/Micropot/incognito
    ```

2. Install the dependencies (defined in `pyproject.toml`):
    ```bash
    pip install .
    ```

---

## Usage

### Python API

#### Example: Providing Personal Information Directly in Code
```python
from . import anonymizer

# Initialize the anonymizer
ano = anonymizer.Anonymizer()

# Define personal information
infos = {
    "first_name": "Bob",
    "last_name": "Jungels",
    "birth_name": "",
    "birthdate": "1992-09-22",
    "ipp": "0987654321",
    "postal_code": "01000",
    "adress": ""
}

# Configure the anonymizer
ano.set_info(infos)
ano.set_strategies(['regex', 'pii'])
ano.set_masks('placeholder')

# Read and anonymize text
text_to_anonymize = ano.open_text_file("/path/to/file.txt")
anonymized_text = ano.anonymize(text_to_anonymize)

print(anonymized_text)
```

#### Example: Using JSON File for Personal Information
```python
from . import anonymizer

# Initialize the anonymizer
ano = anonymizer.Anonymizer()

# Load personal information from JSON
infos_json = ano.open_json_file("/path/to/infofile.json")

# Configure the anonymizer
ano.set_info(infos_json)
ano.set_strategies(['regex', 'pii'])
ano.set_masks('placeholder')

# Read and anonymize text
text_to_anonymize = ano.open_text_file("/path/to/file.txt")
anonymized_text = ano.anonymize(text_to_anonymize)

print(anonymized_text)
```

### Command-Line Interface (CLI)

#### Basic Usage
```bash
python -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks
```

#### Find Available Strategies and Masks
```bash
python -m incognito --help
```

#### Anonymization with JSON File
```bash
python -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks json --json myjsonfile.json
```

To view helper options for the JSON submodule:
```bash
python -m incognito json --help
```

#### Anonymization with Personal Information in CLI
```bash
python -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks infos --first_name Bob --last_name Dylan --birthdate 1800-01-01 --ipp 0987654312 --postal_code 75001
```

To view helper options for the "infos" submodule:
```bash
python -m incognito infos --help
```

---

## Unit Tests

Unit tests are included to ensure the module's functionality. You can modify them based on your needs.

To run the tests:
```bash
make test
```

To check code coverage:
```bash
make cov
```

---

## Anonymization Process Details

### Regex Strategy
One available anonymization strategy is **Regex**. It can extract and mask specific information from the input text, such as:
- Email addresses
- Phone numbers
- French NIR (social security number)
- First and last names (if preceded by titles like "Monsieur", "Madame", "Mr", "Mme", "Docteur", "Professeur", etc.)

For more details, see the [`RegexStrategy` class](incognito/analyzer.py) and the `self.title_regex` variable.

---

## Documentation 
The documentation is available [`here`](https://micropot.github.io/incognito/).

## License

This project is licensed under the terms of the [MIT License](LICENSE).

---

## Contributors

- Maintainer: Micropot  
Feel free to open issues or contribute via pull requests!

## Similar project

 [EDS NLP](https://github.com/aphp/eds-pseudo/tree/main)


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "incognito-anonymizer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "anonymizer, incognito",
    "author": null,
    "author_email": "Arthur Lamard <arthur@lamard.org>",
    "download_url": "https://files.pythonhosted.org/packages/e5/81/eafba475de0987cd8d691eee32e70aae4732d06cfa7070a11ee4d91cf84b/incognito_anonymizer-0.0.5.tar.gz",
    "platform": null,
    "description": "\n# Incognito\n\n## Description\n**Incognito** is a Python module for anonymizing French text. It uses Regex and other strategies to mask names and personal information provided by the user.  \nThis module was specifically designed for medical reports, ensuring that disease names remain unaltered.\n\n[![python](https://img.shields.io/badge/Python-3.12-3776AB.svg?style=flat&logo=python&logoColor=white)](https://www.python.org)\n\n---\n\n## Installation\n### From pip\n```bash\npip install incognito-anonymizer\n```\n### From this repository\n1. Clone the repository:\n    ```bash\n    git clone https://github.com/Micropot/incognito\n    ```\n\n2. Install the dependencies (defined in `pyproject.toml`):\n    ```bash\n    pip install .\n    ```\n\n---\n\n## Usage\n\n### Python API\n\n#### Example: Providing Personal Information Directly in Code\n```python\nfrom . import anonymizer\n\n# Initialize the anonymizer\nano = anonymizer.Anonymizer()\n\n# Define personal information\ninfos = {\n    \"first_name\": \"Bob\",\n    \"last_name\": \"Jungels\",\n    \"birth_name\": \"\",\n    \"birthdate\": \"1992-09-22\",\n    \"ipp\": \"0987654321\",\n    \"postal_code\": \"01000\",\n    \"adress\": \"\"\n}\n\n# Configure the anonymizer\nano.set_info(infos)\nano.set_strategies(['regex', 'pii'])\nano.set_masks('placeholder')\n\n# Read and anonymize text\ntext_to_anonymize = ano.open_text_file(\"/path/to/file.txt\")\nanonymized_text = ano.anonymize(text_to_anonymize)\n\nprint(anonymized_text)\n```\n\n#### Example: Using JSON File for Personal Information\n```python\nfrom . import anonymizer\n\n# Initialize the anonymizer\nano = anonymizer.Anonymizer()\n\n# Load personal information from JSON\ninfos_json = ano.open_json_file(\"/path/to/infofile.json\")\n\n# Configure the anonymizer\nano.set_info(infos_json)\nano.set_strategies(['regex', 'pii'])\nano.set_masks('placeholder')\n\n# Read and anonymize text\ntext_to_anonymize = ano.open_text_file(\"/path/to/file.txt\")\nanonymized_text = ano.anonymize(text_to_anonymize)\n\nprint(anonymized_text)\n```\n\n### Command-Line Interface (CLI)\n\n#### Basic Usage\n```bash\npython -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks\n```\n\n#### Find Available Strategies and Masks\n```bash\npython -m incognito --help\n```\n\n#### Anonymization with JSON File\n```bash\npython -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks json --json myjsonfile.json\n```\n\nTo view helper options for the JSON submodule:\n```bash\npython -m incognito json --help\n```\n\n#### Anonymization with Personal Information in CLI\n```bash\npython -m incognito --input myinputfile.txt --output myanonymizedfile.txt --strategies mystrategies --mask mymasks infos --first_name Bob --last_name Dylan --birthdate 1800-01-01 --ipp 0987654312 --postal_code 75001\n```\n\nTo view helper options for the \"infos\" submodule:\n```bash\npython -m incognito infos --help\n```\n\n---\n\n## Unit Tests\n\nUnit tests are included to ensure the module's functionality. You can modify them based on your needs.\n\nTo run the tests:\n```bash\nmake test\n```\n\nTo check code coverage:\n```bash\nmake cov\n```\n\n---\n\n## Anonymization Process Details\n\n### Regex Strategy\nOne available anonymization strategy is **Regex**. It can extract and mask specific information from the input text, such as:\n- Email addresses\n- Phone numbers\n- French NIR (social security number)\n- First and last names (if preceded by titles like \"Monsieur\", \"Madame\", \"Mr\", \"Mme\", \"Docteur\", \"Professeur\", etc.)\n\nFor more details, see the [`RegexStrategy` class](incognito/analyzer.py) and the `self.title_regex` variable.\n\n---\n\n## Documentation \nThe documentation is available [`here`](https://micropot.github.io/incognito/).\n\n## License\n\nThis project is licensed under the terms of the [MIT License](LICENSE).\n\n---\n\n## Contributors\n\n- Maintainer: Micropot  \nFeel free to open issues or contribute via pull requests!\n\n## Similar project\n\n [EDS NLP](https://github.com/aphp/eds-pseudo/tree/main)\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A module to anonymize french text data",
    "version": "0.0.5",
    "project_urls": {
        "Repository": "https://github.com/Micropot/incognito"
    },
    "split_keywords": [
        "anonymizer",
        " incognito"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "342231519b88522a941aaa41e3085dbe4e0d293601f80a2cee5a1d7cf390fc73",
                "md5": "74a99333c408db872ad03ef96e178bd8",
                "sha256": "99a267085946635f00cb0cfe83e62f6665d7c4900e98215ac4f9d824acd91178"
            },
            "downloads": -1,
            "filename": "incognito_anonymizer-0.0.5-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74a99333c408db872ad03ef96e178bd8",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 9798,
            "upload_time": "2025-01-27T13:03:19",
            "upload_time_iso_8601": "2025-01-27T13:03:19.803419Z",
            "url": "https://files.pythonhosted.org/packages/34/22/31519b88522a941aaa41e3085dbe4e0d293601f80a2cee5a1d7cf390fc73/incognito_anonymizer-0.0.5-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e581eafba475de0987cd8d691eee32e70aae4732d06cfa7070a11ee4d91cf84b",
                "md5": "ad2a3352dfeec91d6afb5230f83c56f2",
                "sha256": "759c0e7a2a33d24f4270d1857e87b1a36a179097b97d19605a28aaf0f4f1217b"
            },
            "downloads": -1,
            "filename": "incognito_anonymizer-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "ad2a3352dfeec91d6afb5230f83c56f2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15212,
            "upload_time": "2025-01-27T13:03:21",
            "upload_time_iso_8601": "2025-01-27T13:03:21.066714Z",
            "url": "https://files.pythonhosted.org/packages/e5/81/eafba475de0987cd8d691eee32e70aae4732d06cfa7070a11ee4d91cf84b/incognito_anonymizer-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-27 13:03:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Micropot",
    "github_project": "incognito",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "incognito-anonymizer"
}
        
Elapsed time: 0.38541s