# Complex Parser
Complex Parser is a powerful Python package designed to streamline the process of data extraction from JSON-like structures while also enriching the extracted data with synonym retrieval capabilities. Whether you're working with complex nested JSON data or simple dictionaries, this package provides a flexible and intuitive solution for extracting specific data elements based on user-defined format keys, all while expanding the semantic richness of your data through synonym retrieval.
## Features
### Data Extraction
- **Structured Data Extraction:** Extract specific data elements from nested JSON-like structures based on user-specified format keys.
- **Customizable Format Keys:** Define format keys to precisely target the data elements you need, making it adaptable to a wide range of data structures.
- **Efficient Data Parsing:** Utilizes efficient algorithms to parse through the data and extract relevant information with minimal computational overhead.
- **Thread Based chuncking:** Utilises threads to quickly sort through larger data sets.
### Synonym Retrieval
- **Semantic Enrichment:** Enhance the semantic richness of your data by retrieving synonyms for key terms using both WordNet and custom synonym lists.
- **Flexible Synonym Loading:** Load additional synonyms from custom lists to expand the synonym pool for specific terms, allowing for fine-tuned control over synonym retrieval.
### Ease of Use
- **Simple Integration:** Integrate seamlessly into your Python projects with an intuitive interface and straightforward usage.
- **Comprehensive Documentation:** Detailed documentation and examples provided for easy reference and quick integration into your projects.
## Installation
You can install Complex Parser via pip:
```bash
pip install complex-parser
```
## Usage:
Here's a simple example demonstrating how to use the package
```python
from complex_parser import extract_data
# Example data
data = {
"people":[
{
"name": "John",
"age": 30,
"address": {
"road": "123 Main St",
"city": "Anytown"
}
},
{
"name": "Joshua",
"age": 3100,
"address": {
"road": "657 Loud St",
"city": "Basictown",
"landmark": "Town Square"
}
},
{
"name": "John",
"age": 30,
"location": {
"road": "8474 Main St",
"city": "None"
}
},
{
"fullname": "Job",
"age": 27,
"destination": {
"road": "8474 John's St",
"city": "London"
}
},
{
"unknown": "Job",
"age": 27,
"destination": {
"road": "8474 John's St",
"city": "London"
}
}
]
}
format_keys = ["name", "address"]
load_lists= {
"address":[
"location"
],
"name": [
"fullname"
]
}
# Extract data with specified format keys
extracted_data = extract_data(data=data, format_keys=format_keys,load_lists=load_lists)
print(extracted_data)
```
results:
```bash
[{'name': 'John', 'age': 30, 'address': {'road': '123 Main St', 'city': 'Anytown'}}, {'name': 'Joshua', 'age': 3100, 'address': {'road': '657 Loud St', 'city': 'Basictown', 'landmark': 'Town Square'}}, {'name': 'John', 'age': 30, 'location': {'road': '8474 Main St', 'city': 'None'}}, {'fullname': 'Job', 'age': 27, 'destination': {'road': "8474 John's St", 'city': 'London'}}]
```
## License:
This project is licensed under the Mozilla Public License Version 2.0 - see the [LICENSE](./LICENSE) file for details.
## Contributing
Contributions are welcome! Please feel free to submit bug reports, feature requests, or pull requests on the GitHub repository.
Raw data
{
"_id": null,
"home_page": "https://github.com/The-Nebula-Developers/Complex-Parser",
"name": "complex-parser",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "",
"keywords": "python,data,parsing,json,complex,synonyms,similar,custom parsing,data parser,synomym data parser,keyword extractor,fuzzywuzzy,nltk,words,json data",
"author": "The Nebula Developer",
"author_email": "support@nebuladevs.tech",
"download_url": "https://files.pythonhosted.org/packages/7a/ec/8ae1818bad930516b5e0ea85e692483131994006e396e5089fadabfc8ba8/complex_parser-0.0.2.tar.gz",
"platform": null,
"description": "# Complex Parser\r\n\r\nComplex Parser is a powerful Python package designed to streamline the process of data extraction from JSON-like structures while also enriching the extracted data with synonym retrieval capabilities. Whether you're working with complex nested JSON data or simple dictionaries, this package provides a flexible and intuitive solution for extracting specific data elements based on user-defined format keys, all while expanding the semantic richness of your data through synonym retrieval.\r\n\r\n## Features\r\n\r\n### Data Extraction\r\n- **Structured Data Extraction:** Extract specific data elements from nested JSON-like structures based on user-specified format keys.\r\n- **Customizable Format Keys:** Define format keys to precisely target the data elements you need, making it adaptable to a wide range of data structures.\r\n- **Efficient Data Parsing:** Utilizes efficient algorithms to parse through the data and extract relevant information with minimal computational overhead.\r\n- **Thread Based chuncking:** Utilises threads to quickly sort through larger data sets. \r\n\r\n### Synonym Retrieval\r\n- **Semantic Enrichment:** Enhance the semantic richness of your data by retrieving synonyms for key terms using both WordNet and custom synonym lists.\r\n- **Flexible Synonym Loading:** Load additional synonyms from custom lists to expand the synonym pool for specific terms, allowing for fine-tuned control over synonym retrieval.\r\n\r\n### Ease of Use\r\n- **Simple Integration:** Integrate seamlessly into your Python projects with an intuitive interface and straightforward usage.\r\n- **Comprehensive Documentation:** Detailed documentation and examples provided for easy reference and quick integration into your projects.\r\n\r\n## Installation\r\n\r\nYou can install Complex Parser via pip:\r\n\r\n```bash\r\npip install complex-parser\r\n```\r\n\r\n## Usage: \r\n\r\nHere's a simple example demonstrating how to use the package\r\n\r\n```python\r\nfrom complex_parser import extract_data\r\n\r\n# Example data\r\ndata = {\r\n \"people\":[\r\n {\r\n \"name\": \"John\",\r\n \"age\": 30,\r\n \"address\": {\r\n \"road\": \"123 Main St\",\r\n \"city\": \"Anytown\"\r\n }\r\n }, \r\n {\r\n \"name\": \"Joshua\",\r\n \"age\": 3100,\r\n \"address\": {\r\n \"road\": \"657 Loud St\",\r\n \"city\": \"Basictown\",\r\n \"landmark\": \"Town Square\"\r\n }\r\n }, \r\n {\r\n \"name\": \"John\",\r\n \"age\": 30,\r\n \"location\": {\r\n \"road\": \"8474 Main St\",\r\n \"city\": \"None\"\r\n }\r\n }, \r\n {\r\n \"fullname\": \"Job\",\r\n \"age\": 27,\r\n \"destination\": {\r\n \"road\": \"8474 John's St\",\r\n \"city\": \"London\"\r\n }\r\n }, \r\n {\r\n \"unknown\": \"Job\",\r\n \"age\": 27,\r\n \"destination\": {\r\n \"road\": \"8474 John's St\",\r\n \"city\": \"London\"\r\n }\r\n }\r\n ]\r\n}\r\nformat_keys = [\"name\", \"address\"]\r\nload_lists= {\r\n \"address\":[\r\n \"location\"\r\n ], \r\n \"name\": [\r\n \"fullname\"\r\n ]\r\n}\r\n# Extract data with specified format keys\r\nextracted_data = extract_data(data=data, format_keys=format_keys,load_lists=load_lists)\r\nprint(extracted_data)\r\n```\r\n\r\nresults: \r\n\r\n```bash\r\n[{'name': 'John', 'age': 30, 'address': {'road': '123 Main St', 'city': 'Anytown'}}, {'name': 'Joshua', 'age': 3100, 'address': {'road': '657 Loud St', 'city': 'Basictown', 'landmark': 'Town Square'}}, {'name': 'John', 'age': 30, 'location': {'road': '8474 Main St', 'city': 'None'}}, {'fullname': 'Job', 'age': 27, 'destination': {'road': \"8474 John's St\", 'city': 'London'}}]\r\n```\r\n\r\n## License: \r\nThis project is licensed under the Mozilla Public License Version 2.0 - see the [LICENSE](./LICENSE) file for details.\r\n\r\n## Contributing\r\nContributions are welcome! Please feel free to submit bug reports, feature requests, or pull requests on the GitHub repository.\r\n",
"bugtrack_url": null,
"license": "Mozilla Public License Version 2.0",
"summary": "A versatile Python package for data extraction from JSON-like structures with user-defined format keys, enhanced with synonym retrieval capabilities.",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/The-Nebula-Developers/Complex-Parser"
},
"split_keywords": [
"python",
"data",
"parsing",
"json",
"complex",
"synonyms",
"similar",
"custom parsing",
"data parser",
"synomym data parser",
"keyword extractor",
"fuzzywuzzy",
"nltk",
"words",
"json data"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "25d4aabadd38b4987116e5beeb5d183bfbc6264f9db0f5d74f03060f42e79c6f",
"md5": "950a144939e2009ff8e3100f141d89b9",
"sha256": "0099cf21db3f920093dede21b04d0ddb318a86d3e54ffa4ed4424cb9499bd17e"
},
"downloads": -1,
"filename": "complex_parser-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "950a144939e2009ff8e3100f141d89b9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 4836,
"upload_time": "2024-03-15T20:22:41",
"upload_time_iso_8601": "2024-03-15T20:22:41.392750Z",
"url": "https://files.pythonhosted.org/packages/25/d4/aabadd38b4987116e5beeb5d183bfbc6264f9db0f5d74f03060f42e79c6f/complex_parser-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7aec8ae1818bad930516b5e0ea85e692483131994006e396e5089fadabfc8ba8",
"md5": "3b55209dcd571c7d75e697ede5878a89",
"sha256": "10b4e8b485d09575864e795ddff157ec32999d6ac8de83550728d4f968c0f9cc"
},
"downloads": -1,
"filename": "complex_parser-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "3b55209dcd571c7d75e697ede5878a89",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 4551,
"upload_time": "2024-03-15T20:22:43",
"upload_time_iso_8601": "2024-03-15T20:22:43.035988Z",
"url": "https://files.pythonhosted.org/packages/7a/ec/8ae1818bad930516b5e0ea85e692483131994006e396e5089fadabfc8ba8/complex_parser-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-15 20:22:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "The-Nebula-Developers",
"github_project": "Complex-Parser",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "complex-parser"
}