Name | content-genome-mapper JSON |
Version |
0.1.0
JSON |
| download |
home_page | None |
Summary | 🧬 A CLI tool to crawl, analyze, and extract content structure semantically. |
upload_time | 2025-07-13 15:40:13 |
maintainer | None |
docs_url | None |
author | Your Name |
requires_python | >=3.8 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# 🧬 Content Genome Mapper CLI
A command-line tool to **crawl, analyze, visualize, and export semantic content data** from web pages — ideal for SEO, content analysis, and research.
---
## Features
- 🌐 Crawl web pages and store raw content
- 🔍 Extract salient entities and score their importance
- 🧩 Identify topic clusters from content
- 📊 Optional sentiment analysis (polarity & subjectivity)
- 📍 Extract named entities
- 🔄 Compare content and structure of two URLs
- 🧬 Batch process multiple URLs from a file
- 📦 Export analysis reports in Markdown and JSON formats
---
## Installation
```bash
pip install content_genome_mapper
```
*Replace with your package install instructions if different.*
---
## Usage
Use the CLI tool `genome` with the following commands:
### Crawl a URL
```bash
genome crawl https://example.com
```
### Analyze a URL (optional sentiment & export)
```bash
genome analyze https://example.com --sentiment --export md
```
### Extract named entities
```bash
genome entities https://example.com
```
### Extract topics
```bash
genome topics https://example.com
```
### Compare two URLs
```bash
genome compare https://site1.com https://site2.com
```
### Batch analyze URLs from a text file
```bash
genome batch urls.txt
```
---
## Export Formats
- Markdown (`md`) — saves a nicely formatted report
- JSON (`json`) — structured report data
- CSV — planned for future release
Reports are saved in the `data/` folder with filenames based on the domain.
---
## Example Markdown Report
```markdown
# 📌 Example Page Title
**URL**: https://example.com
**Sentiment:** Polarity 0.12, Subjectivity 0.34
## 🔍 Top Salient Entities:
- Entity1 (0.92)
- Entity2 (0.75)
## 🧩 Topics:
- Topic cluster 1
- Topic cluster 2
```
---
## Dependencies
- [Typer](https://typer.tiangolo.com/) — for CLI
- [Rich](https://rich.readthedocs.io/) — for console output
- [TextBlob](https://textblob.readthedocs.io/) — for sentiment analysis
- Internal modules: `content_genome_mapper.core.*`
---
## Contributing
Feel free to submit issues or pull requests to improve the tool!
---
## License
MIT License © Amal ALexander
---
## Contact
For questions or support, contact: amalalex95@gmail.com
## Github Repo
Homepage: https://github.com/amal-alexander
Repository: https://github.com/amal-alexander?tab=repositories
Raw data
{
"_id": null,
"home_page": null,
"name": "content-genome-mapper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Your Name",
"author_email": "Amal Alexander <amalalex95@email.com>",
"download_url": "https://files.pythonhosted.org/packages/1d/f2/bda93191db397d9faaeceb05d4de4862ace208e4db6c75ffac64f2888f35/content_genome_mapper-0.1.0.tar.gz",
"platform": null,
"description": "# \ud83e\uddec Content Genome Mapper CLI\r\n\r\nA command-line tool to **crawl, analyze, visualize, and export semantic content data** from web pages \u2014 ideal for SEO, content analysis, and research.\r\n\r\n---\r\n\r\n## Features\r\n\r\n- \ud83c\udf10 Crawl web pages and store raw content \r\n- \ud83d\udd0d Extract salient entities and score their importance \r\n- \ud83e\udde9 Identify topic clusters from content \r\n- \ud83d\udcca Optional sentiment analysis (polarity & subjectivity) \r\n- \ud83d\udccd Extract named entities \r\n- \ud83d\udd04 Compare content and structure of two URLs \r\n- \ud83e\uddec Batch process multiple URLs from a file \r\n- \ud83d\udce6 Export analysis reports in Markdown and JSON formats \r\n\r\n---\r\n\r\n## Installation\r\n\r\n```bash\r\npip install content_genome_mapper\r\n```\r\n\r\n*Replace with your package install instructions if different.*\r\n\r\n---\r\n\r\n## Usage\r\n\r\nUse the CLI tool `genome` with the following commands:\r\n\r\n### Crawl a URL\r\n\r\n```bash\r\ngenome crawl https://example.com\r\n```\r\n\r\n### Analyze a URL (optional sentiment & export)\r\n\r\n```bash\r\ngenome analyze https://example.com --sentiment --export md\r\n```\r\n\r\n### Extract named entities\r\n\r\n```bash\r\ngenome entities https://example.com\r\n```\r\n\r\n### Extract topics\r\n\r\n```bash\r\ngenome topics https://example.com\r\n```\r\n\r\n### Compare two URLs\r\n\r\n```bash\r\ngenome compare https://site1.com https://site2.com\r\n```\r\n\r\n### Batch analyze URLs from a text file\r\n\r\n```bash\r\ngenome batch urls.txt\r\n```\r\n\r\n---\r\n\r\n## Export Formats\r\n\r\n- Markdown (`md`) \u2014 saves a nicely formatted report \r\n- JSON (`json`) \u2014 structured report data \r\n- CSV \u2014 planned for future release \r\n\r\nReports are saved in the `data/` folder with filenames based on the domain.\r\n\r\n---\r\n\r\n## Example Markdown Report\r\n\r\n```markdown\r\n# \ud83d\udccc Example Page Title\r\n\r\n**URL**: https://example.com\r\n\r\n**Sentiment:** Polarity 0.12, Subjectivity 0.34\r\n\r\n## \ud83d\udd0d Top Salient Entities:\r\n- Entity1 (0.92)\r\n- Entity2 (0.75)\r\n\r\n## \ud83e\udde9 Topics:\r\n- Topic cluster 1\r\n- Topic cluster 2\r\n```\r\n\r\n---\r\n\r\n## Dependencies\r\n\r\n- [Typer](https://typer.tiangolo.com/) \u2014 for CLI \r\n- [Rich](https://rich.readthedocs.io/) \u2014 for console output \r\n- [TextBlob](https://textblob.readthedocs.io/) \u2014 for sentiment analysis \r\n- Internal modules: `content_genome_mapper.core.*`\r\n\r\n---\r\n\r\n## Contributing\r\n\r\nFeel free to submit issues or pull requests to improve the tool!\r\n\r\n---\r\n\r\n## License\r\n\r\nMIT License \u00a9 Amal ALexander\r\n\r\n---\r\n\r\n## Contact\r\n\r\nFor questions or support, contact: amalalex95@gmail.com\r\n\r\n## Github Repo\r\nHomepage: https://github.com/amal-alexander\r\nRepository: https://github.com/amal-alexander?tab=repositories\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "\ud83e\uddec A CLI tool to crawl, analyze, and extract content structure semantically.",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/amal-alexander",
"Repository": "https://github.com/amal-alexander?tab=repositories"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "18f04c981c6815ce674daaa12268b34cdf9d0bf43fb5e79046e4d5e037d5d477",
"md5": "70f49341b2d35d2387f2021f73ae9563",
"sha256": "6443222def58340156f94dc10f9b49235a29af14c876c231726d4bef23b54c5d"
},
"downloads": -1,
"filename": "content_genome_mapper-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "70f49341b2d35d2387f2021f73ae9563",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 8809,
"upload_time": "2025-07-13T15:40:11",
"upload_time_iso_8601": "2025-07-13T15:40:11.705227Z",
"url": "https://files.pythonhosted.org/packages/18/f0/4c981c6815ce674daaa12268b34cdf9d0bf43fb5e79046e4d5e037d5d477/content_genome_mapper-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1df2bda93191db397d9faaeceb05d4de4862ace208e4db6c75ffac64f2888f35",
"md5": "6c36b7e166cbfc8b31f47ea16b731cb3",
"sha256": "9a132a096571fa25a8bc7268dfb35bf29d22efd0a3fade4118b3ca89a464a512"
},
"downloads": -1,
"filename": "content_genome_mapper-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "6c36b7e166cbfc8b31f47ea16b731cb3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 7928,
"upload_time": "2025-07-13T15:40:13",
"upload_time_iso_8601": "2025-07-13T15:40:13.135009Z",
"url": "https://files.pythonhosted.org/packages/1d/f2/bda93191db397d9faaeceb05d4de4862ace208e4db6c75ffac64f2888f35/content_genome_mapper-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-13 15:40:13",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "content-genome-mapper"
}