content-genome-mapper


Namecontent-genome-mapper JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
Summary🧬 A CLI tool to crawl, analyze, and extract content structure semantically.
upload_time2025-07-13 15:40:13
maintainerNone
docs_urlNone
authorYour Name
requires_python>=3.8
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 🧬 Content Genome Mapper CLI

A command-line tool to **crawl, analyze, visualize, and export semantic content data** from web pages — ideal for SEO, content analysis, and research.

---

## Features

- 🌐 Crawl web pages and store raw content  
- 🔍 Extract salient entities and score their importance  
- 🧩 Identify topic clusters from content  
- 📊 Optional sentiment analysis (polarity & subjectivity)  
- 📍 Extract named entities  
- 🔄 Compare content and structure of two URLs  
- 🧬 Batch process multiple URLs from a file  
- 📦 Export analysis reports in Markdown and JSON formats  

---

## Installation

```bash
pip install content_genome_mapper
```

*Replace with your package install instructions if different.*

---

## Usage

Use the CLI tool `genome` with the following commands:

### Crawl a URL

```bash
genome crawl https://example.com
```

### Analyze a URL (optional sentiment & export)

```bash
genome analyze https://example.com --sentiment --export md
```

### Extract named entities

```bash
genome entities https://example.com
```

### Extract topics

```bash
genome topics https://example.com
```

### Compare two URLs

```bash
genome compare https://site1.com https://site2.com
```

### Batch analyze URLs from a text file

```bash
genome batch urls.txt
```

---

## Export Formats

- Markdown (`md`) — saves a nicely formatted report  
- JSON (`json`) — structured report data  
- CSV — planned for future release  

Reports are saved in the `data/` folder with filenames based on the domain.

---

## Example Markdown Report

```markdown
# 📌 Example Page Title

**URL**: https://example.com

**Sentiment:** Polarity 0.12, Subjectivity 0.34

## 🔍 Top Salient Entities:
- Entity1 (0.92)
- Entity2 (0.75)

## 🧩 Topics:
- Topic cluster 1
- Topic cluster 2
```

---

## Dependencies

- [Typer](https://typer.tiangolo.com/) — for CLI  
- [Rich](https://rich.readthedocs.io/) — for console output  
- [TextBlob](https://textblob.readthedocs.io/) — for sentiment analysis  
- Internal modules: `content_genome_mapper.core.*`

---

## Contributing

Feel free to submit issues or pull requests to improve the tool!

---

## License

MIT License © Amal ALexander

---

## Contact

For questions or support, contact: amalalex95@gmail.com

## Github Repo
Homepage: https://github.com/amal-alexander
Repository: https://github.com/amal-alexander?tab=repositories

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "content-genome-mapper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Your Name",
    "author_email": "Amal Alexander <amalalex95@email.com>",
    "download_url": "https://files.pythonhosted.org/packages/1d/f2/bda93191db397d9faaeceb05d4de4862ace208e4db6c75ffac64f2888f35/content_genome_mapper-0.1.0.tar.gz",
    "platform": null,
    "description": "# \ud83e\uddec Content Genome Mapper CLI\r\n\r\nA command-line tool to **crawl, analyze, visualize, and export semantic content data** from web pages \u2014 ideal for SEO, content analysis, and research.\r\n\r\n---\r\n\r\n## Features\r\n\r\n- \ud83c\udf10 Crawl web pages and store raw content  \r\n- \ud83d\udd0d Extract salient entities and score their importance  \r\n- \ud83e\udde9 Identify topic clusters from content  \r\n- \ud83d\udcca Optional sentiment analysis (polarity & subjectivity)  \r\n- \ud83d\udccd Extract named entities  \r\n- \ud83d\udd04 Compare content and structure of two URLs  \r\n- \ud83e\uddec Batch process multiple URLs from a file  \r\n- \ud83d\udce6 Export analysis reports in Markdown and JSON formats  \r\n\r\n---\r\n\r\n## Installation\r\n\r\n```bash\r\npip install content_genome_mapper\r\n```\r\n\r\n*Replace with your package install instructions if different.*\r\n\r\n---\r\n\r\n## Usage\r\n\r\nUse the CLI tool `genome` with the following commands:\r\n\r\n### Crawl a URL\r\n\r\n```bash\r\ngenome crawl https://example.com\r\n```\r\n\r\n### Analyze a URL (optional sentiment & export)\r\n\r\n```bash\r\ngenome analyze https://example.com --sentiment --export md\r\n```\r\n\r\n### Extract named entities\r\n\r\n```bash\r\ngenome entities https://example.com\r\n```\r\n\r\n### Extract topics\r\n\r\n```bash\r\ngenome topics https://example.com\r\n```\r\n\r\n### Compare two URLs\r\n\r\n```bash\r\ngenome compare https://site1.com https://site2.com\r\n```\r\n\r\n### Batch analyze URLs from a text file\r\n\r\n```bash\r\ngenome batch urls.txt\r\n```\r\n\r\n---\r\n\r\n## Export Formats\r\n\r\n- Markdown (`md`) \u2014 saves a nicely formatted report  \r\n- JSON (`json`) \u2014 structured report data  \r\n- CSV \u2014 planned for future release  \r\n\r\nReports are saved in the `data/` folder with filenames based on the domain.\r\n\r\n---\r\n\r\n## Example Markdown Report\r\n\r\n```markdown\r\n# \ud83d\udccc Example Page Title\r\n\r\n**URL**: https://example.com\r\n\r\n**Sentiment:** Polarity 0.12, Subjectivity 0.34\r\n\r\n## \ud83d\udd0d Top Salient Entities:\r\n- Entity1 (0.92)\r\n- Entity2 (0.75)\r\n\r\n## \ud83e\udde9 Topics:\r\n- Topic cluster 1\r\n- Topic cluster 2\r\n```\r\n\r\n---\r\n\r\n## Dependencies\r\n\r\n- [Typer](https://typer.tiangolo.com/) \u2014 for CLI  \r\n- [Rich](https://rich.readthedocs.io/) \u2014 for console output  \r\n- [TextBlob](https://textblob.readthedocs.io/) \u2014 for sentiment analysis  \r\n- Internal modules: `content_genome_mapper.core.*`\r\n\r\n---\r\n\r\n## Contributing\r\n\r\nFeel free to submit issues or pull requests to improve the tool!\r\n\r\n---\r\n\r\n## License\r\n\r\nMIT License \u00a9 Amal ALexander\r\n\r\n---\r\n\r\n## Contact\r\n\r\nFor questions or support, contact: amalalex95@gmail.com\r\n\r\n## Github Repo\r\nHomepage: https://github.com/amal-alexander\r\nRepository: https://github.com/amal-alexander?tab=repositories\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "\ud83e\uddec A CLI tool to crawl, analyze, and extract content structure semantically.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/amal-alexander",
        "Repository": "https://github.com/amal-alexander?tab=repositories"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "18f04c981c6815ce674daaa12268b34cdf9d0bf43fb5e79046e4d5e037d5d477",
                "md5": "70f49341b2d35d2387f2021f73ae9563",
                "sha256": "6443222def58340156f94dc10f9b49235a29af14c876c231726d4bef23b54c5d"
            },
            "downloads": -1,
            "filename": "content_genome_mapper-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "70f49341b2d35d2387f2021f73ae9563",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8809,
            "upload_time": "2025-07-13T15:40:11",
            "upload_time_iso_8601": "2025-07-13T15:40:11.705227Z",
            "url": "https://files.pythonhosted.org/packages/18/f0/4c981c6815ce674daaa12268b34cdf9d0bf43fb5e79046e4d5e037d5d477/content_genome_mapper-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1df2bda93191db397d9faaeceb05d4de4862ace208e4db6c75ffac64f2888f35",
                "md5": "6c36b7e166cbfc8b31f47ea16b731cb3",
                "sha256": "9a132a096571fa25a8bc7268dfb35bf29d22efd0a3fade4118b3ca89a464a512"
            },
            "downloads": -1,
            "filename": "content_genome_mapper-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6c36b7e166cbfc8b31f47ea16b731cb3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 7928,
            "upload_time": "2025-07-13T15:40:13",
            "upload_time_iso_8601": "2025-07-13T15:40:13.135009Z",
            "url": "https://files.pythonhosted.org/packages/1d/f2/bda93191db397d9faaeceb05d4de4862ace208e4db6c75ffac64f2888f35/content_genome_mapper-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-13 15:40:13",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "content-genome-mapper"
}
        
Elapsed time: 0.60147s