# ๐ Python Repetition Hunter
> *Hunt down code repetitions like a pro detective*
A powerful Python tool that analyzes your codebase to find repeated patterns and duplicated logic. Based on the Clojure repetition-hunter algorithm, this tool helps you identify opportunities for refactoring and code deduplication.
## โจ Features
- ๐ฏ **Smart Pattern Detection** - Finds semantic duplications, not just copy-paste
- ๐ง **AST-Based Analysis** - Uses Abstract Syntax Trees for deep code understanding
- ๐ง **Variable Normalization** - Detects patterns even when variable names differ
- ๐ **Complexity Scoring** - Ranks findings by complexity ร repetition count
- ๐๏ธ **Configurable Thresholds** - Tune sensitivity to your needs
- ๐ **Recursive Directory Scanning** - Analyze entire projects at once
## ๐ฆ Installation
### From PyPI (Recommended)
```bash
pip install python-repetition-hunter
```
### From Source
```bash
git clone https://github.com/yourusername/python-repetition-hunter.git
cd python-repetition-hunter
pip install -e .
```
## ๐ Quick Start
```bash
# After pip install, use the command directly
repetition-hunter my_code.py
# Scan entire project
repetition-hunter src/
# Find only high-complexity duplications
repetition-hunter --min-complexity 5 --min-repetition 3 src/
# Or run the module directly
python -m repetition_hunter my_code.py
```
## ๐ Usage
```
repetition-hunter [OPTIONS] PATHS...
Arguments:
PATHS Python files or directories to analyze
Options:
--min-complexity INT Minimum complexity threshold (default: 3)
--min-repetition INT Minimum repetition count (default: 2)
--sort [complexity|repetition] Sort results by complexity or repetition (default: complexity)
```
## ๐ฏ Example Output
```
3 repetitions of complexity 12
Line 15 - src/utils.py:
if data is None:
return None
result = []
for item in data:
if item > 0:
result.append(item * 2)
return result
Line 28 - src/processor.py:
if items is None:
return None
output = []
for element in items:
if element > 0:
output.append(element * 2)
return output
======================================================================
```
## ๐งช Test It Out
The project includes `test_sample.py` with intentional duplications to demonstrate the tool:
```bash
repetition-hunter test_sample.py
```
You'll see it catches patterns like:
- Similar data processing loops with different variable names
- Duplicate validation logic
- Repeated calculation patterns
## ๐ง How It Works
1. **Parse** - Converts Python code to Abstract Syntax Trees
2. **Extract** - Identifies all meaningful code nodes (skipping trivial ones)
3. **Normalize** - Replaces variable names with generic placeholders
4. **Group** - Clusters identical normalized patterns
5. **Score** - Ranks by complexity ร repetition count
6. **Report** - Shows original code locations for each pattern
## ๐จ Why Use This?
- **Reduce Technical Debt** - Spot duplicated logic before it spreads
- **Improve Code Quality** - Identify refactoring opportunities
- **Save Time** - Automated detection vs manual code review
- **Learn Patterns** - Understand your codebase's repetition hotspots
## ๐ ๏ธ Requirements
- Python 3.6+
- No external dependencies (uses only standard library)
## ๐ค Contributing
Found a bug or have an idea? Feel free to open an issue or submit a PR!
## ๐ License
This project is open source. Use it, modify it, share it!
---
*Happy hunting! ๐ฏ*
Raw data
{
"_id": null,
"home_page": "https://github.com/yourusername/python-repetition-hunter",
"name": "python-repetition-hunter",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "code-analysis, refactoring, duplication, ast, static-analysis",
"author": "Andres GU",
"author_email": "Your Name <your.email@example.com>",
"download_url": "https://files.pythonhosted.org/packages/9e/1e/01c874b1e5b75970c78b62b6ae684e1a9866837f00e4695e2f2d35d0f071/python_repetition_hunter-1.0.3.tar.gz",
"platform": null,
"description": "# \ud83d\udd0d Python Repetition Hunter\n\n> *Hunt down code repetitions like a pro detective*\n\nA powerful Python tool that analyzes your codebase to find repeated patterns and duplicated logic. Based on the Clojure repetition-hunter algorithm, this tool helps you identify opportunities for refactoring and code deduplication.\n\n## \u2728 Features\n\n- \ud83c\udfaf **Smart Pattern Detection** - Finds semantic duplications, not just copy-paste\n- \ud83e\udde0 **AST-Based Analysis** - Uses Abstract Syntax Trees for deep code understanding \n- \ud83d\udd27 **Variable Normalization** - Detects patterns even when variable names differ\n- \ud83d\udcca **Complexity Scoring** - Ranks findings by complexity \u00d7 repetition count\n- \ud83c\udf9b\ufe0f **Configurable Thresholds** - Tune sensitivity to your needs\n- \ud83d\udcc1 **Recursive Directory Scanning** - Analyze entire projects at once\n\n## \ud83d\udce6 Installation\n\n### From PyPI (Recommended)\n```bash\npip install python-repetition-hunter\n```\n\n### From Source\n```bash\ngit clone https://github.com/yourusername/python-repetition-hunter.git\ncd python-repetition-hunter\npip install -e .\n```\n\n## \ud83d\ude80 Quick Start\n\n```bash\n# After pip install, use the command directly\nrepetition-hunter my_code.py\n\n# Scan entire project\nrepetition-hunter src/\n\n# Find only high-complexity duplications\nrepetition-hunter --min-complexity 5 --min-repetition 3 src/\n\n# Or run the module directly\npython -m repetition_hunter my_code.py\n```\n\n## \ud83d\udccb Usage\n\n```\nrepetition-hunter [OPTIONS] PATHS...\n\nArguments:\n PATHS Python files or directories to analyze\n\nOptions:\n --min-complexity INT Minimum complexity threshold (default: 3)\n --min-repetition INT Minimum repetition count (default: 2) \n --sort [complexity|repetition] Sort results by complexity or repetition (default: complexity)\n```\n\n## \ud83c\udfaf Example Output\n\n```\n3 repetitions of complexity 12\n\nLine 15 - src/utils.py:\nif data is None:\n return None\nresult = []\nfor item in data:\n if item > 0:\n result.append(item * 2)\nreturn result\n\nLine 28 - src/processor.py:\nif items is None:\n return None\noutput = []\nfor element in items:\n if element > 0:\n output.append(element * 2)\nreturn output\n\n======================================================================\n```\n\n## \ud83e\uddea Test It Out\n\nThe project includes `test_sample.py` with intentional duplications to demonstrate the tool:\n\n```bash\nrepetition-hunter test_sample.py\n```\n\nYou'll see it catches patterns like:\n- Similar data processing loops with different variable names\n- Duplicate validation logic\n- Repeated calculation patterns\n\n## \ud83d\udd27 How It Works\n\n1. **Parse** - Converts Python code to Abstract Syntax Trees\n2. **Extract** - Identifies all meaningful code nodes (skipping trivial ones)\n3. **Normalize** - Replaces variable names with generic placeholders\n4. **Group** - Clusters identical normalized patterns\n5. **Score** - Ranks by complexity \u00d7 repetition count\n6. **Report** - Shows original code locations for each pattern\n\n## \ud83c\udfa8 Why Use This?\n\n- **Reduce Technical Debt** - Spot duplicated logic before it spreads\n- **Improve Code Quality** - Identify refactoring opportunities\n- **Save Time** - Automated detection vs manual code review\n- **Learn Patterns** - Understand your codebase's repetition hotspots\n\n## \ud83d\udee0\ufe0f Requirements\n\n- Python 3.6+\n- No external dependencies (uses only standard library)\n\n## \ud83e\udd1d Contributing\n\nFound a bug or have an idea? Feel free to open an issue or submit a PR!\n\n## \ud83d\udcc4 License\n\nThis project is open source. Use it, modify it, share it!\n\n---\n\n*Happy hunting! \ud83c\udfaf*\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Hunt down code repetitions in Python projects",
"version": "1.0.3",
"project_urls": {
"Homepage": "https://github.com/yourusername/python-repetition-hunter",
"Issues": "https://github.com/yourusername/python-repetition-hunter/issues",
"Repository": "https://github.com/yourusername/python-repetition-hunter"
},
"split_keywords": [
"code-analysis",
" refactoring",
" duplication",
" ast",
" static-analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "cae31b16b019dfd0169d596eb9b278de65e28a5beb9adfeef93841a0fdae8cad",
"md5": "bbf0307338f13262a4a31793fbf86bac",
"sha256": "13e1c127f32f61807ded6f17dba3d99b8731a70935dab1f623a80392cefa0265"
},
"downloads": -1,
"filename": "python_repetition_hunter-1.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bbf0307338f13262a4a31793fbf86bac",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 6363,
"upload_time": "2025-07-18T18:49:16",
"upload_time_iso_8601": "2025-07-18T18:49:16.012902Z",
"url": "https://files.pythonhosted.org/packages/ca/e3/1b16b019dfd0169d596eb9b278de65e28a5beb9adfeef93841a0fdae8cad/python_repetition_hunter-1.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "9e1e01c874b1e5b75970c78b62b6ae684e1a9866837f00e4695e2f2d35d0f071",
"md5": "fcd5452be60c8108b8b36280d7e08de4",
"sha256": "e6ebdf642bdc4a2cd0680a9328e6d373ccb807b219b7ff6c7e1e5b37aa81922c"
},
"downloads": -1,
"filename": "python_repetition_hunter-1.0.3.tar.gz",
"has_sig": false,
"md5_digest": "fcd5452be60c8108b8b36280d7e08de4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 6209,
"upload_time": "2025-07-18T18:49:17",
"upload_time_iso_8601": "2025-07-18T18:49:17.607492Z",
"url": "https://files.pythonhosted.org/packages/9e/1e/01c874b1e5b75970c78b62b6ae684e1a9866837f00e4695e2f2d35d0f071/python_repetition_hunter-1.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-18 18:49:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yourusername",
"github_project": "python-repetition-hunter",
"github_not_found": true,
"lcname": "python-repetition-hunter"
}