# CodeConsolidator
CodeConsolidator is a command-line tool written in Python that helps you analyze and understand your codebase. It provides a structured overview of your project's directory structure, file sizes, token counts, and even consolidates the content of all text-based files into a single output for easy analysis with Large Language Models (LLMs).
## Features
* **Directory Tree Visualization:** Generates a hierarchical tree view of your project's directory structure, optionally including file sizes and ignoring specified files/directories.
* **Codebase Statistics:** Calculates total files, directories, code size, and token counts for your project.
* **File Content Consolidation:** Consolidates the content of all text-based files into a single output file (useful for LLM analysis).
* **.gitignore Support:** Respects your .gitignore file to exclude unwanted files and directories from the analysis.
* **Customizable Output:** Choose between text or JSON output formats and customize the level of detail included.
* **Colored Console Output:** Provides a visually appealing and informative summary in the console.
## Installation
```bash
pip install tiktoken colorama
```
```bash
python codeconsolidator.py [path_to_directory] [options]
```
## Options
```bash
path_to_directory: Path to the directory you want to analyze.
-d, --max-depth: Maximum depth for directory traversal.
-o, --output: Output format (text or json). Default: text.
-f, --file: Output file name (default: codebase_analysis.txt or codebase_analysis.json).
--show-tree: Show directory tree in console output (always included in text file output).
--show-size: Show file sizes in directory tree.
--show-ignored: Show ignored files and directories in tree.
--ignore-ext: Additional file extensions to ignore (e.g., .pyc .log).
--no-content: Exclude file contents from the output.
--include-git: Include .git directory in the analysis.
--max-size: Maximum allowed text content size in KB (default: 10240 KB).
```
```bash
python codeconsolidator.py my_project -d 3 -o json --show-size --ignore-ext .pyc .log
```
# 10 LLM Prompts for Enhanced Codebase Analysis
## I. Code Quality & Understanding:
### 1. Codebase Error and Inconsistency Analysis
```bash
**Objective:** Identify potential errors and inconsistencies within the provided codebase.
**Instructions:**
1. **Analyze the attached code** for the following:
* Syntax errors and logical flaws.
* Inconsistencies in variable and function naming conventions.
* Code duplication.
* Performance bottlenecks.
* Violations of established coding best practices.
2. **Structure your analysis clearly**, pinpointing specific code snippets and providing detailed descriptions of the identified issues.
3. **Prioritize clarity and conciseness** in your explanations.
**Expected Output:** A comprehensive report detailing errors and inconsistencies, organized by code section or error type, with actionable insights for improvement.
```
### 2. Codebase Risk Assessment
```bash
**Objective:** Identify code segments within the provided codebase that could potentially lead to future issues.
**Instructions:**
1. **Analyze the attached code** with a focus on:
* Code that is difficult to understand and maintain (code smells).
* Fragments that might cause errors under specific conditions (edge cases).
* Code that deviates from established coding standards.
2. **Provide detailed justifications for your concerns**, explaining the potential risks associated with each identified segment.
3. **Suggest potential solutions or mitigation strategies** to address the identified risks.
**Expected Output:** A report highlighting potential risk areas within the codebase, with clear explanations of the risks and actionable recommendations for improvement.
```
### 3. Codebase Documentation Generation
```bash
**Objective:** Generate comprehensive and user-friendly documentation for the provided codebase.
**Instructions:**
1. **Analyze the attached code** and identify key components, functionalities, and APIs.
2. **Generate documentation that includes:**
* API specifications with detailed descriptions of endpoints, parameters, and responses.
* Function descriptions with clear explanations of their purpose, inputs, and outputs.
* Usage examples demonstrating how to interact with the codebase effectively.
3. **Structure the documentation logically** and use a consistent format for clarity.
4. **Prioritize clarity, conciseness, and accuracy** in your documentation.
**Expected Output:** Well-structured and informative documentation that facilitates understanding and utilization of the codebase by developers and other stakeholders.
```
## II. Learning & Knowledge Extraction:
### 4. User Story Reconstruction from Code
```bash
**Objective:** Reconstruct the user stories that likely served as the basis for the development of the provided codebase.
**Instructions:**
1. **Analyze the attached code** to identify the core functionalities of the application.
2. **Infer the user needs** that each functionality aims to address.
3. **Formulate user stories** using the following template: "As a [user role], I want [functionality], so that [benefit]."
4. **Identify potential missing user stories**. Suggest functionalities that could be added to the application to better meet user needs.
**Expected Output:** A list of reconstructed user stories based on the code's functionalities, along with insights into potential missing user stories and suggestions for application enhancements.
```
### 5. Code-Based Mini-Lesson Generation
```bash
**Objective:** Create a series of mini-lessons that explain the key concepts implemented within the provided codebase.
**Instructions:**
1. **Divide the code into logical sections** and create a separate lesson for each.
2. **Start with the simplest concepts** and gradually progress to more complex ones.
3. **Use code examples from the application** to illustrate the discussed concepts.
4. **Include exercises and quizzes** to help learners test their understanding.
5. **Focus on clarity and pedagogical effectiveness** in your lesson design.
**Expected Output:** A set of well-structured mini-lessons covering the key concepts of the application, with code examples, exercises, and quizzes to facilitate learning.
```
## III. Code Improvement & Transformation:
### 6. Codebase Best Practice Analysis
```bash
**Objective:** Analyze the provided codebase and identify examples of both good and bad programming practices.
**Instructions:**
1. **Carefully review the attached code** and pinpoint instances of exemplary and problematic coding practices.
2. **For each example, provide a detailed analysis** that includes:
* **What is good/bad about the specific solution?**
* **What concepts or principles underpin the solution?**
* **What are the potential positive/negative consequences of using this solution?**
**Expected Output:** A comprehensive report highlighting both positive and negative coding practices within the codebase, with in-depth explanations and analysis of their impact.
```
### 7. Codebase Translation to Another Programming Language
```bash
**Objective:** Translate the provided codebase from [Source Language] to [Target Language] while preserving its functionality and structure.
**Instructions:**
1. **Analyze the attached code** written in [Source Language] and understand its logic and functionalities.
2. **Translate the code** into [Target Language], ensuring that the translated code performs the same tasks as the original code.
3. **Maintain the original code's structure and organization** as much as possible in the translated version.
4. **Adhere to the coding conventions and best practices** of the target language.
5. **Comment the translated code** to explain any significant changes or adaptations made during the translation process.
**Expected Output:** A functional codebase in [Target Language] that accurately reflects the functionality and structure of the original [Source Language] codebase.
```
### 8. Codebase Refactoring for Improved Readability and Performance
```bash
**Objective:** Refactor the provided codebase to enhance its readability, maintainability, and performance.
**Instructions:**
1. **Analyze the attached code** and identify areas that can be improved in terms of code clarity, structure, and efficiency.
2. **Suggest specific code transformations and optimizations** to address the identified areas for improvement.
3. **Prioritize refactoring techniques** that improve code readability without introducing unnecessary complexity.
4. **Consider performance implications** of your suggested refactoring and aim for solutions that enhance efficiency without sacrificing clarity.
5. **Provide clear explanations** for each refactoring suggestion, justifying its benefits and potential impact.
**Expected Output:** A set of actionable refactoring suggestions with detailed explanations of their benefits and potential impact on code quality and performance.
```
## IV. Testing & Security:
### 9. Unit Test Generation for Codebase
```bash
**Objective:** Generate unit tests for the provided codebase to ensure code correctness and prevent regressions.
**Instructions:**
1. **Analyze the attached code** and identify its core functions and methods.
2. **Generate unit tests** that cover a wide range of input values and expected outputs for each function/method.
3. **Follow best practices for unit testing**, including:
* **Test one function/method per test case.**
* **Use descriptive test names.**
* **Assert expected outcomes clearly.**
* **Keep tests independent and isolated.**
4. **Prioritize test coverage for critical functionalities** and edge cases.
**Expected Output:** A comprehensive suite of unit tests that can be used to verify the correctness of the codebase and prevent regressions during future development.
```
### 10. Security Vulnerability Analysis of Codebase
```bash
**Objective:** Identify potential security vulnerabilities within the provided codebase.
**Instructions:**
1. **Analyze the attached code** with a focus on identifying common security weaknesses such as:
* SQL injection.
* Cross-site scripting (XSS).
* Cross-site request forgery (CSRF).
* Authentication and authorization bypasses.
* Data exposure.
2. **For each identified vulnerability, provide a detailed explanation** of:
* The nature of the vulnerability.
* The potential impact of exploitation.
* Recommendations for mitigation using secure coding practices.
3. **Prioritize vulnerabilities based on their severity and potential impact.**
**Expected Output:** A comprehensive security report highlighting potential vulnerabilities within the codebase, along with clear explanations of their risks and actionable recommendations for remediation.
```
Raw data
{
"_id": null,
"home_page": "https://github.com/kamilstanuch/codeconsolidator",
"name": "codeconsolidator",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "code analysis, codebase, consolidation, visualization",
"author": "Kamil Stanuch",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/43/a5/f1059c21ff487c23aed01adb1c34557f6c069240f730692016dbb73d73b5/codeconsolidator-0.1.8.tar.gz",
"platform": null,
"description": "# CodeConsolidator\n\nCodeConsolidator is a command-line tool written in Python that helps you analyze and understand your codebase. It provides a structured overview of your project's directory structure, file sizes, token counts, and even consolidates the content of all text-based files into a single output for easy analysis with Large Language Models (LLMs).\n\n## Features\n\n* **Directory Tree Visualization:** Generates a hierarchical tree view of your project's directory structure, optionally including file sizes and ignoring specified files/directories.\n* **Codebase Statistics:** Calculates total files, directories, code size, and token counts for your project.\n* **File Content Consolidation:** Consolidates the content of all text-based files into a single output file (useful for LLM analysis).\n* **.gitignore Support:** Respects your .gitignore file to exclude unwanted files and directories from the analysis.\n* **Customizable Output:** Choose between text or JSON output formats and customize the level of detail included.\n* **Colored Console Output:** Provides a visually appealing and informative summary in the console.\n\n## Installation\n\n```bash\npip install tiktoken colorama\n```\n\n```bash\npython codeconsolidator.py [path_to_directory] [options]\n```\n\n## Options\n```bash\npath_to_directory: Path to the directory you want to analyze.\n-d, --max-depth: Maximum depth for directory traversal.\n-o, --output: Output format (text or json). Default: text.\n-f, --file: Output file name (default: codebase_analysis.txt or codebase_analysis.json).\n--show-tree: Show directory tree in console output (always included in text file output).\n--show-size: Show file sizes in directory tree.\n--show-ignored: Show ignored files and directories in tree.\n--ignore-ext: Additional file extensions to ignore (e.g., .pyc .log).\n--no-content: Exclude file contents from the output.\n--include-git: Include .git directory in the analysis.\n--max-size: Maximum allowed text content size in KB (default: 10240 KB).\n```\n\n```bash\npython codeconsolidator.py my_project -d 3 -o json --show-size --ignore-ext .pyc .log\n```\n\n# 10 LLM Prompts for Enhanced Codebase Analysis \n\n## I. Code Quality & Understanding:\n\n### 1. Codebase Error and Inconsistency Analysis\n\n```bash\n**Objective:** Identify potential errors and inconsistencies within the provided codebase.\n\n**Instructions:**\n\n1. **Analyze the attached code** for the following:\n * Syntax errors and logical flaws.\n * Inconsistencies in variable and function naming conventions.\n * Code duplication.\n * Performance bottlenecks.\n * Violations of established coding best practices. \n2. **Structure your analysis clearly**, pinpointing specific code snippets and providing detailed descriptions of the identified issues.\n3. **Prioritize clarity and conciseness** in your explanations.\n\n**Expected Output:** A comprehensive report detailing errors and inconsistencies, organized by code section or error type, with actionable insights for improvement. \n```\n\n### 2. Codebase Risk Assessment \n\n```bash\n**Objective:** Identify code segments within the provided codebase that could potentially lead to future issues.\n\n**Instructions:**\n\n1. **Analyze the attached code** with a focus on:\n * Code that is difficult to understand and maintain (code smells).\n * Fragments that might cause errors under specific conditions (edge cases).\n * Code that deviates from established coding standards.\n2. **Provide detailed justifications for your concerns**, explaining the potential risks associated with each identified segment.\n3. **Suggest potential solutions or mitigation strategies** to address the identified risks.\n\n**Expected Output:** A report highlighting potential risk areas within the codebase, with clear explanations of the risks and actionable recommendations for improvement.\n```\n\n### 3. Codebase Documentation Generation\n\n```bash\n**Objective:** Generate comprehensive and user-friendly documentation for the provided codebase.\n\n**Instructions:**\n\n1. **Analyze the attached code** and identify key components, functionalities, and APIs.\n2. **Generate documentation that includes:**\n * API specifications with detailed descriptions of endpoints, parameters, and responses.\n * Function descriptions with clear explanations of their purpose, inputs, and outputs.\n * Usage examples demonstrating how to interact with the codebase effectively.\n3. **Structure the documentation logically** and use a consistent format for clarity.\n4. **Prioritize clarity, conciseness, and accuracy** in your documentation.\n\n**Expected Output:** Well-structured and informative documentation that facilitates understanding and utilization of the codebase by developers and other stakeholders.\n```\n\n## II. Learning & Knowledge Extraction:\n\n### 4. User Story Reconstruction from Code\n\n```bash\n**Objective:** Reconstruct the user stories that likely served as the basis for the development of the provided codebase.\n\n**Instructions:**\n\n1. **Analyze the attached code** to identify the core functionalities of the application.\n2. **Infer the user needs** that each functionality aims to address.\n3. **Formulate user stories** using the following template: \"As a [user role], I want [functionality], so that [benefit].\"\n4. **Identify potential missing user stories**. Suggest functionalities that could be added to the application to better meet user needs.\n\n**Expected Output:** A list of reconstructed user stories based on the code's functionalities, along with insights into potential missing user stories and suggestions for application enhancements.\n```\n\n### 5. Code-Based Mini-Lesson Generation\n\n```bash\n**Objective:** Create a series of mini-lessons that explain the key concepts implemented within the provided codebase.\n\n**Instructions:**\n\n1. **Divide the code into logical sections** and create a separate lesson for each.\n2. **Start with the simplest concepts** and gradually progress to more complex ones.\n3. **Use code examples from the application** to illustrate the discussed concepts.\n4. **Include exercises and quizzes** to help learners test their understanding.\n5. **Focus on clarity and pedagogical effectiveness** in your lesson design.\n\n**Expected Output:** A set of well-structured mini-lessons covering the key concepts of the application, with code examples, exercises, and quizzes to facilitate learning.\n```\n\n## III. Code Improvement & Transformation:\n\n### 6. Codebase Best Practice Analysis\n\n```bash\n**Objective:** Analyze the provided codebase and identify examples of both good and bad programming practices.\n\n**Instructions:**\n\n1. **Carefully review the attached code** and pinpoint instances of exemplary and problematic coding practices.\n2. **For each example, provide a detailed analysis** that includes:\n * **What is good/bad about the specific solution?**\n * **What concepts or principles underpin the solution?**\n * **What are the potential positive/negative consequences of using this solution?**\n\n**Expected Output:** A comprehensive report highlighting both positive and negative coding practices within the codebase, with in-depth explanations and analysis of their impact.\n```\n\n### 7. Codebase Translation to Another Programming Language\n\n```bash\n**Objective:** Translate the provided codebase from [Source Language] to [Target Language] while preserving its functionality and structure.\n\n**Instructions:**\n\n1. **Analyze the attached code** written in [Source Language] and understand its logic and functionalities.\n2. **Translate the code** into [Target Language], ensuring that the translated code performs the same tasks as the original code.\n3. **Maintain the original code's structure and organization** as much as possible in the translated version. \n4. **Adhere to the coding conventions and best practices** of the target language.\n5. **Comment the translated code** to explain any significant changes or adaptations made during the translation process.\n\n**Expected Output:** A functional codebase in [Target Language] that accurately reflects the functionality and structure of the original [Source Language] codebase.\n```\n\n### 8. Codebase Refactoring for Improved Readability and Performance\n\n```bash\n**Objective:** Refactor the provided codebase to enhance its readability, maintainability, and performance.\n\n**Instructions:**\n\n1. **Analyze the attached code** and identify areas that can be improved in terms of code clarity, structure, and efficiency.\n2. **Suggest specific code transformations and optimizations** to address the identified areas for improvement.\n3. **Prioritize refactoring techniques** that improve code readability without introducing unnecessary complexity.\n4. **Consider performance implications** of your suggested refactoring and aim for solutions that enhance efficiency without sacrificing clarity. \n5. **Provide clear explanations** for each refactoring suggestion, justifying its benefits and potential impact.\n\n**Expected Output:** A set of actionable refactoring suggestions with detailed explanations of their benefits and potential impact on code quality and performance.\n```\n\n## IV. Testing & Security:\n\n### 9. Unit Test Generation for Codebase\n\n```bash\n**Objective:** Generate unit tests for the provided codebase to ensure code correctness and prevent regressions.\n\n**Instructions:**\n\n1. **Analyze the attached code** and identify its core functions and methods.\n2. **Generate unit tests** that cover a wide range of input values and expected outputs for each function/method.\n3. **Follow best practices for unit testing**, including:\n * **Test one function/method per test case.**\n * **Use descriptive test names.**\n * **Assert expected outcomes clearly.**\n * **Keep tests independent and isolated.**\n4. **Prioritize test coverage for critical functionalities** and edge cases.\n\n**Expected Output:** A comprehensive suite of unit tests that can be used to verify the correctness of the codebase and prevent regressions during future development.\n```\n\n### 10. Security Vulnerability Analysis of Codebase\n\n```bash\n**Objective:** Identify potential security vulnerabilities within the provided codebase.\n\n**Instructions:**\n\n1. **Analyze the attached code** with a focus on identifying common security weaknesses such as:\n * SQL injection.\n * Cross-site scripting (XSS).\n * Cross-site request forgery (CSRF).\n * Authentication and authorization bypasses.\n * Data exposure.\n2. **For each identified vulnerability, provide a detailed explanation** of:\n * The nature of the vulnerability.\n * The potential impact of exploitation.\n * Recommendations for mitigation using secure coding practices.\n3. **Prioritize vulnerabilities based on their severity and potential impact.**\n\n**Expected Output:** A comprehensive security report highlighting potential vulnerabilities within the codebase, along with clear explanations of their risks and actionable recommendations for remediation.\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Consolidates and analyzes codebases for insights.",
"version": "0.1.8",
"project_urls": {
"Homepage": "https://github.com/kamilstanuch/codeconsolidator"
},
"split_keywords": [
"code analysis",
" codebase",
" consolidation",
" visualization"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a9797a65cca9375869b36d01045d8e54507be01e7955129c5505d0474944bc0e",
"md5": "9a6c08654604e34fc77c729fba17077c",
"sha256": "f9e3cdb0d17cc49c5c49f75158ae73d849232cd663f4d754df1f7fbb076bbbff"
},
"downloads": -1,
"filename": "codeconsolidator-0.1.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9a6c08654604e34fc77c729fba17077c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 8875,
"upload_time": "2024-09-07T13:25:59",
"upload_time_iso_8601": "2024-09-07T13:25:59.861911Z",
"url": "https://files.pythonhosted.org/packages/a9/79/7a65cca9375869b36d01045d8e54507be01e7955129c5505d0474944bc0e/codeconsolidator-0.1.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "43a5f1059c21ff487c23aed01adb1c34557f6c069240f730692016dbb73d73b5",
"md5": "0e8c2146127c00f419f620ba1fd19426",
"sha256": "9576f3d3e9446ffa67cb8ca3157600c9ff0a8cba97f102af78d45527d9870a10"
},
"downloads": -1,
"filename": "codeconsolidator-0.1.8.tar.gz",
"has_sig": false,
"md5_digest": "0e8c2146127c00f419f620ba1fd19426",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 8910,
"upload_time": "2024-09-07T13:26:01",
"upload_time_iso_8601": "2024-09-07T13:26:01.750719Z",
"url": "https://files.pythonhosted.org/packages/43/a5/f1059c21ff487c23aed01adb1c34557f6c069240f730692016dbb73d73b5/codeconsolidator-0.1.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-07 13:26:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kamilstanuch",
"github_project": "codeconsolidator",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "codeconsolidator"
}