# AwesomeCure
> Analyze and cure awesome lists by collecting, processing and presenting data from listed Git projects.
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/protontypes/AwesomeCure.git/HEAD)
AwesomeCure provides basic scripts to analyze Git projects within an Awesome list getting an overview of the represented open source domains. Use the GitHub API to retrieve meta data and generate various metrics about the state of open source ecosystems. As a result, spreadsheets and plots are created to sort and analyze all entries according to you needs.
## Background
Awesome lists are a central part of the open source ecosystem. They allow developers to get an overview of open source projects in different domains. A state of the art cross-platform search engine for open source projects does not yet exist. Therefore, those lists represent central indexes for the diverse open source communities. Interfaces can be created between projects, development resources are concentrated and the wheel is not reinvented again and again.
Maintaining an Awesome list requires removing inactive projects on a regular basis, investigating new projects and engaging the community to update the list. Without these measures, new and still active projects get lost in the multitude of inactive projects. The processing of such list gives the possibility to analyze these ecosystems with the help of data science methods in order to identify potentials and risks within the domain.
## Application
The [OpenSustain.tech](https://opensustain.tech/) website is based on such an Awesome list, giving an overview of the active open source projects in climate and sustainable technology. In the current prototype project state, AwesomeCure can only be applied to this list, but generalization to all Awesome lists is possible.
Most of the entries are linked to GitHub or GitLab repository of the underlying project. Therefore, AwesomeCure is able to analyze every project via the platform API to extract meta data from the listed projects. In this way, various health indicators are extracted like:
* Last activity
* Community Distribution Score ( How much does the project depend on a single person)
* Number of reviews per pull request
* Days until the last commit and last closed issue
* Total number of stars, contributors
* Use licenses
* Any many more...
## Install
Clone the GitHub repository:
```
git clone git@github.com:protontypes/AwesomeCure.git
```
Install Jupiter notebook:
```
pip install jupyterlab
pip install notebook
```
Install the dependencies:
```
cd AwesomeCure
pip install -r requirements.txt
```
Add a `.env` with your personal GitHub token to the root project folder (see more information on that [here](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token)). Give the API key the minimum number of permissions. The `.env` file is excluded from version control by the .gitignore file and in this way not uploaded to GitHub. Open the .env file with your favored editor and add
```
GITHUB=Your_API_Key
```
Run the Jupyter Notebook
```
jupyter notebook
```
A browser window should open, if not, click (or copy paste) the link from your terminal output.
## Architecture
The project was split into two Jupyter notebooks. One for data acquisition and one for data processing and plotting.
### Data Acquisition
The [AwesomeCure.ipynb](./awesomecure.ipynb) notebook lets you read the Awesome list from any repository. Depending on the size of the list the processing can take multiple hours.
![AwesomeCure](./docs/AwesomeCure.png)
### Data Processing
Data processing is done in the [ost_analysis.ipynb](ost_analysis.ipynb) with the output csv files form the data acquisition. Since not API key is needed for this step the processing can also been done online within Binder:
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/protontypes/AwesomeCure.git/HEAD)
![OST_Analysis](./docs/OST_Analysis.png)
## Results
Plotting the dataset gives insides into the Open Source Ecosystems from different perspectives.
### Programming Languages
![programming_languages](./docs/programming_languages.png)
### Project Scores
![Score_example_plot](./docs/Score_example_plot.png)
### Communities and Organizations
![organizations](./docs/organizations.png)
![organizations_forms](./docs/organizations_forms.png)
## An extended Poetry's command-line interface by developing plugins as an alternative to Jupyter Notebooks
##### Usage:
From an integrated terminal, the plugin needs to depend on Poetry to interface with it, so we can run the following commands:
```
poetry install
```
If it's needed, we can use poetry or pip to install the command-line interface:
```
poetry add open-sustain-tech
```
or,
```
pip install open-sustain-tech
```
To run the plugin, we can use the following command:
```
oss-opt
```
or, commonly to poetry and python:
```
poetry run python -m open_sustain_tech
```
or,
```
poetry run oss-opt
```
As described in the `protontypes/AwesomeCure` under the `README.md` file, and under the `Architecture` section in reference to the `Data Acquisition`, the `oss-opt` command-line interface lets read the Awesome list from any repository as well.
**Note:** Depending on the size of the list, the processing can take multiple hours.
This is an update to the open_sustain_tech command-line interface, and here are the main changes to consider regarding this update:
- The OpenSustainTech class is a subclass of the Command class from the cleo library. It has a handle method that is used to handle the command when the user executes it. The factory function is a simple function that creates and returns an instance of the OpenSustainTech class. This function is used as a "factory" for creating OpenSustainTech instances.
- The Options class is a data class (thanks to the @dataclass decorator) that contains options for the command-line parser. It uses the simple_parsing library to define the options and their default values.
- An ArgumentParser object is created and configured to use the options defined in the Options class. The command-line arguments are then parsed and stored in the args variable.
- The OSSOptionPlugin class is a subclass of ApplicationPlugin from the poetry library. It has an activate method that is used to register the open_sustain_tech command with the application. It also has a commands property that returns a list of available commands, in this case, an instance of OpenSustainTech.
## Many thanks to:
Tobias Augspurger
Raw data
{
"_id": null,
"home_page": "https://opensustain.tech/",
"name": "open-sustain-tech",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<3.12",
"maintainer_email": "",
"keywords": "awesome-list,poetry,plugin,python,version,sustainability,technology,open,source,software",
"author": "Abderrahim Guennoune",
"author_email": "aguennoune@outlook.com",
"download_url": "https://files.pythonhosted.org/packages/71/23/f56b6d0f07867cec3c0804c98f96c09a86484d5d020e4941790f11e3cc35/open_sustain_tech-1.3.0.tar.gz",
"platform": null,
"description": "# AwesomeCure\n> Analyze and cure awesome lists by collecting, processing and presenting data from listed Git projects.\n\n [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/protontypes/AwesomeCure.git/HEAD)\n\nAwesomeCure provides basic scripts to analyze Git projects within an Awesome list getting an overview of the represented open source domains. Use the GitHub API to retrieve meta data and generate various metrics about the state of open source ecosystems. As a result, spreadsheets and plots are created to sort and analyze all entries according to you needs. \n\n## Background\n\nAwesome lists are a central part of the open source ecosystem. They allow developers to get an overview of open source projects in different domains. A state of the art cross-platform search engine for open source projects does not yet exist. Therefore, those lists represent central indexes for the diverse open source communities. Interfaces can be created between projects, development resources are concentrated and the wheel is not reinvented again and again.\n\nMaintaining an Awesome list requires removing inactive projects on a regular basis, investigating new projects and engaging the community to update the list. Without these measures, new and still active projects get lost in the multitude of inactive projects. The processing of such list gives the possibility to analyze these ecosystems with the help of data science methods in order to identify potentials and risks within the domain. \n\n## Application\nThe [OpenSustain.tech](https://opensustain.tech/) website is based on such an Awesome list, giving an overview of the active open source projects in climate and sustainable technology. In the current prototype project state, AwesomeCure can only be applied to this list, but generalization to all Awesome lists is possible.\n\nMost of the entries are linked to GitHub or GitLab repository of the underlying project. Therefore, AwesomeCure is able to analyze every project via the platform API to extract meta data from the listed projects. In this way, various health indicators are extracted like:\n\n* Last activity\n* Community Distribution Score ( How much does the project depend on a single person)\n* Number of reviews per pull request\n* Days until the last commit and last closed issue\n* Total number of stars, contributors\n* Use licenses \n* Any many more...\n\n## Install\n\nClone the GitHub repository:\n\n```\ngit clone git@github.com:protontypes/AwesomeCure.git\n```\n\nInstall Jupiter notebook:\n\n```\npip install jupyterlab\npip install notebook\n```\n\nInstall the dependencies:\n\n```\ncd AwesomeCure\npip install -r requirements.txt \n```\n\nAdd a `.env` with your personal GitHub token to the root project folder (see more information on that [here](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token)). Give the API key the minimum number of permissions. The `.env` file is excluded from version control by the .gitignore file and in this way not uploaded to GitHub. Open the .env file with your favored editor and add \n```\nGITHUB=Your_API_Key\n```\n\nRun the Jupyter Notebook\n```\njupyter notebook\n```\nA browser window should open, if not, click (or copy paste) the link from your terminal output.\n\n## Architecture\n\nThe project was split into two Jupyter notebooks. One for data acquisition and one for data processing and plotting. \n\n### Data Acquisition\n\nThe [AwesomeCure.ipynb](./awesomecure.ipynb) notebook lets you read the Awesome list from any repository. Depending on the size of the list the processing can take multiple hours.\n\n![AwesomeCure](./docs/AwesomeCure.png)\n\n### Data Processing\n\nData processing is done in the [ost_analysis.ipynb](ost_analysis.ipynb) with the output csv files form the data acquisition. Since not API key is needed for this step the processing can also been done online within Binder:\n\n [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/protontypes/AwesomeCure.git/HEAD)\n\n![OST_Analysis](./docs/OST_Analysis.png)\n\n## Results\nPlotting the dataset gives insides into the Open Source Ecosystems from different perspectives. \n\n### Programming Languages \n\n![programming_languages](./docs/programming_languages.png)\n\n### Project Scores \n\n![Score_example_plot](./docs/Score_example_plot.png)\n\n\n### Communities and Organizations \n\n![organizations](./docs/organizations.png)\n\n![organizations_forms](./docs/organizations_forms.png)\n\n## An extended Poetry's command-line interface by developing plugins as an alternative to Jupyter Notebooks\n\n##### Usage:\n\nFrom an integrated terminal, the plugin needs to depend on Poetry to interface with it, so we can run the following commands:\n\n```\npoetry install\n```\n\nIf it's needed, we can use poetry or pip to install the command-line interface:\n\n```\npoetry add open-sustain-tech\n```\n\nor,\n\n```\npip install open-sustain-tech\n```\n\nTo run the plugin, we can use the following command:\n\n```\noss-opt\n```\nor, commonly to poetry and python:\n\n```\npoetry run python -m open_sustain_tech\n```\n\nor,\n\n```\npoetry run oss-opt\n```\n\nAs described in the `protontypes/AwesomeCure` under the `README.md` file, and under the `Architecture` section in reference to the `Data Acquisition`, the `oss-opt` command-line interface lets read the Awesome list from any repository as well.\n\n**Note:** Depending on the size of the list, the processing can take multiple hours.\n\nThis is an update to the open_sustain_tech command-line interface, and here are the main changes to consider regarding this update:\n\n- The OpenSustainTech class is a subclass of the Command class from the cleo library. It has a handle method that is used to handle the command when the user executes it. The factory function is a simple function that creates and returns an instance of the OpenSustainTech class. This function is used as a \"factory\" for creating OpenSustainTech instances.\n\n- The Options class is a data class (thanks to the @dataclass decorator) that contains options for the command-line parser. It uses the simple_parsing library to define the options and their default values.\n\n- An ArgumentParser object is created and configured to use the options defined in the Options class. The command-line arguments are then parsed and stored in the args variable.\n\n- The OSSOptionPlugin class is a subclass of ApplicationPlugin from the poetry library. It has an activate method that is used to register the open_sustain_tech command with the application. It also has a commands property that returns a list of available commands, in this case, an instance of OpenSustainTech.\n\n## Many thanks to:\nTobias Augspurger \n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Awesome List Curated with a Poetry Plugin Support to the Open Sustain Tech CLI Option",
"version": "1.3.0",
"project_urls": {
"Bug Tracker": "https://github.com/protontypes/AwesomeCure/issues",
"Homepage": "https://opensustain.tech/",
"Repository": "https://github.com/aguennoune/AwesomeCure/tree/main"
},
"split_keywords": [
"awesome-list",
"poetry",
"plugin",
"python",
"version",
"sustainability",
"technology",
"open",
"source",
"software"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ae39ee4a9b51a9a0d162116af525ea866416b318f936324ec56ccd058306825b",
"md5": "967019b4dbcb71760eddcd7fd6d6017b",
"sha256": "870127b3d6bd1c60dbe40d41cee48efdb89cffe50f95cca101eb1fe93adf540f"
},
"downloads": -1,
"filename": "open_sustain_tech-1.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "967019b4dbcb71760eddcd7fd6d6017b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<3.12",
"size": 5004791,
"upload_time": "2023-11-20T06:59:57",
"upload_time_iso_8601": "2023-11-20T06:59:57.555874Z",
"url": "https://files.pythonhosted.org/packages/ae/39/ee4a9b51a9a0d162116af525ea866416b318f936324ec56ccd058306825b/open_sustain_tech-1.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7123f56b6d0f07867cec3c0804c98f96c09a86484d5d020e4941790f11e3cc35",
"md5": "c6556b157f5149344c3a200713450839",
"sha256": "7b6fb535196ed9a6eff88276b539b8348bd559626efd9e03face9a364f0ef38a"
},
"downloads": -1,
"filename": "open_sustain_tech-1.3.0.tar.gz",
"has_sig": false,
"md5_digest": "c6556b157f5149344c3a200713450839",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<3.12",
"size": 4993367,
"upload_time": "2023-11-20T07:01:15",
"upload_time_iso_8601": "2023-11-20T07:01:15.862299Z",
"url": "https://files.pythonhosted.org/packages/71/23/f56b6d0f07867cec3c0804c98f96c09a86484d5d020e4941790f11e3cc35/open_sustain_tech-1.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-20 07:01:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "protontypes",
"github_project": "AwesomeCure",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "open-sustain-tech"
}