# Sorting Hat [![tests](https://github.com/chaoss/grimoirelab-sortinghat/workflows/tests/badge.svg)](https://github.com/chaoss/grimoirelab-sortinghat/actions?query=workflow:tests+branch:master+event:push) [![PyPI version](https://badge.fury.io/py/sortinghat.svg)](https://badge.fury.io/py/sortinghat)
## Description
A tool to manage identities.
Sorting Hat maintains an SQL database of unique identities of communities members across (potentially) many different sources. Identities corresponding to the same real person can be merged in the same `individual`, with a unique uuid. For each individual, a profile can be defined, with the name and other data shown for the corresponding person by default.
In addition, each individual can be related to one or more affiliations, for different time periods. This will usually correspond to different organizations in which the person was employed during those time periods.
Sorting Hat is a part of the [GrimoireLab toolset](https://grimoirelab.github.io), which provides Python modules and scripts to analyze data sources with information about software development, and allows the production of interactive dashboards to visualize that information.
In the context of GrimoireLab, Sorting Hat is usually run after data is retrieved with [Perceval](https://github.com/chaoss/grimoirelab-perceval), to store the identities obtained into its database, and later merge them into individuals (and maybe affiliate them).
## Requirements
* Python >= 3.9
* Poetry >= 1.1.0
* MySQL >= 8.1 or MariaDB >= 10.4
* Django = 4.2
* Graphene-Django >= 2.0
* uWSGI >= 2.0
You will also need some other libraries for running the tool, you can find the
whole list of dependencies in [pyproject.toml](pyproject.toml) file.
## Installation
### Getting the source code
To install from the source code you will need to clone the repository first:
```
$ git clone https://github.com/chaoss/grimoirelab-sortinghat
$ cd grimoirelab-sortinghat
```
### Backend
#### Prerequisites
##### Poetry
We use [Poetry](https://python-poetry.org/docs/) for managing the project.
You can install it following [these steps](https://python-poetry.org/docs/#installation).
##### mysql_config
Before you install SortingHat tool you might need to install `mysql_config`
command. If you are using a Debian based distribution, this command can be
found either in `libmysqlclient-dev` or `libmariadbclient-dev` packages
(depending on if you are using MySQL or MariaDB database server). You can
install these packages in your system with the next commands:
* **MySQL**
```
$ apt install libmysqlclient-dev
```
* **MariaDB**
```
$ apt install libmariadbclient-dev-compat
```
#### Installation and configuration
**Note**: these examples use `sortinghat.config.settings` configuration file.
In order to use that configuration you need to define the environment variable
`SORTINGHAT_SECRET_KEY` with a secret. More info
[here](https://docs.djangoproject.com/en/4.2/ref/settings/#std:setting-SECRET_KEY).
Install the required dependencies (this will also create a virtual environment).
```
$ poetry install
```
Activate the virtual environment:
```
$ poetry shell
```
Database creation, apply migrations and fixtures, deploy static files,
and create a superuser:
```
(.venv)$ sortinghat-admin --config sortinghat.config.settings setup
```
#### Running the backend
Run SortingHat backend Django app:
```
(.venv)$ ./manage.py runserver --settings=sortinghat.config.settings
```
### Frontend
#### Prerequisites
##### yarn
To compile and run the frontend you will need to install `yarn` first.
The latest versions of `yarn` can only be installed with `npm` - which
is distributed with [NodeJS](https://nodejs.org/en/download/).
When you have `npm` installed, then run the next command to install `yarn`
on the system:
```
npm install -g yarn
```
Check the [official documentation](https://yarnpkg.com/getting-started)
for more information.
#### Installation and configuration
Install the required dependencies
```
$ cd ui/
$ yarn install
```
#### Running the frontend on development mode
Run SortingHat backend Django app:
```
(.venv)$ ./manage.py runserver --settings=config.settings.devel
```
Build the frontend and watch for changes:
```
$ yarn watch --api_url=http://localhost:8000/api/ --publicpath="/static/" --mode development
```
## SortingHat service
Starting at version 0.8, SortingHat is released with a server app. The server has two
modes, `production` and `development`.
When `production` mode is active, a WSGI app is served. The idea is to use a reverse
proxy like NGINX or similar, that will be connected with the WSGI app to provide
an interface HTTP.
When `development` mode is active, an HTTP server is launched, so you can interact
directly with SortingHat using HTTP requests. Take into account this mode is not
suitable nor safe for production.
You will need a django configuration file to run the service. The file must be accessible
via `PYTHONPATH` env variable. You can use the one delivered within the SortingHat
package (stored in `sortinghat/config` folder) and modify it with your parameters.
Following examples will make use of that file.
In order to run the service for the first time, you need to execute the next commands:
Build the UI interface:
```
$ cd ui
$ yarn install
$ yarn build --mode development
```
If you want to run the UI at `/identities` run (you need to use the server
behind a proxy server):
```
$ yarn build
```
Set a secret key:
```
$ export SORTINGHAT_SECRET_KEY="my-secret-key"
```
Set up the service creating a database, deploying static files,
and adding a superuser to access the app:
```
$ sortinghat-admin --config sortinghat.config.settings setup
```
Run the server (use `--dev` flag for `development` mode):
```
$ sortinghatd --config sortinghat.config.settings
```
By default, this runs a WSGI server in `127.0.0.1:9314`. The `--dev` flag runs
a server in `127.0.0.1:8000`.
You will also need to run some workers to execute tasks like recommendations
or affiliation. To start a worker run the command:
```
$ sortinghatw --config sortinghat.config.settings
```
To start a worker that processes jobs from a set of tenants when
`dedicated_queue` is active (see [below](#multi-tenancy))
use the next command:
```
$ sortinghatw --config sortinghat.config.settings tenant_A tenant_B
```
## Create new accounts
To create new accounts for SortingHat use the following command:
```
(.venv)$ sortinghat-admin create-user
Usage: sortinghat-admin create-user [OPTIONS]
Create a new user given a username and password
Options:
--username TEXT Specifies the login for the user.
--is-admin Specifies if the user is superuser.
--no-interactive Run the command in no interactive mode.
```
## Assign users to permission groups
A user in a group automatically has the permissions granted to that group. To assign users to a permission group use the following command:
```
$ sortinghat-admin set-user-permissions username group
```
The list of groups can be customized using the configuration file `sortinghat/config/permission_groups.json`. You can use a different json file using the environment variable `SORTINGHAT_PERMISSION_GROUPS_LIST_PATH`.
## Compatibility between versions
### SortingHat 0.8.0 and GrimoireLab 0.8.0
SortingHat 0.7.x is no longer supported. Any database using this version will not work.
SortingHat databases 0.7.x are no longer compatible. The `uidentities` table was renamed
to `individuals`. The database schema changed in all tables to add the fields `created_at`
and `last_modified`. Also in `domains`, `enrollments`, `identities`, `profiles` tables,
there are some specific changes to the column names:
* `domains`
* `organization_id` to `organization`
* `enrollments`
* `organization_id` to `organization`
* `uuid` to `individual`
* `identities`
* `uuid` to `individual`
* `profiles`
* `country_code` to `country`
* `uuid` to `individual`
Please update your database running the following command:
```
$ sortinghat-admin --config sortinghat.config.settings migrate-old-database
```
### SortingHat 1.1.0 and GrimoireLab 1.3.0
SortingHat 1.1.0 allows the assignment of users to permission groups. By default, any
existing user in the database will have the minimum permissions, which will allow
only read access. To explicitly assign a user to a permission group, run the
command provided in the previous section
[Assign users to permission groups](#assign-users-to-permission-groups)
## Multi-tenancy
SortingHat allows hosting multiple instances with a single service having each
instance's data isolated in different databases.
To enable this feature follow these guidelines:
- Set `MULTI_TENANT` settings to `True`.
- Define a list of tenants using the configuration file `sortinghat/config/tenants.json`.
You can use a different json file using the environment variable
`SORTINGHAT_MULTI_TENANT_LIST_PATH`. The file should have the next schema:
```json
{
"tenants": [
{"name": "tenant A", "dedicated_queue": true},
{"name": "tenant B", "dedicated_queue": false}
]
}
```
Where `name` is the name of each tenant and `dedicated_queue`
is a boolean value to set whether jobs will be run on a specific
queue with the same tenant name.
- Assign users to tenants with the following command:
`sortinghat-admin set-user-tenant username header tenant`
- The selected tenant should be included in the request using the
`sortinghat-tenant` header.
There are some limitations:
- `default` database is only used to store users information and relations between
users and databases, it won't store anything else related with SortingHat models.
- Usernames are shared across all instances, which means that it is not possible
to have the same username with two different passwords in different instances.
- Tenants with `dedicated_queue` set as active will add their jobs to the queue
of the same name. Queues will be created by SortingHat but, you will have
to run a worker that processes that query.
## Running tests
SortingHat comes with a comprehensive list of unit tests for both
frontend and backend.
#### Backend test suite
```
(.venv)$ ./manage.py test --settings=config.settings.config_testing
(.venv)$ ./manage.py test --settings=config.settings.config_testing_tenant
```
#### Frontend test suite
```
$ cd ui/
$ yarn test:unit
```
## License
Licensed under GNU General Public License (GPL), version 3 or later.
Raw data
{
"_id": null,
"home_page": "https://chaoss.github.io/grimoirelab/",
"name": "sortinghat",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "development, grimoirelab",
"author": "GrimoireLab Developers",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/d8/79/8d3424dd3b63b85fd4182f60ce5e8f84ff6985c7a058700b54417dd92376/sortinghat-1.6.0.tar.gz",
"platform": null,
"description": "# Sorting Hat [![tests](https://github.com/chaoss/grimoirelab-sortinghat/workflows/tests/badge.svg)](https://github.com/chaoss/grimoirelab-sortinghat/actions?query=workflow:tests+branch:master+event:push) [![PyPI version](https://badge.fury.io/py/sortinghat.svg)](https://badge.fury.io/py/sortinghat)\n\n## Description\n\nA tool to manage identities.\n\nSorting Hat maintains an SQL database of unique identities of communities members across (potentially) many different sources. Identities corresponding to the same real person can be merged in the same `individual`, with a unique uuid. For each individual, a profile can be defined, with the name and other data shown for the corresponding person by default.\n\nIn addition, each individual can be related to one or more affiliations, for different time periods. This will usually correspond to different organizations in which the person was employed during those time periods.\n\nSorting Hat is a part of the [GrimoireLab toolset](https://grimoirelab.github.io), which provides Python modules and scripts to analyze data sources with information about software development, and allows the production of interactive dashboards to visualize that information.\n\nIn the context of GrimoireLab, Sorting Hat is usually run after data is retrieved with [Perceval](https://github.com/chaoss/grimoirelab-perceval), to store the identities obtained into its database, and later merge them into individuals (and maybe affiliate them).\n\n\n## Requirements\n\n* Python >= 3.9\n* Poetry >= 1.1.0\n* MySQL >= 8.1 or MariaDB >= 10.4\n* Django = 4.2\n* Graphene-Django >= 2.0\n* uWSGI >= 2.0\n\nYou will also need some other libraries for running the tool, you can find the\nwhole list of dependencies in [pyproject.toml](pyproject.toml) file.\n\n\n## Installation\n\n### Getting the source code\n\nTo install from the source code you will need to clone the repository first:\n```\n$ git clone https://github.com/chaoss/grimoirelab-sortinghat\n$ cd grimoirelab-sortinghat\n```\n\n### Backend\n\n#### Prerequisites\n\n##### Poetry\n\nWe use [Poetry](https://python-poetry.org/docs/) for managing the project.\nYou can install it following [these steps](https://python-poetry.org/docs/#installation).\n\n##### mysql_config\n\nBefore you install SortingHat tool you might need to install `mysql_config`\ncommand. If you are using a Debian based distribution, this command can be\nfound either in `libmysqlclient-dev` or `libmariadbclient-dev` packages\n(depending on if you are using MySQL or MariaDB database server). You can\ninstall these packages in your system with the next commands:\n\n* **MySQL**\n\n```\n$ apt install libmysqlclient-dev\n```\n\n* **MariaDB**\n\n```\n$ apt install libmariadbclient-dev-compat\n```\n\n#### Installation and configuration\n\n**Note**: these examples use `sortinghat.config.settings` configuration file.\nIn order to use that configuration you need to define the environment variable\n`SORTINGHAT_SECRET_KEY` with a secret. More info\n[here](https://docs.djangoproject.com/en/4.2/ref/settings/#std:setting-SECRET_KEY).\n\n\nInstall the required dependencies (this will also create a virtual environment).\n```\n$ poetry install\n```\n\nActivate the virtual environment:\n```\n$ poetry shell\n```\n\nDatabase creation, apply migrations and fixtures, deploy static files,\nand create a superuser:\n```\n(.venv)$ sortinghat-admin --config sortinghat.config.settings setup\n```\n\n#### Running the backend\n\nRun SortingHat backend Django app:\n```\n(.venv)$ ./manage.py runserver --settings=sortinghat.config.settings\n```\n\n### Frontend\n\n#### Prerequisites\n\n##### yarn\n\nTo compile and run the frontend you will need to install `yarn` first.\nThe latest versions of `yarn` can only be installed with `npm` - which\nis distributed with [NodeJS](https://nodejs.org/en/download/).\n\nWhen you have `npm` installed, then run the next command to install `yarn`\non the system:\n\n```\nnpm install -g yarn\n```\n\nCheck the [official documentation](https://yarnpkg.com/getting-started)\nfor more information.\n\n#### Installation and configuration\n\nInstall the required dependencies\n```\n$ cd ui/\n$ yarn install\n```\n\n#### Running the frontend on development mode\n\nRun SortingHat backend Django app:\n```\n(.venv)$ ./manage.py runserver --settings=config.settings.devel\n```\n\nBuild the frontend and watch for changes:\n```\n$ yarn watch --api_url=http://localhost:8000/api/ --publicpath=\"/static/\" --mode development\n```\n\n\n## SortingHat service\n\nStarting at version 0.8, SortingHat is released with a server app. The server has two\nmodes, `production` and `development`.\n\nWhen `production` mode is active, a WSGI app is served. The idea is to use a reverse\nproxy like NGINX or similar, that will be connected with the WSGI app to provide\nan interface HTTP.\n\nWhen `development` mode is active, an HTTP server is launched, so you can interact\ndirectly with SortingHat using HTTP requests. Take into account this mode is not\nsuitable nor safe for production.\n\nYou will need a django configuration file to run the service. The file must be accessible\nvia `PYTHONPATH` env variable. You can use the one delivered within the SortingHat\npackage (stored in `sortinghat/config` folder) and modify it with your parameters.\nFollowing examples will make use of that file.\n\nIn order to run the service for the first time, you need to execute the next commands:\n\nBuild the UI interface:\n```\n$ cd ui\n$ yarn install\n$ yarn build --mode development\n```\nIf you want to run the UI at `/identities` run (you need to use the server \nbehind a proxy server):\n```\n$ yarn build\n```\n\nSet a secret key:\n```\n$ export SORTINGHAT_SECRET_KEY=\"my-secret-key\"\n```\n\nSet up the service creating a database, deploying static files,\nand adding a superuser to access the app:\n```\n$ sortinghat-admin --config sortinghat.config.settings setup\n```\n\nRun the server (use `--dev` flag for `development` mode):\n```\n$ sortinghatd --config sortinghat.config.settings\n```\n\nBy default, this runs a WSGI server in `127.0.0.1:9314`. The `--dev` flag runs\na server in `127.0.0.1:8000`.\n\nYou will also need to run some workers to execute tasks like recommendations\nor affiliation. To start a worker run the command:\n```\n$ sortinghatw --config sortinghat.config.settings\n```\n\nTo start a worker that processes jobs from a set of tenants when\n`dedicated_queue` is active (see [below](#multi-tenancy))\nuse the next command:\n```\n$ sortinghatw --config sortinghat.config.settings tenant_A tenant_B\n```\n\n## Create new accounts\nTo create new accounts for SortingHat use the following command:\n\n```\n(.venv)$ sortinghat-admin create-user\n\nUsage: sortinghat-admin create-user [OPTIONS]\n\n Create a new user given a username and password\n\nOptions:\n --username TEXT Specifies the login for the user.\n --is-admin Specifies if the user is superuser.\n --no-interactive Run the command in no interactive mode.\n```\n\n## Assign users to permission groups\nA user in a group automatically has the permissions granted to that group. To assign users to a permission group use the following command:\n```\n$ sortinghat-admin set-user-permissions username group\n```\n\nThe list of groups can be customized using the configuration file `sortinghat/config/permission_groups.json`. You can use a different json file using the environment variable `SORTINGHAT_PERMISSION_GROUPS_LIST_PATH`.\n\n\n## Compatibility between versions\n\n### SortingHat 0.8.0 and GrimoireLab 0.8.0\n\nSortingHat 0.7.x is no longer supported. Any database using this version will not work.\n\nSortingHat databases 0.7.x are no longer compatible. The `uidentities` table was renamed\nto `individuals`. The database schema changed in all tables to add the fields `created_at`\nand `last_modified`. Also in `domains`, `enrollments`, `identities`, `profiles` tables,\nthere are some specific changes to the column names:\n * `domains`\n * `organization_id` to `organization`\n * `enrollments`\n * `organization_id` to `organization`\n * `uuid` to `individual`\n * `identities`\n * `uuid` to `individual`\n * `profiles`\n * `country_code` to `country`\n * `uuid` to `individual`\n\nPlease update your database running the following command:\n```\n$ sortinghat-admin --config sortinghat.config.settings migrate-old-database\n```\n\n### SortingHat 1.1.0 and GrimoireLab 1.3.0\n\nSortingHat 1.1.0 allows the assignment of users to permission groups. By default, any\nexisting user in the database will have the minimum permissions, which will allow\nonly read access. To explicitly assign a user to a permission group, run the\ncommand provided in the previous section \n[Assign users to permission groups](#assign-users-to-permission-groups)\n\n## Multi-tenancy\n\nSortingHat allows hosting multiple instances with a single service having each\ninstance's data isolated in different databases.\n\nTo enable this feature follow these guidelines:\n- Set `MULTI_TENANT` settings to `True`.\n- Define a list of tenants using the configuration file `sortinghat/config/tenants.json`.\n You can use a different json file using the environment variable \n `SORTINGHAT_MULTI_TENANT_LIST_PATH`. The file should have the next schema:\n\n ```json\n {\n \"tenants\": [\n {\"name\": \"tenant A\", \"dedicated_queue\": true},\n {\"name\": \"tenant B\", \"dedicated_queue\": false}\n ]\n }\n ```\n\n Where `name` is the name of each tenant and `dedicated_queue`\n is a boolean value to set whether jobs will be run on a specific\n queue with the same tenant name.\n- Assign users to tenants with the following command:\n `sortinghat-admin set-user-tenant username header tenant`\n- The selected tenant should be included in the request using the\n `sortinghat-tenant` header.\n\nThere are some limitations:\n\n- `default` database is only used to store users information and relations between\n users and databases, it won't store anything else related with SortingHat models.\n- Usernames are shared across all instances, which means that it is not possible\n to have the same username with two different passwords in different instances.\n- Tenants with `dedicated_queue` set as active will add their jobs to the queue\n of the same name. Queues will be created by SortingHat but, you will have\n to run a worker that processes that query.\n\n\n## Running tests\n\nSortingHat comes with a comprehensive list of unit tests for both \nfrontend and backend.\n\n#### Backend test suite\n```\n(.venv)$ ./manage.py test --settings=config.settings.config_testing\n(.venv)$ ./manage.py test --settings=config.settings.config_testing_tenant\n```\n\n#### Frontend test suite\n```\n$ cd ui/\n$ yarn test:unit\n```\n\n## License\n\nLicensed under GNU General Public License (GPL), version 3 or later.\n",
"bugtrack_url": null,
"license": "GPL-3.0+",
"summary": "A tool to manage identities.",
"version": "1.6.0",
"project_urls": {
"Bug Tracker": "https://github.com/chaoss/grimoirelab-sortinghat/issues",
"Homepage": "https://chaoss.github.io/grimoirelab/",
"Repository": "https://github.com/chaoss/grimoirelab-sortinghat"
},
"split_keywords": [
"development",
" grimoirelab"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4113dc14548ae5886614c3148e1d5652bea8e7cd1cf023e63ed05abbbf255ed8",
"md5": "79886f86d3357885284faa5433a24850",
"sha256": "f1def490ada186d72a21efa4535453cbb1a1475019d6fe6b4834bcd2db8f4dba"
},
"downloads": -1,
"filename": "sortinghat-1.6.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "79886f86d3357885284faa5433a24850",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 5268746,
"upload_time": "2024-12-11T10:29:38",
"upload_time_iso_8601": "2024-12-11T10:29:38.665922Z",
"url": "https://files.pythonhosted.org/packages/41/13/dc14548ae5886614c3148e1d5652bea8e7cd1cf023e63ed05abbbf255ed8/sortinghat-1.6.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d8798d3424dd3b63b85fd4182f60ce5e8f84ff6985c7a058700b54417dd92376",
"md5": "1279a9ab8bd12cc6dc1beae00a5f760c",
"sha256": "db823ff14e68b3c481312ff87f96981a952ef06f29428ae0c2db6dc123b78077"
},
"downloads": -1,
"filename": "sortinghat-1.6.0.tar.gz",
"has_sig": false,
"md5_digest": "1279a9ab8bd12cc6dc1beae00a5f760c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 5305352,
"upload_time": "2024-12-11T10:29:43",
"upload_time_iso_8601": "2024-12-11T10:29:43.741652Z",
"url": "https://files.pythonhosted.org/packages/d8/79/8d3424dd3b63b85fd4182f60ce5e8f84ff6985c7a058700b54417dd92376/sortinghat-1.6.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-11 10:29:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chaoss",
"github_project": "grimoirelab-sortinghat",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sortinghat"
}