# git-submodule made easy with git-toprepo
The `git-toprepo` script acts a bit like a client side `git-subtree`
based on the submodules in a top repository.
It has support for one level of submodules only,
no recursive submodules will be resolved.
`git toprepo init <repository> [<directory>]` will clone `repository` into `directory`,
replaces the submodule pointers with the actual content in the repository history.
`git toprepo fetch` fetches from the `remote` and performs the submodule resolution.
`git toprepo pull` is the same as `toprepo fetch && git merge`.
`git toprepo push [-n/--dry-run] <rev>:<ref> ...` does a reverse submodule resolution
so that each submodule can be pushed individually to each submodule upstream.
If running with `-n` or `--dry-run`, the resulting `git push` command lines
will be printed but not executed.
## Merging strategy
The basic idea is to join all the history from all the subrepositories
in a reproducible way. This means that users can keep a mono repository
locally on their computers but have share commit hashes with everyone else.
Consider the following history and commits:
    Top repo  A---B---C---D-------E---F---G---H
                  |       |       |       |
    Submodule 1---2-------3---4---5---6---7---8
The merged history will look like:
    Mono repo A---B2---C2---D3--E5---F5--G7--H7
                  /          \  /     \  / \
                 1            D4       F6   G8
... and NOT like:
    BAD REPO  A--B2--C2--D3--D4--E5--F5--G7--H7
                 /\      /         \    /     \
                1  ------            E6       H8
The algorithm steps are:
* Any history before the submodule is added contains the submodule
  directory only (1).
* Empty edge for the submodule history are removed (`2---3`).
  Such empty edges would only pollute the graph.
  The mono repo history for the submodule directory would
  show there is no update between the two commits anyway.
* The top repo will keep the "first parent" line (`D3---E5`).
  D4 might not be buildable and would break
  `git bisect --first-parent`.
* Submodule changes are moved as late as possible before merging (F6).
  The alternative of E6 instead of F6 clutters a graph log view.
  From the top repo view, it is impossible to know if E6 or F6
  is better (buildable) or not.
* Unmerged submodule branches are branched as early as possible.
  It is easier to run `git checkout G8 && git rebase H7` than
  `git checkout H8 && git rebase H7 --onto G7`.
* Unmerged submodule branches are branched from the history of `HEAD`.
  As commit 7 can be used in multiple top repo branches,
  it is impossible to know which branch commit 8 is aimed for.
  Simply checkout a different monorepo branch and run `git toprepo refilter`
  to move unmerged submodule branches around.
## Configuration
The configuration is specified in git-config format and read by default
from `refs/meta/git-toprepo:toprepo.config` from the top repo remote.
This default loading location can be overridden by setting
`toprepo.config.default.*` in your own git-config.
### Edit default configuration
To setup and edit the configuration in the default location, run
```bash
mkdir my-toprepo-config
cd my-toprepo-config
git init
# Initial commits.
vim toprepo.config
git add toprepo.config
git commit
git push <repository> HEAD:refs/meta/git-toprepo
# Fetch to edit
git fetch <repository> refs/meta/git-toprepo
git checkout FETCH_HEAD
```
Alternatively, setup the repository with a remote using:
```bash
mkdir my-toprepo-config
cd my-toprepo-config
git init
git remote add origin <repository>
git config remote.origin.fetch +refs/meta/*:refs/remotes/origin/meta/*
git fetch origin
```
### Configuration loading
The configuration is specified in the git-config under the section
`toprepo.config.<name>`. The default setting is:
```
[toprepo.config.default]
    type = "git"
    url = .
    ref = refs/meta/git-toprepo
    path = toprepo.config
```
This will load configuration from `toprepo.config` at `refs/meta/git-toprepo`
in the top repo remote.
More configurations can be loaded recursively and they are parsed using
`git config --file - --list`.
#### Configuration loading related fields
The following fields are available for different
`toprepo.config.<config-name>.type`:
* `toprepo.config.<config-name>.type=file` loads a file from local disk.
  * `toprepo.config.<config-name>.path`: The path to the config file to load.
* `toprepo.config.<config-name>.type=git` loads a file from local disk.
  * `toprepo.config.<config-name>.url`: The local or remote repository
    location. If the URL starts with `.`, it is assumed to be an URL relative
    to the top repository remote origin.
  * `toprepo.config.<config-name>.ref`: The remote reference to load.
  * `toprepo.config.<config-name>.path`: The path to the config file
    in the repository.
* `toprepo.config.<config-name>.type=none` has no more fields.
#### Configuration loading examples
Load from worktree:
```
[toprepo.config.default]
    type = "file"
    path = .gittoprepo
```
Load from remote `HEAD` (instead of `refs/meta/git-toprepo`)
```ini
[toprepo.config.default]
    type = "git"
    url = .
    ref = HEAD
    path = .gittoprepo
```
or simply
```ini
[toprepo.config.default]
    ref = HEAD
    path = .gittoprepo
```
### Roles
Roles are used to load and filter a set of repositories.
The build-in default configuration includes:
```ini
[toprepo]
    role = default
[toprepo.role.default]
    repos = +.*
```
This means that the role to load resolves to `default` if unset.
The `default` role resolves to filtering all repositories if unset.
#### Role related fields
* `toprepo.role`: A named role to use. Defaults to `default`.
* `toprepo.role.<role>.repos`: Tells which sub repos to use.
  Multiple values are accumulated.
  Each value starts with `+` or `-` followed by a regexp that should match a
  whole repository name. The last matching regex decides whether the repo
  should be expanded or not.
  `toprepo.role.default.repos` defaults to `+.*`.
#### Role configuration examples
```ini
[toprepo.role]
    # Default to this role, git-config can override.
    role = "active-only"
[toprepo.role.all]
    repos = +.*
[toprepo.role.active-only]
    # Remove all repositories.
    repos = -.*
    # Match certain ones.
    repos = +git-toprepo
    repos = +git-filter-repo
```
### Sub repositories
As `.gitmodules` evolves on the branches over time and
the servers might be relocated, the repository configuration shows how to
access each sub repository in the full history of the top repository.
For example, multiple URLs might have been configured in
the `.gitmodules` file, but all of them refers to the same repository.
#### Repository related fields
* `toprepo.repo.<repo-name>.urls`: Repositories with this specified URL in the
  .gitmodules file will use the configuration under `repo-name`.
  Multiple values are allowed, in which case `fetchUrl` must also be
  specified to make upstream connections unambiguous.
* `toprepo.repo.<repo-name>.fetchUrl`: Overrides `toprepo.repo.<repo-name>.url`
  for clone and fetch.
* `toprepo.repo.<repo-name>.pushUrl`: Overrides `toprepo.repo.<repo-name>.fetchUrl`
  for push.
* `toprepo.repo.<repo-name>.fetchArgs`: Extra command line arguments for
  git-fetch, multiple uses are accumulated.
  Default is `--prune`, `--prune-tags` and `--tags`.
#### Repository configuration examples
```ini
# The repository will be cloned under `.git/repos/myrepo` and
# the role will filter on the myrepo identifier (case sensitive).
[toprepo.repo "myrepo"]
    url = ../some-repo.git
    url = https://my-git-server/some-repo.git
    # Multiple urls makes fetchUrl required.
    fetchUrl = ../some-repo.git
```
Note that without quotes, the configuration is read in lowercase:
```bash
$ git config --list --file - <<EOF
[toprepo.repo.LowerCase]
    url = ../LowerCase.git
[toprepo.repo "Other_Repo"]
    url = ../Other/Repo.git
EOF
toprepo.repo.lowercase.url=../LowerCase.git
toprepo.repo.Other_Repo.url=../Other/Repo.git
```
###  Missing commits
Sometimes, submodules point to commits that do not exist anymore,
with there branch removed, or are otherwise erroneous.
To give the same view and resolved commit hashes to all users,
every missing commit needs to be listed.
git-toprepo will print the lines to add to your configuration when needed.
#### Missing commits syntax
* `toprepo.missing-commit.rev-<commit-hash>=<raw-url>`: This commit hash
  will be ignored if referenced by a subdmodule that has its URL in the
  `.gitmodules` file specified as `raw-url`.
#### Missing commits example
```
[toprepo.missing-commits]
    rev-b6a50df1c26c6b0f8755cac88203a9f4547adccd = ../some-repo.git
    rev-bfd24a62a7d5d5c67e396dd78e28137f99757508 = https://my-git-server/some-repo.git
```
            
         
        Raw data
        
            {
    "_id": null,
    "home_page": null,
    "name": "git-toprepo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "git, submodule, monorepo, toprepo, superrepo",
    "author": "Fredrik Medley",
    "author_email": "fredrik@meroton.com",
    "download_url": "https://files.pythonhosted.org/packages/7e/02/aa0911cd285a6938d54000c050378aaa51e73d1bbc6a9228a10a46a24435/git_toprepo-0.1.0rc7.tar.gz",
    "platform": null,
    "description": "# git-submodule made easy with git-toprepo\n\nThe `git-toprepo` script acts a bit like a client side `git-subtree`\nbased on the submodules in a top repository.\nIt has support for one level of submodules only,\nno recursive submodules will be resolved.\n\n`git toprepo init <repository> [<directory>]` will clone `repository` into `directory`,\nreplaces the submodule pointers with the actual content in the repository history.\n\n`git toprepo fetch` fetches from the `remote` and performs the submodule resolution.\n\n`git toprepo pull` is the same as `toprepo fetch && git merge`.\n\n`git toprepo push [-n/--dry-run] <rev>:<ref> ...` does a reverse submodule resolution\nso that each submodule can be pushed individually to each submodule upstream.\nIf running with `-n` or `--dry-run`, the resulting `git push` command lines\nwill be printed but not executed.\n\n## Merging strategy\n\nThe basic idea is to join all the history from all the subrepositories\nin a reproducible way. This means that users can keep a mono repository\nlocally on their computers but have share commit hashes with everyone else.\n\nConsider the following history and commits:\n\n    Top repo  A---B---C---D-------E---F---G---H\n                  |       |       |       |\n    Submodule 1---2-------3---4---5---6---7---8\n\nThe merged history will look like:\n\n    Mono repo A---B2---C2---D3--E5---F5--G7--H7\n                  /          \\  /     \\  / \\\n                 1            D4       F6   G8\n\n... and NOT like:\n\n    BAD REPO  A--B2--C2--D3--D4--E5--F5--G7--H7\n                 /\\      /         \\    /     \\\n                1  ------            E6       H8\n\nThe algorithm steps are:\n* Any history before the submodule is added contains the submodule\n  directory only (1).\n* Empty edge for the submodule history are removed (`2---3`).\n  Such empty edges would only pollute the graph.\n  The mono repo history for the submodule directory would\n  show there is no update between the two commits anyway.\n* The top repo will keep the \"first parent\" line (`D3---E5`).\n  D4 might not be buildable and would break\n  `git bisect --first-parent`.\n* Submodule changes are moved as late as possible before merging (F6).\n  The alternative of E6 instead of F6 clutters a graph log view.\n  From the top repo view, it is impossible to know if E6 or F6\n  is better (buildable) or not.\n* Unmerged submodule branches are branched as early as possible.\n  It is easier to run `git checkout G8 && git rebase H7` than\n  `git checkout H8 && git rebase H7 --onto G7`.\n* Unmerged submodule branches are branched from the history of `HEAD`.\n  As commit 7 can be used in multiple top repo branches,\n  it is impossible to know which branch commit 8 is aimed for.\n  Simply checkout a different monorepo branch and run `git toprepo refilter`\n  to move unmerged submodule branches around.\n\n## Configuration\n\nThe configuration is specified in git-config format and read by default\nfrom `refs/meta/git-toprepo:toprepo.config` from the top repo remote.\nThis default loading location can be overridden by setting\n`toprepo.config.default.*` in your own git-config.\n\n### Edit default configuration\n\nTo setup and edit the configuration in the default location, run\n\n```bash\nmkdir my-toprepo-config\ncd my-toprepo-config\ngit init\n# Initial commits.\nvim toprepo.config\ngit add toprepo.config\ngit commit\ngit push <repository> HEAD:refs/meta/git-toprepo\n\n# Fetch to edit\ngit fetch <repository> refs/meta/git-toprepo\ngit checkout FETCH_HEAD\n```\n\nAlternatively, setup the repository with a remote using:\n\n```bash\nmkdir my-toprepo-config\ncd my-toprepo-config\ngit init\ngit remote add origin <repository>\ngit config remote.origin.fetch +refs/meta/*:refs/remotes/origin/meta/*\ngit fetch origin\n```\n\n### Configuration loading\n\nThe configuration is specified in the git-config under the section\n`toprepo.config.<name>`. The default setting is:\n\n```\n[toprepo.config.default]\n    type = \"git\"\n    url = .\n    ref = refs/meta/git-toprepo\n    path = toprepo.config\n```\n\nThis will load configuration from `toprepo.config` at `refs/meta/git-toprepo`\nin the top repo remote.\nMore configurations can be loaded recursively and they are parsed using\n`git config --file - --list`.\n\n#### Configuration loading related fields\nThe following fields are available for different\n`toprepo.config.<config-name>.type`:\n\n* `toprepo.config.<config-name>.type=file` loads a file from local disk.\n  * `toprepo.config.<config-name>.path`: The path to the config file to load.\n* `toprepo.config.<config-name>.type=git` loads a file from local disk.\n  * `toprepo.config.<config-name>.url`: The local or remote repository\n    location. If the URL starts with `.`, it is assumed to be an URL relative\n    to the top repository remote origin.\n  * `toprepo.config.<config-name>.ref`: The remote reference to load.\n  * `toprepo.config.<config-name>.path`: The path to the config file\n    in the repository.\n* `toprepo.config.<config-name>.type=none` has no more fields.\n\n#### Configuration loading examples\n\nLoad from worktree:\n\n```\n[toprepo.config.default]\n    type = \"file\"\n    path = .gittoprepo\n```\n\nLoad from remote `HEAD` (instead of `refs/meta/git-toprepo`)\n\n```ini\n[toprepo.config.default]\n    type = \"git\"\n    url = .\n    ref = HEAD\n    path = .gittoprepo\n```\n\nor simply\n\n```ini\n[toprepo.config.default]\n    ref = HEAD\n    path = .gittoprepo\n```\n\n### Roles\n\nRoles are used to load and filter a set of repositories.\nThe build-in default configuration includes:\n\n```ini\n[toprepo]\n    role = default\n[toprepo.role.default]\n    repos = +.*\n```\n\nThis means that the role to load resolves to `default` if unset.\nThe `default` role resolves to filtering all repositories if unset.\n\n#### Role related fields\n\n* `toprepo.role`: A named role to use. Defaults to `default`.\n* `toprepo.role.<role>.repos`: Tells which sub repos to use.\n  Multiple values are accumulated.\n  Each value starts with `+` or `-` followed by a regexp that should match a\n  whole repository name. The last matching regex decides whether the repo\n  should be expanded or not.\n  `toprepo.role.default.repos` defaults to `+.*`.\n\n#### Role configuration examples\n\n```ini\n[toprepo.role]\n    # Default to this role, git-config can override.\n    role = \"active-only\"\n\n[toprepo.role.all]\n    repos = +.*\n[toprepo.role.active-only]\n    # Remove all repositories.\n    repos = -.*\n    # Match certain ones.\n    repos = +git-toprepo\n    repos = +git-filter-repo\n```\n\n### Sub repositories\n\nAs `.gitmodules` evolves on the branches over time and\nthe servers might be relocated, the repository configuration shows how to\naccess each sub repository in the full history of the top repository.\nFor example, multiple URLs might have been configured in\nthe `.gitmodules` file, but all of them refers to the same repository.\n\n#### Repository related fields\n\n* `toprepo.repo.<repo-name>.urls`: Repositories with this specified URL in the\n  .gitmodules file will use the configuration under `repo-name`.\n  Multiple values are allowed, in which case `fetchUrl` must also be\n  specified to make upstream connections unambiguous.\n* `toprepo.repo.<repo-name>.fetchUrl`: Overrides `toprepo.repo.<repo-name>.url`\n  for clone and fetch.\n* `toprepo.repo.<repo-name>.pushUrl`: Overrides `toprepo.repo.<repo-name>.fetchUrl`\n  for push.\n* `toprepo.repo.<repo-name>.fetchArgs`: Extra command line arguments for\n  git-fetch, multiple uses are accumulated.\n  Default is `--prune`, `--prune-tags` and `--tags`.\n\n#### Repository configuration examples\n\n```ini\n# The repository will be cloned under `.git/repos/myrepo` and\n# the role will filter on the myrepo identifier (case sensitive).\n[toprepo.repo \"myrepo\"]\n    url = ../some-repo.git\n    url = https://my-git-server/some-repo.git\n    # Multiple urls makes fetchUrl required.\n    fetchUrl = ../some-repo.git\n```\n\nNote that without quotes, the configuration is read in lowercase:\n\n```bash\n$ git config --list --file - <<EOF\n[toprepo.repo.LowerCase]\n    url = ../LowerCase.git\n[toprepo.repo \"Other_Repo\"]\n    url = ../Other/Repo.git\nEOF\n\ntoprepo.repo.lowercase.url=../LowerCase.git\ntoprepo.repo.Other_Repo.url=../Other/Repo.git\n```\n\n###  Missing commits\n\nSometimes, submodules point to commits that do not exist anymore,\nwith there branch removed, or are otherwise erroneous.\nTo give the same view and resolved commit hashes to all users,\nevery missing commit needs to be listed.\ngit-toprepo will print the lines to add to your configuration when needed.\n\n#### Missing commits syntax\n\n* `toprepo.missing-commit.rev-<commit-hash>=<raw-url>`: This commit hash\n  will be ignored if referenced by a subdmodule that has its URL in the\n  `.gitmodules` file specified as `raw-url`.\n\n#### Missing commits example\n\n```\n[toprepo.missing-commits]\n    rev-b6a50df1c26c6b0f8755cac88203a9f4547adccd = ../some-repo.git\n    rev-bfd24a62a7d5d5c67e396dd78e28137f99757508 = https://my-git-server/some-repo.git\n```\n\n",
    "bugtrack_url": null,
    "license": "GPL-3.0-only",
    "summary": "git-submodule made easy with git-toprepo",
    "version": "0.1.0rc7",
    "project_urls": null,
    "split_keywords": [
        "git",
        " submodule",
        " monorepo",
        " toprepo",
        " superrepo"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6f808604adfaa9b930159a70e8b83482027f6cfcbb08b34b394a47a29884faac",
                "md5": "f375c4946d8263477e6126994fbb567f",
                "sha256": "c49490ec347ec717133d928400b4818e00acd69975ddf4ee0a635e57c272972f"
            },
            "downloads": -1,
            "filename": "git_toprepo-0.1.0rc7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f375c4946d8263477e6126994fbb567f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 84171,
            "upload_time": "2024-06-18T12:38:05",
            "upload_time_iso_8601": "2024-06-18T12:38:05.784401Z",
            "url": "https://files.pythonhosted.org/packages/6f/80/8604adfaa9b930159a70e8b83482027f6cfcbb08b34b394a47a29884faac/git_toprepo-0.1.0rc7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7e02aa0911cd285a6938d54000c050378aaa51e73d1bbc6a9228a10a46a24435",
                "md5": "3661a1a3e2d74a0df71815421ae81fb6",
                "sha256": "4bbdb763d378df4c3b0261f1aad6c47f9f033d73cd10c9a86805a36c1b35e84a"
            },
            "downloads": -1,
            "filename": "git_toprepo-0.1.0rc7.tar.gz",
            "has_sig": false,
            "md5_digest": "3661a1a3e2d74a0df71815421ae81fb6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 85184,
            "upload_time": "2024-06-18T12:38:10",
            "upload_time_iso_8601": "2024-06-18T12:38:10.318301Z",
            "url": "https://files.pythonhosted.org/packages/7e/02/aa0911cd285a6938d54000c050378aaa51e73d1bbc6a9228a10a46a24435/git_toprepo-0.1.0rc7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-18 12:38:10",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "git-toprepo"
}