ansible-content-parser


Nameansible-content-parser JSON
Version 1.0.3 PyPI version JSON
download
home_page
SummaryThe ansible content parser
upload_time2024-02-05 14:32:02
maintainer
docs_urlNone
author
requires_python>=3.10
licenseApache
keywords ansible
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # ansible-content-parser

## Overview

`ansible-content-parser` analyzes Ansible files in a given source
(a local directory, an archive file or a git URL)
by running `ansible-lint` internally,
updates Ansible files using the [Autofix feature of `ansible-lint`](https://ansible.readthedocs.io/projects/lint/autofix/)
and generates the `ftdata.jsonl` file, which is the training dataset for
developing custom AI models.

## Build

Execute the `tox` command. Installable images are created under
the `dist` directory.

## Installation

### Prerequisites

- Python version 3.10 or later.
- UNIX OS, such as Linux or Mac OS.

**Note:** Installation on Microsoft Windows OS is not supported.

### Procedure

`ansible-content-parser` uses a newer version of `ansible-lint` and its
dependent components. In order to isolate them from the existing
Ansible installations, it is recommended to install `ansible-content-parser` in
a Python virtual environment with the following steps:

1. Create a working directory and set up venv Python virtual environment:

```commandline
python -m venv ./venv
source ./venv/bin/activate
```

2. Install `ansible-content-parser` from the pip repository:

```commandline
pip install --upgrade pip
pip install --upgrade ansible-content-parser
```

3. After the installation is completed, verify that `ansible-content-parser` and `ansible-lint` are installed correctly:

```commandline
ansible-content-parser --version
ansible-lint --version
```

A list of application versions and their dependencies are displayed.
In the output that is displayed, ensure that you have the same version of `ansible-lint`.

**Important:** If there is a mismatch in the installed `ansible-lint` versions, you cannot get consistent results from the content parser and ansible-lint.
For example, the following result shows a mismatch in `ansible-lint` versions:

```commandline
$ ansible-content-parser --version
ansible-content-parser 0.0.1 using ansible-lint:6.20.0 ansible-core:2.15.4
$ ansible-lint --version
ansible-lint 6.13.1 using ansible 2.15.4
A new release of ansible-lint is available: 6.13.1 → 6.20.0
```

If the `ansible-lint` versions do not match, perform the following tasks:

1. Deactivate and reactivate venv:

```commandline
deactivate
source ./venv/bin/activate
```

2. Verify that the `ansible-lint` versions match:

```commandline
ansible-content-parser --version
ansible-lint --version
```

For example, the following output shows the same ansible-lint versions:

```
$ ansible-content-parser --version
ansible-content-parser 0.0.1 using ansible-lint:6.20.0 ansible-core:2.15.4
$ ansible-lint --version
ansible-lint 6.20.0 using ansible-core:2.15.4 ansible-compat:4.1.10 ruamel-yaml:0.17.32 ruamel-yaml-clib:0.2.7
```

## Execution

`ansible-content-parser` accepts two positional parameters (`source` and `output`)
with a few optional parameters.

```commandline
$ ansible-content-parser --help
usage: ansible-content-parser [-h] [--config-file CONFIG_FILE]
                              [--profile {min,basic,moderate,safety,shared,production}] [--fix WRITE_LIST]
                              [--skip-ansible-lint] [--no-exclude] [-v] [--source-license SOURCE_LICENSE]
                              [--source-description SOURCE_DESCRIPTION] [--repo-name REPO_NAME] [--repo-url REPO_URL]
                              [--version]
                              source output

Parse Ansible files in the given repository by running ansible-lint and generate a training dataset for Ansible
Lightspeed.

positional arguments:
  source                source, which can be an zip/tar archive, a git URL or a local directory
  output                output directory

options:
  -h, --help            show this help message and exit
  --config-file CONFIG_FILE
                        Specify the configuration file to use for ansible-lint. By default it will look for '.ansible-
                        lint', '.config/ansible-lint.yml', or '.config/ansible-lint.yaml' in the source repository.
  --profile {min,basic,moderate,safety,shared,production}
                        Specify which rules profile to be used for ansible-lint
  --fix WRITE_LIST      Specify how ansible-lint performs auto-fixes, including YAML reformatting. You can limit the
                        effective rule transforms (the 'write_list') by passing a keywords 'all' (=default) or 'none'
                        or a comma separated list of rule ids or rule tags.
  -S, --skip-ansible-lint
                        Skip the execution of ansible-lint.
  --no-exclude          Do not let ansible-content-parser to generate training dataset by excluding files that caused
                        lint errors. With this option specified, a single lint error terminates the execution without
                        generating the training dataset.
  -v, --verbose         Explain what is being done
  --source-license SOURCE_LICENSE
                        Specify the license that will be included in the training dataset.
  --source-description SOURCE_DESCRIPTION
                        Specify the description of the source that will be included in the training dataset.
  --repo-name REPO_NAME
                        Specify the repository name that will be included in the training dataset. If it is not
                        specified, it is generated from the source name.
  --repo-url REPO_URL   Specify the repository url that will be included in the training dataset. If it is not
                        specified, it is generated from the source name.
  --version             show program's version number and exit
```

### `source` positional argument

The first positional parameter is `source`, which specifies
the source repository to be used. Following three types of sources are supported:

1. File directory.
2. Archive file in the following table:

| File Format      | File Extension                                |
| ---------------- | --------------------------------------------- |
| ZIP              | .zip                                          |
| Uncompressed TAR | .tar                                          |
| Compressed TAR   | .tar.gz, .tgz, .tar.bz2, .tbz2, .tar.xz, .txz |

3. Git URL, e.g. `git@github.com:ansible/workshop-examples.git` or `https://github.com/ansible/workshop-examples.git`

### `output` positional argument

The second positional parameter is `output`, which specifies a writable
directory. If the directory already exists, it has to be
an empty directory. If it does not exist, it will be newly created with
the given name.

`ansible-content-parser` creates the`repository` subdirectory in the
`output` directory and copies the contents of the `source` repository
to it. The copied contents may be changed by during the execution
of the Content Parser.

## Outputs

Following directory structure is created in the directory specified with the `output`
positional argument.

```
output/
  |-- ftdata.jsonl # Training dataset
  |-- report.txt   # A human-readable report
  |
  |-- repository/
  |     |-- (files copied from the source repository)
  |
  |-- metadata/
        |-- lint-result.json     # Metadata generated by ansible-lint
        |-- sarif.json           # ansible-lint results in SARIF format
        |-- (other metadata files generated)
```

### ftdata.jsonl

This is the training dataset file, which is the main output of `ansible-content-parser`.

It is in the JSONL format, each of whose line represents a JSON object

### report.txt

This is a human-readable report that provides the summary information of the run
of `ansible-content-parser`, which contains sections like:

1. File counts per type
2. List of Ansible files identified
3. Issues found by ansible-lint
4. List of Ansible modules found in tasks

Note: When the `--skip-ansible-lint` option is specified, the first three sections do
not appear in the report.

### metadata directory

This subdirectory contains a few files that contain metadata generated
in the Content Parser run.

#### lint-result.json

`lint-result.json` is created in the `metadata` subdirectory
as the result of the execution
of `ansible-content-parser`. The file contains a dictionary, which
has two key/value pairs:

1. `files` This is for the list of files that were found
   in the execution. The format of each file entry is explained below.

Each file entry is represented as a dictionary that contains following keys

| Key               | Description                                                   |
| ----------------- | ------------------------------------------------------------- |
| `base_kind`       | MIME type of the file, for example, `text/yaml`               |
| `dir`             | Directory where the file resides.                             |
| `exc`             | Exception found while processing this file. It can be null.   |
| `filename`        | File name                                                     |
| `kind`            | File type, for example, `playbook`, `tasks` or `role`         |
| `name`            | File name (Usually same as `filename`)                        |
| `parent`          | Name of the parent, like a role. It can be null               |
| `role`            | Ansible role. It can be null                                  |
| `stop_processing` | Identifies whether processing was stopped on this file or not |
| `updated`         | Identifies whether contents were updated by `ansible-lint`    |

Following shows an example of a file entry:

```json
{
  "base_kind": "text/yaml",
  "dir": "/mnt/input/roles/delete_compute_node/tasks",
  "exc": null,
  "filename": "roles/delete_compute_node/tasks/main.yaml",
  "kind": "tasks",
  "name": "roles/delete_compute_node/tasks/main.yaml",
  "parent": "roles/delete_compute_node",
  "role": "delete_compute_node",
  "stop_processing": false,
  "updated": false
}
```

2.  `excluded` This is for the list of file paths, which were excluded in the second `ansible-lint`
    execution because syntax check errors were found in those files on the first execution.
    The files included in the list will not appear in the entries associated with the `files` key.

- **Note:** If `ansible-content-parser` is executed with the `--no-exclude` option, the second execution
  does not occur even if syntax check errors were found on the first execution and
  the training dataset will not be created.

#### sarif.json

This is the output of `ansible-lint` with the `--sarif-file` option.
The `report.txt` contains a summary generated
from this file in the "Issues found by ansible-lint" section.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "ansible-content-parser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Tami Takamiya <ttakamiy@redhat.com>",
    "keywords": "ansible",
    "author": "",
    "author_email": "Tami Takamiya <ttakamiy@redhat.com>",
    "download_url": "https://files.pythonhosted.org/packages/4f/8c/b8a4424a65bcdda439e1f7b766651e6a02617ba9133a446ebf03683ddfec/ansible-content-parser-1.0.3.tar.gz",
    "platform": null,
    "description": "# ansible-content-parser\n\n## Overview\n\n`ansible-content-parser` analyzes Ansible files in a given source\n(a local directory, an archive file or a git URL)\nby running `ansible-lint` internally,\nupdates Ansible files using the [Autofix feature of `ansible-lint`](https://ansible.readthedocs.io/projects/lint/autofix/)\nand generates the `ftdata.jsonl` file, which is the training dataset for\ndeveloping custom AI models.\n\n## Build\n\nExecute the `tox` command. Installable images are created under\nthe `dist` directory.\n\n## Installation\n\n### Prerequisites\n\n- Python version 3.10 or later.\n- UNIX OS, such as Linux or Mac OS.\n\n**Note:** Installation on Microsoft Windows OS is not supported.\n\n### Procedure\n\n`ansible-content-parser` uses a newer version of `ansible-lint` and its\ndependent components. In order to isolate them from the existing\nAnsible installations, it is recommended to install `ansible-content-parser` in\na Python virtual environment with the following steps:\n\n1. Create a working directory and set up venv Python virtual environment:\n\n```commandline\npython -m venv ./venv\nsource ./venv/bin/activate\n```\n\n2. Install `ansible-content-parser` from the pip repository:\n\n```commandline\npip install --upgrade pip\npip install --upgrade ansible-content-parser\n```\n\n3. After the installation is completed, verify that `ansible-content-parser` and `ansible-lint` are installed correctly:\n\n```commandline\nansible-content-parser --version\nansible-lint --version\n```\n\nA list of application versions and their dependencies are displayed.\nIn the output that is displayed, ensure that you have the same version of `ansible-lint`.\n\n**Important:** If there is a mismatch in the installed `ansible-lint` versions, you cannot get consistent results from the content parser and ansible-lint.\nFor example, the following result shows a mismatch in `ansible-lint` versions:\n\n```commandline\n$ ansible-content-parser --version\nansible-content-parser 0.0.1 using ansible-lint:6.20.0 ansible-core:2.15.4\n$ ansible-lint --version\nansible-lint 6.13.1 using ansible 2.15.4\nA new release of ansible-lint is available: 6.13.1 \u2192 6.20.0\n```\n\nIf the `ansible-lint` versions do not match, perform the following tasks:\n\n1. Deactivate and reactivate venv:\n\n```commandline\ndeactivate\nsource ./venv/bin/activate\n```\n\n2. Verify that the `ansible-lint` versions match:\n\n```commandline\nansible-content-parser --version\nansible-lint --version\n```\n\nFor example, the following output shows the same ansible-lint versions:\n\n```\n$ ansible-content-parser --version\nansible-content-parser 0.0.1 using ansible-lint:6.20.0 ansible-core:2.15.4\n$ ansible-lint --version\nansible-lint 6.20.0 using ansible-core:2.15.4 ansible-compat:4.1.10 ruamel-yaml:0.17.32 ruamel-yaml-clib:0.2.7\n```\n\n## Execution\n\n`ansible-content-parser` accepts two positional parameters (`source` and `output`)\nwith a few optional parameters.\n\n```commandline\n$ ansible-content-parser --help\nusage: ansible-content-parser [-h] [--config-file CONFIG_FILE]\n                              [--profile {min,basic,moderate,safety,shared,production}] [--fix WRITE_LIST]\n                              [--skip-ansible-lint] [--no-exclude] [-v] [--source-license SOURCE_LICENSE]\n                              [--source-description SOURCE_DESCRIPTION] [--repo-name REPO_NAME] [--repo-url REPO_URL]\n                              [--version]\n                              source output\n\nParse Ansible files in the given repository by running ansible-lint and generate a training dataset for Ansible\nLightspeed.\n\npositional arguments:\n  source                source, which can be an zip/tar archive, a git URL or a local directory\n  output                output directory\n\noptions:\n  -h, --help            show this help message and exit\n  --config-file CONFIG_FILE\n                        Specify the configuration file to use for ansible-lint. By default it will look for '.ansible-\n                        lint', '.config/ansible-lint.yml', or '.config/ansible-lint.yaml' in the source repository.\n  --profile {min,basic,moderate,safety,shared,production}\n                        Specify which rules profile to be used for ansible-lint\n  --fix WRITE_LIST      Specify how ansible-lint performs auto-fixes, including YAML reformatting. You can limit the\n                        effective rule transforms (the 'write_list') by passing a keywords 'all' (=default) or 'none'\n                        or a comma separated list of rule ids or rule tags.\n  -S, --skip-ansible-lint\n                        Skip the execution of ansible-lint.\n  --no-exclude          Do not let ansible-content-parser to generate training dataset by excluding files that caused\n                        lint errors. With this option specified, a single lint error terminates the execution without\n                        generating the training dataset.\n  -v, --verbose         Explain what is being done\n  --source-license SOURCE_LICENSE\n                        Specify the license that will be included in the training dataset.\n  --source-description SOURCE_DESCRIPTION\n                        Specify the description of the source that will be included in the training dataset.\n  --repo-name REPO_NAME\n                        Specify the repository name that will be included in the training dataset. If it is not\n                        specified, it is generated from the source name.\n  --repo-url REPO_URL   Specify the repository url that will be included in the training dataset. If it is not\n                        specified, it is generated from the source name.\n  --version             show program's version number and exit\n```\n\n### `source` positional argument\n\nThe first positional parameter is `source`, which specifies\nthe source repository to be used. Following three types of sources are supported:\n\n1. File directory.\n2. Archive file in the following table:\n\n| File Format      | File Extension                                |\n| ---------------- | --------------------------------------------- |\n| ZIP              | .zip                                          |\n| Uncompressed TAR | .tar                                          |\n| Compressed TAR   | .tar.gz, .tgz, .tar.bz2, .tbz2, .tar.xz, .txz |\n\n3. Git URL, e.g. `git@github.com:ansible/workshop-examples.git` or `https://github.com/ansible/workshop-examples.git`\n\n### `output` positional argument\n\nThe second positional parameter is `output`, which specifies a writable\ndirectory. If the directory already exists, it has to be\nan empty directory. If it does not exist, it will be newly created with\nthe given name.\n\n`ansible-content-parser` creates the`repository` subdirectory in the\n`output` directory and copies the contents of the `source` repository\nto it. The copied contents may be changed by during the execution\nof the Content Parser.\n\n## Outputs\n\nFollowing directory structure is created in the directory specified with the `output`\npositional argument.\n\n```\noutput/\n  |-- ftdata.jsonl # Training dataset\n  |-- report.txt   # A human-readable report\n  |\n  |-- repository/\n  |     |-- (files copied from the source repository)\n  |\n  |-- metadata/\n        |-- lint-result.json     # Metadata generated by ansible-lint\n        |-- sarif.json           # ansible-lint results in SARIF format\n        |-- (other metadata files generated)\n```\n\n### ftdata.jsonl\n\nThis is the training dataset file, which is the main output of `ansible-content-parser`.\n\nIt is in the JSONL format, each of whose line represents a JSON object\n\n### report.txt\n\nThis is a human-readable report that provides the summary information of the run\nof `ansible-content-parser`, which contains sections like:\n\n1. File counts per type\n2. List of Ansible files identified\n3. Issues found by ansible-lint\n4. List of Ansible modules found in tasks\n\nNote: When the `--skip-ansible-lint` option is specified, the first three sections do\nnot appear in the report.\n\n### metadata directory\n\nThis subdirectory contains a few files that contain metadata generated\nin the Content Parser run.\n\n#### lint-result.json\n\n`lint-result.json` is created in the `metadata` subdirectory\nas the result of the execution\nof `ansible-content-parser`. The file contains a dictionary, which\nhas two key/value pairs:\n\n1. `files` This is for the list of files that were found\n   in the execution. The format of each file entry is explained below.\n\nEach file entry is represented as a dictionary that contains following keys\n\n| Key               | Description                                                   |\n| ----------------- | ------------------------------------------------------------- |\n| `base_kind`       | MIME type of the file, for example, `text/yaml`               |\n| `dir`             | Directory where the file resides.                             |\n| `exc`             | Exception found while processing this file. It can be null.   |\n| `filename`        | File name                                                     |\n| `kind`            | File type, for example, `playbook`, `tasks` or `role`         |\n| `name`            | File name (Usually same as `filename`)                        |\n| `parent`          | Name of the parent, like a role. It can be null               |\n| `role`            | Ansible role. It can be null                                  |\n| `stop_processing` | Identifies whether processing was stopped on this file or not |\n| `updated`         | Identifies whether contents were updated by `ansible-lint`    |\n\nFollowing shows an example of a file entry:\n\n```json\n{\n  \"base_kind\": \"text/yaml\",\n  \"dir\": \"/mnt/input/roles/delete_compute_node/tasks\",\n  \"exc\": null,\n  \"filename\": \"roles/delete_compute_node/tasks/main.yaml\",\n  \"kind\": \"tasks\",\n  \"name\": \"roles/delete_compute_node/tasks/main.yaml\",\n  \"parent\": \"roles/delete_compute_node\",\n  \"role\": \"delete_compute_node\",\n  \"stop_processing\": false,\n  \"updated\": false\n}\n```\n\n2.  `excluded` This is for the list of file paths, which were excluded in the second `ansible-lint`\n    execution because syntax check errors were found in those files on the first execution.\n    The files included in the list will not appear in the entries associated with the `files` key.\n\n- **Note:** If `ansible-content-parser` is executed with the `--no-exclude` option, the second execution\n  does not occur even if syntax check errors were found on the first execution and\n  the training dataset will not be created.\n\n#### sarif.json\n\nThis is the output of `ansible-lint` with the `--sarif-file` option.\nThe `report.txt` contains a summary generated\nfrom this file in the \"Issues found by ansible-lint\" section.\n",
    "bugtrack_url": null,
    "license": "Apache",
    "summary": "The ansible content parser",
    "version": "1.0.3",
    "project_urls": {
        "changelog": "https://github.com/ansible/ansible-content-parser/releases",
        "homepage": "https://github.com/ansible/ansible-content-parser/",
        "repository": "https://github.com/ansible/ansible-content-parser/"
    },
    "split_keywords": [
        "ansible"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5dd4a91e36763ce9536aae7c0d03afb43f3793e46b4e943d5600fb3e7d81536d",
                "md5": "ade375beec70d79e01f84a05861fef31",
                "sha256": "afbfa32ea2748163ed3720fcd52e53b021d13ba83dc40034bd7ed83a677250db"
            },
            "downloads": -1,
            "filename": "ansible_content_parser-1.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ade375beec70d79e01f84a05861fef31",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 23697,
            "upload_time": "2024-02-05T14:32:00",
            "upload_time_iso_8601": "2024-02-05T14:32:00.458805Z",
            "url": "https://files.pythonhosted.org/packages/5d/d4/a91e36763ce9536aae7c0d03afb43f3793e46b4e943d5600fb3e7d81536d/ansible_content_parser-1.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4f8cb8a4424a65bcdda439e1f7b766651e6a02617ba9133a446ebf03683ddfec",
                "md5": "a9d8b10b460291c45f0699ace631f2af",
                "sha256": "c9712f6ce2066dd074153dc00b1e811b76934af6c9f2629a0a8f92111655c8cf"
            },
            "downloads": -1,
            "filename": "ansible-content-parser-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "a9d8b10b460291c45f0699ace631f2af",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 39605,
            "upload_time": "2024-02-05T14:32:02",
            "upload_time_iso_8601": "2024-02-05T14:32:02.031629Z",
            "url": "https://files.pythonhosted.org/packages/4f/8c/b8a4424a65bcdda439e1f7b766651e6a02617ba9133a446ebf03683ddfec/ansible-content-parser-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-05 14:32:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ansible",
    "github_project": "ansible-content-parser",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "ansible-content-parser"
}
        
Elapsed time: 0.17613s