# bigcode-astgen-py
Generate Python files AST in a format compatible with [150k Python Dataset][1].
The code is mostly copied from [150k Python Dataset][1] and adapted to work with Python 3.
Note that this tool will only be able to parse the version of Python it is running,
as it is internally using Python `ast` module, which uses the current Python parser.
## Install
This tool can be installed by running
```
pip install bigcode-astgen
```
or by fetching this repository and running
```
cd ast-generators/python
pip install .
```
## CLI usage
```
bigcode-astgen-py -o <output> <input>
```
`<input>` should be a file, or a glob expression to files.
### Normal mode
In normal mode, `<input>` is interpreted as a filename and the resulting AST
is outputed in `<output>` if provided, else printed to `stdout`.
### Batch mode
In batch mode, `<input>` is interpreted as a glob, and all matching files
are parsed. `<output>` is a prefix and `<output>.json`, `<output>.txt` and
`<output>_failed.txt` files will be created.
* `<output>.json` - contains a JSON formatted AST per line
* `<output>.txt` - contains a filename per line, in the same order as `<output>.json`
* `<output>_failed.txt` - contains a filename per line, with the reason why it could not be parsed
### Example
#### Normal mode
```
bigcode-astgen-py bigcode_astgen/normalizer.py
```
parse `bigcode_astgen/normalizer.py` and output the result to stdout.
#### Batch mode
```
bigcode-astgen-py --batch -o result/asts "src/**/*.py"
```
parse all `.py` files in `src` directory and output results in the `result` directory
with the prefix `asts`.
## Python API
### `bigcode_astgen.ast_generator.parse_string`
Returns the AST nodes of the given string
Args:
* `content`: string containing a Python program
### `bigcode_astgen.ast_generator.parse_file`
Returns the AST nodes of the given file
Args:
* `filename`: path to a file containing a Python program
### `bigcode_astgen.ast_bulk_processor.process_files`
Process all the files matched with the `files_pattern` and output the results in `output_dir`
Args:
* `files_pattern`: a glob pattern containing python files
* `output`: the filename (without extension) where to output results
## License
I could not find the license of [150k Python Dataset][1] source code from which
`bigcode_astgen/ast_generator.py` is copied.
Therefore, until further notice, this project does not follow the MIT license as the rest of the repository.
[1]: http://www.srl.inf.ethz.ch/py150.php
Raw data
{
"_id": null,
"home_page": "https://github.com/tuvistavie/bigcode-tools/tree/master/bigcode-astgen/python",
"name": "bigcode-astgen",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Daniel Perez",
"author_email": "tuvistavie@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/4c/b4/9a64c11d1833baddc9abeff5735972faafbf7bd22b11357d27cd76390365/bigcode-astgen-0.2.1.tar.gz",
"platform": "",
"description": "# bigcode-astgen-py\n\nGenerate Python files AST in a format compatible with [150k Python Dataset][1].\n\nThe code is mostly copied from [150k Python Dataset][1] and adapted to work with Python 3.\n\nNote that this tool will only be able to parse the version of Python it is running,\nas it is internally using Python `ast` module, which uses the current Python parser.\n\n## Install\n\nThis tool can be installed by running\n\n```\npip install bigcode-astgen\n```\n\nor by fetching this repository and running\n\n```\ncd ast-generators/python\npip install .\n```\n\n## CLI usage\n\n```\nbigcode-astgen-py -o <output> <input>\n```\n\n`<input>` should be a file, or a glob expression to files.\n\n### Normal mode\n\nIn normal mode, `<input>` is interpreted as a filename and the resulting AST\nis outputed in `<output>` if provided, else printed to `stdout`.\n\n### Batch mode\n\nIn batch mode, `<input>` is interpreted as a glob, and all matching files\nare parsed. `<output>` is a prefix and `<output>.json`, `<output>.txt` and\n`<output>_failed.txt` files will be created.\n\n* `<output>.json` - contains a JSON formatted AST per line\n* `<output>.txt` - contains a filename per line, in the same order as `<output>.json`\n* `<output>_failed.txt` - contains a filename per line, with the reason why it could not be parsed\n\n### Example\n\n#### Normal mode\n\n```\nbigcode-astgen-py bigcode_astgen/normalizer.py\n```\n\nparse `bigcode_astgen/normalizer.py` and output the result to stdout.\n\n#### Batch mode\n\n```\nbigcode-astgen-py --batch -o result/asts \"src/**/*.py\"\n```\n\nparse all `.py` files in `src` directory and output results in the `result` directory\nwith the prefix `asts`.\n\n\n## Python API\n\n### `bigcode_astgen.ast_generator.parse_string`\n\nReturns the AST nodes of the given string\n\nArgs:\n\n* `content`: string containing a Python program\n\n\n### `bigcode_astgen.ast_generator.parse_file`\n\nReturns the AST nodes of the given file\n\nArgs:\n\n* `filename`: path to a file containing a Python program\n\n### `bigcode_astgen.ast_bulk_processor.process_files`\n\nProcess all the files matched with the `files_pattern` and output the results in `output_dir`\n\nArgs:\n\n* `files_pattern`: a glob pattern containing python files\n* `output`: the filename (without extension) where to output results\n\n## License\n\nI could not find the license of [150k Python Dataset][1] source code from which\n`bigcode_astgen/ast_generator.py` is copied.\nTherefore, until further notice, this project does not follow the MIT license as the rest of the repository.\n\n\n[1]: http://www.srl.inf.ethz.ch/py150.php\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Tool to search and fetch code from GitHub",
"version": "0.2.1",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b9725f9282ca476f917a969fe1ebf6f0ce947aef99b312430a96a26a7bc439c5",
"md5": "2e77ef0a2ac79a76b8684f0679fb0023",
"sha256": "9c50ee77bd90e5031b35180c852a48210ef6e899faa8f134dfbb7eaf3c8a4362"
},
"downloads": -1,
"filename": "bigcode_astgen-0.2.1-py3-none-any.whl",
"has_sig": true,
"md5_digest": "2e77ef0a2ac79a76b8684f0679fb0023",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 11944,
"upload_time": "2017-11-03T06:16:16",
"upload_time_iso_8601": "2017-11-03T06:16:16.109824Z",
"url": "https://files.pythonhosted.org/packages/b9/72/5f9282ca476f917a969fe1ebf6f0ce947aef99b312430a96a26a7bc439c5/bigcode_astgen-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4cb49a64c11d1833baddc9abeff5735972faafbf7bd22b11357d27cd76390365",
"md5": "44687b963c7845e66a62fc23f15a7869",
"sha256": "fbb3371c2a8ba7198d4ecd6b92e62dfe67859b5911f99a6087acced2bb05f838"
},
"downloads": -1,
"filename": "bigcode-astgen-0.2.1.tar.gz",
"has_sig": true,
"md5_digest": "44687b963c7845e66a62fc23f15a7869",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 8111,
"upload_time": "2017-11-03T06:16:17",
"upload_time_iso_8601": "2017-11-03T06:16:17.657188Z",
"url": "https://files.pythonhosted.org/packages/4c/b4/9a64c11d1833baddc9abeff5735972faafbf7bd22b11357d27cd76390365/bigcode-astgen-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2017-11-03 06:16:17",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "bigcode-astgen"
}