# pyanalyze
Pyanalyze is a tool for programmatically detecting common mistakes in Python code, such as references to undefined variables and type errors.
It can be extended to add additional rules and perform checks specific to particular functions.
Some use cases for this tool include:
- **Catching bugs before they reach production**. The script will catch accidental mistakes like writing "`collections.defalutdict`" instead of "`collections.defaultdict`", so that they won't cause errors in production. Other categories of bugs it can find include variables that may be undefined at runtime, duplicate keys in dict literals, and missing `await` keywords.
- **Making refactoring easier**. When you make a change like removing an object attribute or moving a class from one file to another, pyanalyze will often be able to flag code that you forgot to change.
- **Finding dead code**. It has an option for finding Python objects (functions and classes) that are not used anywhere in the codebase.
- **Checking type annotations**. Type annotations are useful as documentation for readers of code, but only when they are actually correct. Although pyanalyze does not support the full Python type system (see [here](https://pyanalyze.readthedocs.io/en/latest/typesystem.html) for details), it can often detect incorrect type annotations.
## Usage
You can install pyanalyze with:
```bash
$ pip install pyanalyze
```
Once it is installed, you can run pyanalyze on a Python file or package as follows:
```bash
$ python -m pyanalyze file.py
$ python -m pyanalyze package/
```
But note that this will try to import all Python files it is passed. If you have scripts that perform operations without `if __name__ == "__main__":` blocks, pyanalyze may end up executing them.
In order to run successfully, pyanalyze needs to be able to import the code it checks. To make this work you may have to manually adjust Python's import path using the `$PYTHONPATH` environment variable.
### Configuration
Pyanalyze has a number of command-line options, which you can see by running `python -m pyanalyze --help`. Important ones include `-f`, which runs an interactive prompt that lets you examine and fix each error found by pyanalyze, and `--enable`/`--disable`, which enable and disable specific error codes.
Configuration through a `pyproject.toml` file is also supported. See
[the documentation](https://pyanalyze.readthedocs.io/en/latest/configuration.html) for
details.
### Extending pyanalyze
The main way to extend pyanalyze is by providing a specification for a particular function. This allows you to run arbitrary code that inspects the arguments to the function and raises errors if something is wrong.
As an example, suppose your codebase contains a function `database.run_query()` that takes as an argument a SQL string, like this:
```python
database.run_query("SELECT answer, question FROM content")
```
You want to detect when a call to `run_query()` contains syntactically invalid SQL or refers to a non-existent table or column. You could set that up with code like this:
```python
from pyanalyze.error_code import ErrorCode
from pyanalyze.signature import CallContext, Signature, SigParameter
from pyanalyze.value import KnownValue, TypedValue, AnyValue, AnySource, Value
from database import run_query, parse_sql
def run_query_impl(ctx: CallContext) -> Value:
sql = ctx.vars["sql"]
if not isinstance(sql, KnownValue) or not isinstance(sql.val, str):
ctx.show_error(
"Argument to run_query() must be a string literal",
ErrorCode.incompatible_call,
)
return AnyValue(AnySource.error)
try:
parsed = parse_sql(sql)
except ValueError as e:
ctx.show_error(
f"Invalid sql passed to run_query(): {e}",
ErrorCode.incompatible_call,
)
return AnyValue(AnySource.error)
# check that the parsed SQL is valid...
# pyanalyze will use this as the inferred return type for the function
return TypedValue(list)
# in pyproject.toml, set:
# known_signatures = ["<module>.get_known_argspecs"]
def get_known_argspecs(arg_spec_cache):
return {
# This infers the parameter types and names from the function signature
run_query: arg_spec_cache.get_argspec(
run_query, impl=run_query_impl
),
# You can also write the signature manually
run_query: Signature.make(
[SigParameter("sql", annotation=TypedValue(str))],
callable=run_query,
impl=run_query_impl,
),
}
```
### Displaying and checking the type of an expression
You can use `typing_extensions.reveal_type(expr)` to display the type pyanalyze infers for an expression. This can be
useful to understand errors or to debug why pyanalyze does not catch a particular issue. For example:
```python
from typing_extensions import reveal_type
reveal_type(1) # Revealed type is 'Literal[1]' (code: reveal_type)
```
This function is also considered a builtin while type checking, so you can use `reveal_type()` in code that is type checked but not run.
For callable objects, `reveal_type()` will also display the signature inferred by pyanalyze:
```python
from typing_extensions import reveal_type
reveal_type(reveal_type) # Revealed type is 'Literal[<function reveal_type at 0x104bf55e0>]', signature is (value, /) -> None (code: reveal_type)
```
A similar function, `pyanalyze.dump_value`, can be used to get lower-level details of the `Value` object pyanalyze infers for an expression.
Similarly, you can use `pyanalyze.assert_is_value` to assert that pyanalyze infers a particular type for
an expression. This requires importing the appropriate `Value` subclass from `pyanalyze.value`. For example:
```python
from pyanalyze import assert_is_value
from pyanalyze.value import KnownValue
assert_is_value(1, KnownValue(1)) # succeeds
assert_is_value(int("2"), KnownValue(1)) # Bad value inference: expected KnownValue(val=1), got TypedValue(typ=<class 'int'>) (code: inference_failure)
```
This function is mostly useful when writing unit tests for pyanalyze or an extension.
### Ignoring errors
Sometimes pyanalyze gets things wrong and you need to ignore an error it emits. This can be done as follows:
- Add `# static analysis: ignore` on a line by itself before the line that generates the erorr.
- Add `# static analysis: ignore` at the end of the line that generates the error.
- Add `# static analysis: ignore` at the top of the file; this will ignore errors in the entire file.
You can add an error code, like `# static analysis: ignore[undefined_name]`, to ignore only a specific error code. This does not work for whole-file ignores. If the `bare_ignore` error code is turned on, pyanalyze will emit an error if you don't specify an error code on an ignore comment.
### Python version support
Pyanalyze supports Python 3.6 through 3.10. Because it imports the code it checks, you have to run it using the same version of Python you use to run your code.
## Contributing
We welcome your contributions. See [CONTRIBUTING.md](https://github.com/quora/pyanalyze/blob/master/CONTRIBUTING.md)
for how to get started.
## Documentation
Documentation is available at [ReadTheDocs](https://pyanalyze.readthedocs.io/en/latest/)
or on [GitHub](https://github.com/quora/pyanalyze/tree/master/docs).
Raw data
{
"_id": null,
"home_page": "https://github.com/quora/pyanalyze",
"name": "pyanalyze",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "quora static analysis",
"author": "Quora, Inc.",
"author_email": "jelle@quora.com",
"download_url": "https://files.pythonhosted.org/packages/24/84/9b0b34dce4471f1f647967ef242526e6119d2a3617bba06f3b8ad1c040eb/pyanalyze-0.12.0.tar.gz",
"platform": null,
"description": "# pyanalyze\n\nPyanalyze is a tool for programmatically detecting common mistakes in Python code, such as references to undefined variables and type errors.\nIt can be extended to add additional rules and perform checks specific to particular functions.\n\nSome use cases for this tool include:\n\n- **Catching bugs before they reach production**. The script will catch accidental mistakes like writing \"`collections.defalutdict`\" instead of \"`collections.defaultdict`\", so that they won't cause errors in production. Other categories of bugs it can find include variables that may be undefined at runtime, duplicate keys in dict literals, and missing `await` keywords.\n- **Making refactoring easier**. When you make a change like removing an object attribute or moving a class from one file to another, pyanalyze will often be able to flag code that you forgot to change.\n- **Finding dead code**. It has an option for finding Python objects (functions and classes) that are not used anywhere in the codebase.\n- **Checking type annotations**. Type annotations are useful as documentation for readers of code, but only when they are actually correct. Although pyanalyze does not support the full Python type system (see [here](https://pyanalyze.readthedocs.io/en/latest/typesystem.html) for details), it can often detect incorrect type annotations.\n\n## Usage\n\nYou can install pyanalyze with:\n\n```bash\n$ pip install pyanalyze\n```\n\nOnce it is installed, you can run pyanalyze on a Python file or package as follows:\n\n```bash\n$ python -m pyanalyze file.py\n$ python -m pyanalyze package/\n```\n\nBut note that this will try to import all Python files it is passed. If you have scripts that perform operations without `if __name__ == \"__main__\":` blocks, pyanalyze may end up executing them.\n\nIn order to run successfully, pyanalyze needs to be able to import the code it checks. To make this work you may have to manually adjust Python's import path using the `$PYTHONPATH` environment variable.\n\n### Configuration\n\nPyanalyze has a number of command-line options, which you can see by running `python -m pyanalyze --help`. Important ones include `-f`, which runs an interactive prompt that lets you examine and fix each error found by pyanalyze, and `--enable`/`--disable`, which enable and disable specific error codes.\n\nConfiguration through a `pyproject.toml` file is also supported. See\n[the documentation](https://pyanalyze.readthedocs.io/en/latest/configuration.html) for\ndetails.\n\n### Extending pyanalyze\n\nThe main way to extend pyanalyze is by providing a specification for a particular function. This allows you to run arbitrary code that inspects the arguments to the function and raises errors if something is wrong.\n\nAs an example, suppose your codebase contains a function `database.run_query()` that takes as an argument a SQL string, like this:\n\n```python\ndatabase.run_query(\"SELECT answer, question FROM content\")\n```\n\nYou want to detect when a call to `run_query()` contains syntactically invalid SQL or refers to a non-existent table or column. You could set that up with code like this:\n\n```python\nfrom pyanalyze.error_code import ErrorCode\nfrom pyanalyze.signature import CallContext, Signature, SigParameter\nfrom pyanalyze.value import KnownValue, TypedValue, AnyValue, AnySource, Value\n\nfrom database import run_query, parse_sql\n\n\ndef run_query_impl(ctx: CallContext) -> Value:\n sql = ctx.vars[\"sql\"]\n if not isinstance(sql, KnownValue) or not isinstance(sql.val, str):\n ctx.show_error(\n \"Argument to run_query() must be a string literal\",\n ErrorCode.incompatible_call,\n )\n return AnyValue(AnySource.error)\n\n try:\n parsed = parse_sql(sql)\n except ValueError as e:\n ctx.show_error(\n f\"Invalid sql passed to run_query(): {e}\",\n ErrorCode.incompatible_call,\n )\n return AnyValue(AnySource.error)\n\n # check that the parsed SQL is valid...\n\n # pyanalyze will use this as the inferred return type for the function\n return TypedValue(list)\n\n\n# in pyproject.toml, set:\n# known_signatures = [\"<module>.get_known_argspecs\"]\ndef get_known_argspecs(arg_spec_cache):\n return {\n # This infers the parameter types and names from the function signature\n run_query: arg_spec_cache.get_argspec(\n run_query, impl=run_query_impl\n ),\n # You can also write the signature manually\n run_query: Signature.make(\n [SigParameter(\"sql\", annotation=TypedValue(str))],\n callable=run_query,\n impl=run_query_impl,\n ),\n }\n```\n\n### Displaying and checking the type of an expression\n\nYou can use `typing_extensions.reveal_type(expr)` to display the type pyanalyze infers for an expression. This can be\nuseful to understand errors or to debug why pyanalyze does not catch a particular issue. For example:\n\n```python\nfrom typing_extensions import reveal_type\n\nreveal_type(1) # Revealed type is 'Literal[1]' (code: reveal_type)\n```\n\nThis function is also considered a builtin while type checking, so you can use `reveal_type()` in code that is type checked but not run.\n\nFor callable objects, `reveal_type()` will also display the signature inferred by pyanalyze:\n\n```python\nfrom typing_extensions import reveal_type\n\nreveal_type(reveal_type) # Revealed type is 'Literal[<function reveal_type at 0x104bf55e0>]', signature is (value, /) -> None (code: reveal_type)\n```\n\nA similar function, `pyanalyze.dump_value`, can be used to get lower-level details of the `Value` object pyanalyze infers for an expression.\n\nSimilarly, you can use `pyanalyze.assert_is_value` to assert that pyanalyze infers a particular type for\nan expression. This requires importing the appropriate `Value` subclass from `pyanalyze.value`. For example:\n\n```python\nfrom pyanalyze import assert_is_value\nfrom pyanalyze.value import KnownValue\n\nassert_is_value(1, KnownValue(1)) # succeeds\nassert_is_value(int(\"2\"), KnownValue(1)) # Bad value inference: expected KnownValue(val=1), got TypedValue(typ=<class 'int'>) (code: inference_failure)\n```\n\nThis function is mostly useful when writing unit tests for pyanalyze or an extension.\n\n### Ignoring errors\n\nSometimes pyanalyze gets things wrong and you need to ignore an error it emits. This can be done as follows:\n\n- Add `# static analysis: ignore` on a line by itself before the line that generates the erorr.\n- Add `# static analysis: ignore` at the end of the line that generates the error.\n- Add `# static analysis: ignore` at the top of the file; this will ignore errors in the entire file.\n\nYou can add an error code, like `# static analysis: ignore[undefined_name]`, to ignore only a specific error code. This does not work for whole-file ignores. If the `bare_ignore` error code is turned on, pyanalyze will emit an error if you don't specify an error code on an ignore comment.\n\n### Python version support\n\nPyanalyze supports Python 3.6 through 3.10. Because it imports the code it checks, you have to run it using the same version of Python you use to run your code.\n\n## Contributing\n\nWe welcome your contributions. See [CONTRIBUTING.md](https://github.com/quora/pyanalyze/blob/master/CONTRIBUTING.md)\nfor how to get started.\n\n## Documentation\n\nDocumentation is available at [ReadTheDocs](https://pyanalyze.readthedocs.io/en/latest/)\nor on [GitHub](https://github.com/quora/pyanalyze/tree/master/docs).\n",
"bugtrack_url": null,
"license": "Apache Software License",
"summary": "A static analyzer for Python",
"version": "0.12.0",
"project_urls": {
"Homepage": "https://github.com/quora/pyanalyze"
},
"split_keywords": [
"quora",
"static",
"analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2e9a958d3939bfe9765c4d5d301c7c0ef5e968b642ac84cf5e327ff9b2dff509",
"md5": "9a424aa6dc989bc5077dc9a4a292d017",
"sha256": "d33b9fa78f77d4231b9ef0188a781d34f55cb70514f83f20b9d5bf3abdccb522"
},
"downloads": -1,
"filename": "pyanalyze-0.12.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9a424aa6dc989bc5077dc9a4a292d017",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 365181,
"upload_time": "2024-02-25T22:59:59",
"upload_time_iso_8601": "2024-02-25T22:59:59.301320Z",
"url": "https://files.pythonhosted.org/packages/2e/9a/958d3939bfe9765c4d5d301c7c0ef5e968b642ac84cf5e327ff9b2dff509/pyanalyze-0.12.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "24849b0b34dce4471f1f647967ef242526e6119d2a3617bba06f3b8ad1c040eb",
"md5": "a6914035f28e9388359d4215da11e44d",
"sha256": "42db061d254ad830641ab892f7c456d53deffbc7327136647ed730016df056b7"
},
"downloads": -1,
"filename": "pyanalyze-0.12.0.tar.gz",
"has_sig": false,
"md5_digest": "a6914035f28e9388359d4215da11e44d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 328034,
"upload_time": "2024-02-25T23:00:02",
"upload_time_iso_8601": "2024-02-25T23:00:02.207736Z",
"url": "https://files.pythonhosted.org/packages/24/84/9b0b34dce4471f1f647967ef242526e6119d2a3617bba06f3b8ad1c040eb/pyanalyze-0.12.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-25 23:00:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "quora",
"github_project": "pyanalyze",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "asynq",
"specs": []
},
{
"name": "qcore",
"specs": [
[
">=",
"0.5.1"
]
]
},
{
"name": "ast_decompiler",
"specs": [
[
">=",
"0.4.0"
]
]
},
{
"name": "typeshed_client",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "typing_extensions",
"specs": [
[
">=",
"4.1.1"
]
]
},
{
"name": "aenum",
"specs": [
[
">=",
"2.2.3"
]
]
},
{
"name": "codemod",
"specs": []
},
{
"name": "myst-parser",
"specs": [
[
"==",
"2.0.0"
]
]
},
{
"name": "Sphinx",
"specs": [
[
"==",
"7.2.5"
]
]
},
{
"name": "black",
"specs": [
[
"==",
"24.2.0"
]
]
},
{
"name": "flake8",
"specs": [
[
"==",
"7.0.0"
]
]
},
{
"name": "usort",
"specs": [
[
"==",
"1.0.8.post1"
]
]
}
],
"tox": true,
"lcname": "pyanalyze"
}