siggen


Namesiggen JSON
Version 2.2.20241029 PyPI version JSON
download
home_pageNone
SummarySocorro signature generation extracted as a Python library
upload_time2024-10-30 01:00:37
maintainerNone
docs_urlNone
authorWill Kahn-Greene
requires_python>=3.9
licenseMPLv2
keywords socorro
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ==============
socorro-siggen
==============

This is an extraction of the Socorro crash signature generation code.

:Code: https://github.com/willkg/socorro-siggen
:Documentation: Check the ``README.rst`` file
:Changelog: Check the ``HISTORY.rst`` file
:Issue tracker: https://github.com/willkg/socorro-siggen/issues
:License: MPLv2
:Status: Stable
:Community Participation Guidelines: `<https://github.com/willkg/socorro-siggen/blob/main/CODE_OF_CONDUCT.md>`_


Installing
==========

socorro-siggen is available on `PyPI <https://pypi.org/project/siggen/>`_. You
can install for library usage with::

    $ pip install siggen

You can install for cli usage with::

    $ pip install 'siggen[cli]'

Install for hacking::

    $ pip install -r requirements-dev.txt


Versioning
==========

siggen is an extraction of the signature generation code in Socorro. If you are
running signature generation on crash data and you want signatures to match
equivalent crash reports in Socorro, then you need to keep siggen up-to-date.

siggen uses a calver scheme:

MAJOR.MINOR.yyyymmdd

* MAJOR: indicates incompatible API changes -- listed as "big changes" in
  HISTORY.rst
* MINOR: indicates changes that are backwards-compatible
* yyyymmdd: the release date


Basic use
=========

Use it on the command line for signature generation debugging
-------------------------------------------------------------

siggen comes with several command line tools for signature generation.

``signify``
    Takes a signature generation crash data file via stdin, runs signature
    generation, and prints the output.

    This is helpful for generating signatures for crash data.

    Usage::

        signify --help

    Example::

        $ fetch-data 04e52a99-67d4-4d19-ad21-e29d10220905 > crash_data.json
        $ cat crash_data.json | signify

    If you pass in the ``--verbose`` flag, you'll get verbose output about
    how the signature was generated.

``fetch-data``
    Downloads processed crash data from Crash Stats and converts it to the
    signature generation crash data.

    Usage::

        fetch-data --help

    Example::

        $ fetch-data 04e52a99-67d4-4d19-ad21-e29d10220905 > crash_data.json

``signature``
    Downloads processed crash data from Crash Stats, converts it to signature
    generation crash data format, and generates a signature.

    This also tells you whether the new signature matches the old one.

    This is helpful for making adjustments to the signature lists and debugging
    signature generation problems.

    Usage::

        $ signature --help

    Example::

        $ signature 04e52a99-67d4-4d19-ad21-e29d10220905 > crash_data.json


Use it as a library
-------------------

You can use socorro-siggen as a library::

    from siggen.generator import SignatureGenerator

    generator = SignatureGenerator()

    crash_data = {
        ...
    }

    ret = generator.generate(crash_data)
    print(ret['signature'])


Things to know
==============

Things to know about siggen:

1. Make sure to use the latest version of siggen and update frequently.

2. Signatures generated will change between siggen versions. The API may be
   stable, but bug fixes and changes to the siglist files will affect signature
   generation output. Hopefully for the better!

3. If you have problems, please open up an issue. Please include the version of
   siggen.

   When using siggen, you can find the version like this::

       import siggen
       print(siggen.__version__)


Signature generation crash data schema
======================================

This is the schema for the signature generation crash data structure::

  {
    crashing_thread: <int or null>,    // Optional, The index of the crashing thread in threads.
                                       // This defaults to None which indicates there was no
                                       // crashing thread identified in the crash report.

    threads: [                         // Optional, list of stack traces for c/c++/rust code.
      {
        frames: [                      // List of one or more frames.
          {
            function: <string>,        // Optional, The name of the function.
                                       // If this is ``None`` or not in the frame, then signature
                                       // generation will calculate something using other data in
                                       // the frame.

            module: <string>,          // Optional, name of the module
            file: <string>,            // Optional, name of the file
            line: <int>,               // Optional, line in the file
            module_offset: <string>,   // Optional, offset in hex in the module for this frame
            offset: <string>           // Optional, offset in hex for this frame

                                       // Signature parts are computed using frame data in this
                                       // order:

                                       // 1. if there's a function (and optionally line)--use
                                       //    that
                                       // 2. if there's a file and a line--use that
                                       // 3. if there's an offset and no module/module_offset--use
                                       //    that
                                       // 4. use module/module_offset
          }
          // ... additional frames
        ],

        thread_name: <string>,         // Optional, The name of the thread.
                                       // This isn't used, yet, but might be in the future for
                                       // debugging purposes.

        frame_count: <int>             // Optional, This is the total number of frames. This
                                       // isn't used.
      },
      // ... additional threads
    ],

    java_stack_trace: <string>,        // Optional, If the crash is a Java crash, then this will
                                       // be the Java traceback as a single string. Signature
                                       // generation will split this string into lines and then
                                       // extract frame information from it to generate the
                                       // signature.

                                       // FIXME(willkg): Write up better description of this.

    oom_allocation_size: <int>,        // Optional, The allocation size that triggered an
                                       // out-of-memory error. This will get added to the
                                       // signature if one of the indicator functions appears in
                                       // the stack of the crashing thread.

    abort_message: <string>,           // Optional, The abort message for the crash, if there is
                                       // one. This is added to the beginning of the signature.

    hang_type: <int>,                  // Optional.
                                       // 1 here indicates this is a chrome hang and we look at
                                       // thread 0 for generation.
                                       // -1 indicates another kind of hang.

    async_shutdown_timeout: <text>,    // Optional, This is a text field encoded in JSON with
                                       // "phase" and "conditions" keys.
                                       // FIXME(willkg): Document this structure better.

    jit_category: <string>,            // Optional, If there's a JIT classification in the
                                       // crash, then that will override the signature

    ipc_channel_error: <string>,       // Optional, If there is an IPC channel error, it
                                       // replaces the signature.

    ipc_message_name: <string>,        // Optional, This gets added to the signature if there
                                       // was an IPC message name in the crash.

    additional_minidumps: <string>,    // Optional, A crash report can contain multiple minidumps.
                                       // This is a comma-delimited list of minidumps other than
                                       // the main one that the crash had.

                                       // Example: "browser,flash1,flash2,content"

    mdsw_status_string: <string>,      // Optional, Socorro-generated
                                       // This is the minidump-stackwalk status string. This
                                       // gets generated when the Socorro processor runs the
                                       // minidump through minidump-stackwalk. If you're not
                                       // using minidump-stackwalk, you can ignore this.

    reason: <string>,                  // Optional, The crash_info type value. This can indicate
                                       // the crash was a OOM.

    moz_crash_reason: <string>,        // Optional, This is the MOZ_CRASH_REASON value. This
                                       // doesn't affect anything unless the value is
                                       // "MOZ_RELEASE_ASSERT(parentBuildID == childBuildID)".

    os: <string>,                      // Optional, The name of the operating system. This
                                       // doesn't affect anything unless the name is "Windows
                                       // NT" in which case it will lowercase module names when
                                       // iterating through frames to build the signature.
  }


Missing keys in the structure are treated as ``None``, so you can pass in a
minimal structure with just the parts you define.


Examples
========

Example almost minimal, somewhat nonsense ``crash_data.json``::

    {
        "os": "Linux",
        "crashing_thread": 0,
        "threads": [
            {
                "frames": [
                    {
                        "frame": 0,
                        "function": "SomeFunc",
                        "line": 20,
                        "file": "somefile.cpp",
                        "module": "foo.so.5.15.0",
                        "module_offset": "0x37a92",
                        "offset": "0x7fc641052a92"
                    },
                    {
                        "frame": 1,
                        "function": "SomeOtherFunc",
                        "line": 444,
                        "file": "someotherfile.cpp",
                        "module": "bar.so",
                        "module_offset": "0x39a55",
                        "offset": "0x7fc641044a55"
                    }
                ]
            }
        ]
    }


That produces this output::

    $ cat crash_data.json | signify
    {
      "notes": [],
      "proto_signature": "SomeFunc | SomeOtherFunc",
      "signature": "SomeFunc"
    }

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "siggen",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "socorro",
    "author": "Will Kahn-Greene",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/5d/65/e66d17204473511c00e535fc5f93f9d51c11baf280ae29453c0cf5932265/siggen-2.2.20241029.tar.gz",
    "platform": null,
    "description": "==============\nsocorro-siggen\n==============\n\nThis is an extraction of the Socorro crash signature generation code.\n\n:Code: https://github.com/willkg/socorro-siggen\n:Documentation: Check the ``README.rst`` file\n:Changelog: Check the ``HISTORY.rst`` file\n:Issue tracker: https://github.com/willkg/socorro-siggen/issues\n:License: MPLv2\n:Status: Stable\n:Community Participation Guidelines: `<https://github.com/willkg/socorro-siggen/blob/main/CODE_OF_CONDUCT.md>`_\n\n\nInstalling\n==========\n\nsocorro-siggen is available on `PyPI <https://pypi.org/project/siggen/>`_. You\ncan install for library usage with::\n\n    $ pip install siggen\n\nYou can install for cli usage with::\n\n    $ pip install 'siggen[cli]'\n\nInstall for hacking::\n\n    $ pip install -r requirements-dev.txt\n\n\nVersioning\n==========\n\nsiggen is an extraction of the signature generation code in Socorro. If you are\nrunning signature generation on crash data and you want signatures to match\nequivalent crash reports in Socorro, then you need to keep siggen up-to-date.\n\nsiggen uses a calver scheme:\n\nMAJOR.MINOR.yyyymmdd\n\n* MAJOR: indicates incompatible API changes -- listed as \"big changes\" in\n  HISTORY.rst\n* MINOR: indicates changes that are backwards-compatible\n* yyyymmdd: the release date\n\n\nBasic use\n=========\n\nUse it on the command line for signature generation debugging\n-------------------------------------------------------------\n\nsiggen comes with several command line tools for signature generation.\n\n``signify``\n    Takes a signature generation crash data file via stdin, runs signature\n    generation, and prints the output.\n\n    This is helpful for generating signatures for crash data.\n\n    Usage::\n\n        signify --help\n\n    Example::\n\n        $ fetch-data 04e52a99-67d4-4d19-ad21-e29d10220905 > crash_data.json\n        $ cat crash_data.json | signify\n\n    If you pass in the ``--verbose`` flag, you'll get verbose output about\n    how the signature was generated.\n\n``fetch-data``\n    Downloads processed crash data from Crash Stats and converts it to the\n    signature generation crash data.\n\n    Usage::\n\n        fetch-data --help\n\n    Example::\n\n        $ fetch-data 04e52a99-67d4-4d19-ad21-e29d10220905 > crash_data.json\n\n``signature``\n    Downloads processed crash data from Crash Stats, converts it to signature\n    generation crash data format, and generates a signature.\n\n    This also tells you whether the new signature matches the old one.\n\n    This is helpful for making adjustments to the signature lists and debugging\n    signature generation problems.\n\n    Usage::\n\n        $ signature --help\n\n    Example::\n\n        $ signature 04e52a99-67d4-4d19-ad21-e29d10220905 > crash_data.json\n\n\nUse it as a library\n-------------------\n\nYou can use socorro-siggen as a library::\n\n    from siggen.generator import SignatureGenerator\n\n    generator = SignatureGenerator()\n\n    crash_data = {\n        ...\n    }\n\n    ret = generator.generate(crash_data)\n    print(ret['signature'])\n\n\nThings to know\n==============\n\nThings to know about siggen:\n\n1. Make sure to use the latest version of siggen and update frequently.\n\n2. Signatures generated will change between siggen versions. The API may be\n   stable, but bug fixes and changes to the siglist files will affect signature\n   generation output. Hopefully for the better!\n\n3. If you have problems, please open up an issue. Please include the version of\n   siggen.\n\n   When using siggen, you can find the version like this::\n\n       import siggen\n       print(siggen.__version__)\n\n\nSignature generation crash data schema\n======================================\n\nThis is the schema for the signature generation crash data structure::\n\n  {\n    crashing_thread: <int or null>,    // Optional, The index of the crashing thread in threads.\n                                       // This defaults to None which indicates there was no\n                                       // crashing thread identified in the crash report.\n\n    threads: [                         // Optional, list of stack traces for c/c++/rust code.\n      {\n        frames: [                      // List of one or more frames.\n          {\n            function: <string>,        // Optional, The name of the function.\n                                       // If this is ``None`` or not in the frame, then signature\n                                       // generation will calculate something using other data in\n                                       // the frame.\n\n            module: <string>,          // Optional, name of the module\n            file: <string>,            // Optional, name of the file\n            line: <int>,               // Optional, line in the file\n            module_offset: <string>,   // Optional, offset in hex in the module for this frame\n            offset: <string>           // Optional, offset in hex for this frame\n\n                                       // Signature parts are computed using frame data in this\n                                       // order:\n\n                                       // 1. if there's a function (and optionally line)--use\n                                       //    that\n                                       // 2. if there's a file and a line--use that\n                                       // 3. if there's an offset and no module/module_offset--use\n                                       //    that\n                                       // 4. use module/module_offset\n          }\n          // ... additional frames\n        ],\n\n        thread_name: <string>,         // Optional, The name of the thread.\n                                       // This isn't used, yet, but might be in the future for\n                                       // debugging purposes.\n\n        frame_count: <int>             // Optional, This is the total number of frames. This\n                                       // isn't used.\n      },\n      // ... additional threads\n    ],\n\n    java_stack_trace: <string>,        // Optional, If the crash is a Java crash, then this will\n                                       // be the Java traceback as a single string. Signature\n                                       // generation will split this string into lines and then\n                                       // extract frame information from it to generate the\n                                       // signature.\n\n                                       // FIXME(willkg): Write up better description of this.\n\n    oom_allocation_size: <int>,        // Optional, The allocation size that triggered an\n                                       // out-of-memory error. This will get added to the\n                                       // signature if one of the indicator functions appears in\n                                       // the stack of the crashing thread.\n\n    abort_message: <string>,           // Optional, The abort message for the crash, if there is\n                                       // one. This is added to the beginning of the signature.\n\n    hang_type: <int>,                  // Optional.\n                                       // 1 here indicates this is a chrome hang and we look at\n                                       // thread 0 for generation.\n                                       // -1 indicates another kind of hang.\n\n    async_shutdown_timeout: <text>,    // Optional, This is a text field encoded in JSON with\n                                       // \"phase\" and \"conditions\" keys.\n                                       // FIXME(willkg): Document this structure better.\n\n    jit_category: <string>,            // Optional, If there's a JIT classification in the\n                                       // crash, then that will override the signature\n\n    ipc_channel_error: <string>,       // Optional, If there is an IPC channel error, it\n                                       // replaces the signature.\n\n    ipc_message_name: <string>,        // Optional, This gets added to the signature if there\n                                       // was an IPC message name in the crash.\n\n    additional_minidumps: <string>,    // Optional, A crash report can contain multiple minidumps.\n                                       // This is a comma-delimited list of minidumps other than\n                                       // the main one that the crash had.\n\n                                       // Example: \"browser,flash1,flash2,content\"\n\n    mdsw_status_string: <string>,      // Optional, Socorro-generated\n                                       // This is the minidump-stackwalk status string. This\n                                       // gets generated when the Socorro processor runs the\n                                       // minidump through minidump-stackwalk. If you're not\n                                       // using minidump-stackwalk, you can ignore this.\n\n    reason: <string>,                  // Optional, The crash_info type value. This can indicate\n                                       // the crash was a OOM.\n\n    moz_crash_reason: <string>,        // Optional, This is the MOZ_CRASH_REASON value. This\n                                       // doesn't affect anything unless the value is\n                                       // \"MOZ_RELEASE_ASSERT(parentBuildID == childBuildID)\".\n\n    os: <string>,                      // Optional, The name of the operating system. This\n                                       // doesn't affect anything unless the name is \"Windows\n                                       // NT\" in which case it will lowercase module names when\n                                       // iterating through frames to build the signature.\n  }\n\n\nMissing keys in the structure are treated as ``None``, so you can pass in a\nminimal structure with just the parts you define.\n\n\nExamples\n========\n\nExample almost minimal, somewhat nonsense ``crash_data.json``::\n\n    {\n        \"os\": \"Linux\",\n        \"crashing_thread\": 0,\n        \"threads\": [\n            {\n                \"frames\": [\n                    {\n                        \"frame\": 0,\n                        \"function\": \"SomeFunc\",\n                        \"line\": 20,\n                        \"file\": \"somefile.cpp\",\n                        \"module\": \"foo.so.5.15.0\",\n                        \"module_offset\": \"0x37a92\",\n                        \"offset\": \"0x7fc641052a92\"\n                    },\n                    {\n                        \"frame\": 1,\n                        \"function\": \"SomeOtherFunc\",\n                        \"line\": 444,\n                        \"file\": \"someotherfile.cpp\",\n                        \"module\": \"bar.so\",\n                        \"module_offset\": \"0x39a55\",\n                        \"offset\": \"0x7fc641044a55\"\n                    }\n                ]\n            }\n        ]\n    }\n\n\nThat produces this output::\n\n    $ cat crash_data.json | signify\n    {\n      \"notes\": [],\n      \"proto_signature\": \"SomeFunc | SomeOtherFunc\",\n      \"signature\": \"SomeFunc\"\n    }\n",
    "bugtrack_url": null,
    "license": "MPLv2",
    "summary": "Socorro signature generation extracted as a Python library",
    "version": "2.2.20241029",
    "project_urls": {
        "Homepage": "https://github.com/willkg/socorro-siggen",
        "Issues": "https://github.com/willkg/socorro-siggen",
        "Source": "https://github.com/willkg/socorro-siggen"
    },
    "split_keywords": [
        "socorro"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c0b6f1114e6df7cedad69b85a4408ff75884796e89d2f0fd8e247bd056c58851",
                "md5": "f899fe160bf0d8e41588f1c14af868c0",
                "sha256": "887305e795624742494ae7c373fdf486ec87e5c2ce07d60e9cad558bc392ff63"
            },
            "downloads": -1,
            "filename": "siggen-2.2.20241029-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f899fe160bf0d8e41588f1c14af868c0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 66779,
            "upload_time": "2024-10-30T01:00:35",
            "upload_time_iso_8601": "2024-10-30T01:00:35.950163Z",
            "url": "https://files.pythonhosted.org/packages/c0/b6/f1114e6df7cedad69b85a4408ff75884796e89d2f0fd8e247bd056c58851/siggen-2.2.20241029-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5d65e66d17204473511c00e535fc5f93f9d51c11baf280ae29453c0cf5932265",
                "md5": "e755db67c00631677831c019709887fe",
                "sha256": "a752bc70cb2083071729d4b8570d89cf5de08feca6670260a082cb96f9c0a4e8"
            },
            "downloads": -1,
            "filename": "siggen-2.2.20241029.tar.gz",
            "has_sig": false,
            "md5_digest": "e755db67c00631677831c019709887fe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 68506,
            "upload_time": "2024-10-30T01:00:37",
            "upload_time_iso_8601": "2024-10-30T01:00:37.189490Z",
            "url": "https://files.pythonhosted.org/packages/5d/65/e66d17204473511c00e535fc5f93f9d51c11baf280ae29453c0cf5932265/siggen-2.2.20241029.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-30 01:00:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "willkg",
    "github_project": "socorro-siggen",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "siggen"
}
        
Elapsed time: 0.39605s