ssc_codegen


Namessc_codegen JSON
Version 0.2.7 PyPI version JSON
download
home_page
Summarygenerate selector schemas classes from dsl-like language based on python
upload_time2024-03-01 09:50:57
maintainer
docs_urlNone
authorvypivshiy
requires_python>=3.10,<4.0
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Selector schema codegen

ssc_codegen - generator of parsers for various programming languages (for html priority) using
python-DSL configurations with built-in declarative language.

Designed to port parsers to various programming languages and libs

# Motivation
- ~~interesting in practice write DSL-like language~~
- decrease boilerplate code for web-parsers
- write once - convert to other mainstream http parser libs
- minimal operations for easy add another libs and languages in future
  - include css, xpath, attributes operations, regex, minimal string formatting operations
- pre validate css/xpath queries and logic before generate code
- standardisation: generate classes with minimal dependencies and documented parsed signature 

## Install

### pipx (recommended for CLI usage)

```shell
pipx install ssc_codegen
```

### pip

```shell
pip install ssc_codegen
```

## Supported libs and languages

| language | lib                                                          | xpath | css | formatter   |
|----------|--------------------------------------------------------------|-------|-----|-------------|
| python   | bs4                                                          | NO    | YES | black       |
| -        | parsel                                                       | YES   | YES | -           |
| -        | selectolax (modest)                                          | NO    | YES | -           |
| -        | scrapy (based on parsel, but class init argument - Response) | YES   | YES | -           |
| dart     | universal_html                                               | NO    | YES | dart format |

### Quickstart

see [example](example) and read code with comments

### Recommendations

- usage css selector: they can be **guaranteed** converted to xpath (if target language not support CSS selectors)
- usage simple operations for more compatibility other libraries. 
  - Some libraries may not fully support selector specifications
  - for example, `#product_description+ p` selector in `parsel` works fine, but not works in `selectolax`, `dart` libs
- there is a xpath to css converter for simple queries **without guarantees of functionality**. 
For example, in css there is no analogue of `contains` from xpath, etc.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "ssc_codegen",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "vypivshiy",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/5c/aa/b2865ae632502ed8f2b05f4409b07ecd1a9e7b73d53a8b5bf41f9fd0a164/ssc_codegen-0.2.7.tar.gz",
    "platform": null,
    "description": "# Selector schema codegen\n\nssc_codegen - generator of parsers for various programming languages (for html priority) using\npython-DSL configurations with built-in declarative language.\n\nDesigned to port parsers to various programming languages and libs\n\n# Motivation\n- ~~interesting in practice write DSL-like language~~\n- decrease boilerplate code for web-parsers\n- write once - convert to other mainstream http parser libs\n- minimal operations for easy add another libs and languages in future\n  - include css, xpath, attributes operations, regex, minimal string formatting operations\n- pre validate css/xpath queries and logic before generate code\n- standardisation: generate classes with minimal dependencies and documented parsed signature \n\n## Install\n\n### pipx (recommended for CLI usage)\n\n```shell\npipx install ssc_codegen\n```\n\n### pip\n\n```shell\npip install ssc_codegen\n```\n\n## Supported libs and languages\n\n| language | lib                                                          | xpath | css | formatter   |\n|----------|--------------------------------------------------------------|-------|-----|-------------|\n| python   | bs4                                                          | NO    | YES | black       |\n| -        | parsel                                                       | YES   | YES | -           |\n| -        | selectolax (modest)                                          | NO    | YES | -           |\n| -        | scrapy (based on parsel, but class init argument - Response) | YES   | YES | -           |\n| dart     | universal_html                                               | NO    | YES | dart format |\n\n### Quickstart\n\nsee [example](example) and read code with comments\n\n### Recommendations\n\n- usage css selector: they can be **guaranteed** converted to xpath (if target language not support CSS selectors)\n- usage simple operations for more compatibility other libraries. \n  - Some libraries may not fully support selector specifications\n  - for example, `#product_description+ p` selector in `parsel` works fine, but not works in `selectolax`, `dart` libs\n- there is a xpath to css converter for simple queries **without guarantees of functionality**. \nFor example, in css there is no analogue of `contains` from xpath, etc.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "generate selector schemas classes from dsl-like language based on python",
    "version": "0.2.7",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bef16855a90d37d74b93a7c532dd019afcaea856172d2bd87708c86b538b7c62",
                "md5": "2a56224ec63b38fa3154a6367432a40e",
                "sha256": "c17a61766f65e020aeed091ceed672f7b0bd90655e9b2eb98fa79fc2b915e793"
            },
            "downloads": -1,
            "filename": "ssc_codegen-0.2.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2a56224ec63b38fa3154a6367432a40e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10,<4.0",
            "size": 37781,
            "upload_time": "2024-03-01T09:50:53",
            "upload_time_iso_8601": "2024-03-01T09:50:53.670239Z",
            "url": "https://files.pythonhosted.org/packages/be/f1/6855a90d37d74b93a7c532dd019afcaea856172d2bd87708c86b538b7c62/ssc_codegen-0.2.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5caab2865ae632502ed8f2b05f4409b07ecd1a9e7b73d53a8b5bf41f9fd0a164",
                "md5": "899a60358807c4b545b9d417970b4a5d",
                "sha256": "97337e4a8b05f2c63ab57bd9edad0f7c9d786f631fd025432c90db84af8b26d6"
            },
            "downloads": -1,
            "filename": "ssc_codegen-0.2.7.tar.gz",
            "has_sig": false,
            "md5_digest": "899a60358807c4b545b9d417970b4a5d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10,<4.0",
            "size": 19276,
            "upload_time": "2024-03-01T09:50:57",
            "upload_time_iso_8601": "2024-03-01T09:50:57.975105Z",
            "url": "https://files.pythonhosted.org/packages/5c/aa/b2865ae632502ed8f2b05f4409b07ecd1a9e7b73d53a8b5bf41f9fd0a164/ssc_codegen-0.2.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-01 09:50:57",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ssc_codegen"
}
        
Elapsed time: 0.20713s