semantic-copycat-purl2src


Namesemantic-copycat-purl2src JSON
Version 1.1.2 PyPI version JSON
download
home_pageNone
SummaryTranslate Package URLs (PURLs) into validated download URLs for source code artifacts
upload_time2025-09-06 03:21:06
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseApache-2.0
keywords purl package-url package-manager source-code license-compliance ai-detection semantic-analysis semantic-copycat code-copycat
VCS
bugtrack_url
requirements click requests urllib3
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PURL2SRC - Package URL (PURL) to Source

Translate Package URLs (PURLs) into validated download URLs for source code artifacts.

## Features

- **Multi-ecosystem support**: NPM, PyPI, Cargo, NuGet, GitHub, Maven, RubyGems, Go, Conda, and more
- **Three-level resolution strategy**:
  1. Direct URL construction based on known patterns
  2. Package registry API queries
  3. Local package manager fallback
- **URL validation**: Verify download URLs are accessible
- **Batch processing**: Process multiple PURLs from files
- **Multiple output formats**: JSON, CSV, or plain text
- **Extensible architecture**: Easy to add new package ecosystems

## Installation

```bash
pip install semantic-copycat-purl2src
```

## Usage

### Command Line

```bash
# Single PURL (default text output)
purl2src "pkg:npm/express@4.17.1"
# Output: pkg:npm/express@4.17.1 -> https://registry.npmjs.org/express/-/express-4.17.1.tgz

# JSON output format
purl2src "pkg:npm/express@4.17.1" --format json

# With validation
purl2src "pkg:pypi/requests@2.28.0" --validate

# Batch processing from file
purl2src -f purls.txt --output results.json

# Batch processing with JSON to stdout
purl2src -f purls.txt --format json
```

### Python API

```python
from purl2src import get_download_url

# Get download URL for a PURL
result = get_download_url("pkg:npm/express@4.17.1")
print(result.download_url)
# https://registry.npmjs.org/express/-/express-4.17.1.tgz

# Without validation (faster)
result = get_download_url("pkg:pypi/requests@2.28.0", validate=False)
```

## Supported Ecosystems

| Ecosystem | PURL Type | Example |
|-----------|-----------|---------|
| NPM | `npm` | `pkg:npm/@angular/core@12.0.0` |
| PyPI | `pypi` | `pkg:pypi/django@4.0.0` |
| Cargo | `cargo` | `pkg:cargo/serde@1.0.0` |
| NuGet | `nuget` | `pkg:nuget/Newtonsoft.Json@13.0.1` |
| Maven | `maven` | `pkg:maven/org.apache.commons/commons-lang3@3.12.0` |
| RubyGems | `gem` | `pkg:gem/rails@7.0.0` |
| Go | `golang` | `pkg:golang/github.com/gin-gonic/gin@v1.8.0` |
| GitHub | `github` | `pkg:github/facebook/react@v18.0.0` |
| Conda | `conda` | `pkg:conda/numpy@1.23.0?channel=conda-forge&subdir=linux-64&build=py39h1234567_0` |
| Generic | `generic` | `pkg:generic/package@1.0.0?download_url=https://example.com/file.tar.gz` |

## Examples

### NPM with Scoped Package
```bash
purl2src "pkg:npm/@angular/core@12.0.0"
# Output: https://registry.npmjs.org/@angular/core/-/core-12.0.0.tgz
```

### Maven with Classifier
```bash
purl2src "pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?classifier=sources"
# Output: https://repo.maven.apache.org/maven2/org/apache/xmlgraphics/batik-anim/1.9.1/batik-anim-1.9.1-sources.jar
```

### Generic with Checksum Validation
```bash
purl2src "pkg:generic/mypackage@1.0.0?download_url=https://example.com/pkg.tar.gz&checksum=sha256:abcd1234..."
```

## License

Apache License 2.0 - see LICENSE file for details

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "semantic-copycat-purl2src",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "purl, package-url, package-manager, source-code, license-compliance, ai-detection, semantic-analysis, semantic-copycat, code-copycat",
    "author": null,
    "author_email": "\"Oscar Valenzuela B.\" <oscar.valenzuela.b@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/a9/4a/aa78ecb4febb17ac6f00893df133eb009bfaed268434f187a0044aa20ce8/semantic_copycat_purl2src-1.1.2.tar.gz",
    "platform": null,
    "description": "# PURL2SRC - Package URL (PURL) to Source\n\nTranslate Package URLs (PURLs) into validated download URLs for source code artifacts.\n\n## Features\n\n- **Multi-ecosystem support**: NPM, PyPI, Cargo, NuGet, GitHub, Maven, RubyGems, Go, Conda, and more\n- **Three-level resolution strategy**:\n  1. Direct URL construction based on known patterns\n  2. Package registry API queries\n  3. Local package manager fallback\n- **URL validation**: Verify download URLs are accessible\n- **Batch processing**: Process multiple PURLs from files\n- **Multiple output formats**: JSON, CSV, or plain text\n- **Extensible architecture**: Easy to add new package ecosystems\n\n## Installation\n\n```bash\npip install semantic-copycat-purl2src\n```\n\n## Usage\n\n### Command Line\n\n```bash\n# Single PURL (default text output)\npurl2src \"pkg:npm/express@4.17.1\"\n# Output: pkg:npm/express@4.17.1 -> https://registry.npmjs.org/express/-/express-4.17.1.tgz\n\n# JSON output format\npurl2src \"pkg:npm/express@4.17.1\" --format json\n\n# With validation\npurl2src \"pkg:pypi/requests@2.28.0\" --validate\n\n# Batch processing from file\npurl2src -f purls.txt --output results.json\n\n# Batch processing with JSON to stdout\npurl2src -f purls.txt --format json\n```\n\n### Python API\n\n```python\nfrom purl2src import get_download_url\n\n# Get download URL for a PURL\nresult = get_download_url(\"pkg:npm/express@4.17.1\")\nprint(result.download_url)\n# https://registry.npmjs.org/express/-/express-4.17.1.tgz\n\n# Without validation (faster)\nresult = get_download_url(\"pkg:pypi/requests@2.28.0\", validate=False)\n```\n\n## Supported Ecosystems\n\n| Ecosystem | PURL Type | Example |\n|-----------|-----------|---------|\n| NPM | `npm` | `pkg:npm/@angular/core@12.0.0` |\n| PyPI | `pypi` | `pkg:pypi/django@4.0.0` |\n| Cargo | `cargo` | `pkg:cargo/serde@1.0.0` |\n| NuGet | `nuget` | `pkg:nuget/Newtonsoft.Json@13.0.1` |\n| Maven | `maven` | `pkg:maven/org.apache.commons/commons-lang3@3.12.0` |\n| RubyGems | `gem` | `pkg:gem/rails@7.0.0` |\n| Go | `golang` | `pkg:golang/github.com/gin-gonic/gin@v1.8.0` |\n| GitHub | `github` | `pkg:github/facebook/react@v18.0.0` |\n| Conda | `conda` | `pkg:conda/numpy@1.23.0?channel=conda-forge&subdir=linux-64&build=py39h1234567_0` |\n| Generic | `generic` | `pkg:generic/package@1.0.0?download_url=https://example.com/file.tar.gz` |\n\n## Examples\n\n### NPM with Scoped Package\n```bash\npurl2src \"pkg:npm/@angular/core@12.0.0\"\n# Output: https://registry.npmjs.org/@angular/core/-/core-12.0.0.tgz\n```\n\n### Maven with Classifier\n```bash\npurl2src \"pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?classifier=sources\"\n# Output: https://repo.maven.apache.org/maven2/org/apache/xmlgraphics/batik-anim/1.9.1/batik-anim-1.9.1-sources.jar\n```\n\n### Generic with Checksum Validation\n```bash\npurl2src \"pkg:generic/mypackage@1.0.0?download_url=https://example.com/pkg.tar.gz&checksum=sha256:abcd1234...\"\n```\n\n## License\n\nApache License 2.0 - see LICENSE file for details\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Translate Package URLs (PURLs) into validated download URLs for source code artifacts",
    "version": "1.1.2",
    "project_urls": {
        "Documentation": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src#readme",
        "Homepage": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src",
        "Issues": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src/issues",
        "Repository": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src"
    },
    "split_keywords": [
        "purl",
        " package-url",
        " package-manager",
        " source-code",
        " license-compliance",
        " ai-detection",
        " semantic-analysis",
        " semantic-copycat",
        " code-copycat"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4f946b51baa17907d65c91ff47eb4597269054de83da20011505fc8353dba467",
                "md5": "f6f6c9a22acc9e5e0916bfb3e805e137",
                "sha256": "486f750f4d3838f86e992fc9269437fc98af0b7c73af11e33d1019791acc8f49"
            },
            "downloads": -1,
            "filename": "semantic_copycat_purl2src-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f6f6c9a22acc9e5e0916bfb3e805e137",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 27077,
            "upload_time": "2025-09-06T03:21:05",
            "upload_time_iso_8601": "2025-09-06T03:21:05.108152Z",
            "url": "https://files.pythonhosted.org/packages/4f/94/6b51baa17907d65c91ff47eb4597269054de83da20011505fc8353dba467/semantic_copycat_purl2src-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a94aaa78ecb4febb17ac6f00893df133eb009bfaed268434f187a0044aa20ce8",
                "md5": "9ef778de6ccbc01b4fa888556ace8573",
                "sha256": "6882eab6967c40e90e9df83c648fae70094f7e41511eaac3c55135b7828ec5cb"
            },
            "downloads": -1,
            "filename": "semantic_copycat_purl2src-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "9ef778de6ccbc01b4fa888556ace8573",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 22553,
            "upload_time": "2025-09-06T03:21:06",
            "upload_time_iso_8601": "2025-09-06T03:21:06.262674Z",
            "url": "https://files.pythonhosted.org/packages/a9/4a/aa78ecb4febb17ac6f00893df133eb009bfaed268434f187a0044aa20ce8/semantic_copycat_purl2src-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-06 03:21:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "oscarvalenzuelab",
    "github_project": "semantic-copycat-purl2src#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "click",
            "specs": [
                [
                    ">=",
                    "8.0.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.28.0"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "<",
                    "2.0.0"
                ],
                [
                    ">=",
                    "1.26.0"
                ]
            ]
        }
    ],
    "tox": true,
    "lcname": "semantic-copycat-purl2src"
}
        
Elapsed time: 1.24033s