# PURL2SRC - Package URL (PURL) to Source
Translate Package URLs (PURLs) into validated download URLs for source code artifacts.
## Features
- **Multi-ecosystem support**: NPM, PyPI, Cargo, NuGet, GitHub, Maven, RubyGems, Go, Conda, and more
- **Three-level resolution strategy**:
1. Direct URL construction based on known patterns
2. Package registry API queries
3. Local package manager fallback
- **URL validation**: Verify download URLs are accessible
- **Batch processing**: Process multiple PURLs from files
- **Multiple output formats**: JSON, CSV, or plain text
- **Extensible architecture**: Easy to add new package ecosystems
## Installation
```bash
pip install semantic-copycat-purl2src
```
## Usage
### Command Line
```bash
# Single PURL (default text output)
purl2src "pkg:npm/express@4.17.1"
# Output: pkg:npm/express@4.17.1 -> https://registry.npmjs.org/express/-/express-4.17.1.tgz
# JSON output format
purl2src "pkg:npm/express@4.17.1" --format json
# With validation
purl2src "pkg:pypi/requests@2.28.0" --validate
# Batch processing from file
purl2src -f purls.txt --output results.json
# Batch processing with JSON to stdout
purl2src -f purls.txt --format json
```
### Python API
```python
from purl2src import get_download_url
# Get download URL for a PURL
result = get_download_url("pkg:npm/express@4.17.1")
print(result.download_url)
# https://registry.npmjs.org/express/-/express-4.17.1.tgz
# Without validation (faster)
result = get_download_url("pkg:pypi/requests@2.28.0", validate=False)
```
## Supported Ecosystems
| Ecosystem | PURL Type | Example |
|-----------|-----------|---------|
| NPM | `npm` | `pkg:npm/@angular/core@12.0.0` |
| PyPI | `pypi` | `pkg:pypi/django@4.0.0` |
| Cargo | `cargo` | `pkg:cargo/serde@1.0.0` |
| NuGet | `nuget` | `pkg:nuget/Newtonsoft.Json@13.0.1` |
| Maven | `maven` | `pkg:maven/org.apache.commons/commons-lang3@3.12.0` |
| RubyGems | `gem` | `pkg:gem/rails@7.0.0` |
| Go | `golang` | `pkg:golang/github.com/gin-gonic/gin@v1.8.0` |
| GitHub | `github` | `pkg:github/facebook/react@v18.0.0` |
| Conda | `conda` | `pkg:conda/numpy@1.23.0?channel=conda-forge&subdir=linux-64&build=py39h1234567_0` |
| Generic | `generic` | `pkg:generic/package@1.0.0?download_url=https://example.com/file.tar.gz` |
## Examples
### NPM with Scoped Package
```bash
purl2src "pkg:npm/@angular/core@12.0.0"
# Output: https://registry.npmjs.org/@angular/core/-/core-12.0.0.tgz
```
### Maven with Classifier
```bash
purl2src "pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?classifier=sources"
# Output: https://repo.maven.apache.org/maven2/org/apache/xmlgraphics/batik-anim/1.9.1/batik-anim-1.9.1-sources.jar
```
### Generic with Checksum Validation
```bash
purl2src "pkg:generic/mypackage@1.0.0?download_url=https://example.com/pkg.tar.gz&checksum=sha256:abcd1234..."
```
## License
Apache License 2.0 - see LICENSE file for details
Raw data
{
"_id": null,
"home_page": null,
"name": "semantic-copycat-purl2src",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "purl, package-url, package-manager, source-code, license-compliance, ai-detection, semantic-analysis, semantic-copycat, code-copycat",
"author": null,
"author_email": "\"Oscar Valenzuela B.\" <oscar.valenzuela.b@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/a9/4a/aa78ecb4febb17ac6f00893df133eb009bfaed268434f187a0044aa20ce8/semantic_copycat_purl2src-1.1.2.tar.gz",
"platform": null,
"description": "# PURL2SRC - Package URL (PURL) to Source\n\nTranslate Package URLs (PURLs) into validated download URLs for source code artifacts.\n\n## Features\n\n- **Multi-ecosystem support**: NPM, PyPI, Cargo, NuGet, GitHub, Maven, RubyGems, Go, Conda, and more\n- **Three-level resolution strategy**:\n 1. Direct URL construction based on known patterns\n 2. Package registry API queries\n 3. Local package manager fallback\n- **URL validation**: Verify download URLs are accessible\n- **Batch processing**: Process multiple PURLs from files\n- **Multiple output formats**: JSON, CSV, or plain text\n- **Extensible architecture**: Easy to add new package ecosystems\n\n## Installation\n\n```bash\npip install semantic-copycat-purl2src\n```\n\n## Usage\n\n### Command Line\n\n```bash\n# Single PURL (default text output)\npurl2src \"pkg:npm/express@4.17.1\"\n# Output: pkg:npm/express@4.17.1 -> https://registry.npmjs.org/express/-/express-4.17.1.tgz\n\n# JSON output format\npurl2src \"pkg:npm/express@4.17.1\" --format json\n\n# With validation\npurl2src \"pkg:pypi/requests@2.28.0\" --validate\n\n# Batch processing from file\npurl2src -f purls.txt --output results.json\n\n# Batch processing with JSON to stdout\npurl2src -f purls.txt --format json\n```\n\n### Python API\n\n```python\nfrom purl2src import get_download_url\n\n# Get download URL for a PURL\nresult = get_download_url(\"pkg:npm/express@4.17.1\")\nprint(result.download_url)\n# https://registry.npmjs.org/express/-/express-4.17.1.tgz\n\n# Without validation (faster)\nresult = get_download_url(\"pkg:pypi/requests@2.28.0\", validate=False)\n```\n\n## Supported Ecosystems\n\n| Ecosystem | PURL Type | Example |\n|-----------|-----------|---------|\n| NPM | `npm` | `pkg:npm/@angular/core@12.0.0` |\n| PyPI | `pypi` | `pkg:pypi/django@4.0.0` |\n| Cargo | `cargo` | `pkg:cargo/serde@1.0.0` |\n| NuGet | `nuget` | `pkg:nuget/Newtonsoft.Json@13.0.1` |\n| Maven | `maven` | `pkg:maven/org.apache.commons/commons-lang3@3.12.0` |\n| RubyGems | `gem` | `pkg:gem/rails@7.0.0` |\n| Go | `golang` | `pkg:golang/github.com/gin-gonic/gin@v1.8.0` |\n| GitHub | `github` | `pkg:github/facebook/react@v18.0.0` |\n| Conda | `conda` | `pkg:conda/numpy@1.23.0?channel=conda-forge&subdir=linux-64&build=py39h1234567_0` |\n| Generic | `generic` | `pkg:generic/package@1.0.0?download_url=https://example.com/file.tar.gz` |\n\n## Examples\n\n### NPM with Scoped Package\n```bash\npurl2src \"pkg:npm/@angular/core@12.0.0\"\n# Output: https://registry.npmjs.org/@angular/core/-/core-12.0.0.tgz\n```\n\n### Maven with Classifier\n```bash\npurl2src \"pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?classifier=sources\"\n# Output: https://repo.maven.apache.org/maven2/org/apache/xmlgraphics/batik-anim/1.9.1/batik-anim-1.9.1-sources.jar\n```\n\n### Generic with Checksum Validation\n```bash\npurl2src \"pkg:generic/mypackage@1.0.0?download_url=https://example.com/pkg.tar.gz&checksum=sha256:abcd1234...\"\n```\n\n## License\n\nApache License 2.0 - see LICENSE file for details\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Translate Package URLs (PURLs) into validated download URLs for source code artifacts",
"version": "1.1.2",
"project_urls": {
"Documentation": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src#readme",
"Homepage": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src",
"Issues": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src/issues",
"Repository": "https://github.com/oscarvalenzuelab/semantic-copycat-purl2src"
},
"split_keywords": [
"purl",
" package-url",
" package-manager",
" source-code",
" license-compliance",
" ai-detection",
" semantic-analysis",
" semantic-copycat",
" code-copycat"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4f946b51baa17907d65c91ff47eb4597269054de83da20011505fc8353dba467",
"md5": "f6f6c9a22acc9e5e0916bfb3e805e137",
"sha256": "486f750f4d3838f86e992fc9269437fc98af0b7c73af11e33d1019791acc8f49"
},
"downloads": -1,
"filename": "semantic_copycat_purl2src-1.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f6f6c9a22acc9e5e0916bfb3e805e137",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 27077,
"upload_time": "2025-09-06T03:21:05",
"upload_time_iso_8601": "2025-09-06T03:21:05.108152Z",
"url": "https://files.pythonhosted.org/packages/4f/94/6b51baa17907d65c91ff47eb4597269054de83da20011505fc8353dba467/semantic_copycat_purl2src-1.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a94aaa78ecb4febb17ac6f00893df133eb009bfaed268434f187a0044aa20ce8",
"md5": "9ef778de6ccbc01b4fa888556ace8573",
"sha256": "6882eab6967c40e90e9df83c648fae70094f7e41511eaac3c55135b7828ec5cb"
},
"downloads": -1,
"filename": "semantic_copycat_purl2src-1.1.2.tar.gz",
"has_sig": false,
"md5_digest": "9ef778de6ccbc01b4fa888556ace8573",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 22553,
"upload_time": "2025-09-06T03:21:06",
"upload_time_iso_8601": "2025-09-06T03:21:06.262674Z",
"url": "https://files.pythonhosted.org/packages/a9/4a/aa78ecb4febb17ac6f00893df133eb009bfaed268434f187a0044aa20ce8/semantic_copycat_purl2src-1.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-06 03:21:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "oscarvalenzuelab",
"github_project": "semantic-copycat-purl2src#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "click",
"specs": [
[
">=",
"8.0.0"
]
]
},
{
"name": "requests",
"specs": [
[
">=",
"2.28.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"<",
"2.0.0"
],
[
">=",
"1.26.0"
]
]
}
],
"tox": true,
"lcname": "semantic-copycat-purl2src"
}