jpcorpreg


Namejpcorpreg JSON
Version 1.8.1 PyPI version JSON
download
home_pageNone
Summaryjpcorpreg is a Python library that downloads corporate registry which is published in the Corporate Number Publication Site as a data frame.
upload_time2025-08-16 14:57:36
maintainerNone
docs_urlNone
authornew-village
requires_python>=3.8
licenseApache-2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # jpcorpreg  
[![Test](https://github.com/new-village/jp-corpreg-loader/actions/workflows/test.yaml/badge.svg)](https://github.com/new-village/jp-corpreg-loader/actions/workflows/test.yaml)
![PyPI - Version](https://img.shields.io/pypi/v/jpcorpreg)
  
**jpcorpreg** is a Python library that downloads corporate registry which is published in the [Corporate Number Publication Site](https://www.houjin-bangou.nta.go.jp/en/) as a data frame.
   
  
## Installation  
----------------------
jpcorpreg is available on pip installation.
```sh
$ python -m pip install jpcorpreg
```
  
### GitHub Install
Installing the latest version from GitHub:  
```sh
$ git clone https://github.com/new-village/jpcorpreg
$ cd jpcorpreg
$ pip install -e .
```
    
## Usage
This section demonstrates how to use this library to load and process data from the National Tax Agency's [Corporate Number Publication Site](https://www.houjin-bangou.nta.go.jp/).

### Direct Data Loading
To download data for a specific prefecture, use the `load` function. By passing the prefecture name as an argument, you can obtain a DataFrame containing data for that prefecture.  
```python
>>> import jpcorpreg
>>> df = jpcorpreg.load("Shimane")
```

To execute the `load` function without argument, data for all prefectures across Japan will be downloaded. 
```python
>>> import jpcorpreg
>>> df = jpcorpreg.load()
```

### Parquet Output
If you prefer to save the downloaded data as a Parquet file instead of returning a DataFrame, pass `format="parquet"`. The function returns the path to the generated `.parquet` file.
```python
>>> import jpcorpreg
>>> parquet_path = jpcorpreg.load("Shimane", format="parquet")
```

You can then read the Parquet file with pandas:
```python
>>> import pandas as pd
>>> df = pd.read_parquet(parquet_path)
```

### CSV Data Loading
If you already have a downloaded CSV file, use the `read_csv` function. By passing the file path as an argument, you can obtain a DataFrame with headers from the CSV data.
```python
>>> import jpcorpreg
>>> df = jpcorpreg.read_csv("path/to/data.csv")
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "jpcorpreg",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "new-village",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/57/74/72d8193a2f7da5af9c351d879904c25f3d429d0016ae3c61973c1ba13339/jpcorpreg-1.8.1.tar.gz",
    "platform": null,
    "description": "# jpcorpreg  \n[![Test](https://github.com/new-village/jp-corpreg-loader/actions/workflows/test.yaml/badge.svg)](https://github.com/new-village/jp-corpreg-loader/actions/workflows/test.yaml)\n![PyPI - Version](https://img.shields.io/pypi/v/jpcorpreg)\n  \n**jpcorpreg** is a Python library that downloads corporate registry which is published in the [Corporate Number Publication Site](https://www.houjin-bangou.nta.go.jp/en/) as a data frame.\n   \n  \n## Installation  \n----------------------\njpcorpreg is available on pip installation.\n```sh\n$ python -m pip install jpcorpreg\n```\n  \n### GitHub Install\nInstalling the latest version from GitHub:  \n```sh\n$ git clone https://github.com/new-village/jpcorpreg\n$ cd jpcorpreg\n$ pip install -e .\n```\n    \n## Usage\nThis section demonstrates how to use this library to load and process data from the National Tax Agency's [Corporate Number Publication Site](https://www.houjin-bangou.nta.go.jp/).\n\n### Direct Data Loading\nTo download data for a specific prefecture, use the `load` function. By passing the prefecture name as an argument, you can obtain a DataFrame containing data for that prefecture.  \n```python\n>>> import jpcorpreg\n>>> df = jpcorpreg.load(\"Shimane\")\n```\n\nTo execute the `load` function without argument, data for all prefectures across Japan will be downloaded. \n```python\n>>> import jpcorpreg\n>>> df = jpcorpreg.load()\n```\n\n### Parquet Output\nIf you prefer to save the downloaded data as a Parquet file instead of returning a DataFrame, pass `format=\"parquet\"`. The function returns the path to the generated `.parquet` file.\n```python\n>>> import jpcorpreg\n>>> parquet_path = jpcorpreg.load(\"Shimane\", format=\"parquet\")\n```\n\nYou can then read the Parquet file with pandas:\n```python\n>>> import pandas as pd\n>>> df = pd.read_parquet(parquet_path)\n```\n\n### CSV Data Loading\nIf you already have a downloaded CSV file, use the `read_csv` function. By passing the file path as an argument, you can obtain a DataFrame with headers from the CSV data.\n```python\n>>> import jpcorpreg\n>>> df = jpcorpreg.read_csv(\"path/to/data.csv\")\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "jpcorpreg is a Python library that downloads corporate registry which is published in the Corporate Number Publication Site as a data frame.",
    "version": "1.8.1",
    "project_urls": {
        "Homepage": "https://github.com/new-village/jpcorpreg",
        "Repository": "https://github.com/new-village/jpcorpreg"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "024aa0e5a44b1d95f8f68a3ca8cd48bb9a916fe9b940b6f0fcb24506ece6c59d",
                "md5": "18ef968ba365358be2dcd972f480c80f",
                "sha256": "dc95202c98c72ba515bc0459e03acea80ad118f9b641172d185a9860eb13c91d"
            },
            "downloads": -1,
            "filename": "jpcorpreg-1.8.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18ef968ba365358be2dcd972f480c80f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 10818,
            "upload_time": "2025-08-16T14:57:35",
            "upload_time_iso_8601": "2025-08-16T14:57:35.853416Z",
            "url": "https://files.pythonhosted.org/packages/02/4a/a0e5a44b1d95f8f68a3ca8cd48bb9a916fe9b940b6f0fcb24506ece6c59d/jpcorpreg-1.8.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "577472d8193a2f7da5af9c351d879904c25f3d429d0016ae3c61973c1ba13339",
                "md5": "655629a8e24b2396e990a98aea25d24a",
                "sha256": "6505ea5d688fc245c28ed95f5ca6d4b8321dd30c6d16f8ee361d5d305ae5673f"
            },
            "downloads": -1,
            "filename": "jpcorpreg-1.8.1.tar.gz",
            "has_sig": false,
            "md5_digest": "655629a8e24b2396e990a98aea25d24a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 10793,
            "upload_time": "2025-08-16T14:57:36",
            "upload_time_iso_8601": "2025-08-16T14:57:36.872961Z",
            "url": "https://files.pythonhosted.org/packages/57/74/72d8193a2f7da5af9c351d879904c25f3d429d0016ae3c61973c1ba13339/jpcorpreg-1.8.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-16 14:57:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "new-village",
    "github_project": "jpcorpreg",
    "github_not_found": true,
    "lcname": "jpcorpreg"
}
        
Elapsed time: 1.82691s