neoscr

Name	neoscr JSON
Version	2.1.2 JSON
	download
home_page	https://github.com/datarisk-io/neoscr
Summary	Wrapper to query the SCR api
upload_time	2023-08-24 13:44:08
maintainer
docs_url	None
author	João Nogueira
requires_python	>=3.7
license	Apache Software License 2.0
keywords	nbdev jupyter notebook python
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # neoscr

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Install

``` sh
pip install neoscr
```

## How to use

Fill me in please! Don’t forget code examples:

``` python
from neoscr.core import ConsultaSCR
```

``` python
import os

scr = ConsultaSCR(
    user=os.environ["SCR_USER"],
    password=os.environ["SCR_PASSWORD"],
    code=os.environ["SCR_CODE"],
    api_key=os.environ["SCR_API_KEY"]
)
```

<div>

> **Warning**
>
> You have the choice to not pass the API credentials on the
> [`ConsultaSCR`](https://datarisk-io.github.io/neoscr/core.html#consultascr)
> instantiation, but for that you should have the credentials to access
> the SCR API stored in your OS environment variables.

</div>

``` python
cpf = "867.168.046-09" # fake cpf
ano = 2022
mes = 12

# retorna três dataframes
df_cpf_traduzido, df_cpf_modalidade, df_cpf_resumo_lista_das_operacoes = scr.get_cpf_data(cpf, ano, mes)
```

<div>

> **Note**
>
> `neoscr` will save each request made into `.neoscr` folder located at
> your home directory.
>
> For the example above, the saved file will be:
> `~/.neoscr/86716804609_2022_12.json`
>
> Next time you do the same request, it will load from the local
> storage.

</div>

``` python
cnpj = "79.322.561/0001-67" # fake cnpj
ano = 2022
mes = 12

# retorna três dataframes
df_cnpj_traduzido, df_cnpj_modalidade, df_cnpj_resumo_lista_das_operacoes = scr.get_cnpj_data(cnpj, ano, mes)
```

# Batch Query

Execute the code below to query a list of cpfs or cnpjs (under
modification) and download the data

<div>

> **Caution**
>
> Please don’t just copy and execute the code above. Read it and adapt
> it to your needs.

</div>

``` python
import os
import logging
import pandas as pd
from tqdm import tqdm

from neoscr.utils import let_only_digits

# carregando a lista de cpfs
df = pd.read_csv("dataset.csv")
lista_de_cpfs = df['cpf'].tolist()

# instanciando o objeto ConsultaSCR
scr = ConsultaSCR()

# instanciando o objeto logger
logger = logging.getLogger('database_updater')
logger.setLevel(logging.DEBUG)

# criando o file handler
file_handler = logging.FileHandler('querylog.log')
file_handler.setLevel(logging.DEBUG)

# adicionando o file handler ao logger
logger.addHandler(file_handler)

# iterando sobre a lista de cpfs e enriquecendo
ano = 2022
mes = 12
for cpf in tqdm(lista_de_cpfs):
    try:
        df_traduzido, df_modalidade, df_cnpj_resumo_lista_das_operacoes = scr.get_cpf_data(cpf, ano, mes)                               
        cpf_only_digits = let_only_digits(cpf)
        df_traduzido.to_csv(f"data/scr/raw/{cpf_only_digits}_traduzido.csv", index=False)
        df_modalidade.to_csv(f"data/scr/raw/{cpf_only_digits}_modalidade.csv", index=False)
        df_cnpj_resumo_lista_das_operacoes.to_csv(f"data/scr/raw/{cpf_only_digits}_resumo_lista_das_operacoes.csv", index=False)
    except:
        logger.error(f"Erro no CPF {cpf}")
        continue
```

After download the data, you may want to get all the raw data together
in one big table:

``` python
# carregandos os dados de todos os arquivos salvos
df_traduzido_full = pd.DataFrame()
for file in os.listdir("data/scr/raw/"):
    if file.endswith("_traduzido.csv"):
        df_traduzido = pd.read_csv(f"data/scr/raw/{file}")
        df_traduzido_full = pd.concat([df_traduzido_full, df_traduzido])

df_modalidade_full = pd.DataFrame()
for file in os.listdir(".data/scr/raw"):
    if file.endswith("_modalidade.csv"):
        df_modalidade = pd.read_csv(f"data/scr/raw/{file}")
        df_modalidade_full = pd.concat([df_modalidade_full, df_modalidade])

df_cnpj_resumo_lista_das_operacoes_full = pd.DataFrame()
for file in os.listdir("data/scr/raw/"):
    if file.endswith("_resumo_lista_das_operacoes.csv"):
        df_cnpj_resumo_lista_das_operacoes = pd.read_csv(f"data/scr/raw/{file}")
        df_cnpj_resumo_lista_das_operacoes_full = pd.concat([df_cnpj_resumo_lista_das_operacoes_full, df_cnpj_resumo_lista_das_operacoes])
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/datarisk-io/neoscr",
    "name": "neoscr",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "nbdev jupyter notebook python",
    "author": "Jo\u00e3o Nogueira",
    "author_email": "joao.nogueira@datarisk.io",
    "download_url": "https://files.pythonhosted.org/packages/7c/fb/bc54c800d2db291074f6aa6584a46e3ffc9f5375aeb73e49e8650600ebc9/neoscr-2.1.2.tar.gz",
    "platform": null,
    "description": "# neoscr\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n## Install\n\n``` sh\npip install neoscr\n```\n\n## How to use\n\nFill me in please! Don\u2019t forget code examples:\n\n``` python\nfrom neoscr.core import ConsultaSCR\n```\n\n``` python\nimport os\n\nscr = ConsultaSCR(\n    user=os.environ[\"SCR_USER\"],\n    password=os.environ[\"SCR_PASSWORD\"],\n    code=os.environ[\"SCR_CODE\"],\n    api_key=os.environ[\"SCR_API_KEY\"]\n)\n```\n\n<div>\n\n> **Warning**\n>\n> You have the choice to not pass the API credentials on the\n> [`ConsultaSCR`](https://datarisk-io.github.io/neoscr/core.html#consultascr)\n> instantiation, but for that you should have the credentials to access\n> the SCR API stored in your OS environment variables.\n\n</div>\n\n``` python\ncpf = \"867.168.046-09\" # fake cpf\nano = 2022\nmes = 12\n\n# retorna tr\u00eas dataframes\ndf_cpf_traduzido, df_cpf_modalidade, df_cpf_resumo_lista_das_operacoes = scr.get_cpf_data(cpf, ano, mes)\n```\n\n<div>\n\n> **Note**\n>\n> `neoscr` will save each request made into `.neoscr` folder located at\n> your home directory.\n>\n> For the example above, the saved file will be:\n> `~/.neoscr/86716804609_2022_12.json`\n>\n> Next time you do the same request, it will load from the local\n> storage.\n\n</div>\n\n``` python\ncnpj = \"79.322.561/0001-67\" # fake cnpj\nano = 2022\nmes = 12\n\n# retorna tr\u00eas dataframes\ndf_cnpj_traduzido, df_cnpj_modalidade, df_cnpj_resumo_lista_das_operacoes = scr.get_cnpj_data(cnpj, ano, mes)\n```\n\n# Batch Query\n\nExecute the code below to query a list of cpfs or cnpjs (under\nmodification) and download the data\n\n<div>\n\n> **Caution**\n>\n> Please don\u2019t just copy and execute the code above. Read it and adapt\n> it to your needs.\n\n</div>\n\n``` python\nimport os\nimport logging\nimport pandas as pd\nfrom tqdm import tqdm\n\nfrom neoscr.utils import let_only_digits\n\n# carregando a lista de cpfs\ndf = pd.read_csv(\"dataset.csv\")\nlista_de_cpfs = df['cpf'].tolist()\n\n# instanciando o objeto ConsultaSCR\nscr = ConsultaSCR()\n\n# instanciando o objeto logger\nlogger = logging.getLogger('database_updater')\nlogger.setLevel(logging.DEBUG)\n\n# criando o file handler\nfile_handler = logging.FileHandler('querylog.log')\nfile_handler.setLevel(logging.DEBUG)\n\n# adicionando o file handler ao logger\nlogger.addHandler(file_handler)\n\n# iterando sobre a lista de cpfs e enriquecendo\nano = 2022\nmes = 12\nfor cpf in tqdm(lista_de_cpfs):\n    try:\n        df_traduzido, df_modalidade, df_cnpj_resumo_lista_das_operacoes = scr.get_cpf_data(cpf, ano, mes)                               \n        cpf_only_digits = let_only_digits(cpf)\n        df_traduzido.to_csv(f\"data/scr/raw/{cpf_only_digits}_traduzido.csv\", index=False)\n        df_modalidade.to_csv(f\"data/scr/raw/{cpf_only_digits}_modalidade.csv\", index=False)\n        df_cnpj_resumo_lista_das_operacoes.to_csv(f\"data/scr/raw/{cpf_only_digits}_resumo_lista_das_operacoes.csv\", index=False)\n    except:\n        logger.error(f\"Erro no CPF {cpf}\")\n        continue\n```\n\nAfter download the data, you may want to get all the raw data together\nin one big table:\n\n``` python\n# carregandos os dados de todos os arquivos salvos\ndf_traduzido_full = pd.DataFrame()\nfor file in os.listdir(\"data/scr/raw/\"):\n    if file.endswith(\"_traduzido.csv\"):\n        df_traduzido = pd.read_csv(f\"data/scr/raw/{file}\")\n        df_traduzido_full = pd.concat([df_traduzido_full, df_traduzido])\n\ndf_modalidade_full = pd.DataFrame()\nfor file in os.listdir(\".data/scr/raw\"):\n    if file.endswith(\"_modalidade.csv\"):\n        df_modalidade = pd.read_csv(f\"data/scr/raw/{file}\")\n        df_modalidade_full = pd.concat([df_modalidade_full, df_modalidade])\n\ndf_cnpj_resumo_lista_das_operacoes_full = pd.DataFrame()\nfor file in os.listdir(\"data/scr/raw/\"):\n    if file.endswith(\"_resumo_lista_das_operacoes.csv\"):\n        df_cnpj_resumo_lista_das_operacoes = pd.read_csv(f\"data/scr/raw/{file}\")\n        df_cnpj_resumo_lista_das_operacoes_full = pd.concat([df_cnpj_resumo_lista_das_operacoes_full, df_cnpj_resumo_lista_das_operacoes])\n```\n\n\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Wrapper to query the SCR api",
    "version": "2.1.2",
    "project_urls": {
        "Homepage": "https://github.com/datarisk-io/neoscr"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1a00409494b1bd7042782597a59993d4989432a7894fbcb58facce7f6874f2ed",
                "md5": "f2181ac12dec54796ff3e71c3be033e4",
                "sha256": "30d32623ad27052ccebd903caa9b822e9521b9fd07f34b642792ffc079b413c6"
            },
            "downloads": -1,
            "filename": "neoscr-2.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f2181ac12dec54796ff3e71c3be033e4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 12103,
            "upload_time": "2023-08-24T13:44:07",
            "upload_time_iso_8601": "2023-08-24T13:44:07.617163Z",
            "url": "https://files.pythonhosted.org/packages/1a/00/409494b1bd7042782597a59993d4989432a7894fbcb58facce7f6874f2ed/neoscr-2.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7cfbbc54c800d2db291074f6aa6584a46e3ffc9f5375aeb73e49e8650600ebc9",
                "md5": "485daf64422457277953406ad093dc30",
                "sha256": "67e389f690c65872d57c14adb340b370ef61cd25914268ff11d89030f082a543"
            },
            "downloads": -1,
            "filename": "neoscr-2.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "485daf64422457277953406ad093dc30",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 13490,
            "upload_time": "2023-08-24T13:44:08",
            "upload_time_iso_8601": "2023-08-24T13:44:08.820893Z",
            "url": "https://files.pythonhosted.org/packages/7c/fb/bc54c800d2db291074f6aa6584a46e3ffc9f5375aeb73e49e8650600ebc9/neoscr-2.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-24 13:44:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "datarisk-io",
    "github_project": "neoscr",
    "github_not_found": true,
    "lcname": "neoscr"
}

João Nogueira