pandasmore


Namepandasmore JSON
Version 0.0.6 PyPI version JSON
download
home_pagehttps://github.com/ionmihai/pandasmore
SummaryExtends pandas with common functions used in finance and economics research
upload_time2024-01-31 22:56:53
maintainer
docs_urlNone
authorionmihai
requires_python>=3.7
licenseApache Software License 2.0
keywords nbdev jupyter notebook python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pandasmore

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

The full documentation site is
[here](https://ionmihai.github.io/pandasmore/), and the GitHub page is
[here](https://github.com/ionmihai/pandasmore).

Here is a short description of some of the main functions (more details
below and in the
[documentation](https://ionmihai.github.io/pandasmore/core.html)):

- [`setup_tseries`](https://ionmihai.github.io/pandasmore/core.html#setup_tseries):
  cleans up dates and sets them as the index
- [`setup_panel`](https://ionmihai.github.io/pandasmore/core.html#setup_panel):
  cleans up dates and panel id’s and sets them as the index (panel id,
  period date)
- [`lag`](https://ionmihai.github.io/pandasmore/core.html#lag): robust
  lagging that accounts for panel structure, unsorted or duplicate
  dates, or gaps in the time-series

## Install

``` sh
pip install pandasmore
```

## How to use

First, we set up an example dataset to showcase the functions in this
module.

``` python
import pandas as pd
import numpy as np
import pandasmore as pdm
```

``` python
raw = pd.DataFrame(np.random.rand(15,2), 
                    columns=list('AB'), 
                    index=pd.MultiIndex.from_product(
                        [[1,2, np.nan],[np.nan,'2010-01','2010-02','2010-02','2010-04']],
                        names = ['firm_id','date'])
                      ).reset_index()
raw
```

<div>


|     | firm_id | date    | A        | B        |
|-----|---------|---------|----------|----------|
| 0   | 1.0     | NaN     | 0.249370 | 0.926335 |
| 1   | 1.0     | 2010-01 | 0.282501 | 0.513859 |
| 2   | 1.0     | 2010-02 | 0.804278 | 0.307171 |
| 3   | 1.0     | 2010-02 | 0.828895 | 0.746789 |
| 4   | 1.0     | 2010-04 | 0.569099 | 0.331814 |
| 5   | 2.0     | NaN     | 0.533977 | 0.823457 |
| 6   | 2.0     | 2010-01 | 0.207558 | 0.401378 |
| 7   | 2.0     | 2010-02 | 0.086001 | 0.959371 |
| 8   | 2.0     | 2010-02 | 0.054230 | 0.993980 |
| 9   | 2.0     | 2010-04 | 0.062525 | 0.200272 |
| 10  | NaN     | NaN     | 0.091012 | 0.635409 |
| 11  | NaN     | 2010-01 | 0.866369 | 0.972394 |
| 12  | NaN     | 2010-02 | 0.432087 | 0.837597 |
| 13  | NaN     | 2010-02 | 0.878219 | 0.148009 |
| 14  | NaN     | 2010-04 | 0.820386 | 0.834821 |

</div>

``` python
df = pdm.setup_tseries(raw.query('firm_id==1'),
                        time_var='date', time_var_format="%Y-%m",
                        freq='M')
df
```

<div>


|         | date    | dtdate     | firm_id | A        | B        |
|---------|---------|------------|---------|----------|----------|
| Mdate   |         |            |         |          |          |
| 2010-01 | 2010-01 | 2010-01-01 | 1.0     | 0.282501 | 0.513859 |
| 2010-02 | 2010-02 | 2010-02-01 | 1.0     | 0.828895 | 0.746789 |
| 2010-04 | 2010-04 | 2010-04-01 | 1.0     | 0.569099 | 0.331814 |

</div>

``` python
df = pdm.setup_panel(raw,
                        panel_ids='firm_id',
                        time_var='date', time_var_format="%Y-%m",
                        freq='M')
df
```

<div>


|         |         | date    | dtdate     | A        | B        |
|---------|---------|---------|------------|----------|----------|
| firm_id | Mdate   |         |            |          |          |
| 1       | 2010-01 | 2010-01 | 2010-01-01 | 0.282501 | 0.513859 |
|         | 2010-02 | 2010-02 | 2010-02-01 | 0.828895 | 0.746789 |
|         | 2010-04 | 2010-04 | 2010-04-01 | 0.569099 | 0.331814 |
| 2       | 2010-01 | 2010-01 | 2010-01-01 | 0.207558 | 0.401378 |
|         | 2010-02 | 2010-02 | 2010-02-01 | 0.054230 | 0.993980 |
|         | 2010-04 | 2010-04 | 2010-04-01 | 0.062525 | 0.200272 |

</div>

``` python
pdm.lag(df['A'])
```

    firm_id  Mdate  
    1        2010-01         NaN
             2010-02    0.282501
             2010-04         NaN
    2        2010-01         NaN
             2010-02    0.207558
             2010-04         NaN
    Name: A_lag1, dtype: float64

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ionmihai/pandasmore",
    "name": "pandasmore",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "nbdev jupyter notebook python",
    "author": "ionmihai",
    "author_email": "mihaiion@email.arizona.edu",
    "download_url": "https://files.pythonhosted.org/packages/c3/02/5c2629fd4da033b8943dcf5efd137a3a21ded0949d419e1a762dcf7ef5f1/pandasmore-0.0.6.tar.gz",
    "platform": null,
    "description": "# pandasmore\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\nThe full documentation site is\n[here](https://ionmihai.github.io/pandasmore/), and the GitHub page is\n[here](https://github.com/ionmihai/pandasmore).\n\nHere is a short description of some of the main functions (more details\nbelow and in the\n[documentation](https://ionmihai.github.io/pandasmore/core.html)):\n\n- [`setup_tseries`](https://ionmihai.github.io/pandasmore/core.html#setup_tseries):\n  cleans up dates and sets them as the index\n- [`setup_panel`](https://ionmihai.github.io/pandasmore/core.html#setup_panel):\n  cleans up dates and panel id\u2019s and sets them as the index (panel id,\n  period date)\n- [`lag`](https://ionmihai.github.io/pandasmore/core.html#lag): robust\n  lagging that accounts for panel structure, unsorted or duplicate\n  dates, or gaps in the time-series\n\n## Install\n\n``` sh\npip install pandasmore\n```\n\n## How to use\n\nFirst, we set up an example dataset to showcase the functions in this\nmodule.\n\n``` python\nimport pandas as pd\nimport numpy as np\nimport pandasmore as pdm\n```\n\n``` python\nraw = pd.DataFrame(np.random.rand(15,2), \n                    columns=list('AB'), \n                    index=pd.MultiIndex.from_product(\n                        [[1,2, np.nan],[np.nan,'2010-01','2010-02','2010-02','2010-04']],\n                        names = ['firm_id','date'])\n                      ).reset_index()\nraw\n```\n\n<div>\n\n\n|     | firm_id | date    | A        | B        |\n|-----|---------|---------|----------|----------|\n| 0   | 1.0     | NaN     | 0.249370 | 0.926335 |\n| 1   | 1.0     | 2010-01 | 0.282501 | 0.513859 |\n| 2   | 1.0     | 2010-02 | 0.804278 | 0.307171 |\n| 3   | 1.0     | 2010-02 | 0.828895 | 0.746789 |\n| 4   | 1.0     | 2010-04 | 0.569099 | 0.331814 |\n| 5   | 2.0     | NaN     | 0.533977 | 0.823457 |\n| 6   | 2.0     | 2010-01 | 0.207558 | 0.401378 |\n| 7   | 2.0     | 2010-02 | 0.086001 | 0.959371 |\n| 8   | 2.0     | 2010-02 | 0.054230 | 0.993980 |\n| 9   | 2.0     | 2010-04 | 0.062525 | 0.200272 |\n| 10  | NaN     | NaN     | 0.091012 | 0.635409 |\n| 11  | NaN     | 2010-01 | 0.866369 | 0.972394 |\n| 12  | NaN     | 2010-02 | 0.432087 | 0.837597 |\n| 13  | NaN     | 2010-02 | 0.878219 | 0.148009 |\n| 14  | NaN     | 2010-04 | 0.820386 | 0.834821 |\n\n</div>\n\n``` python\ndf = pdm.setup_tseries(raw.query('firm_id==1'),\n                        time_var='date', time_var_format=\"%Y-%m\",\n                        freq='M')\ndf\n```\n\n<div>\n\n\n|         | date    | dtdate     | firm_id | A        | B        |\n|---------|---------|------------|---------|----------|----------|\n| Mdate   |         |            |         |          |          |\n| 2010-01 | 2010-01 | 2010-01-01 | 1.0     | 0.282501 | 0.513859 |\n| 2010-02 | 2010-02 | 2010-02-01 | 1.0     | 0.828895 | 0.746789 |\n| 2010-04 | 2010-04 | 2010-04-01 | 1.0     | 0.569099 | 0.331814 |\n\n</div>\n\n``` python\ndf = pdm.setup_panel(raw,\n                        panel_ids='firm_id',\n                        time_var='date', time_var_format=\"%Y-%m\",\n                        freq='M')\ndf\n```\n\n<div>\n\n\n|         |         | date    | dtdate     | A        | B        |\n|---------|---------|---------|------------|----------|----------|\n| firm_id | Mdate   |         |            |          |          |\n| 1       | 2010-01 | 2010-01 | 2010-01-01 | 0.282501 | 0.513859 |\n|         | 2010-02 | 2010-02 | 2010-02-01 | 0.828895 | 0.746789 |\n|         | 2010-04 | 2010-04 | 2010-04-01 | 0.569099 | 0.331814 |\n| 2       | 2010-01 | 2010-01 | 2010-01-01 | 0.207558 | 0.401378 |\n|         | 2010-02 | 2010-02 | 2010-02-01 | 0.054230 | 0.993980 |\n|         | 2010-04 | 2010-04 | 2010-04-01 | 0.062525 | 0.200272 |\n\n</div>\n\n``` python\npdm.lag(df['A'])\n```\n\n    firm_id  Mdate  \n    1        2010-01         NaN\n             2010-02    0.282501\n             2010-04         NaN\n    2        2010-01         NaN\n             2010-02    0.207558\n             2010-04         NaN\n    Name: A_lag1, dtype: float64\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Extends pandas with common functions used in finance and economics research",
    "version": "0.0.6",
    "project_urls": {
        "Homepage": "https://github.com/ionmihai/pandasmore"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "60552a985031dabc3d16af23156935d62d36b758582caa536c9d23153475ef26",
                "md5": "d42ebb6cb224cde6191f6accfebe5819",
                "sha256": "70377506392d6b14430a5203b0aebbd7c363c966da5e089b41603a1fc92ff91b"
            },
            "downloads": -1,
            "filename": "pandasmore-0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d42ebb6cb224cde6191f6accfebe5819",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 11393,
            "upload_time": "2024-01-31T22:56:51",
            "upload_time_iso_8601": "2024-01-31T22:56:51.219955Z",
            "url": "https://files.pythonhosted.org/packages/60/55/2a985031dabc3d16af23156935d62d36b758582caa536c9d23153475ef26/pandasmore-0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c3025c2629fd4da033b8943dcf5efd137a3a21ded0949d419e1a762dcf7ef5f1",
                "md5": "e0c0336cac2956b13e76723135ac849a",
                "sha256": "ae0752ae3091431f579737ea2d9e590a83ee80c9607ae469434f8da136ae1282"
            },
            "downloads": -1,
            "filename": "pandasmore-0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "e0c0336cac2956b13e76723135ac849a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 24468,
            "upload_time": "2024-01-31T22:56:53",
            "upload_time_iso_8601": "2024-01-31T22:56:53.225192Z",
            "url": "https://files.pythonhosted.org/packages/c3/02/5c2629fd4da033b8943dcf5efd137a3a21ded0949d419e1a762dcf7ef5f1/pandasmore-0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-31 22:56:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ionmihai",
    "github_project": "pandasmore",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pandasmore"
}
        
Elapsed time: 0.25035s