a-pandas-ex-string-to-dtypes

Name	a-pandas-ex-string-to-dtypes JSON
Version	0.1 JSON
	download
home_page	https://github.com/hansalemaos/a_pandas_ex_string_to_dtypes
Summary	Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes
upload_time	2022-10-04 04:27:05
maintainer
docs_url	None
author	Johannes Fischer
requires_python
license	MIT
keywords	pandas dtypes converter
VCS
bugtrack_url
requirements	a_pandas_ex_df_to_string a_pandas_ex_less_memory_more_speed pandas
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
## **What is it used for?**



Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes



### Installation



```python

pip install a-pandas-ex-string-to-dtypes

```



### Usage



```python

    from a_pandas_ex_string_to_dtypes import pd_add_string_to_dtypes

    import pandas as pd

    pd_add_string_to_dtypes()

    df = pd.read_csv("https://github.com/pandas-dev/pandas/raw/main/doc/data/titanic.csv")

    print(df)

    print(df.dtypes)   

    

    

         PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked

    0              1         0       3  ...   7.2500   NaN         S

    1              2         1       1  ...  71.2833   C85         C

    2              3         1       3  ...   7.9250   NaN         S

    3              4         1       1  ...  53.1000  C123         S

    4              5         0       3  ...   8.0500   NaN         S

    ..           ...       ...     ...  ...      ...   ...       ...

    886          887         0       2  ...  13.0000   NaN         S

    887          888         1       1  ...  30.0000   B42         S

    888          889         0       3  ...  23.4500   NaN         S

    889          890         1       1  ...  30.0000  C148         C

    890          891         0       3  ...   7.7500   NaN         Q

    [891 rows x 12 columns]  

    

    PassengerId      int64

    Survived         int64

    Pclass           int64

    Name            object

    Sex             object

    Age            float64

    SibSp            int64

    Parch            int64

    Ticket          object

    Fare           float64

    Cabin           object

    Embarked        object

    dtype: object     

    

    

    

    

    

    dfstring = pd.concat(

        [df[x].astype("string") for x in df.columns], axis=1, ignore_index=True

    )

    dfstring.columns=df.columns

    print(dfstring)

    print(dfstring.dtypes)  

    

        PassengerId Survived Pclass  ...     Fare Cabin Embarked

    0             1        0      3  ...     7.25  <NA>        S

    1             2        1      1  ...  71.2833   C85        C

    2             3        1      3  ...    7.925  <NA>        S

    3             4        1      1  ...     53.1  C123        S

    4             5        0      3  ...     8.05  <NA>        S

    ..          ...      ...    ...  ...      ...   ...      ...

    886         887        0      2  ...     13.0  <NA>        S

    887         888        1      1  ...     30.0   B42        S

    888         889        0      3  ...    23.45  <NA>        S

    889         890        1      1  ...     30.0  C148        C

    890         891        0      3  ...     7.75  <NA>        Q

    [891 rows x 12 columns]    

    

    

    PassengerId    string

    Survived       string

    Pclass         string

    Name           string

    Sex            string

    Age            string

    SibSp          string

    Parch          string

    Ticket         string

    Fare           string

    Cabin          string

    Embarked       string

    dtype: object    

    

    

    

    converted = dfstring.ds_string_to_best_dtype()

    print(converted)

    print(converted.dtypes)

         PassengerId  Survived  Pclass  ...     Fare Cabin Embarked

    0              1         0       3  ...   7.2500  <NA>        S

    1              2         1       1  ...  71.2833   C85        C

    2              3         1       3  ...   7.9250  <NA>        S

    3              4         1       1  ...  53.1000  C123        S

    4              5         0       3  ...   8.0500  <NA>        S

    ..           ...       ...     ...  ...      ...   ...      ...

    886          887         0       2  ...  13.0000  <NA>        S

    887          888         1       1  ...  30.0000   B42        S

    888          889         0       3  ...  23.4500  <NA>        S

    889          890         1       1  ...  30.0000  C148        C

    890          891         0       3  ...   7.7500  <NA>        Q

    [891 rows x 12 columns]    

    

    

    PassengerId      uint16

    Survived          uint8

    Pclass            uint8

    Name             string

    Sex            category

    Age              object

    SibSp             uint8

    Parch             uint8

    Ticket           object

    Fare            float64

    Cabin          category

    Embarked       category

    dtype: object    

    

    

        Parameters:

            df: Union[pd.DataFrame, pd.Series]

        Returns:

            Union[pd.DataFrame, pd.Series]

```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/hansalemaos/a_pandas_ex_string_to_dtypes",
    "name": "a-pandas-ex-string-to-dtypes",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "pandas,dtypes,converter",
    "author": "Johannes Fischer",
    "author_email": "<aulasparticularesdealemaosp@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/5a/cd/0ee0f6357f7fe2d7f9822eef64ffc9dd047cb288615e0d3ccec1eb39a9be/a_pandas_ex_string_to_dtypes-0.1.tar.gz",
    "platform": null,
    "description": "\n## **What is it used for?**\n\n\n\nConvert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes\n\n\n\n### Installation\n\n\n\n```python\n\npip install a-pandas-ex-string-to-dtypes\n\n```\n\n\n\n### Usage\n\n\n\n```python\n\n    from a_pandas_ex_string_to_dtypes import pd_add_string_to_dtypes\n\n    import pandas as pd\n\n    pd_add_string_to_dtypes()\n\n    df = pd.read_csv(\"https://github.com/pandas-dev/pandas/raw/main/doc/data/titanic.csv\")\n\n    print(df)\n\n    print(df.dtypes)   \n\n    \n\n    \n\n         PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked\n\n    0              1         0       3  ...   7.2500   NaN         S\n\n    1              2         1       1  ...  71.2833   C85         C\n\n    2              3         1       3  ...   7.9250   NaN         S\n\n    3              4         1       1  ...  53.1000  C123         S\n\n    4              5         0       3  ...   8.0500   NaN         S\n\n    ..           ...       ...     ...  ...      ...   ...       ...\n\n    886          887         0       2  ...  13.0000   NaN         S\n\n    887          888         1       1  ...  30.0000   B42         S\n\n    888          889         0       3  ...  23.4500   NaN         S\n\n    889          890         1       1  ...  30.0000  C148         C\n\n    890          891         0       3  ...   7.7500   NaN         Q\n\n    [891 rows x 12 columns]  \n\n    \n\n    PassengerId      int64\n\n    Survived         int64\n\n    Pclass           int64\n\n    Name            object\n\n    Sex             object\n\n    Age            float64\n\n    SibSp            int64\n\n    Parch            int64\n\n    Ticket          object\n\n    Fare           float64\n\n    Cabin           object\n\n    Embarked        object\n\n    dtype: object     \n\n    \n\n    \n\n    \n\n    \n\n    \n\n    dfstring = pd.concat(\n\n        [df[x].astype(\"string\") for x in df.columns], axis=1, ignore_index=True\n\n    )\n\n    dfstring.columns=df.columns\n\n    print(dfstring)\n\n    print(dfstring.dtypes)  \n\n    \n\n        PassengerId Survived Pclass  ...     Fare Cabin Embarked\n\n    0             1        0      3  ...     7.25  <NA>        S\n\n    1             2        1      1  ...  71.2833   C85        C\n\n    2             3        1      3  ...    7.925  <NA>        S\n\n    3             4        1      1  ...     53.1  C123        S\n\n    4             5        0      3  ...     8.05  <NA>        S\n\n    ..          ...      ...    ...  ...      ...   ...      ...\n\n    886         887        0      2  ...     13.0  <NA>        S\n\n    887         888        1      1  ...     30.0   B42        S\n\n    888         889        0      3  ...    23.45  <NA>        S\n\n    889         890        1      1  ...     30.0  C148        C\n\n    890         891        0      3  ...     7.75  <NA>        Q\n\n    [891 rows x 12 columns]    \n\n    \n\n    \n\n    PassengerId    string\n\n    Survived       string\n\n    Pclass         string\n\n    Name           string\n\n    Sex            string\n\n    Age            string\n\n    SibSp          string\n\n    Parch          string\n\n    Ticket         string\n\n    Fare           string\n\n    Cabin          string\n\n    Embarked       string\n\n    dtype: object    \n\n    \n\n    \n\n    \n\n    converted = dfstring.ds_string_to_best_dtype()\n\n    print(converted)\n\n    print(converted.dtypes)\n\n         PassengerId  Survived  Pclass  ...     Fare Cabin Embarked\n\n    0              1         0       3  ...   7.2500  <NA>        S\n\n    1              2         1       1  ...  71.2833   C85        C\n\n    2              3         1       3  ...   7.9250  <NA>        S\n\n    3              4         1       1  ...  53.1000  C123        S\n\n    4              5         0       3  ...   8.0500  <NA>        S\n\n    ..           ...       ...     ...  ...      ...   ...      ...\n\n    886          887         0       2  ...  13.0000  <NA>        S\n\n    887          888         1       1  ...  30.0000   B42        S\n\n    888          889         0       3  ...  23.4500  <NA>        S\n\n    889          890         1       1  ...  30.0000  C148        C\n\n    890          891         0       3  ...   7.7500  <NA>        Q\n\n    [891 rows x 12 columns]    \n\n    \n\n    \n\n    PassengerId      uint16\n\n    Survived          uint8\n\n    Pclass            uint8\n\n    Name             string\n\n    Sex            category\n\n    Age              object\n\n    SibSp             uint8\n\n    Parch             uint8\n\n    Ticket           object\n\n    Fare            float64\n\n    Cabin          category\n\n    Embarked       category\n\n    dtype: object    \n\n    \n\n    \n\n        Parameters:\n\n            df: Union[pd.DataFrame, pd.Series]\n\n        Returns:\n\n            Union[pd.DataFrame, pd.Series]\n\n```\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes",
    "version": "0.1",
    "project_urls": {
        "Homepage": "https://github.com/hansalemaos/a_pandas_ex_string_to_dtypes"
    },
    "split_keywords": [
        "pandas",
        "dtypes",
        "converter"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0df2934ee563eee188b6d7a7d267cc85268364e31c2405ed6aa04622f194c941",
                "md5": "f1f0cc827f69ccfa097f87bf1c5b42f9",
                "sha256": "f428c8564bc589b102de2a8bae982a4c659b899f1214ae506c86207ff5c8769e"
            },
            "downloads": -1,
            "filename": "a_pandas_ex_string_to_dtypes-0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f1f0cc827f69ccfa097f87bf1c5b42f9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 8572,
            "upload_time": "2022-10-04T04:27:04",
            "upload_time_iso_8601": "2022-10-04T04:27:04.077472Z",
            "url": "https://files.pythonhosted.org/packages/0d/f2/934ee563eee188b6d7a7d267cc85268364e31c2405ed6aa04622f194c941/a_pandas_ex_string_to_dtypes-0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5acd0ee0f6357f7fe2d7f9822eef64ffc9dd047cb288615e0d3ccec1eb39a9be",
                "md5": "82e416bf7c01e79ed37a0823f97cfe32",
                "sha256": "850db3010565b38d5d3eb8fc0f3a97c8223b7d8b595835071de1934c39d0bf91"
            },
            "downloads": -1,
            "filename": "a_pandas_ex_string_to_dtypes-0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "82e416bf7c01e79ed37a0823f97cfe32",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 6291,
            "upload_time": "2022-10-04T04:27:05",
            "upload_time_iso_8601": "2022-10-04T04:27:05.710862Z",
            "url": "https://files.pythonhosted.org/packages/5a/cd/0ee0f6357f7fe2d7f9822eef64ffc9dd047cb288615e0d3ccec1eb39a9be/a_pandas_ex_string_to_dtypes-0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-10-04 04:27:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hansalemaos",
    "github_project": "a_pandas_ex_string_to_dtypes",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "a_pandas_ex_df_to_string",
            "specs": []
        },
        {
            "name": "a_pandas_ex_less_memory_more_speed",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        }
    ],
    "lcname": "a-pandas-ex-string-to-dtypes"
}

Johannes Fischer