nameparser


Namenameparser JSON
Version 1.1.2 PyPI version JSON
download
home_pagehttps://github.com/derek73/python-nameparser
SummaryA simple Python module for parsing human names into their individual components.
upload_time2022-11-14 03:05:45
maintainer
docs_urlNone
authorDerek Gulbranson
requires_python
licenseLGPL
keywords names parser
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            Name Parser
===========

|Build Status| |PyPI| |PyPI version| |Documentation|

A simple Python (3.2+ & 2.6+) module for parsing human names into their
individual components. 

* hn.title
* hn.first
* hn.middle
* hn.last
* hn.suffix
* hn.nickname
* hn.surnames *(middle + last)*
* hn.initials *(first initial of each name part)*

Supported Name Structures
~~~~~~~~~~~~~~~~~~~~~~~~~

The supported name structure is generally "Title First Middle Last Suffix", where all pieces 
are optional. Comma-separated format like "Last, First" is also supported.

1. Title Firstname "Nickname" Middle Middle Lastname Suffix
2. Lastname [Suffix], Title Firstname (Nickname) Middle Middle[,] Suffix [, Suffix]
3. Title Firstname M Lastname [Suffix], Suffix [Suffix] [, Suffix]

Instantiating the `HumanName` class with a string splits on commas and then spaces, 
classifying name parts based on placement in the string and matches against known name 
pieces like titles and suffixes. 

It correctly handles some common conjunctions and special prefixes to last names
like "del". Titles and conjunctions can be chained together to handle complex
titles like "Asst Secretary of State". It can also try to correct capitalization
of names that are all upper- or lowercase names.

It attempts the best guess that can be made with a simple, rule-based approach. 
Its main use case is English and it is not likely to be useful for languages 
that do not conform to the supported name structure. It's not perfect, but it 
gets you pretty far.

Installation
------------

::

  pip install nameparser

If you want to try out the latest code from GitHub you can
install with pip using the command below.

``pip install -e git+git://github.com/derek73/python-nameparser.git#egg=nameparser``

If you need to handle lists of names, check out
`namesparser <https://github.com/gwu-libraries/namesparser>`_, a
compliment to this module that handles multiple names in a string.


Quick Start Example
-------------------

::

    >>> from nameparser import HumanName
    >>> name = HumanName("Dr. Juan Q. Xavier de la Vega III (Doc Vega)")
    >>> name 
    <HumanName : [
    	title: 'Dr.' 
    	first: 'Juan' 
    	middle: 'Q. Xavier' 
    	last: 'de la Vega' 
    	suffix: 'III'
    	nickname: 'Doc Vega'
    ]>
    >>> name.last
    'de la Vega'
    >>> name.as_dict()
    {'last': 'de la Vega', 'suffix': 'III', 'title': 'Dr.', 'middle': 'Q. Xavier', 'nickname': 'Doc Vega', 'first': 'Juan'}
    >>> str(name)
    'Dr. Juan Q. Xavier de la Vega III (Doc Vega)'
    >>> name.string_format = "{first} {last}"
    >>> str(name)
    'Juan de la Vega'


The parser does not attempt to correct mistakes in the input. It mostly just splits on white
space and puts things in buckets based on their position in the string. This also means
the difference between 'title' and 'suffix' is positional, not semantic. "Dr" is a title
when it comes before the name and a suffix when it comes after. ("Pre-nominal"
and "post-nominal" would probably be better names.)

::

    >>> name = HumanName("1 & 2, 3 4 5, Mr.")
    >>> name 
    <HumanName : [
    	title: '' 
    	first: '3' 
    	middle: '4 5' 
    	last: '1 & 2' 
    	suffix: 'Mr.'
    	nickname: ''
    ]>

Customization
-------------

Your project may need some adjustment for your dataset. You can
do this in your own pre- or post-processing, by `customizing the configured pre-defined 
sets`_ of titles, prefixes, etc., or by subclassing the `HumanName` class. See the 
`full documentation`_ for more information.


`Full documentation`_
~~~~~~~~~~~~~~~~~~~~~

.. _customizing the configured pre-defined sets: http://nameparser.readthedocs.org/en/latest/customize.html
.. _Full documentation: http://nameparser.readthedocs.org/en/latest/


Contributing
------------

If you come across name piece that you think should be in the default config, you're
probably right. `Start a New Issue`_ and we can get them added. 

Please let me know if there are ways this library could be structured to make
it easier for you to use in your projects. Read CONTRIBUTING.md_ for more info
on running the tests and contributing to the project.

**GitHub Project**

https://github.com/derek73/python-nameparser

.. _CONTRIBUTING.md: https://github.com/derek73/python-nameparser/tree/master/CONTRIBUTING.md
.. _Start a New Issue: https://github.com/derek73/python-nameparser/issues
.. _click here to propose changes to the titles: https://github.com/derek73/python-nameparser/edit/master/nameparser/config/titles.py

.. |Build Status| image:: https://github.com/derek73/python-nameparser/actions/workflows/python-package.yml/badge.svg
   :target: https://github.com/derek73/python-nameparser/actions/workflows/python-package.yml
.. |PyPI| image:: https://img.shields.io/pypi/v/nameparser.svg
   :target: https://pypi.org/project/nameparser/
.. |Documentation| image:: https://readthedocs.org/projects/nameparser/badge/?version=latest
   :target: http://nameparser.readthedocs.io/en/latest/?badge=latest
.. |PyPI version| image:: https://img.shields.io/pypi/pyversions/nameparser.svg
   :target: https://pypi.org/project/nameparser/

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/derek73/python-nameparser",
    "name": "nameparser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "names,parser",
    "author": "Derek Gulbranson",
    "author_email": "derek73@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/84/87/c816fc6ecc6f3cf8778b180423291a389a5480618957674dc8489cf44b36/nameparser-1.1.2.tar.gz",
    "platform": null,
    "description": "Name Parser\n===========\n\n|Build Status| |PyPI| |PyPI version| |Documentation|\n\nA simple Python (3.2+ & 2.6+) module for parsing human names into their\nindividual components. \n\n* hn.title\n* hn.first\n* hn.middle\n* hn.last\n* hn.suffix\n* hn.nickname\n* hn.surnames *(middle + last)*\n* hn.initials *(first initial of each name part)*\n\nSupported Name Structures\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThe supported name structure is generally \"Title First Middle Last Suffix\", where all pieces \nare optional. Comma-separated format like \"Last, First\" is also supported.\n\n1. Title Firstname \"Nickname\" Middle Middle Lastname Suffix\n2. Lastname [Suffix], Title Firstname (Nickname) Middle Middle[,] Suffix [, Suffix]\n3. Title Firstname M Lastname [Suffix], Suffix [Suffix] [, Suffix]\n\nInstantiating the `HumanName` class with a string splits on commas and then spaces, \nclassifying name parts based on placement in the string and matches against known name \npieces like titles and suffixes. \n\nIt correctly handles some common conjunctions and special prefixes to last names\nlike \"del\". Titles and conjunctions can be chained together to handle complex\ntitles like \"Asst Secretary of State\". It can also try to correct capitalization\nof names that are all upper- or lowercase names.\n\nIt attempts the best guess that can be made with a simple, rule-based approach. \nIts main use case is English and it is not likely to be useful for languages \nthat do not conform to the supported name structure. It's not perfect, but it \ngets you pretty far.\n\nInstallation\n------------\n\n::\n\n  pip install nameparser\n\nIf you want to try out the latest code from GitHub you can\ninstall with pip using the command below.\n\n``pip install -e git+git://github.com/derek73/python-nameparser.git#egg=nameparser``\n\nIf you need to handle lists of names, check out\n`namesparser <https://github.com/gwu-libraries/namesparser>`_, a\ncompliment to this module that handles multiple names in a string.\n\n\nQuick Start Example\n-------------------\n\n::\n\n    >>> from nameparser import HumanName\n    >>> name = HumanName(\"Dr. Juan Q. Xavier de la Vega III (Doc Vega)\")\n    >>> name \n    <HumanName : [\n    \ttitle: 'Dr.' \n    \tfirst: 'Juan' \n    \tmiddle: 'Q. Xavier' \n    \tlast: 'de la Vega' \n    \tsuffix: 'III'\n    \tnickname: 'Doc Vega'\n    ]>\n    >>> name.last\n    'de la Vega'\n    >>> name.as_dict()\n    {'last': 'de la Vega', 'suffix': 'III', 'title': 'Dr.', 'middle': 'Q. Xavier', 'nickname': 'Doc Vega', 'first': 'Juan'}\n    >>> str(name)\n    'Dr. Juan Q. Xavier de la Vega III (Doc Vega)'\n    >>> name.string_format = \"{first} {last}\"\n    >>> str(name)\n    'Juan de la Vega'\n\n\nThe parser does not attempt to correct mistakes in the input. It mostly just splits on white\nspace and puts things in buckets based on their position in the string. This also means\nthe difference between 'title' and 'suffix' is positional, not semantic. \"Dr\" is a title\nwhen it comes before the name and a suffix when it comes after. (\"Pre-nominal\"\nand \"post-nominal\" would probably be better names.)\n\n::\n\n    >>> name = HumanName(\"1 & 2, 3 4 5, Mr.\")\n    >>> name \n    <HumanName : [\n    \ttitle: '' \n    \tfirst: '3' \n    \tmiddle: '4 5' \n    \tlast: '1 & 2' \n    \tsuffix: 'Mr.'\n    \tnickname: ''\n    ]>\n\nCustomization\n-------------\n\nYour project may need some adjustment for your dataset. You can\ndo this in your own pre- or post-processing, by `customizing the configured pre-defined \nsets`_ of titles, prefixes, etc., or by subclassing the `HumanName` class. See the \n`full documentation`_ for more information.\n\n\n`Full documentation`_\n~~~~~~~~~~~~~~~~~~~~~\n\n.. _customizing the configured pre-defined sets: http://nameparser.readthedocs.org/en/latest/customize.html\n.. _Full documentation: http://nameparser.readthedocs.org/en/latest/\n\n\nContributing\n------------\n\nIf you come across name piece that you think should be in the default config, you're\nprobably right. `Start a New Issue`_ and we can get them added. \n\nPlease let me know if there are ways this library could be structured to make\nit easier for you to use in your projects. Read CONTRIBUTING.md_ for more info\non running the tests and contributing to the project.\n\n**GitHub Project**\n\nhttps://github.com/derek73/python-nameparser\n\n.. _CONTRIBUTING.md: https://github.com/derek73/python-nameparser/tree/master/CONTRIBUTING.md\n.. _Start a New Issue: https://github.com/derek73/python-nameparser/issues\n.. _click here to propose changes to the titles: https://github.com/derek73/python-nameparser/edit/master/nameparser/config/titles.py\n\n.. |Build Status| image:: https://github.com/derek73/python-nameparser/actions/workflows/python-package.yml/badge.svg\n   :target: https://github.com/derek73/python-nameparser/actions/workflows/python-package.yml\n.. |PyPI| image:: https://img.shields.io/pypi/v/nameparser.svg\n   :target: https://pypi.org/project/nameparser/\n.. |Documentation| image:: https://readthedocs.org/projects/nameparser/badge/?version=latest\n   :target: http://nameparser.readthedocs.io/en/latest/?badge=latest\n.. |PyPI version| image:: https://img.shields.io/pypi/pyversions/nameparser.svg\n   :target: https://pypi.org/project/nameparser/\n",
    "bugtrack_url": null,
    "license": "LGPL",
    "summary": "A simple Python module for parsing human names into their individual components.",
    "version": "1.1.2",
    "split_keywords": [
        "names",
        "parser"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "7373146a39e14ece413f6fa68943e075",
                "sha256": "ea2e01d1d9d04c0648be230f161f27316a1b5be431a1cc64e8799fac548fb3bc"
            },
            "downloads": -1,
            "filename": "nameparser-1.1.2-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7373146a39e14ece413f6fa68943e075",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 24667,
            "upload_time": "2022-11-14T03:05:44",
            "upload_time_iso_8601": "2022-11-14T03:05:44.459084Z",
            "url": "https://files.pythonhosted.org/packages/d4/93/6abfac89b655a0f0af6d484231124b07194a0b18f23df657c0a53a2eadfb/nameparser-1.1.2-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "c5853d1b2b80d3b1ca4e6484a157e4df",
                "sha256": "f4b6c7c1048d528bd6aa2b27cf42a06447d2b31e45a95b20449513078f1d86ef"
            },
            "downloads": -1,
            "filename": "nameparser-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "c5853d1b2b80d3b1ca4e6484a157e4df",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 36442,
            "upload_time": "2022-11-14T03:05:45",
            "upload_time_iso_8601": "2022-11-14T03:05:45.819062Z",
            "url": "https://files.pythonhosted.org/packages/84/87/c816fc6ecc6f3cf8778b180423291a389a5480618957674dc8489cf44b36/nameparser-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-11-14 03:05:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "derek73",
    "github_project": "python-nameparser",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": true,
    "lcname": "nameparser"
}
        
Elapsed time: 0.02307s