flashgeotext


Nameflashgeotext JSON
Version 0.5.3 PyPI version JSON
download
home_pagehttps://flashgeotext.iwpnd.pw
SummaryExtract and count countries and cities (+their synonyms) from text
upload_time2024-05-05 07:52:01
maintainerNone
docs_urlNone
authorBenjamin Ramser
requires_python<4.0.0,>=3.10.0
licenseMIT
keywords geonames nlp text extraction
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
<a href="https://github.com/iwpnd/flashgeotext/actions" target="_blank">
    <img src="https://github.com/iwpnd/flashgeotext/workflows/CI/badge.svg?branch=master" alt="Build Status">
</a>
<a href="https://codecov.io/gh/iwpnd/flashgeotext" target="_blank">
    <img src="https://codecov.io/gh/iwpnd/flashgeotext/branch/master/graph/badge.svg" alt="Coverage">
</a>
</p>

---

# flashgeotext :zap::earth_africa:

Extract and count countries and cities (+their synonyms) from text, like [GeoText](https://github.com/elyase/geotext) on steroids using [FlashText](https://github.com/vi3k6i5/flashtext/), a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.

**introductory blogpost**: [https://iwpnd.github.io/articles/2020-02/flashgeotext-library](https://iwpnd.pw/articles/2020-02/flashgeotext-library)

## Usage

```python
from flashgeotext.geotext import GeoText

geotext = GeoText()

input_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans
                to cut tariffs on $75 billion worth of goods that the country
                imports from the US. Washington welcomes the decision.'''

geotext.extract(input_text=input_text)
>> {
    'cities': {
        'Shanghai': {
            'count': 2,
            'span_info': [(0, 8), (45, 53)],
            'found_as': ['Shanghai', 'Shanghai'],
            },
        'Washington, D.C.': {
            'count': 1,
            'span_info': [(175, 185)],
            'found_as': ['Washington'],
            }
        },
    'countries': {
        'China': {
            'count': 1,
            'span_info': [(64, 69)],
            'found_as': ['China'],
            },
        'United States': {
            'count': 1,
            'span_info': [(171, 173)],
            'found_as': ['US'],
            }
        }
    }
```

## Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

### Installing

pip:

```bash
pip install flashgeotext
```

conda:

```bash
conda install flashgeotext
```

for development:

```bash
git clone https://github.com/iwpnd/flashgeotext.git
cd flashgeotext/
poetry install
```

### Running the tests

```bash
poetry run pytest . -v
```

## Authors

- **Benjamin Ramser** - _Initial work_ - [iwpnd](https://github.com/iwpnd)

See also the list of [contributors](https://github.com/iwpnd/flashgeotext/contributors) who participated in this project.

## License

This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details

Demo Data cities from [http://www.geonames.org](http://www.geonames.org) licensed under the Creative Commons Attribution 3.0 License.

## Acknowledgments

- Hat tip to [@vi3k6i5](https://github.com/vi3k6i5) for his [paper](https://arxiv.org/abs/1711.00046) and implementation


            

Raw data

            {
    "_id": null,
    "home_page": "https://flashgeotext.iwpnd.pw",
    "name": "flashgeotext",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0.0,>=3.10.0",
    "maintainer_email": null,
    "keywords": "geonames, nlp, text extraction",
    "author": "Benjamin Ramser",
    "author_email": "ahoi@iwpnd.pw",
    "download_url": "https://files.pythonhosted.org/packages/47/de/aa769e5dd8945c672c69686c6f82ef6a44f7b2338ab35e896bfac61cfb12/flashgeotext-0.5.3.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n<a href=\"https://github.com/iwpnd/flashgeotext/actions\" target=\"_blank\">\n    <img src=\"https://github.com/iwpnd/flashgeotext/workflows/CI/badge.svg?branch=master\" alt=\"Build Status\">\n</a>\n<a href=\"https://codecov.io/gh/iwpnd/flashgeotext\" target=\"_blank\">\n    <img src=\"https://codecov.io/gh/iwpnd/flashgeotext/branch/master/graph/badge.svg\" alt=\"Coverage\">\n</a>\n</p>\n\n---\n\n# flashgeotext :zap::earth_africa:\n\nExtract and count countries and cities (+their synonyms) from text, like [GeoText](https://github.com/elyase/geotext) on steroids using [FlashText](https://github.com/vi3k6i5/flashtext/), a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.\n\n**introductory blogpost**: [https://iwpnd.github.io/articles/2020-02/flashgeotext-library](https://iwpnd.pw/articles/2020-02/flashgeotext-library)\n\n## Usage\n\n```python\nfrom flashgeotext.geotext import GeoText\n\ngeotext = GeoText()\n\ninput_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans\n                to cut tariffs on $75 billion worth of goods that the country\n                imports from the US. Washington welcomes the decision.'''\n\ngeotext.extract(input_text=input_text)\n>> {\n    'cities': {\n        'Shanghai': {\n            'count': 2,\n            'span_info': [(0, 8), (45, 53)],\n            'found_as': ['Shanghai', 'Shanghai'],\n            },\n        'Washington, D.C.': {\n            'count': 1,\n            'span_info': [(175, 185)],\n            'found_as': ['Washington'],\n            }\n        },\n    'countries': {\n        'China': {\n            'count': 1,\n            'span_info': [(64, 69)],\n            'found_as': ['China'],\n            },\n        'United States': {\n            'count': 1,\n            'span_info': [(171, 173)],\n            'found_as': ['US'],\n            }\n        }\n    }\n```\n\n## Getting Started\n\nThese instructions will get you a copy of the project up and running on your local machine for development and testing purposes.\n\n### Installing\n\npip:\n\n```bash\npip install flashgeotext\n```\n\nconda:\n\n```bash\nconda install flashgeotext\n```\n\nfor development:\n\n```bash\ngit clone https://github.com/iwpnd/flashgeotext.git\ncd flashgeotext/\npoetry install\n```\n\n### Running the tests\n\n```bash\npoetry run pytest . -v\n```\n\n## Authors\n\n- **Benjamin Ramser** - _Initial work_ - [iwpnd](https://github.com/iwpnd)\n\nSee also the list of [contributors](https://github.com/iwpnd/flashgeotext/contributors) who participated in this project.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details\n\nDemo Data cities from [http://www.geonames.org](http://www.geonames.org) licensed under the Creative Commons Attribution 3.0 License.\n\n## Acknowledgments\n\n- Hat tip to [@vi3k6i5](https://github.com/vi3k6i5) for his [paper](https://arxiv.org/abs/1711.00046) and implementation\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Extract and count countries and cities (+their synonyms) from text",
    "version": "0.5.3",
    "project_urls": {
        "Homepage": "https://flashgeotext.iwpnd.pw",
        "Repository": "https://github.com/iwpnd/flashgeotext"
    },
    "split_keywords": [
        "geonames",
        " nlp",
        " text extraction"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dd441d86edd4a6c5835f958cedb51a9fab7f256d8d24a15a6754ff2d4d93b0ea",
                "md5": "e43f73ffc1b5417546f8d35087f7c2c3",
                "sha256": "43e73bdd304689e243ae8c9852c04854f2455f268fa138111aa26fbffc03bcae"
            },
            "downloads": -1,
            "filename": "flashgeotext-0.5.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e43f73ffc1b5417546f8d35087f7c2c3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0.0,>=3.10.0",
            "size": 448007,
            "upload_time": "2024-05-05T07:51:58",
            "upload_time_iso_8601": "2024-05-05T07:51:58.837859Z",
            "url": "https://files.pythonhosted.org/packages/dd/44/1d86edd4a6c5835f958cedb51a9fab7f256d8d24a15a6754ff2d4d93b0ea/flashgeotext-0.5.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "47deaa769e5dd8945c672c69686c6f82ef6a44f7b2338ab35e896bfac61cfb12",
                "md5": "7fdcced41be9edeb48ba3c2f8c88d43e",
                "sha256": "4ffa3bdea2b826cd61da66cf8a71f393b83d1229ec3943e3a1109c5e6cb3d6a3"
            },
            "downloads": -1,
            "filename": "flashgeotext-0.5.3.tar.gz",
            "has_sig": false,
            "md5_digest": "7fdcced41be9edeb48ba3c2f8c88d43e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0.0,>=3.10.0",
            "size": 439435,
            "upload_time": "2024-05-05T07:52:01",
            "upload_time_iso_8601": "2024-05-05T07:52:01.418122Z",
            "url": "https://files.pythonhosted.org/packages/47/de/aa769e5dd8945c672c69686c6f82ef6a44f7b2338ab35e896bfac61cfb12/flashgeotext-0.5.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-05 07:52:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "iwpnd",
    "github_project": "flashgeotext",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "flashgeotext"
}
        
Elapsed time: 3.94941s