extract-emails


Nameextract-emails JSON
Version 5.3.4 PyPI version JSON
download
home_pagehttps://github.com/dmitriiweb/extract-emails
SummaryExtract email addresses and linkedin profiles from given URL.
upload_time2024-06-02 09:50:18
maintainerNone
docs_urlNone
authorDmitrii Kurlov
requires_python<3.13,>=3.10
licenseMIT
keywords parser email linkedin
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Extract Emails

![Image](https://github.com/dmitriiweb/extract-emails/blob/docs_improvements/images/email.png?raw=true)

[![PyPI version](https://badge.fury.io/py/extract-emails.svg)](https://badge.fury.io/py/extract-emails)

Extract emails and linkedins profiles from a given website

**Support the project with BTC**: *bc1q0cxl5j3se0ufhr96h8x0zs8nz4t7h6krrxkd6l*

[Documentation](https://dmitriiweb.github.io/extract-emails/)

## Requirements

- Python >= 3.10

## Installation

```bash
pip install extract_emails[all]
# or
pip install extract_emails[requests]
# or
pip install extract_emails[selenium]
```

## Simple Usage

### As library

```python
from pathlib import Path

from extract_emails import DefaultFilterAndEmailFactory as Factory
from extract_emails import DefaultWorker
from extract_emails.browsers.requests_browser import RequestsBrowser as Browser
from extract_emails.data_savers import CsvSaver


websites = [
    "website1.com",
    "website2.com",
]

browser = Browser()
data_saver = CsvSaver(save_mode="a", output_path=Path("output.csv"))

for website in websites:
    factory = Factory(
        website_url=website, browser=browser, depth=5, max_links_from_page=1
    )
    worker = DefaultWorker(factory)
    data = worker.get_data()
    data_saver.save(data)
```

### As CLI tool

```bash
$ extract-emails --help

$ extract-emails --url https://en.wikipedia.org/wiki/Email -of output.csv -d 1
$ cat output.csv
email,page,website
bob@b.org,https://en.wikipedia.org/wiki/Email,https://en.wikipedia.org/wiki/Email
```

### By me a coffee

- **USDT** (TRC20): TXuYegp5L8Zf7wF2YRFjskZwdBxhRpvxBS
- **BEP20**: 0x4D51Db2B754eA83ce228F7de8EaEB93a88bdC965
- **TON**: UQA5quJljQz84RwzteN3uuKsdPTDee7a_GF5lgIgezA2oib5

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dmitriiweb/extract-emails",
    "name": "extract-emails",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.10",
    "maintainer_email": null,
    "keywords": "parser, email, linkedin",
    "author": "Dmitrii Kurlov",
    "author_email": "dmitriik@tutanota.com",
    "download_url": "https://files.pythonhosted.org/packages/0f/5e/0e23c79df27780dc516bad14fb6b8378780e7601dab1192638c77bab6c78/extract_emails-5.3.4.tar.gz",
    "platform": null,
    "description": "# Extract Emails\n\n![Image](https://github.com/dmitriiweb/extract-emails/blob/docs_improvements/images/email.png?raw=true)\n\n[![PyPI version](https://badge.fury.io/py/extract-emails.svg)](https://badge.fury.io/py/extract-emails)\n\nExtract emails and linkedins profiles from a given website\n\n**Support the project with BTC**: *bc1q0cxl5j3se0ufhr96h8x0zs8nz4t7h6krrxkd6l*\n\n[Documentation](https://dmitriiweb.github.io/extract-emails/)\n\n## Requirements\n\n- Python >= 3.10\n\n## Installation\n\n```bash\npip install extract_emails[all]\n# or\npip install extract_emails[requests]\n# or\npip install extract_emails[selenium]\n```\n\n## Simple Usage\n\n### As library\n\n```python\nfrom pathlib import Path\n\nfrom extract_emails import DefaultFilterAndEmailFactory as Factory\nfrom extract_emails import DefaultWorker\nfrom extract_emails.browsers.requests_browser import RequestsBrowser as Browser\nfrom extract_emails.data_savers import CsvSaver\n\n\nwebsites = [\n    \"website1.com\",\n    \"website2.com\",\n]\n\nbrowser = Browser()\ndata_saver = CsvSaver(save_mode=\"a\", output_path=Path(\"output.csv\"))\n\nfor website in websites:\n    factory = Factory(\n        website_url=website, browser=browser, depth=5, max_links_from_page=1\n    )\n    worker = DefaultWorker(factory)\n    data = worker.get_data()\n    data_saver.save(data)\n```\n\n### As CLI tool\n\n```bash\n$ extract-emails --help\n\n$ extract-emails --url https://en.wikipedia.org/wiki/Email -of output.csv -d 1\n$ cat output.csv\nemail,page,website\nbob@b.org,https://en.wikipedia.org/wiki/Email,https://en.wikipedia.org/wiki/Email\n```\n\n### By me a coffee\n\n- **USDT** (TRC20): TXuYegp5L8Zf7wF2YRFjskZwdBxhRpvxBS\n- **BEP20**: 0x4D51Db2B754eA83ce228F7de8EaEB93a88bdC965\n- **TON**: UQA5quJljQz84RwzteN3uuKsdPTDee7a_GF5lgIgezA2oib5\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Extract email addresses and linkedin profiles from given URL.",
    "version": "5.3.4",
    "project_urls": {
        "Documentation": "https://dmitriiweb.github.io/extract-emails",
        "Homepage": "https://github.com/dmitriiweb/extract-emails",
        "Repository": "https://github.com/dmitriiweb/extract-emails"
    },
    "split_keywords": [
        "parser",
        " email",
        " linkedin"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e74e9375e91815e405837083ed16b55b214b11d7da4b12075104b00989e73e10",
                "md5": "b50884c012e4cefedde03ebe6e5749ed",
                "sha256": "f1c1745193d7b3ebc77f03c846290f027ab9a6aa39a117cdf931cf4d46227587"
            },
            "downloads": -1,
            "filename": "extract_emails-5.3.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b50884c012e4cefedde03ebe6e5749ed",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.10",
            "size": 31024,
            "upload_time": "2024-06-02T09:50:16",
            "upload_time_iso_8601": "2024-06-02T09:50:16.208375Z",
            "url": "https://files.pythonhosted.org/packages/e7/4e/9375e91815e405837083ed16b55b214b11d7da4b12075104b00989e73e10/extract_emails-5.3.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0f5e0e23c79df27780dc516bad14fb6b8378780e7601dab1192638c77bab6c78",
                "md5": "9382691bd735de25f15f57ef95b9e56b",
                "sha256": "90a7c680028a582eda7501c79b27c0a2ad46268ae0b7a40bed31f942ae51debd"
            },
            "downloads": -1,
            "filename": "extract_emails-5.3.4.tar.gz",
            "has_sig": false,
            "md5_digest": "9382691bd735de25f15f57ef95b9e56b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.10",
            "size": 20045,
            "upload_time": "2024-06-02T09:50:18",
            "upload_time_iso_8601": "2024-06-02T09:50:18.863647Z",
            "url": "https://files.pythonhosted.org/packages/0f/5e/0e23c79df27780dc516bad14fb6b8378780e7601dab1192638c77bab6c78/extract_emails-5.3.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-02 09:50:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dmitriiweb",
    "github_project": "extract-emails",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "extract-emails"
}
        
Elapsed time: 0.88892s