# Extract Emails
![Image](https://github.com/dmitriiweb/extract-emails/blob/docs_improvements/images/email.png?raw=true)
[![PyPI version](https://badge.fury.io/py/extract-emails.svg)](https://badge.fury.io/py/extract-emails)
Extract emails and linkedins profiles from a given website
**Support the project with BTC**: *bc1q0cxl5j3se0ufhr96h8x0zs8nz4t7h6krrxkd6l*
[Documentation](https://dmitriiweb.github.io/extract-emails/)
## Requirements
- Python >= 3.10
## Installation
```bash
pip install extract_emails[all]
# or
pip install extract_emails[requests]
# or
pip install extract_emails[selenium]
```
## Simple Usage
### As library
```python
from pathlib import Path
from extract_emails import DefaultFilterAndEmailFactory as Factory
from extract_emails import DefaultWorker
from extract_emails.browsers.requests_browser import RequestsBrowser as Browser
from extract_emails.data_savers import CsvSaver
websites = [
"website1.com",
"website2.com",
]
browser = Browser()
data_saver = CsvSaver(save_mode="a", output_path=Path("output.csv"))
for website in websites:
factory = Factory(
website_url=website, browser=browser, depth=5, max_links_from_page=1
)
worker = DefaultWorker(factory)
data = worker.get_data()
data_saver.save(data)
```
### As CLI tool
```bash
$ extract-emails --help
$ extract-emails --url https://en.wikipedia.org/wiki/Email -of output.csv -d 1
$ cat output.csv
email,page,website
bob@b.org,https://en.wikipedia.org/wiki/Email,https://en.wikipedia.org/wiki/Email
```
### By me a coffee
- **USDT** (TRC20): TXuYegp5L8Zf7wF2YRFjskZwdBxhRpvxBS
- **BEP20**: 0x4D51Db2B754eA83ce228F7de8EaEB93a88bdC965
- **TON**: UQA5quJljQz84RwzteN3uuKsdPTDee7a_GF5lgIgezA2oib5
Raw data
{
"_id": null,
"home_page": "https://github.com/dmitriiweb/extract-emails",
"name": "extract-emails",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "parser, email, linkedin",
"author": "Dmitrii Kurlov",
"author_email": "dmitriik@tutanota.com",
"download_url": "https://files.pythonhosted.org/packages/0f/5e/0e23c79df27780dc516bad14fb6b8378780e7601dab1192638c77bab6c78/extract_emails-5.3.4.tar.gz",
"platform": null,
"description": "# Extract Emails\n\n![Image](https://github.com/dmitriiweb/extract-emails/blob/docs_improvements/images/email.png?raw=true)\n\n[![PyPI version](https://badge.fury.io/py/extract-emails.svg)](https://badge.fury.io/py/extract-emails)\n\nExtract emails and linkedins profiles from a given website\n\n**Support the project with BTC**: *bc1q0cxl5j3se0ufhr96h8x0zs8nz4t7h6krrxkd6l*\n\n[Documentation](https://dmitriiweb.github.io/extract-emails/)\n\n## Requirements\n\n- Python >= 3.10\n\n## Installation\n\n```bash\npip install extract_emails[all]\n# or\npip install extract_emails[requests]\n# or\npip install extract_emails[selenium]\n```\n\n## Simple Usage\n\n### As library\n\n```python\nfrom pathlib import Path\n\nfrom extract_emails import DefaultFilterAndEmailFactory as Factory\nfrom extract_emails import DefaultWorker\nfrom extract_emails.browsers.requests_browser import RequestsBrowser as Browser\nfrom extract_emails.data_savers import CsvSaver\n\n\nwebsites = [\n \"website1.com\",\n \"website2.com\",\n]\n\nbrowser = Browser()\ndata_saver = CsvSaver(save_mode=\"a\", output_path=Path(\"output.csv\"))\n\nfor website in websites:\n factory = Factory(\n website_url=website, browser=browser, depth=5, max_links_from_page=1\n )\n worker = DefaultWorker(factory)\n data = worker.get_data()\n data_saver.save(data)\n```\n\n### As CLI tool\n\n```bash\n$ extract-emails --help\n\n$ extract-emails --url https://en.wikipedia.org/wiki/Email -of output.csv -d 1\n$ cat output.csv\nemail,page,website\nbob@b.org,https://en.wikipedia.org/wiki/Email,https://en.wikipedia.org/wiki/Email\n```\n\n### By me a coffee\n\n- **USDT** (TRC20): TXuYegp5L8Zf7wF2YRFjskZwdBxhRpvxBS\n- **BEP20**: 0x4D51Db2B754eA83ce228F7de8EaEB93a88bdC965\n- **TON**: UQA5quJljQz84RwzteN3uuKsdPTDee7a_GF5lgIgezA2oib5\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Extract email addresses and linkedin profiles from given URL.",
"version": "5.3.4",
"project_urls": {
"Documentation": "https://dmitriiweb.github.io/extract-emails",
"Homepage": "https://github.com/dmitriiweb/extract-emails",
"Repository": "https://github.com/dmitriiweb/extract-emails"
},
"split_keywords": [
"parser",
" email",
" linkedin"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e74e9375e91815e405837083ed16b55b214b11d7da4b12075104b00989e73e10",
"md5": "b50884c012e4cefedde03ebe6e5749ed",
"sha256": "f1c1745193d7b3ebc77f03c846290f027ab9a6aa39a117cdf931cf4d46227587"
},
"downloads": -1,
"filename": "extract_emails-5.3.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b50884c012e4cefedde03ebe6e5749ed",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 31024,
"upload_time": "2024-06-02T09:50:16",
"upload_time_iso_8601": "2024-06-02T09:50:16.208375Z",
"url": "https://files.pythonhosted.org/packages/e7/4e/9375e91815e405837083ed16b55b214b11d7da4b12075104b00989e73e10/extract_emails-5.3.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0f5e0e23c79df27780dc516bad14fb6b8378780e7601dab1192638c77bab6c78",
"md5": "9382691bd735de25f15f57ef95b9e56b",
"sha256": "90a7c680028a582eda7501c79b27c0a2ad46268ae0b7a40bed31f942ae51debd"
},
"downloads": -1,
"filename": "extract_emails-5.3.4.tar.gz",
"has_sig": false,
"md5_digest": "9382691bd735de25f15f57ef95b9e56b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 20045,
"upload_time": "2024-06-02T09:50:18",
"upload_time_iso_8601": "2024-06-02T09:50:18.863647Z",
"url": "https://files.pythonhosted.org/packages/0f/5e/0e23c79df27780dc516bad14fb6b8378780e7601dab1192638c77bab6c78/extract_emails-5.3.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-02 09:50:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dmitriiweb",
"github_project": "extract-emails",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "extract-emails"
}