### GhanaWeb Scraper
A simple unofficial python package to scrape data from [Ghanaweb](https://www.ghanaweb.com). Affiliated to [bank-of-ghana-fx-rates](https://pypi.org/project/bank-of-ghana-fx-rates/)
### How to install
```shell
pip install ghanaweb-scraper
```
### Warning: DO NOT RUN IN ONLINE JUPYTERNOTEBOOKS eg. Colabs
### GhanaWeb Urls:
```markdown
urls = [
"https://www.ghanaweb.com/GhanaHomePage/regional/"
"https://www.ghanaweb.com/GhanaHomePage/editorial/"
"https://www.ghanaweb.com/GhanaHomePage/health/"
"https://www.ghanaweb.com/GhanaHomePage/diaspora/"
"https://www.ghanaweb.com/GhanaHomePage/tabloid/"
"https://www.ghanaweb.com/GhanaHomePage/africa/"
"https://www.ghanaweb.com/GhanaHomePage/religion/"
"https://www.ghanaweb.com/GhanaHomePage/NewsArchive/"
"https://www.ghanaweb.com/GhanaHomePage/business/"
"https://www.ghanaweb.com/GhanaHomePage/SportsArchive/"
"https://www.ghanaweb.com/GhanaHomePage/entertainment/"
"https://www.ghanaweb.com/GhanaHomePage/africa/"
"https://www.ghanaweb.com/GhanaHomePage/television/"
]
```
### Usage
```python
from ghanaweb.scraper import GhanaWeb
url = 'https://www.ghanaweb.com/GhanaHomePage/politics/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/health/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/crime/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/regional/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'
# web = GhanaWeb(url='https://www.ghanaweb.com/GhanaHomePage/politics/')
web = GhanaWeb(url=url)
# scrape data and save to `current working dir`
web.download(output_dir=None)
```
### scrape list of articles from [GhanaWeb](https://ghanaweb.com)
```python
from ghanaweb.scraper import GhanaWeb
urls = [
'https://www.ghanaweb.com/GhanaHomePage/politics/',
'https://www.ghanaweb.com/GhanaHomePage/health/',
'https://www.ghanaweb.com/GhanaHomePage/crime/',
'https://www.ghanaweb.com/GhanaHomePage/regional/',
'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'
]
for url in urls:
print(f"Downloading: {url}")
web = GhanaWeb(url=url)
# download to current working directory
# if no location is specified
# web.download(output_dir="/Users/tsiameh/Desktop/")
web.download(output_dir=None)
```
### Scrape data from [MyJoyOnline](https://myjoyonline.com)
```python
from myjoyonline.scraper import MyJoyOnline
url = 'https://www.myjoyonline.com/news/',
print(f"Downloading data from: {url}")
joy = MyJoyOnline(url=url)
# download to current working directory
# if no location is specified
# joy.download(output_dir="/Users/tsiameh/Desktop/")
joy.download()
```
```python
from myjoyonline.scraper import MyJoyOnline
urls = [
'https://www.myjoyonline.com/news/',
'https://www.myjoyonline.com/entertainment/',
'https://www.myjoyonline.com/business/',
'https://www.myjoyonline.com/sports/',
'https://www.myjoyonline.com/opinion/'
]
for url in urls:
print(f"Downloading data from: {url}")
joy = MyJoyOnline(url=url)
# download to current working directory
# if no location is specified
# joy.download(output_dir="/Users/tsiameh/Desktop/")
joy.download()
```
BuyMeCoffee
-----------
[![Build](https://www.buymeacoffee.com/assets/img/custom_images/yellow_img.png)](https://www.buymeacoffee.com/theodondrew)
Credits
-------
- `Theophilus Siameh`
<div>
<a href="https://twitter.com/tsiameh"><img src="https://img.shields.io/twitter/follow/tsiameh?color=blue&logo=twitter&style=flat" alt="tsiameh twitter"></a>
</div>
Raw data
{
"_id": null,
"home_page": "https://github.com/donwany/ghanaweb-scraper",
"name": "ghanaweb-scraper",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "Scraper,Data,Ghanaweb,Web Scraper,Ghana Scraper",
"author": "Theophilus Siameh",
"author_email": "theodondre@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/9e/58/3bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b/ghanaweb-scraper-1.0.2.tar.gz",
"platform": "any",
"description": "### GhanaWeb Scraper\n A simple unofficial python package to scrape data from [Ghanaweb](https://www.ghanaweb.com). Affiliated to [bank-of-ghana-fx-rates](https://pypi.org/project/bank-of-ghana-fx-rates/)\n\n### How to install\n```shell\npip install ghanaweb-scraper\n```\n### Warning: DO NOT RUN IN ONLINE JUPYTERNOTEBOOKS eg. Colabs\n\n### GhanaWeb Urls:\n```markdown\nurls = [\n \"https://www.ghanaweb.com/GhanaHomePage/regional/\"\t\n \"https://www.ghanaweb.com/GhanaHomePage/editorial/\"\n \"https://www.ghanaweb.com/GhanaHomePage/health/\"\n \"https://www.ghanaweb.com/GhanaHomePage/diaspora/\"\n \"https://www.ghanaweb.com/GhanaHomePage/tabloid/\"\n \"https://www.ghanaweb.com/GhanaHomePage/africa/\"\n \"https://www.ghanaweb.com/GhanaHomePage/religion/\"\n \"https://www.ghanaweb.com/GhanaHomePage/NewsArchive/\"\n \"https://www.ghanaweb.com/GhanaHomePage/business/\"\n \"https://www.ghanaweb.com/GhanaHomePage/SportsArchive/\"\n \"https://www.ghanaweb.com/GhanaHomePage/entertainment/\"\n \"https://www.ghanaweb.com/GhanaHomePage/africa/\"\n \"https://www.ghanaweb.com/GhanaHomePage/television/\"\n]\n```\n### Usage\n```python\nfrom ghanaweb.scraper import GhanaWeb\n\nurl = 'https://www.ghanaweb.com/GhanaHomePage/politics/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/health/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/crime/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/regional/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'\n\n# web = GhanaWeb(url='https://www.ghanaweb.com/GhanaHomePage/politics/')\nweb = GhanaWeb(url=url)\n# scrape data and save to `current working dir`\nweb.download(output_dir=None)\n```\n### scrape list of articles from [GhanaWeb](https://ghanaweb.com)\n```python\nfrom ghanaweb.scraper import GhanaWeb\n\nurls = [\n 'https://www.ghanaweb.com/GhanaHomePage/politics/',\n 'https://www.ghanaweb.com/GhanaHomePage/health/',\n 'https://www.ghanaweb.com/GhanaHomePage/crime/',\n 'https://www.ghanaweb.com/GhanaHomePage/regional/',\n 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'\n ]\n\nfor url in urls:\n print(f\"Downloading: {url}\")\n web = GhanaWeb(url=url)\n # download to current working directory\n # if no location is specified\n # web.download(output_dir=\"/Users/tsiameh/Desktop/\")\n web.download(output_dir=None)\n```\n\n### Scrape data from [MyJoyOnline](https://myjoyonline.com)\n```python\nfrom myjoyonline.scraper import MyJoyOnline\n\nurl = 'https://www.myjoyonline.com/news/',\n\nprint(f\"Downloading data from: {url}\")\njoy = MyJoyOnline(url=url)\n# download to current working directory\n# if no location is specified\n# joy.download(output_dir=\"/Users/tsiameh/Desktop/\")\njoy.download()\n```\n```python\nfrom myjoyonline.scraper import MyJoyOnline\n\nurls = [\n 'https://www.myjoyonline.com/news/',\n 'https://www.myjoyonline.com/entertainment/',\n 'https://www.myjoyonline.com/business/',\n 'https://www.myjoyonline.com/sports/',\n 'https://www.myjoyonline.com/opinion/'\n ]\n\nfor url in urls:\n print(f\"Downloading data from: {url}\")\n joy = MyJoyOnline(url=url)\n # download to current working directory\n # if no location is specified\n # joy.download(output_dir=\"/Users/tsiameh/Desktop/\")\n joy.download()\n```\n\nBuyMeCoffee\n-----------\n[![Build](https://www.buymeacoffee.com/assets/img/custom_images/yellow_img.png)](https://www.buymeacoffee.com/theodondrew)\n\nCredits\n-------\n- `Theophilus Siameh`\n<div>\n <a href=\"https://twitter.com/tsiameh\"><img src=\"https://img.shields.io/twitter/follow/tsiameh?color=blue&logo=twitter&style=flat\" alt=\"tsiameh twitter\"></a>\n</div>\n\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "A python package to scrape data from GhanaWeb",
"version": "1.0.2",
"split_keywords": [
"scraper",
"data",
"ghanaweb",
"web scraper",
"ghana scraper"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0db203998f1f2a85a47c0cf2793cf284e53dc939ed1d7e61ef612819568664a6",
"md5": "45effafa969a65198f84cfc091d6ff66",
"sha256": "08550c2602785fe30d5ce253b978173ced00040d1160da6e94ee6acdb93054ec"
},
"downloads": -1,
"filename": "ghanaweb_scraper-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "45effafa969a65198f84cfc091d6ff66",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 5782,
"upload_time": "2023-03-15T15:45:53",
"upload_time_iso_8601": "2023-03-15T15:45:53.990994Z",
"url": "https://files.pythonhosted.org/packages/0d/b2/03998f1f2a85a47c0cf2793cf284e53dc939ed1d7e61ef612819568664a6/ghanaweb_scraper-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9e583bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b",
"md5": "f6fcd2cbf91e78965363fcce052b12b4",
"sha256": "a27931e95deb115bdc255e9d77a778628debf06b818b70509d12d34382efa3dd"
},
"downloads": -1,
"filename": "ghanaweb-scraper-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "f6fcd2cbf91e78965363fcce052b12b4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 4134,
"upload_time": "2023-03-15T15:45:56",
"upload_time_iso_8601": "2023-03-15T15:45:56.266186Z",
"url": "https://files.pythonhosted.org/packages/9e/58/3bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b/ghanaweb-scraper-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-15 15:45:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "donwany",
"github_project": "ghanaweb-scraper",
"lcname": "ghanaweb-scraper"
}