ghanaweb-scraper


Nameghanaweb-scraper JSON
Version 1.0.2 PyPI version JSON
download
home_pagehttps://github.com/donwany/ghanaweb-scraper
SummaryA python package to scrape data from GhanaWeb
upload_time2023-03-15 15:45:56
maintainer
docs_urlNone
authorTheophilus Siameh
requires_python>=3.7
licenseMIT License
keywords scraper data ghanaweb web scraper ghana scraper
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ### GhanaWeb Scraper
  A simple unofficial python package to scrape data from [Ghanaweb](https://www.ghanaweb.com). Affiliated to [bank-of-ghana-fx-rates](https://pypi.org/project/bank-of-ghana-fx-rates/)

### How to install
```shell
pip install ghanaweb-scraper
```
### Warning: DO NOT RUN IN ONLINE JUPYTERNOTEBOOKS eg. Colabs

### GhanaWeb Urls:
```markdown
urls = [
    "https://www.ghanaweb.com/GhanaHomePage/regional/"	
    "https://www.ghanaweb.com/GhanaHomePage/editorial/"
    "https://www.ghanaweb.com/GhanaHomePage/health/"
    "https://www.ghanaweb.com/GhanaHomePage/diaspora/"
    "https://www.ghanaweb.com/GhanaHomePage/tabloid/"
    "https://www.ghanaweb.com/GhanaHomePage/africa/"
    "https://www.ghanaweb.com/GhanaHomePage/religion/"
    "https://www.ghanaweb.com/GhanaHomePage/NewsArchive/"
    "https://www.ghanaweb.com/GhanaHomePage/business/"
    "https://www.ghanaweb.com/GhanaHomePage/SportsArchive/"
    "https://www.ghanaweb.com/GhanaHomePage/entertainment/"
    "https://www.ghanaweb.com/GhanaHomePage/africa/"
    "https://www.ghanaweb.com/GhanaHomePage/television/"
]
```
### Usage
```python
from ghanaweb.scraper import GhanaWeb

url = 'https://www.ghanaweb.com/GhanaHomePage/politics/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/health/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/crime/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/regional/'
# url = 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'

# web = GhanaWeb(url='https://www.ghanaweb.com/GhanaHomePage/politics/')
web = GhanaWeb(url=url)
# scrape data and save to `current working dir`
web.download(output_dir=None)
```
### scrape list of articles from [GhanaWeb](https://ghanaweb.com)
```python
from ghanaweb.scraper import GhanaWeb

urls = [
        'https://www.ghanaweb.com/GhanaHomePage/politics/',
        'https://www.ghanaweb.com/GhanaHomePage/health/',
        'https://www.ghanaweb.com/GhanaHomePage/crime/',
        'https://www.ghanaweb.com/GhanaHomePage/regional/',
        'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'
    ]

for url in urls:
    print(f"Downloading: {url}")
    web = GhanaWeb(url=url)
    # download to current working directory
    # if no location is specified
    # web.download(output_dir="/Users/tsiameh/Desktop/")
    web.download(output_dir=None)
```

### Scrape data from [MyJoyOnline](https://myjoyonline.com)
```python
from myjoyonline.scraper import MyJoyOnline

url = 'https://www.myjoyonline.com/news/',

print(f"Downloading data from: {url}")
joy = MyJoyOnline(url=url)
# download to current working directory
# if no location is specified
# joy.download(output_dir="/Users/tsiameh/Desktop/")
joy.download()
```
```python
from myjoyonline.scraper import MyJoyOnline

urls = [
        'https://www.myjoyonline.com/news/',
        'https://www.myjoyonline.com/entertainment/',
        'https://www.myjoyonline.com/business/',
        'https://www.myjoyonline.com/sports/',
        'https://www.myjoyonline.com/opinion/'
    ]

for url in urls:
    print(f"Downloading data from: {url}")
    joy = MyJoyOnline(url=url)
    # download to current working directory
    # if no location is specified
    # joy.download(output_dir="/Users/tsiameh/Desktop/")
    joy.download()
```

BuyMeCoffee
-----------
[![Build](https://www.buymeacoffee.com/assets/img/custom_images/yellow_img.png)](https://www.buymeacoffee.com/theodondrew)

Credits
-------
-  `Theophilus Siameh`
<div>
    <a href="https://twitter.com/tsiameh"><img src="https://img.shields.io/twitter/follow/tsiameh?color=blue&logo=twitter&style=flat" alt="tsiameh twitter"></a>
</div>


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/donwany/ghanaweb-scraper",
    "name": "ghanaweb-scraper",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "Scraper,Data,Ghanaweb,Web Scraper,Ghana Scraper",
    "author": "Theophilus Siameh",
    "author_email": "theodondre@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/9e/58/3bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b/ghanaweb-scraper-1.0.2.tar.gz",
    "platform": "any",
    "description": "### GhanaWeb Scraper\n  A simple unofficial python package to scrape data from [Ghanaweb](https://www.ghanaweb.com). Affiliated to [bank-of-ghana-fx-rates](https://pypi.org/project/bank-of-ghana-fx-rates/)\n\n### How to install\n```shell\npip install ghanaweb-scraper\n```\n### Warning: DO NOT RUN IN ONLINE JUPYTERNOTEBOOKS eg. Colabs\n\n### GhanaWeb Urls:\n```markdown\nurls = [\n    \"https://www.ghanaweb.com/GhanaHomePage/regional/\"\t\n    \"https://www.ghanaweb.com/GhanaHomePage/editorial/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/health/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/diaspora/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/tabloid/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/africa/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/religion/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/NewsArchive/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/business/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/SportsArchive/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/entertainment/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/africa/\"\n    \"https://www.ghanaweb.com/GhanaHomePage/television/\"\n]\n```\n### Usage\n```python\nfrom ghanaweb.scraper import GhanaWeb\n\nurl = 'https://www.ghanaweb.com/GhanaHomePage/politics/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/health/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/crime/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/regional/'\n# url = 'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'\n\n# web = GhanaWeb(url='https://www.ghanaweb.com/GhanaHomePage/politics/')\nweb = GhanaWeb(url=url)\n# scrape data and save to `current working dir`\nweb.download(output_dir=None)\n```\n### scrape list of articles from [GhanaWeb](https://ghanaweb.com)\n```python\nfrom ghanaweb.scraper import GhanaWeb\n\nurls = [\n        'https://www.ghanaweb.com/GhanaHomePage/politics/',\n        'https://www.ghanaweb.com/GhanaHomePage/health/',\n        'https://www.ghanaweb.com/GhanaHomePage/crime/',\n        'https://www.ghanaweb.com/GhanaHomePage/regional/',\n        'https://www.ghanaweb.com/GhanaHomePage/year-in-review/'\n    ]\n\nfor url in urls:\n    print(f\"Downloading: {url}\")\n    web = GhanaWeb(url=url)\n    # download to current working directory\n    # if no location is specified\n    # web.download(output_dir=\"/Users/tsiameh/Desktop/\")\n    web.download(output_dir=None)\n```\n\n### Scrape data from [MyJoyOnline](https://myjoyonline.com)\n```python\nfrom myjoyonline.scraper import MyJoyOnline\n\nurl = 'https://www.myjoyonline.com/news/',\n\nprint(f\"Downloading data from: {url}\")\njoy = MyJoyOnline(url=url)\n# download to current working directory\n# if no location is specified\n# joy.download(output_dir=\"/Users/tsiameh/Desktop/\")\njoy.download()\n```\n```python\nfrom myjoyonline.scraper import MyJoyOnline\n\nurls = [\n        'https://www.myjoyonline.com/news/',\n        'https://www.myjoyonline.com/entertainment/',\n        'https://www.myjoyonline.com/business/',\n        'https://www.myjoyonline.com/sports/',\n        'https://www.myjoyonline.com/opinion/'\n    ]\n\nfor url in urls:\n    print(f\"Downloading data from: {url}\")\n    joy = MyJoyOnline(url=url)\n    # download to current working directory\n    # if no location is specified\n    # joy.download(output_dir=\"/Users/tsiameh/Desktop/\")\n    joy.download()\n```\n\nBuyMeCoffee\n-----------\n[![Build](https://www.buymeacoffee.com/assets/img/custom_images/yellow_img.png)](https://www.buymeacoffee.com/theodondrew)\n\nCredits\n-------\n-  `Theophilus Siameh`\n<div>\n    <a href=\"https://twitter.com/tsiameh\"><img src=\"https://img.shields.io/twitter/follow/tsiameh?color=blue&logo=twitter&style=flat\" alt=\"tsiameh twitter\"></a>\n</div>\n\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A python package to scrape data from GhanaWeb",
    "version": "1.0.2",
    "split_keywords": [
        "scraper",
        "data",
        "ghanaweb",
        "web scraper",
        "ghana scraper"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0db203998f1f2a85a47c0cf2793cf284e53dc939ed1d7e61ef612819568664a6",
                "md5": "45effafa969a65198f84cfc091d6ff66",
                "sha256": "08550c2602785fe30d5ce253b978173ced00040d1160da6e94ee6acdb93054ec"
            },
            "downloads": -1,
            "filename": "ghanaweb_scraper-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "45effafa969a65198f84cfc091d6ff66",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 5782,
            "upload_time": "2023-03-15T15:45:53",
            "upload_time_iso_8601": "2023-03-15T15:45:53.990994Z",
            "url": "https://files.pythonhosted.org/packages/0d/b2/03998f1f2a85a47c0cf2793cf284e53dc939ed1d7e61ef612819568664a6/ghanaweb_scraper-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9e583bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b",
                "md5": "f6fcd2cbf91e78965363fcce052b12b4",
                "sha256": "a27931e95deb115bdc255e9d77a778628debf06b818b70509d12d34382efa3dd"
            },
            "downloads": -1,
            "filename": "ghanaweb-scraper-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f6fcd2cbf91e78965363fcce052b12b4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4134,
            "upload_time": "2023-03-15T15:45:56",
            "upload_time_iso_8601": "2023-03-15T15:45:56.266186Z",
            "url": "https://files.pythonhosted.org/packages/9e/58/3bf9643febbf1bfcec7e574c3fb14afd8bf7e07d03ea1f0863e6391d6b5b/ghanaweb-scraper-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-15 15:45:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "donwany",
    "github_project": "ghanaweb-scraper",
    "lcname": "ghanaweb-scraper"
}
        
Elapsed time: 0.06322s