# NewsCatcher News API V2 SDK for Python
The official Python client library to manipulate [NewsCatcher News API V2](https://newscatcherapi.com/news-api) from your Python application.
Documentation is identical with the API documentation. The same parameters and filters are available.
And the same response structure. You can have a look at [docs.newscatcherapi.com](https://docs.newscatcherapi.com).
## Authentication
The Authentication is done via the `x_api_key` variable.
Receive your API key by registering at [app.newscatcherapi.com](https://app.newscatcherapi.com).
## Installation
```pip install newscatcherapi```
## Quick Start
Import installed package.
`````from newscatcherapi import NewsCatcherApiClient`````
Init the instance with an API key given after registration.
````newscatcherapi = NewsCatcherApiClient(x_api_key='YOUR_API_KEY') ````
## Endpoints
An instance of `NewsCatcherApiClient` has three main methods that correspond to three endpoints available for NewsCatcher News API.
### Get News (/v2/search)
Main method that allows you to find news article by keyword, date, language, country, etc.
```
all_articles = newscatcherapi.get_search(q='Elon Musk',
lang='en',
countries='CA',
page_size=100)
```
### Get News Extracting All Pages (/v2/search)
It is the same method as *get_search*, but you can program to extract all articles without changing `page` param manually.
For example: for a given search you have 1000 found articles. *get_search* makes one API call and returns up to 100 articles.
*get_search_all_pages* will make 10 API calls and will return all 1000 articles.
Two new parameters:
- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.
- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second.
```
all_articles = newscatcherapi.get_search_all_pages(q='Elon Musk',
lang='en',
countries='CA',
page_size=100,
max_page=10,
seconds_pause=1.0
)
```
### Get News Extracting All Articles (/v2/search)
It is the same method as *get_search*, but you can fetch all articles without changing `page`, `from_`, and `to_` params manually.
For example: for a given search you have found more than 10000 articles. *get_search* makes one API call and returns up to 100 articles.
*get_search_all_pages* will make 100 API calls and will return 10000 articles. The *get_search_all_articles* method will return all articles.
One new parameters:
- `by` - How to divide the the time interval between to_ and from_ in order to extract all articles for the given search query. By default it is set to `week`. Accepted values: `month`, `week`, `day`, `hour`.
```
all_articles = newscatcherapi.get_search_all_articles(q='Elon Musk',
lang='en',
countries='CA',
page_size=100,
by = 'day'
)
```
### Get Latest Headlines (/v2/latest_headlines)
Get the latest headlines given any topic, country, sources, or language.
```
top_headlines = newscatcherapi.get_latest_headlines(lang='en',
countries='us',
topic='business')
```
### Get Latest Headlines Extracting All Pages (/v2/latest_headlines)
It is the same function as *get_latest_headlines*, but you can program to extract all articles without changing `page` param manually.
For example: for a given search you have 1000 found articles. *get_latest_headlines* makes one API call and returns up to 100 articles.
*get_latest_headlines_all_pages* will make 10 API calls and will return all 1000 articles.
Two new parameters:
- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.
- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second.
```
top_headlines = newscatcherapi.get_latest_headlines_all_pages(lang='en',
countries='us',
topic='business',
max_page=10,
seconds_pause=1.0
)
```
### Get Sources (/v2/sources)
Returns a list of the top 100 supported news websites. Overall, we support over 60,000 websites. Using this method, you may find the top 100 for your specific language, country, topic combination.
```
sources = newscatcherapi.get_sources(topic='business',
lang='en',
countries='US')
```
### Every endpoint supports _proxies_ parameter
If you want to use proxies, you can add this parameter to all the endpoints we have.
Here is an example of a valid form proxies parameter and an example of using it with one of the endpoints.
```
proxies = {
'http': 'http://proxy.example.com:8080',
'https': 'http://secureproxy.example.com:8090',
}
all_articles = newscatcherapi.get_search(q='Elon Musk',
lang='en',
countries='CA',
page_size=100,
proxies=proxies)
```
### Use *from_* and *to_* instead of *from* and *to* like in NewsCatcher News API
In Python, we are not allowed to reserve variable names *from* and *to*. If you try to use them, you will get a syntax error:
```SyntaxError: invalid syntax```
So, here is an example on how to use time variables *from_* and *to_* in *get_search* method.
```
all_articles = newscatcherapi.get_search(q='Elon Musk',
lang='en',
countries='CA,US',
from_='2021/08/20',
to_='2021/08/31')
```
## Feedback
Feel free to contact us if you have spot a bug or have any suggestion at maksym`[at]`newscatcherapi.com
Raw data
{
"_id": null,
"home_page": "https://newscatcherapi.com/",
"name": "newscatcherapi",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6.0",
"maintainer_email": null,
"keywords": "News, RSS, Scraping, Data Mining, News Extraction",
"author": "Maksym Sugonyaka",
"author_email": "maksym@newscatcherapi.com",
"download_url": "https://files.pythonhosted.org/packages/62/d0/68bb064b6905061b3871f84d272420294d7a247e57620fb07f27584638eb/newscatcherapi-0.7.3.tar.gz",
"platform": null,
"description": "# NewsCatcher News API V2 SDK for Python\n\nThe official Python client library to manipulate [NewsCatcher News API V2](https://newscatcherapi.com/news-api) from your Python application.\n\nDocumentation is identical with the API documentation. The same parameters and filters are available. \nAnd the same response structure. You can have a look at [docs.newscatcherapi.com](https://docs.newscatcherapi.com).\n\n## Authentication\n\nThe Authentication is done via the `x_api_key` variable.\n\nReceive your API key by registering at [app.newscatcherapi.com](https://app.newscatcherapi.com).\n\n## Installation\n```pip install newscatcherapi```\n\n## Quick Start\nImport installed package.\n\n`````from newscatcherapi import NewsCatcherApiClient`````\n\nInit the instance with an API key given after registration.\n\n````newscatcherapi = NewsCatcherApiClient(x_api_key='YOUR_API_KEY') ````\n\n## Endpoints\nAn instance of `NewsCatcherApiClient` has three main methods that correspond to three endpoints available for NewsCatcher News API.\n\n### Get News (/v2/search)\nMain method that allows you to find news article by keyword, date, language, country, etc.\n\n```\nall_articles = newscatcherapi.get_search(q='Elon Musk',\n lang='en',\n countries='CA',\n page_size=100)\n```\n\n### Get News Extracting All Pages (/v2/search)\nIt is the same method as *get_search*, but you can program to extract all articles without changing `page` param manually. \n\nFor example: for a given search you have 1000 found articles. *get_search* makes one API call and returns up to 100 articles. \n*get_search_all_pages* will make 10 API calls and will return all 1000 articles. \n\nTwo new parameters:\n- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.\n- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second. \n\n```\nall_articles = newscatcherapi.get_search_all_pages(q='Elon Musk',\n lang='en',\n countries='CA',\n page_size=100,\n max_page=10,\n seconds_pause=1.0\n )\n ```\n\n\n### Get News Extracting All Articles (/v2/search)\nIt is the same method as *get_search*, but you can fetch all articles without changing `page`, `from_`, and `to_` params manually. \n\u200b\nFor example: for a given search you have found more than 10000 articles. *get_search* makes one API call and returns up to 100 articles. \n*get_search_all_pages* will make 100 API calls and will return 10000 articles. The *get_search_all_articles* method will return all articles. \n\u200b\n\nOne new parameters:\n- `by` - How to divide the the time interval between to_ and from_ in order to extract all articles for the given search query. By default it is set to `week`. Accepted values: `month`, `week`, `day`, `hour`.\n\u200b\n```\nall_articles = newscatcherapi.get_search_all_articles(q='Elon Musk',\n lang='en',\n countries='CA',\n page_size=100,\n by = 'day'\n )\n ```\n\n### Get Latest Headlines (/v2/latest_headlines)\nGet the latest headlines given any topic, country, sources, or language.\n\n```\ntop_headlines = newscatcherapi.get_latest_headlines(lang='en',\n countries='us',\n topic='business')\n ```\n\n### Get Latest Headlines Extracting All Pages (/v2/latest_headlines)\nIt is the same function as *get_latest_headlines*, but you can program to extract all articles without changing `page` param manually. \n\nFor example: for a given search you have 1000 found articles. *get_latest_headlines* makes one API call and returns up to 100 articles. \n*get_latest_headlines_all_pages* will make 10 API calls and will return all 1000 articles. \n\nTwo new parameters:\n- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.\n- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second. \n\n```\ntop_headlines = newscatcherapi.get_latest_headlines_all_pages(lang='en',\n countries='us', \n topic='business',\n max_page=10,\n seconds_pause=1.0\n )\n ```\n\n### Get Sources (/v2/sources)\nReturns a list of the top 100 supported news websites. Overall, we support over 60,000 websites. Using this method, you may find the top 100 for your specific language, country, topic combination.\n\n```\nsources = newscatcherapi.get_sources(topic='business',\n lang='en',\n countries='US')\n ```\n\n### Every endpoint supports _proxies_ parameter\nIf you want to use proxies, you can add this parameter to all the endpoints we have.\nHere is an example of a valid form proxies parameter and an example of using it with one of the endpoints. \n\n```\nproxies = {\n 'http': 'http://proxy.example.com:8080',\n 'https': 'http://secureproxy.example.com:8090',\n}\n\nall_articles = newscatcherapi.get_search(q='Elon Musk',\n lang='en',\n countries='CA',\n page_size=100,\n proxies=proxies)\n```\n\n\n### Use *from_* and *to_* instead of *from* and *to* like in NewsCatcher News API\nIn Python, we are not allowed to reserve variable names *from* and *to*. If you try to use them, you will get a syntax error:\n\n```SyntaxError: invalid syntax``` \n\nSo, here is an example on how to use time variables *from_* and *to_* in *get_search* method.\n\n```\nall_articles = newscatcherapi.get_search(q='Elon Musk',\n lang='en',\n countries='CA,US',\n from_='2021/08/20',\n to_='2021/08/31')\n```\n\n## Feedback\n\nFeel free to contact us if you have spot a bug or have any suggestion at maksym`[at]`newscatcherapi.com\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "NewsCatcher News API V2 SDK for Python",
"version": "0.7.3",
"project_urls": {
"Homepage": "https://newscatcherapi.com/"
},
"split_keywords": [
"news",
" rss",
" scraping",
" data mining",
" news extraction"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e3cd352b8112bd3f6bc6775d1136d5892e1b988f3847a35c5e18382b813459b3",
"md5": "9c5581d6657faa7a7baf35463db81961",
"sha256": "c50d7efee72e06fd9b671db8b8c47a0bd10a3c4eb1894f97a7b11796f4afcd25"
},
"downloads": -1,
"filename": "newscatcherapi-0.7.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9c5581d6657faa7a7baf35463db81961",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6.0",
"size": 12774,
"upload_time": "2024-09-20T16:18:28",
"upload_time_iso_8601": "2024-09-20T16:18:28.238790Z",
"url": "https://files.pythonhosted.org/packages/e3/cd/352b8112bd3f6bc6775d1136d5892e1b988f3847a35c5e18382b813459b3/newscatcherapi-0.7.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "62d068bb064b6905061b3871f84d272420294d7a247e57620fb07f27584638eb",
"md5": "ea511559b5fc02245f5b680fe0dde00a",
"sha256": "6efe34ce2b8ca4987494670a400844826b4997092423c18d8ff58eaa55e63428"
},
"downloads": -1,
"filename": "newscatcherapi-0.7.3.tar.gz",
"has_sig": false,
"md5_digest": "ea511559b5fc02245f5b680fe0dde00a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6.0",
"size": 12369,
"upload_time": "2024-09-20T16:18:29",
"upload_time_iso_8601": "2024-09-20T16:18:29.618073Z",
"url": "https://files.pythonhosted.org/packages/62/d0/68bb064b6905061b3871f84d272420294d7a247e57620fb07f27584638eb/newscatcherapi-0.7.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-20 16:18:29",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "newscatcherapi"
}