newscatcherapi


Namenewscatcherapi JSON
Version 0.7.3 PyPI version JSON
download
home_pagehttps://newscatcherapi.com/
SummaryNewsCatcher News API V2 SDK for Python
upload_time2024-09-20 16:18:29
maintainerNone
docs_urlNone
authorMaksym Sugonyaka
requires_python>=3.6.0
licenseMIT
keywords news rss scraping data mining news extraction
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NewsCatcher News API V2 SDK for Python

The official Python client library to manipulate [NewsCatcher News API V2](https://newscatcherapi.com/news-api) from your Python application.

Documentation is identical with the API documentation. The same parameters and filters are available. 
And the same response structure. You can have a look at [docs.newscatcherapi.com](https://docs.newscatcherapi.com).

## Authentication

The Authentication is done via the `x_api_key` variable.

Receive your API key by registering at [app.newscatcherapi.com](https://app.newscatcherapi.com).

## Installation
```pip install newscatcherapi```

## Quick Start
Import installed package.

`````from newscatcherapi import NewsCatcherApiClient`````

Init the instance with an API key given after registration.

````newscatcherapi = NewsCatcherApiClient(x_api_key='YOUR_API_KEY') ````

## Endpoints
An instance of `NewsCatcherApiClient` has three main methods that correspond to three endpoints available for NewsCatcher News API.

### Get News (/v2/search)
Main method that allows you to find news article by keyword, date, language, country, etc.

```
all_articles = newscatcherapi.get_search(q='Elon Musk',
                                         lang='en',
                                         countries='CA',
                                         page_size=100)
```

### Get News Extracting All Pages (/v2/search)
It is the same method as *get_search*, but you can program to extract all articles without changing `page` param manually. 

For example: for a given search you have 1000 found articles.  *get_search* makes one API call and returns up to 100 articles. 
*get_search_all_pages* will make 10 API calls and will return all 1000 articles. 

Two new parameters:
- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.
- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second. 

```
all_articles = newscatcherapi.get_search_all_pages(q='Elon Musk',
                                         lang='en',
                                         countries='CA',
                                         page_size=100,
                                         max_page=10,
                                         seconds_pause=1.0
                                         )
 ```


### Get News Extracting All Articles (/v2/search)
It is the same method as *get_search*, but you can fetch all articles without changing `page`, `from_`, and `to_` params manually. 
​
For example: for a given search you have found more than 10000 articles.  *get_search* makes one API call and returns up to 100 articles. 
*get_search_all_pages* will make 100 API calls and will return 10000 articles. The *get_search_all_articles* method will return all articles. 
​

One new parameters:
- `by` - How to divide the the time interval between to_ and from_ in order to extract all articles for the given search query. By default it is set to `week`. Accepted values: `month`, `week`, `day`, `hour`.
​
```
all_articles = newscatcherapi.get_search_all_articles(q='Elon Musk',
                                         lang='en',
                                         countries='CA',
                                         page_size=100,
                                         by = 'day'
                                         )
 ```

### Get Latest Headlines (/v2/latest_headlines)
Get the latest headlines given any topic, country, sources, or language.

```
top_headlines = newscatcherapi.get_latest_headlines(lang='en',
                                                    countries='us',
                                                    topic='business')
 ```

### Get Latest Headlines Extracting All Pages (/v2/latest_headlines)
It is the same function as *get_latest_headlines*, but you can program to extract all articles without changing `page` param manually. 

For example: for a given search you have 1000 found articles.  *get_latest_headlines* makes one API call and returns up to 100 articles. 
*get_latest_headlines_all_pages* will make 10 API calls and will return all 1000 articles. 

Two new parameters:
- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.
- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second. 

```
top_headlines = newscatcherapi.get_latest_headlines_all_pages(lang='en',
                                                    countries='us', 
                                                    topic='business',
                                                    max_page=10,
                                                    seconds_pause=1.0
                                                    )
 ```

### Get Sources (/v2/sources)
Returns a list of the top 100 supported news websites. Overall, we support over 60,000 websites. Using this method, you may find the top 100 for your specific language, country, topic combination.

```
sources = newscatcherapi.get_sources(topic='business',
                                     lang='en',
                                     countries='US')
 ```

### Every endpoint supports _proxies_ parameter
If you want to use proxies, you can add this parameter to all the endpoints we have.
Here is an example of a valid form proxies parameter and an example of using it with one of the endpoints. 

```
proxies = {
   'http': 'http://proxy.example.com:8080',
   'https': 'http://secureproxy.example.com:8090',
}

all_articles = newscatcherapi.get_search(q='Elon Musk',
                                         lang='en',
                                         countries='CA',
                                         page_size=100,
                                         proxies=proxies)
```


### Use *from_* and *to_* instead of *from* and *to* like in NewsCatcher News API
In Python, we are not allowed to reserve variable names *from* and *to*. If you try to use them, you will get a syntax error:

```SyntaxError: invalid syntax``` 

So, here is an example on how to use time variables *from_* and *to_* in *get_search* method.

```
all_articles = newscatcherapi.get_search(q='Elon Musk',
                                         lang='en',
                                         countries='CA,US',
                                         from_='2021/08/20',
                                         to_='2021/08/31')
```

## Feedback

Feel free to contact us if you have spot a bug or have any suggestion at maksym`[at]`newscatcherapi.com

            

Raw data

            {
    "_id": null,
    "home_page": "https://newscatcherapi.com/",
    "name": "newscatcherapi",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6.0",
    "maintainer_email": null,
    "keywords": "News, RSS, Scraping, Data Mining, News Extraction",
    "author": "Maksym Sugonyaka",
    "author_email": "maksym@newscatcherapi.com",
    "download_url": "https://files.pythonhosted.org/packages/62/d0/68bb064b6905061b3871f84d272420294d7a247e57620fb07f27584638eb/newscatcherapi-0.7.3.tar.gz",
    "platform": null,
    "description": "# NewsCatcher News API V2 SDK for Python\n\nThe official Python client library to manipulate [NewsCatcher News API V2](https://newscatcherapi.com/news-api) from your Python application.\n\nDocumentation is identical with the API documentation. The same parameters and filters are available. \nAnd the same response structure. You can have a look at [docs.newscatcherapi.com](https://docs.newscatcherapi.com).\n\n## Authentication\n\nThe Authentication is done via the `x_api_key` variable.\n\nReceive your API key by registering at [app.newscatcherapi.com](https://app.newscatcherapi.com).\n\n## Installation\n```pip install newscatcherapi```\n\n## Quick Start\nImport installed package.\n\n`````from newscatcherapi import NewsCatcherApiClient`````\n\nInit the instance with an API key given after registration.\n\n````newscatcherapi = NewsCatcherApiClient(x_api_key='YOUR_API_KEY') ````\n\n## Endpoints\nAn instance of `NewsCatcherApiClient` has three main methods that correspond to three endpoints available for NewsCatcher News API.\n\n### Get News (/v2/search)\nMain method that allows you to find news article by keyword, date, language, country, etc.\n\n```\nall_articles = newscatcherapi.get_search(q='Elon Musk',\n                                         lang='en',\n                                         countries='CA',\n                                         page_size=100)\n```\n\n### Get News Extracting All Pages (/v2/search)\nIt is the same method as *get_search*, but you can program to extract all articles without changing `page` param manually. \n\nFor example: for a given search you have 1000 found articles.  *get_search* makes one API call and returns up to 100 articles. \n*get_search_all_pages* will make 10 API calls and will return all 1000 articles. \n\nTwo new parameters:\n- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.\n- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second. \n\n```\nall_articles = newscatcherapi.get_search_all_pages(q='Elon Musk',\n                                         lang='en',\n                                         countries='CA',\n                                         page_size=100,\n                                         max_page=10,\n                                         seconds_pause=1.0\n                                         )\n ```\n\n\n### Get News Extracting All Articles (/v2/search)\nIt is the same method as *get_search*, but you can fetch all articles without changing `page`, `from_`, and `to_` params manually. \n\u200b\nFor example: for a given search you have found more than 10000 articles.  *get_search* makes one API call and returns up to 100 articles. \n*get_search_all_pages* will make 100 API calls and will return 10000 articles. The *get_search_all_articles* method will return all articles. \n\u200b\n\nOne new parameters:\n- `by` - How to divide the the time interval between to_ and from_ in order to extract all articles for the given search query. By default it is set to `week`. Accepted values: `month`, `week`, `day`, `hour`.\n\u200b\n```\nall_articles = newscatcherapi.get_search_all_articles(q='Elon Musk',\n                                         lang='en',\n                                         countries='CA',\n                                         page_size=100,\n                                         by = 'day'\n                                         )\n ```\n\n### Get Latest Headlines (/v2/latest_headlines)\nGet the latest headlines given any topic, country, sources, or language.\n\n```\ntop_headlines = newscatcherapi.get_latest_headlines(lang='en',\n                                                    countries='us',\n                                                    topic='business')\n ```\n\n### Get Latest Headlines Extracting All Pages (/v2/latest_headlines)\nIt is the same function as *get_latest_headlines*, but you can program to extract all articles without changing `page` param manually. \n\nFor example: for a given search you have 1000 found articles.  *get_latest_headlines* makes one API call and returns up to 100 articles. \n*get_latest_headlines_all_pages* will make 10 API calls and will return all 1000 articles. \n\nTwo new parameters:\n- `max_page` - The last page number to extract. To use when you want to limit the number of extracted pages.\n- `seconds_pause` - Number of seconds waiting before each call. This parameter helps you deal with the rate limit on your subscription plan. By default, it is set to 1 second. \n\n```\ntop_headlines = newscatcherapi.get_latest_headlines_all_pages(lang='en',\n                                                    countries='us', \n                                                    topic='business',\n                                                    max_page=10,\n                                                    seconds_pause=1.0\n                                                    )\n ```\n\n### Get Sources (/v2/sources)\nReturns a list of the top 100 supported news websites. Overall, we support over 60,000 websites. Using this method, you may find the top 100 for your specific language, country, topic combination.\n\n```\nsources = newscatcherapi.get_sources(topic='business',\n                                     lang='en',\n                                     countries='US')\n ```\n\n### Every endpoint supports _proxies_ parameter\nIf you want to use proxies, you can add this parameter to all the endpoints we have.\nHere is an example of a valid form proxies parameter and an example of using it with one of the endpoints. \n\n```\nproxies = {\n   'http': 'http://proxy.example.com:8080',\n   'https': 'http://secureproxy.example.com:8090',\n}\n\nall_articles = newscatcherapi.get_search(q='Elon Musk',\n                                         lang='en',\n                                         countries='CA',\n                                         page_size=100,\n                                         proxies=proxies)\n```\n\n\n### Use *from_* and *to_* instead of *from* and *to* like in NewsCatcher News API\nIn Python, we are not allowed to reserve variable names *from* and *to*. If you try to use them, you will get a syntax error:\n\n```SyntaxError: invalid syntax``` \n\nSo, here is an example on how to use time variables *from_* and *to_* in *get_search* method.\n\n```\nall_articles = newscatcherapi.get_search(q='Elon Musk',\n                                         lang='en',\n                                         countries='CA,US',\n                                         from_='2021/08/20',\n                                         to_='2021/08/31')\n```\n\n## Feedback\n\nFeel free to contact us if you have spot a bug or have any suggestion at maksym`[at]`newscatcherapi.com\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "NewsCatcher News API V2 SDK for Python",
    "version": "0.7.3",
    "project_urls": {
        "Homepage": "https://newscatcherapi.com/"
    },
    "split_keywords": [
        "news",
        " rss",
        " scraping",
        " data mining",
        " news extraction"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e3cd352b8112bd3f6bc6775d1136d5892e1b988f3847a35c5e18382b813459b3",
                "md5": "9c5581d6657faa7a7baf35463db81961",
                "sha256": "c50d7efee72e06fd9b671db8b8c47a0bd10a3c4eb1894f97a7b11796f4afcd25"
            },
            "downloads": -1,
            "filename": "newscatcherapi-0.7.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9c5581d6657faa7a7baf35463db81961",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6.0",
            "size": 12774,
            "upload_time": "2024-09-20T16:18:28",
            "upload_time_iso_8601": "2024-09-20T16:18:28.238790Z",
            "url": "https://files.pythonhosted.org/packages/e3/cd/352b8112bd3f6bc6775d1136d5892e1b988f3847a35c5e18382b813459b3/newscatcherapi-0.7.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "62d068bb064b6905061b3871f84d272420294d7a247e57620fb07f27584638eb",
                "md5": "ea511559b5fc02245f5b680fe0dde00a",
                "sha256": "6efe34ce2b8ca4987494670a400844826b4997092423c18d8ff58eaa55e63428"
            },
            "downloads": -1,
            "filename": "newscatcherapi-0.7.3.tar.gz",
            "has_sig": false,
            "md5_digest": "ea511559b5fc02245f5b680fe0dde00a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6.0",
            "size": 12369,
            "upload_time": "2024-09-20T16:18:29",
            "upload_time_iso_8601": "2024-09-20T16:18:29.618073Z",
            "url": "https://files.pythonhosted.org/packages/62/d0/68bb064b6905061b3871f84d272420294d7a247e57620fb07f27584638eb/newscatcherapi-0.7.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-20 16:18:29",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "newscatcherapi"
}
        
Elapsed time: 0.38117s