Dow Jones Factiva News Python Library
#####################################
.. image:: https://github.com/dowjones/factiva-news-python/actions/workflows/master_test_publish.yml/badge.svg
This library simplifies the integration to Factiva API services for news-related services.
The following services are currently implemented.
* **Snapshots**: Allows to run each snapshot creation, monitoring, download and local exploration, in an individual manner. Also allows to run the whole process within a single method.
* **Streams**: In addition to creating and getting stream details, contains the methods to easily implement a stream listener and push the content to other locations appropriate for high-available setups.
The previous components rely on the API-Key authentication method, which is a prerequisite when using either of those services.
Installation
============
To install this library, run the following commands.
.. code-block::
$ pip install --upgrade factiva-news
Using Library services
======================
Both services, Snapshots and Streams are implemented in this library.
Enviroment vars
===============
To be able to use Stream Listener options, add the following environment vars depending on your selected listener tool
To use BigQuery Stream Listener
.. code-block::
$ export GOOGLE_APPLICATION_CREDENTIALS="/Users/Files/credentials.json"
$ export STREAMLOG_BQ_TABLENAME=project.dataset.table
To use MongoDB Stream Listener
.. code-block::
$ export MONGODB_CONNECTION_STRING=mongodb://localhost:27017
$ export MONGODB_DATABASE_NAME=factiva-news
$ export MONGODB_COLLECTION_NAME=stream-listener
To define custom directories. If they are not set, the project root path will be used
.. code-block::
$ export DOWNLOAD_FILES_DIR=/users/dowloads
$ export STREAM_FILES_DIR=/users/listeners
$ export LOG_FILES_DIR=/users/logs
Snapshots
---------
Create a new snapshot and download to a local repository just require a few lines of code.
.. code-block:: python
from factiva.news.snapshot import Snapshot
my_query = "publication_datetime >= '2020-01-01 00:00:00' AND LOWER(language_code) = 'en'"
my_snapshot = Snapshot(
user_key='abcd1234abcd1234abcd1234abcd1234', # Can be ommited if exist as env variable
query=my_query)
my_snapshot.process_extract() # This operation can take several minutes to complete
After the process completes, the output files are stored in a subfolder named as the Extraction Job ID.
In the previous code a new snapshot is created using my_query as selection criteria and user_key for user authentication. After the job is being validated internally, a Snapshot Id is obtained along with the list of files to download. Files are automatically downloaded to a folder named equal to the snapshot ID, and contents are loaded as a Pandas DataFrame to the variable news_articles. This process may take several minutes, but automates the extraction process significantly.
Streams
-------
Create a stream instance and get the details to configure the stream client and listen the content as it is delivered.
.. code-block:: python
from factiva.news.stream import Stream
stream_query = Stream(
user_key='abcd1234abcd1234abcd1234abcd1234', # Can be ommited if exist as env variable
user_key_stats=True,
query="publication_datetime >= '2021-04-01 00:00:00' AND LOWER(language_code)='en' AND UPPER(source_code) = 'DJDN'",
)
print(stream_query.create())
Raw data
{
"_id": null,
"home_page": "https://developer.dowjones.com/",
"name": "factiva-news",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "news,news aggregator,risk,compliance,nlp,alternative data,factiva,trading news,market movers",
"author": "Dow Jones Customer Engineers",
"author_email": "customer.solutions@dowjones.com",
"download_url": "https://files.pythonhosted.org/packages/5d/8e/5f79309303fd360c393ea694f462f7d8ce3592c650c248d9c8d412590d65/factiva-news-0.2.5.tar.gz",
"platform": null,
"description": "Dow Jones Factiva News Python Library\n#####################################\n.. image:: https://github.com/dowjones/factiva-news-python/actions/workflows/master_test_publish.yml/badge.svg\n\nThis library simplifies the integration to Factiva API services for news-related services.\n\nThe following services are currently implemented.\n\n* **Snapshots**: Allows to run each snapshot creation, monitoring, download and local exploration, in an individual manner. Also allows to run the whole process within a single method.\n* **Streams**: In addition to creating and getting stream details, contains the methods to easily implement a stream listener and push the content to other locations appropriate for high-available setups.\n\nThe previous components rely on the API-Key authentication method, which is a prerequisite when using either of those services.\n\nInstallation\n============\nTo install this library, run the following commands.\n\n.. code-block::\n\n $ pip install --upgrade factiva-news\n\nUsing Library services\n======================\nBoth services, Snapshots and Streams are implemented in this library.\n\nEnviroment vars\n===============\nTo be able to use Stream Listener options, add the following environment vars depending on your selected listener tool\n\nTo use BigQuery Stream Listener\n.. code-block::\n\n $ export GOOGLE_APPLICATION_CREDENTIALS=\"/Users/Files/credentials.json\"\n $ export STREAMLOG_BQ_TABLENAME=project.dataset.table\n\nTo use MongoDB Stream Listener\n.. code-block::\n\n $ export MONGODB_CONNECTION_STRING=mongodb://localhost:27017\n $ export MONGODB_DATABASE_NAME=factiva-news\n $ export MONGODB_COLLECTION_NAME=stream-listener \n\nTo define custom directories. If they are not set, the project root path will be used\n.. code-block::\n\n $ export DOWNLOAD_FILES_DIR=/users/dowloads\n $ export STREAM_FILES_DIR=/users/listeners\n $ export LOG_FILES_DIR=/users/logs\n\n\nSnapshots\n---------\nCreate a new snapshot and download to a local repository just require a few lines of code.\n\n.. code-block:: python\n\n from factiva.news.snapshot import Snapshot\n my_query = \"publication_datetime >= '2020-01-01 00:00:00' AND LOWER(language_code) = 'en'\"\n my_snapshot = Snapshot(\n user_key='abcd1234abcd1234abcd1234abcd1234', # Can be ommited if exist as env variable\n query=my_query)\n my_snapshot.process_extract() # This operation can take several minutes to complete\n\nAfter the process completes, the output files are stored in a subfolder named as the Extraction Job ID.\n\nIn the previous code a new snapshot is created using my_query as selection criteria and user_key for user authentication. After the job is being validated internally, a Snapshot Id is obtained along with the list of files to download. Files are automatically downloaded to a folder named equal to the snapshot ID, and contents are loaded as a Pandas DataFrame to the variable news_articles. This process may take several minutes, but automates the extraction process significantly.\n\nStreams\n-------\nCreate a stream instance and get the details to configure the stream client and listen the content as it is delivered.\n\n.. code-block:: python\n\n from factiva.news.stream import Stream\n\n stream_query = Stream(\n user_key='abcd1234abcd1234abcd1234abcd1234', # Can be ommited if exist as env variable\n user_key_stats=True,\n query=\"publication_datetime >= '2021-04-01 00:00:00' AND LOWER(language_code)='en' AND UPPER(source_code) = 'DJDN'\",\n )\n \n print(stream_query.create())\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python package to interact with Factiva news-related APIs. Services are described in the Dow Jones Developer Platform.",
"version": "0.2.5",
"split_keywords": [
"news",
"news aggregator",
"risk",
"compliance",
"nlp",
"alternative data",
"factiva",
"trading news",
"market movers"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "fdaeafba0d2ed11e8e635310f1a9c1b5",
"sha256": "0b9fb75102afc5f8cb6474d31e380c1c23a1bc98a59d7064c971e1b5c2b06da6"
},
"downloads": -1,
"filename": "factiva_news-0.2.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fdaeafba0d2ed11e8e635310f1a9c1b5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 34343,
"upload_time": "2022-12-02T11:14:46",
"upload_time_iso_8601": "2022-12-02T11:14:46.753217Z",
"url": "https://files.pythonhosted.org/packages/10/00/f5b417d09ba6d28cbfaad1e33fa3e906bc1ed3f13e04fa844156982d3c5d/factiva_news-0.2.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "f374c713c9407e3f57472a5eb9fb84d0",
"sha256": "1a7967db1a79a8e68cf710ec36fe9936b57c22eeeaccde26813f9b222bea94ec"
},
"downloads": -1,
"filename": "factiva-news-0.2.5.tar.gz",
"has_sig": false,
"md5_digest": "f374c713c9407e3f57472a5eb9fb84d0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 29047,
"upload_time": "2022-12-02T11:14:48",
"upload_time_iso_8601": "2022-12-02T11:14:48.528351Z",
"url": "https://files.pythonhosted.org/packages/5d/8e/5f79309303fd360c393ea694f462f7d8ce3592c650c248d9c8d412590d65/factiva-news-0.2.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-02 11:14:48",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "factiva-news"
}