slb-glossary


Nameslb-glossary JSON
Version 0.0.1 PyPI version JSON
download
home_pageNone
SummarySearch the Schlumberger Oilfield Glossary programmatically using Selenium.
upload_time2024-06-10 12:44:24
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords schlumberger oilfield glossary petroleum terms petroleum
VCS
bugtrack_url
requirements selenium
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Schlumberger Petroleum Glossary

Browse the Schlumberger Petroleum Glossary using Python (in English and Spanish).

**For optimum performance, Use the Chrome browser and a fast and stable internet connection.**

> This package is intended for research or instructional use only.

## Installation

* Install using pip:

```bash
pip install slb-glossary
```

## Dependencies

* [seleneium](https://pypi.org/project/selenium/)
* [openpyxl](https://pypi.org/project/openpyxl/) (for exporting search results to Excel)

## Quick Start

```python
import slb_glossary as slb

# Create a glossary object
with slb.Glossary(slb.Browser.CHROME, open_browser=True) as glossary:
    # Search for a term
    results = glossary.search("porosity")
    # Print the results
    for result in results:
        print(result.asdict())
```

## Usage

**Please note that this is just a brief overview of the module. The module is properly documented and you are encouraged to read the docstrings for more information on the various methods and classes.**

> "topics" used in the context of this documentation refers to the subjects or topics in the glossary.

### Instantiate a glossary object

Import the module:

```python
import slb_glossary as slb
```

To use the glossary, you need to create a `Glossary` object. The `Glossary` class takes a few arguments:

* `browser`: The browser to use. It can be any of the values in the `Browser` enum.
**Ensure you have the browser selected installed on your machine.**
* `open_browser`: A boolean indicating whether to open the browser when searching the glossary or not.
If this is True, a browser window is open when you search for a term. This can be useful for monitoring
and debugging the search process. If you don't need to see the browser window, set this to False.
This is analogous to running the browser in headless mode. The default value is False.
* `page_load_timeout`: The maximum time to wait for a page to load before raising an exception.
* `implicit_wait_time`: The maximum time to wait for an element to be found before raising an exception.
* `language`: The language to use when searching the glossary. This ca be any of the values in the `Language` enum.
Presently, only English and Spanish are supported. The default value is `Language.ENGLISH`.

```python
glossary = slb.Glossary(slb.Browser.CHROME, open_browser=True)
```

### Get all topics/subjects available in the glossary

When you initialize a glossary, the available topics are automatically fetched and stored in the `topics` attribute.

```python
topics = glossary.topics
print(topics)
```

This returns a mapping of the topic to the number of terms under the topic in the glossary

```python
{
    "Drilling": 452,
    "Geology": 518,
    ...
}
```

Use `glossary.topics_list` if you only need a list of the topics in the glossary. `glossary.size` returns the total number of terms in the glossary.

If you need to refetch all topics call `glossary.get_topics()`. Read the method's docstring for more info on its use.

### Get a topic match

Do you have a topic in mind and are not sure if it is in the glossary? Use the `get_topic_match` method to get a topic match. It returns a single topic that best matches the input topic.

```python
topic = glossary.get_topic_match("drill")
print(topic)

# Output: Drilling
```

### Search for a term

Use the `search` method to search for a term in the glossary

```python
results = glossary.search("porosity")
```

This returns a list of [`SearchResult`](#search-results)s for "porosity". You can also pass some optional arguments to the `search` method:

* `under_topic`: Streamline search to a specific topic
* `start_letter`: Limit the search to terms starting with the given letter(s)
* `max_results`: Limit the number of results returned.

### Search for terms under a specific topic/subject

```python
results = glossary.get_terms_on(topic="Well workover")
```

The `get_terms_on` method returns a list of `SearchResult`s for all terms under the specified topic.
The difference between `search` and `get_terms_on` is that `search` searches the entire glossary while `get_terms_on` searches only under the specified topic. Hence, search can contain terms from different topics.

The topic passed need not be an exact match to what is in the glossary. The glossary will choose the closest match to the provided topic that is available in the glossary.

> Interesting fact: If you want to base your search on multiple topics, just pass a string with the topics separated by a comma. For example, `"Drilling, Well workover, Shale gas"`.

### Search results

Search results are returned as `SearchResult` objects. Each `SearchResult` object has the following attributes:

* `term`: The term being searched for
* `definition`: The definition of the term
* `grammatical_label`: The grammatical label of the term. Basically the part of speech of the term
* `topic`: The topic under which the term is found
* `url`: The URL to the term in the glossary

To get the search results as a dictionary, use the `asdict` method.

```python
results = glossary.search("oblique fault")
for result in results:
    print(result.asdict())
```

You could also convert search results to tuples using the `astuple` method.

```python
results = glossary.search("oblique fault")
for result in results:
    print(result.astuple())
```

### Other methods

Some other methods available in the `Glossary` class are:

* `get_search_url`: Returns the correct glossary url for the given parameters.
* `get_terms_urls`: Returns the URLs of all terms gotten using the given parameters.
* `get_results_from_url`: Extracts search results from a given URL. Returns a list of `SearchResult`s.

### Closing the glossary

When you are done using the glossary, it is important that you close it to free up resources. This is done by calling the `close` method.

```python
glossary.close()
```

If you used the `Glossary` object as a context manager, you don't need to call the `close` method. The `Glossary` object will automatically close itself when the context manager exits. Also, on normal termination of the program, the `Glossary` object will close itself (If it is not already closed).

### Save/export search results to a file

A convenient way to save search results to a file is to use the `saver` attribute of the glossary object.

```python
results = glossary.search("gas lift")
glossary.saver.save(results, "./gas_lift.txt")
```

The `save` method takes a list of `SearchResult`s and the filename or file path to save the results to. The file save format is determined by the file extension. The supported file formats by default are 'xlsx', 'txt', 'csv' and 'json'.
Or check `glossary.saver.supported_file_types`.

### Customizing how results are saved

By default, the `Glossary` class uses a `Saver` class to save search results. This base `Saver` class only supports a few file formats, which should be sufficient. However, if you need to save in an unsupported format. You can subclass the `Saver` class thus;

```python
from typing import List
import slb_glossary as slb

class FooSaver(slb.Saver):
    @staticmethod
    def save_as_xyz(results: List[SearchResult], filename: str):
        # Validate filename or path 
        # Your implementation goes here
        ...
```

Read the docstrings of the `Saver` class to get a good grasp of how to do this. Also, you may read the `slb_glossary.saver` module to get an idea of how you would implement your custom save method.

There are two ways you can use your custom saver class.

1. Create a `Glossary` subclass:

```python
import slb_glossary as slb

class FooGlossary(slb.Glossary):
    saver_class = FooSaver
    ...

glossary = FooGlossary(...)
glossary.saver.save(...)
```

2. Instantiate a saver directly

```python
saver = FooSaver()
saver.save(...)
```

## Contributing

Contributions are welcome. Please fork the repository and submit a pull request.

## Credits

This project was inspired by the 2023/24/25 Petrobowl Team of the Federal University of Petroleum Resources, Effurun, Delta state, Nigeria. It aided the team's preparation for the PetroQuiz and PetroBowl competitions organized by the Society of Petroleum Engineers(SPE).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "slb-glossary",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "\"Daniel T. Afolayan (ti-oluwa)\" <tioluwa.dev@gmail.com>",
    "keywords": "Schlumberger, Oilfield, Glossary, Petroleum terms, Petroleum",
    "author": null,
    "author_email": "\"Daniel T. Afolayan (ti-oluwa)\" <tioluwa.dev@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/bf/60/e8f5be0cbcc7c85d105e1a70bed7ccaa6b18cffdf727d3a82eb717055b73/slb_glossary-0.0.1.tar.gz",
    "platform": null,
    "description": "# Schlumberger Petroleum Glossary\r\n\r\nBrowse the Schlumberger Petroleum Glossary using Python (in English and Spanish).\r\n\r\n**For optimum performance, Use the Chrome browser and a fast and stable internet connection.**\r\n\r\n> This package is intended for research or instructional use only.\r\n\r\n## Installation\r\n\r\n* Install using pip:\r\n\r\n```bash\r\npip install slb-glossary\r\n```\r\n\r\n## Dependencies\r\n\r\n* [seleneium](https://pypi.org/project/selenium/)\r\n* [openpyxl](https://pypi.org/project/openpyxl/) (for exporting search results to Excel)\r\n\r\n## Quick Start\r\n\r\n```python\r\nimport slb_glossary as slb\r\n\r\n# Create a glossary object\r\nwith slb.Glossary(slb.Browser.CHROME, open_browser=True) as glossary:\r\n    # Search for a term\r\n    results = glossary.search(\"porosity\")\r\n    # Print the results\r\n    for result in results:\r\n        print(result.asdict())\r\n```\r\n\r\n## Usage\r\n\r\n**Please note that this is just a brief overview of the module. The module is properly documented and you are encouraged to read the docstrings for more information on the various methods and classes.**\r\n\r\n> \"topics\" used in the context of this documentation refers to the subjects or topics in the glossary.\r\n\r\n### Instantiate a glossary object\r\n\r\nImport the module:\r\n\r\n```python\r\nimport slb_glossary as slb\r\n```\r\n\r\nTo use the glossary, you need to create a `Glossary` object. The `Glossary` class takes a few arguments:\r\n\r\n* `browser`: The browser to use. It can be any of the values in the `Browser` enum.\r\n**Ensure you have the browser selected installed on your machine.**\r\n* `open_browser`: A boolean indicating whether to open the browser when searching the glossary or not.\r\nIf this is True, a browser window is open when you search for a term. This can be useful for monitoring\r\nand debugging the search process. If you don't need to see the browser window, set this to False.\r\nThis is analogous to running the browser in headless mode. The default value is False.\r\n* `page_load_timeout`: The maximum time to wait for a page to load before raising an exception.\r\n* `implicit_wait_time`: The maximum time to wait for an element to be found before raising an exception.\r\n* `language`: The language to use when searching the glossary. This ca be any of the values in the `Language` enum.\r\nPresently, only English and Spanish are supported. The default value is `Language.ENGLISH`.\r\n\r\n```python\r\nglossary = slb.Glossary(slb.Browser.CHROME, open_browser=True)\r\n```\r\n\r\n### Get all topics/subjects available in the glossary\r\n\r\nWhen you initialize a glossary, the available topics are automatically fetched and stored in the `topics` attribute.\r\n\r\n```python\r\ntopics = glossary.topics\r\nprint(topics)\r\n```\r\n\r\nThis returns a mapping of the topic to the number of terms under the topic in the glossary\r\n\r\n```python\r\n{\r\n    \"Drilling\": 452,\r\n    \"Geology\": 518,\r\n    ...\r\n}\r\n```\r\n\r\nUse `glossary.topics_list` if you only need a list of the topics in the glossary. `glossary.size` returns the total number of terms in the glossary.\r\n\r\nIf you need to refetch all topics call `glossary.get_topics()`. Read the method's docstring for more info on its use.\r\n\r\n### Get a topic match\r\n\r\nDo you have a topic in mind and are not sure if it is in the glossary? Use the `get_topic_match` method to get a topic match. It returns a single topic that best matches the input topic.\r\n\r\n```python\r\ntopic = glossary.get_topic_match(\"drill\")\r\nprint(topic)\r\n\r\n# Output: Drilling\r\n```\r\n\r\n### Search for a term\r\n\r\nUse the `search` method to search for a term in the glossary\r\n\r\n```python\r\nresults = glossary.search(\"porosity\")\r\n```\r\n\r\nThis returns a list of [`SearchResult`](#search-results)s for \"porosity\". You can also pass some optional arguments to the `search` method:\r\n\r\n* `under_topic`: Streamline search to a specific topic\r\n* `start_letter`: Limit the search to terms starting with the given letter(s)\r\n* `max_results`: Limit the number of results returned.\r\n\r\n### Search for terms under a specific topic/subject\r\n\r\n```python\r\nresults = glossary.get_terms_on(topic=\"Well workover\")\r\n```\r\n\r\nThe `get_terms_on` method returns a list of `SearchResult`s for all terms under the specified topic.\r\nThe difference between `search` and `get_terms_on` is that `search` searches the entire glossary while `get_terms_on` searches only under the specified topic. Hence, search can contain terms from different topics.\r\n\r\nThe topic passed need not be an exact match to what is in the glossary. The glossary will choose the closest match to the provided topic that is available in the glossary.\r\n\r\n> Interesting fact: If you want to base your search on multiple topics, just pass a string with the topics separated by a comma. For example, `\"Drilling, Well workover, Shale gas\"`.\r\n\r\n### Search results\r\n\r\nSearch results are returned as `SearchResult` objects. Each `SearchResult` object has the following attributes:\r\n\r\n* `term`: The term being searched for\r\n* `definition`: The definition of the term\r\n* `grammatical_label`: The grammatical label of the term. Basically the part of speech of the term\r\n* `topic`: The topic under which the term is found\r\n* `url`: The URL to the term in the glossary\r\n\r\nTo get the search results as a dictionary, use the `asdict` method.\r\n\r\n```python\r\nresults = glossary.search(\"oblique fault\")\r\nfor result in results:\r\n    print(result.asdict())\r\n```\r\n\r\nYou could also convert search results to tuples using the `astuple` method.\r\n\r\n```python\r\nresults = glossary.search(\"oblique fault\")\r\nfor result in results:\r\n    print(result.astuple())\r\n```\r\n\r\n### Other methods\r\n\r\nSome other methods available in the `Glossary` class are:\r\n\r\n* `get_search_url`: Returns the correct glossary url for the given parameters.\r\n* `get_terms_urls`: Returns the URLs of all terms gotten using the given parameters.\r\n* `get_results_from_url`: Extracts search results from a given URL. Returns a list of `SearchResult`s.\r\n\r\n### Closing the glossary\r\n\r\nWhen you are done using the glossary, it is important that you close it to free up resources. This is done by calling the `close` method.\r\n\r\n```python\r\nglossary.close()\r\n```\r\n\r\nIf you used the `Glossary` object as a context manager, you don't need to call the `close` method. The `Glossary` object will automatically close itself when the context manager exits. Also, on normal termination of the program, the `Glossary` object will close itself (If it is not already closed).\r\n\r\n### Save/export search results to a file\r\n\r\nA convenient way to save search results to a file is to use the `saver` attribute of the glossary object.\r\n\r\n```python\r\nresults = glossary.search(\"gas lift\")\r\nglossary.saver.save(results, \"./gas_lift.txt\")\r\n```\r\n\r\nThe `save` method takes a list of `SearchResult`s and the filename or file path to save the results to. The file save format is determined by the file extension. The supported file formats by default are 'xlsx', 'txt', 'csv' and 'json'.\r\nOr check `glossary.saver.supported_file_types`.\r\n\r\n### Customizing how results are saved\r\n\r\nBy default, the `Glossary` class uses a `Saver` class to save search results. This base `Saver` class only supports a few file formats, which should be sufficient. However, if you need to save in an unsupported format. You can subclass the `Saver` class thus;\r\n\r\n```python\r\nfrom typing import List\r\nimport slb_glossary as slb\r\n\r\nclass FooSaver(slb.Saver):\r\n    @staticmethod\r\n    def save_as_xyz(results: List[SearchResult], filename: str):\r\n        # Validate filename or path \r\n        # Your implementation goes here\r\n        ...\r\n```\r\n\r\nRead the docstrings of the `Saver` class to get a good grasp of how to do this. Also, you may read the `slb_glossary.saver` module to get an idea of how you would implement your custom save method.\r\n\r\nThere are two ways you can use your custom saver class.\r\n\r\n1. Create a `Glossary` subclass:\r\n\r\n```python\r\nimport slb_glossary as slb\r\n\r\nclass FooGlossary(slb.Glossary):\r\n    saver_class = FooSaver\r\n    ...\r\n\r\nglossary = FooGlossary(...)\r\nglossary.saver.save(...)\r\n```\r\n\r\n2. Instantiate a saver directly\r\n\r\n```python\r\nsaver = FooSaver()\r\nsaver.save(...)\r\n```\r\n\r\n## Contributing\r\n\r\nContributions are welcome. Please fork the repository and submit a pull request.\r\n\r\n## Credits\r\n\r\nThis project was inspired by the 2023/24/25 Petrobowl Team of the Federal University of Petroleum Resources, Effurun, Delta state, Nigeria. It aided the team's preparation for the PetroQuiz and PetroBowl competitions organized by the Society of Petroleum Engineers(SPE).\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Search the Schlumberger Oilfield Glossary programmatically using Selenium.",
    "version": "0.0.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/ti-oluwa/slb_glossary/issues",
        "Homepage": "https://github.com/ti-oluwa/slb_glossary",
        "Repository": "https://github.com/ti-oluwa/slb_glossary"
    },
    "split_keywords": [
        "schlumberger",
        " oilfield",
        " glossary",
        " petroleum terms",
        " petroleum"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99bbfe034013a786f64df7ab291e16287e20857d26395600895d24998baabcb7",
                "md5": "ecb9b84ff666111a7ced50403e1cb07c",
                "sha256": "38af0c7df78850680d920001c647fce3b6634897095129b58d6eaa553f533e83"
            },
            "downloads": -1,
            "filename": "slb_glossary-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ecb9b84ff666111a7ced50403e1cb07c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 16868,
            "upload_time": "2024-06-10T12:44:18",
            "upload_time_iso_8601": "2024-06-10T12:44:18.435164Z",
            "url": "https://files.pythonhosted.org/packages/99/bb/fe034013a786f64df7ab291e16287e20857d26395600895d24998baabcb7/slb_glossary-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bf60e8f5be0cbcc7c85d105e1a70bed7ccaa6b18cffdf727d3a82eb717055b73",
                "md5": "2ec408983b4fb7b72140766be236b412",
                "sha256": "614032c3a800ec458bb46d1c5644072f6e743e78bfd00564d55dbcbed188c2f7"
            },
            "downloads": -1,
            "filename": "slb_glossary-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "2ec408983b4fb7b72140766be236b412",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 19001,
            "upload_time": "2024-06-10T12:44:24",
            "upload_time_iso_8601": "2024-06-10T12:44:24.367041Z",
            "url": "https://files.pythonhosted.org/packages/bf/60/e8f5be0cbcc7c85d105e1a70bed7ccaa6b18cffdf727d3a82eb717055b73/slb_glossary-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-10 12:44:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ti-oluwa",
    "github_project": "slb_glossary",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "selenium",
            "specs": [
                [
                    "==",
                    "4.21.0"
                ]
            ]
        }
    ],
    "lcname": "slb-glossary"
}
        
Elapsed time: 0.29296s