sharkbite


Namesharkbite JSON
Version 1.0.4 PyPI version JSON
download
home_pagehttps://docs.sharkbite.io/
SummaryApache Accumulo and Apache HDFS Python Connector
upload_time2021-02-25 16:32:51
maintainer
docs_urlNone
authorMarc Parisi
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ![logo](https://www.sharkbite.io/wp-content/uploads/2017/02/sharkbite.jpg) Sharkbite 
[![Documentation Status](https://readthedocs.org/projects/sharkbite/badge/?version=latest)](https://docs.sharkbite.io/en/latest/?badge=latest)

**S**harkbite is an HDFS and native client for Apache Accumulo ccumulo, with design liberties
that make it usable across other key/value stores. 

As of version V1.0 : 

 * Works with Accumulo 1.6.x, 1.7.x, 1.8.x, 1.9.x and 2.x
 * package import is now **sharkbite** not **pysharkbite**
 * Support for torch IterableDatasets using batch scanners.
 * **Read/Write** : Reading and writing data to Accumulo is currently supported.

About the name

**S**harkbite's name originated from design as a connector that abstracted components in which we tightly
coupled and gripped interfaces of the underlying datastore. With an abstraction layer for access, and using
cross compatible objects, the underlying interfaces are heavily coupled to each database. As a result, Sharkbite
became a fitting name since interfaces exist to abstract the high coupling that exists within implementations of 
the API.

## Python Support
This python client can be installed via `pip install sharkbite`

[A Python example](https://github.com/phrocker/sharkbite/blob/master/examples/pythonexample.py) is included. This is your primary example of the Python bound sharkbite
library.
## Features


### Hedged Reads (! BETA )

Sharkbite now supports hedged reads ( executing scans against RFiles when they can be accessed ) concurrently with 
Accumulo RPC scans. The first executor to complete will return your results. This feature is in beta and not suggested
for production environments.

Enable it with the following option:

```

  import sharkbite as sharkbite

  connector = sharkbite.AccumuloConnector(user, zk)

  table_operations = connector.tableOps(table)  

  scanner = table_operations.createScanner(auths, 2)

  range = sharkbite.Range("myrow")

  scanner.addRange( range )

  ### enable the beta option of hedged reads

  scanner.setOption( sharkbite.ScannerOptions.HedgedReads )

  resultset = scanner.getResultSet()

  for keyvalue in resultset:
      key = keyvalue.getKey()
      value = keyvalue.getValue()

```

### Python Iterators  

We now support a beta version of python iterators. By using the cmake option PYTHON_ITERATOR_SUPPORT ( cmake -DPYTHON_ITERATOR_SUPPORT=ON ) we will build the necessary infrastructure to support python iterators

Iterators can be defined as single function lambdas or by implementing the seek or next methods.


The first example implements the seek and onNext methods. seek is optional if you don't wish to adjust the range. Once keys are being iterated you may get the top key. You may call 
iterator.next() after or the infrastructure will do that for you. 
```

class myIterator: 
  def seek(iterator,soughtRange):
    range = Range("a")
    iterator.seek(range)


  def onNext(iterator):
    if (iterator.hasTop()):
    	kv = KeyValue()
  	  key = iterator.getTopKey()
  	  cf = key.getColumnFamily()
  	  value = iterator.getTopValue()
  	  key.setColumnFamily("oh changed " + cf)
  	  iterator.next()
  	  return KeyValue(key,value)
    else: 
      return None

```

If this is defined in a separate file, you may use it with the following code snippet

```
with open('test.iter', 'r') as file:
  iterator = file.read()
## name, iterator text, priority
iterator = sharkbite.PythonIterator("PythonIterator",iteratortext,100)
scanner.addIterator(iterator)    
```

Alternative you may use lambdas. The lambda you provide will be passed the KeyValue ( getKey() and getValue() return the constituent parts). A partial code example of setting it up is below.
You may return a Key or KeyValue object. If you return the former an empty value will be return ed.

```
## define only the name and priority 
iterator = sharkbite.PythonIterator("PythonIterator",100)
## define a lambda to ajust the column family.
iterator = iterator.onNext("lambda x : Key( x.getKey().getRow(), 'new cf', x.getKey().getColumnQualifier()) ")

scanner.addIterator(iterator)
```

You may either define a python iterator as a text implementation or a lambda. Both cannot be used simulaneously. 

[accumulo]: https://accumulo.apache.org




            

Raw data

            {
    "_id": null,
    "home_page": "https://docs.sharkbite.io/",
    "name": "sharkbite",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Marc Parisi",
    "author_email": "phrocker@apache.org",
    "download_url": "",
    "platform": "",
    "description": "# ![logo](https://www.sharkbite.io/wp-content/uploads/2017/02/sharkbite.jpg) Sharkbite \n[![Documentation Status](https://readthedocs.org/projects/sharkbite/badge/?version=latest)](https://docs.sharkbite.io/en/latest/?badge=latest)\n\n**S**harkbite is an HDFS and native client for Apache Accumulo ccumulo, with design liberties\nthat make it usable across other key/value stores. \n\nAs of version V1.0 : \n\n * Works with Accumulo 1.6.x, 1.7.x, 1.8.x, 1.9.x and 2.x\n * package import is now **sharkbite** not **pysharkbite**\n * Support for torch IterableDatasets using batch scanners.\n * **Read/Write** : Reading and writing data to Accumulo is currently supported.\n\nAbout the name\n\n**S**harkbite's name originated from design as a connector that abstracted components in which we tightly\ncoupled and gripped interfaces of the underlying datastore. With an abstraction layer for access, and using\ncross compatible objects, the underlying interfaces are heavily coupled to each database. As a result, Sharkbite\nbecame a fitting name since interfaces exist to abstract the high coupling that exists within implementations of \nthe API.\n\n## Python Support\nThis python client can be installed via `pip install sharkbite`\n\n[A Python example](https://github.com/phrocker/sharkbite/blob/master/examples/pythonexample.py) is included. This is your primary example of the Python bound sharkbite\nlibrary.\n## Features\n\n\n### Hedged Reads (! BETA )\n\nSharkbite now supports hedged reads ( executing scans against RFiles when they can be accessed ) concurrently with \nAccumulo RPC scans. The first executor to complete will return your results. This feature is in beta and not suggested\nfor production environments.\n\nEnable it with the following option:\n\n```\n\n  import sharkbite as sharkbite\n\n  connector = sharkbite.AccumuloConnector(user, zk)\n\n  table_operations = connector.tableOps(table)  \n\n  scanner = table_operations.createScanner(auths, 2)\n\n  range = sharkbite.Range(\"myrow\")\n\n  scanner.addRange( range )\n\n  ### enable the beta option of hedged reads\n\n  scanner.setOption( sharkbite.ScannerOptions.HedgedReads )\n\n  resultset = scanner.getResultSet()\n\n  for keyvalue in resultset:\n      key = keyvalue.getKey()\n      value = keyvalue.getValue()\n\n```\n\n### Python Iterators  \n\nWe now support a beta version of python iterators. By using the cmake option PYTHON_ITERATOR_SUPPORT ( cmake -DPYTHON_ITERATOR_SUPPORT=ON ) we will build the necessary infrastructure to support python iterators\n\nIterators can be defined as single function lambdas or by implementing the seek or next methods.\n\n\nThe first example implements the seek and onNext methods. seek is optional if you don't wish to adjust the range. Once keys are being iterated you may get the top key. You may call \niterator.next() after or the infrastructure will do that for you. \n```\n\nclass myIterator: \n  def seek(iterator,soughtRange):\n    range = Range(\"a\")\n    iterator.seek(range)\n\n\n  def onNext(iterator):\n    if (iterator.hasTop()):\n    \tkv = KeyValue()\n  \t  key = iterator.getTopKey()\n  \t  cf = key.getColumnFamily()\n  \t  value = iterator.getTopValue()\n  \t  key.setColumnFamily(\"oh changed \" + cf)\n  \t  iterator.next()\n  \t  return KeyValue(key,value)\n    else: \n      return None\n\n```\n\nIf this is defined in a separate file, you may use it with the following code snippet\n\n```\nwith open('test.iter', 'r') as file:\n  iterator = file.read()\n## name, iterator text, priority\niterator = sharkbite.PythonIterator(\"PythonIterator\",iteratortext,100)\nscanner.addIterator(iterator)    \n```\n\nAlternative you may use lambdas. The lambda you provide will be passed the KeyValue ( getKey() and getValue() return the constituent parts). A partial code example of setting it up is below.\nYou may return a Key or KeyValue object. If you return the former an empty value will be return ed.\n\n```\n## define only the name and priority \niterator = sharkbite.PythonIterator(\"PythonIterator\",100)\n## define a lambda to ajust the column family.\niterator = iterator.onNext(\"lambda x : Key( x.getKey().getRow(), 'new cf', x.getKey().getColumnQualifier()) \")\n\nscanner.addIterator(iterator)\n```\n\nYou may either define a python iterator as a text implementation or a lambda. Both cannot be used simulaneously. \n\n[accumulo]: https://accumulo.apache.org\n\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Apache Accumulo and Apache HDFS Python Connector",
    "version": "1.0.4",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "ea8f46c761396df6f9165b69c35b53fe",
                "sha256": "d47ed7ff5e160842f67ea83d54182323648b621b46df06f5544e9715d4b9ac9a"
            },
            "downloads": -1,
            "filename": "sharkbite-1.0.4-cp36-cp36m-manylinux1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "ea8f46c761396df6f9165b69c35b53fe",
            "packagetype": "bdist_wheel",
            "python_version": "cp36",
            "requires_python": ">=3.6",
            "size": 15715159,
            "upload_time": "2021-02-25T16:32:51",
            "upload_time_iso_8601": "2021-02-25T16:32:51.472696Z",
            "url": "https://files.pythonhosted.org/packages/e4/9f/e077bdd4af08b1433bd7406ecd926ff4cedefbd39e83d4f11d825114e76b/sharkbite-1.0.4-cp36-cp36m-manylinux1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "e4bbbfb2d7f0a03d667767d90084c044",
                "sha256": "70793120af594b28332e95b60d0e22a8e112a8516ac00842d61e6cb69006303b"
            },
            "downloads": -1,
            "filename": "sharkbite-1.0.4-cp38-cp38-manylinux1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "e4bbbfb2d7f0a03d667767d90084c044",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.6",
            "size": 52939145,
            "upload_time": "2021-02-25T16:32:54",
            "upload_time_iso_8601": "2021-02-25T16:32:54.912372Z",
            "url": "https://files.pythonhosted.org/packages/b0/bc/b28b78246555266116fdc5c0e8d15136bc8fc824a376fa4b61167b74d7a3/sharkbite-1.0.4-cp38-cp38-manylinux1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-02-25 16:32:51",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "sharkbite"
}
        
Elapsed time: 0.20251s