haggle

Name	haggle JSON
Version	0.1.13 JSON
	download
home_page	https://github.com/thorwhalen/haggle
Summary	A facade to Kaggle data
upload_time	2024-01-10 10:08:26
maintainer
docs_url	None
author
requires_python
license	mit
keywords	kaggle data
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            - [Haggle](#haggle)
- [Simple example](#simple-example)
  * [By the way...](#by-the-way)
- [Search results and dataset metadata](#search-results-and-dataset-metadata)
  * [.meta](#meta)
  * [Cached search info](#cached-search-info)
- [The boring stuff](#the-boring-stuff)
  * [Install](#install)
  * [API credentials](#api-credentials)
- [F.A.Q.](#faq)
  * [What if I don't want a zip file anymore?](#what-if-i-don-t-want-a-zip-file-anymore-)
  * [Do you have any jupyter notebooks demoing this.](#do-you-have-any-jupyter-notebooks-demoing-this)
  * [A little snippet of code to test?](#a-little-snippet-of-code-to-test-)

<small><i><a href='http://ecotrust-canada.github.io/markdown-toc/'>Table of contents generated with markdown-toc</a></i></small>

# Haggle

A simple facade to [Kaggle](https://www.kaggle.com/) data.

Essentially, instantiate a `KaggleDatasets` object, and from it search datasets, see their metadata, download the data 
(automatically caching it in well organized folders), and all from an interface that looks like 
a humble dict with `owner/dataset` keys, and that's the coolest bit.

**Haggle:** /ˈhaɡəl/
- an instance of intense argument (as in bargaining) 
- wrangle (over a price, terms of an agreement, etc.) 
- rhymes with Kaggle and is not taken on pypi (well, now it is)

# Simple example


```python
from haggle import KaggleDatasets

rootdir = None  # define (or not) where you want the data to be cached/downloaded

s = KaggleDatasets(rootdir)  # make an instance

if 'rtatman/english-word-frequency' in s:
    del s['rtatman/english-word-frequency']  # just to prepare for the demo
```


```python
list(s)  # see what you have locally
```




    ['uciml/human-activity-recognition-with-smartphones',
     'sitsawek/phonetics-articles-on-plos']



Let's search something (you can also search on [kaggle](https://www.kaggle.com/), I was kidding about it being lame!)


```python
results = s.search('word frequency')
print(f"{len(results)=}")
list(results)[:10]
```

    len(results)=180




    ['rtatman/english-word-frequency',
     'yekenot/fasttext-crawl-300d-2m',
     'rtatman/japanese-lemma-frequency',
     'rtatman/glove-global-vectors-for-word-representation',
     'averkij/lingtrain-hungarian-word-frequency',
     'lukevanhaezebrouck/subtlex-word-frequency',
     'facebook/fatsttext-common-crawl',
     'facebook/fasttext-wikinews',
     'facebook/fasttext-english-word-vectors-including-subwords',
     'kushtej/kannada-word-frequency']



Chose what you want? Good, now do this:


```python
v = s['rtatman/english-word-frequency']
type(v)
```




    py2store.slib.s_zipfile.ZipReader



Okay, let's slow down a moment. What happened? What's this `ZipReader` thingy?

Well, what happened is that this downloaded the zip file of the data for you and saved it in `ROOTDIR/rtatman/english-word-frequency.zip`. Don't believe me? Go have a look. 

But then it also returns this object called `ZipReader` that points to it. 

If you don't like it, you don't have to use it. But I think you should like it.

Look at what it can do!

List the contents of file (that's in the zip... okay there's just one here, it's a bit boring)


```python
list(v)
```




    ['unigram_freq.csv']



Retrieve the data for any given file of the zip without ever having to unzip it!

Oh, and still pretending to be a dict. 


```python
b = v['unigram_freq.csv']
print(f"b is a {type(b)} and has {len(b)} bytes")
```

    b is a <class 'bytes'> and has 4956252 bytes


Now the data is given in bytes by default, since that's the basis of everything. 

From there you can go everywhere. Here for example, say we'd like to go to `pandas.DataFrame`...


```python
import pandas as pd
from io import BytesIO

df = pd.read_csv(BytesIO(b))
df.shape
```




    (333333, 2)




```python
print(df.head(7).to_string())
```

      word        count
    0  the  23135851162
    1   of  13151942776
    2  and  12997637966
    3   to  12136980858
    4    a   9081174698
    5   in   8469404971
    6  for   5933321709


And as mentioned, it caches the data to your local drive. You know, download, so that the next time you ask for `s['rtatman/english-word-frequency']`, it'll be faster to get those bytes.

See, let's list the contents of `s` again and see that we now have that `'rtatman/english-word-frequency'` key we didn't have before.


```python
list(s)
```




    ['uciml/human-activity-recognition-with-smartphones',
     'rtatman/english-word-frequency',
     'sitsawek/phonetics-articles-on-plos']




## By the way...

So a `KaggleDatasets` is a store with a dict-like interface. 

Listing happens locally. Remote listing is done through `.search(...)`.

Getting happens locally first, and if not, will get remotely (and cache locally).


Where are the zips stored? Ask `.zips_dir`:


```python
s.zips_dir
```




    '/D/Dropbox/_odata/kaggle/zips'



# Search results and dataset metadata

Let's have a closer look at those search results. All we did is a `len(results)` and a `list(results)`. What else can you do with that object?

Well, as is so happens, you can do whatever (read-only) operation you can do on a -- take a wild guess -- a dict. 

Namely, you can get a value for the keys we've listed


```python
from pprint import pprint
pprint(results['rtatman/english-word-frequency'])
```

    {'creatorName': 'Rachael Tatman',
     'creatorUrl': 'rtatman',
     'currentVersionNumber': 1,
     'description': None,
     'downloadCount': 3079,
     'files': [],
     'id': 2367,
     'isFeatured': False,
     'isPrivate': False,
     'isReviewed': True,
     'kernelCount': 12,
     'lastUpdated': '2017-09-06T18:21:27.18Z',
     'licenseName': 'Other (specified in description)',
     'ownerName': 'Rachael Tatman',
     'ownerRef': 'rtatman',
     'ref': 'rtatman/english-word-frequency',
     'subtitle': '⅓ Million Most Frequent English Words on the Web',
     'tags': [{'competitionCount': 3,
               'datasetCount': 231,
               'description': 'Language is a method of communication that consists '
                              'of using words arranged into meaningful patterns. '
                              'This is a good place to find natural language '
                              'processing datasets and kernels to study languages '
                              'and train your chat bots.',
               'fullPath': 'topic > culture and humanities > languages',
               'isAutomatic': False,
               'name': 'languages',
               'ref': 'languages',
               'scriptCount': 77,
               'totalCount': 311},
              {'competitionCount': 6,
               'datasetCount': 445,
               'description': 'The linguistics tag contains datasets and kernels '
                              'that you can use for text analytics, sentiment '
                              'analyses, and making clever jokes like this: Let me '
                              "tell you a little about myself. It's a reflexive "
                              'pronoun that means "me."',
               'fullPath': 'topic > people and society > social science > '
                           'linguistics',
               'isAutomatic': False,
               'name': 'linguistics',
               'ref': 'linguistics',
               'scriptCount': 122,
               'totalCount': 573},
              {'competitionCount': 18,
               'datasetCount': 4172,
               'description': 'An interconnected network of tubes that connects '
                              'the entire world together. This tag covers a broad '
                              'range of tags; anything from cryptocurrency to '
                              'website analytics.',
               'fullPath': 'topic > science and technology > internet',
               'isAutomatic': False,
               'name': 'internet',
               'ref': 'internet',
               'scriptCount': 198,
               'totalCount': 4388}],
     'title': 'English Word Frequency',
     'topicCount': 1,
     'totalBytes': 2236581,
     'url': 'https://www.kaggle.com/rtatman/english-word-frequency',
     'usabilityRating': 0.8235294,
     'versions': [],
     'viewCount': 21726,
     'voteCount': 105}


You get description, size, tags, download count... Useful stuff to make your choice. 

Personally, I like transform those results in a `DataFrame` that I can subsequently interrogate:


```python
import pandas as pd
df = pd.DataFrame(results.values())[['ref', 'title', 'subtitle', 'downloadCount', 'totalBytes']]
df = df.set_index('ref').sort_values('downloadCount', ascending=False)
# print(df.head(10).to_markdown())  # markdown for first 10 rows
```


| ref                                                  | title                                         | subtitle                                                                         |   downloadCount |       totalBytes |
|:-----------------------------------------------------|:----------------------------------------------|:---------------------------------------------------------------------------------|----------------:|-----------------:|
| jealousleopard/goodreadsbooks                        | Goodreads-books                               | comprehensive list of all books listed in goodreads                              |           23640 | 637338           |
| uciml/zoo-animal-classification                      | Zoo Animal Classification                     | Use Machine Learning Methods to Correctly Classify Animals Based Upon Attributes |           16597 |   1898           |
| yekenot/fasttext-crawl-300d-2m                       | FastText crawl 300d 2M                        | 2 million word vectors trained on Common Crawl (600B tokens)                     |            8275 |      1.54555e+09 |
| rtatman/sentiment-lexicons-for-81-languages          | Sentiment Lexicons for 81 Languages           | Sentiment Polarity Lexicons (Positive vs. Negative)                              |            7960 |      1.62176e+06 |
| rtatman/glove-global-vectors-for-word-representation | GloVe: Global Vectors for Word Representation | Pre-trained word vectors from Wikipedia 2014 + Gigaword 5                        |            7432 |      4.80173e+08 |
| mozillaorg/common-voice                              | Common Voice                                  | 500 hours of speech recordings, with speaker demographics                        |            6075 |      1.29315e+10 |
| arathee2/demonetization-in-india-twitter-data        | Demonetization in India Twitter Data          | Data extracted from Twitter regarding the recent currency demonetization         |            5761 | 919578           |
| eibriel/rdany-conversations                          | rDany Chat                                    | 157 chats & 6300+ messages with a (fake) virtual companion                       |            3983 | 916724           |
| mrisdal/2016-us-presidential-debates                 | 2016 US Presidential Debates                  | Full transcripts of the face-off between Clinton & Trump                         |            3920 | 123161           |
| nobelfoundation/nobel-laureates                      | Nobel Laureates, 1901-Present                 | Which country has won the most prizes in each category?                          |            3192 |  67763           |


## .meta

`.meta` is your access to metadata about datasets. 

It works the same way things work with the zips of datasets: It will:
- list: will list locally store dataset meta information (in location specified by `s.meta_dir`)
- get: when a value (metadata dict) is requested, (1) the key is searched locally first, and if not found, (2) will request it remotely (through the kaggle api), and (3) the value will be cached (stored) locally


## Cached search info

Wait, it's not all: `KaggleDatasets` will (by default) also cache these results locally in individual json files.

Where? Ask `meta_dir`:


```python
s.meta_dir
```




    '/D/Dropbox/_odata/kaggle/meta'



You can access these files with your favorite dict-like interface, through the `.meta` attribute


```python
len(s.meta)
```




    358




```python
list(s.meta)[:7]
```




    ['emmabel/word-occurrences-in-mr-robot',
     'bitsnpieces/covid19-country-data',
     'johnwdata/coronavirus-covid19-cases-by-us-state',
     'johnwdata/coronavirus-covid19-cases-by-us-county',
     'andradaolteanu/bing-nrc-afinn-lexicons',
     'rahulloha/covid19',
     'nltkdata/word2vec-sample']




```python
pprint(s.meta['emmabel/word-occurrences-in-mr-robot'])
```

    {'creatorName': 'Emma',
     'creatorUrl': 'emmabel',
     'currentVersionNumber': 1,
     'description': None,
     'downloadCount': 116,
     'files': [],
     'id': 4288,
     'isFeatured': False,
     'isPrivate': False,
     'isReviewed': False,
     'kernelCount': 1,
     'lastUpdated': '2017-11-09T18:30:15.733Z',
     'licenseName': 'CC0: Public Domain',
     'ownerName': 'Emma',
     'ownerRef': 'emmabel',
     'ref': 'emmabel/word-occurrences-in-mr-robot',
     'subtitle': "Find out F-Society's favorite lingo",
     'tags': [{'competitionCount': 0,
               'datasetCount': 7525,
               'description': 'Activities that holds the attention and interest of '
                              'an audience, or gives pleasure and delight. It can '
                              'be an idea or a task, but is more likely to be one '
                              'of the activities or events that have developed '
                              'over thousands of years specifically for the '
                              "purpose of keeping an audience's attention.",
               'fullPath': 'topic > arts and entertainment',
               'isAutomatic': False,
               'name': 'arts and entertainment',
               'ref': 'arts and entertainment',
               'scriptCount': 35,
               'totalCount': 7560},
              {'competitionCount': 0,
               'datasetCount': 1227,
               'description': 'One of the hallmarks of intelligence is the use of '
                              'games and toys to occupy free time and develop '
                              "intellectually. Often stored in Mom's basement.",
               'fullPath': 'topic > culture and humanities > games',
               'isAutomatic': False,
               'name': 'games',
               'ref': 'games',
               'scriptCount': 40,
               'totalCount': 1267}],
     'title': 'Word Occurrences in Mr. Robot',
     'topicCount': 0,
     'totalBytes': 119466,
     'url': 'https://www.kaggle.com/emmabel/word-occurrences-in-mr-robot',
     'usabilityRating': 0.7058824,
     'versions': [],
     'viewCount': 1028,
     'voteCount': 5}


So if you want to search locally for information (again, information about your searches, not your data zips!), you can get them in a `DataFrame` like so:


```python
>>> df = pd.DataFrame(s.meta.values())
>>> df.shape
(358, 36)
```

Markdown for the 10 first rows...
```python
t = df.head(10).dropna(axis=1)
del t['tags']
print(t.to_markdown())
```

|    |     id | ref                                                      | subtitle                                                                        | creatorName     | creatorUrl     |       totalBytes | url                                                                             | lastUpdated              |   downloadCount | isPrivate   | isReviewed   | isFeatured   | licenseName                                                | ownerName       | ownerRef       |   kernelCount | title                                              |   topicCount |   viewCount |   voteCount |   currentVersionNumber | files   | versions   |   usabilityRating |
|---:|-------:|:---------------------------------------------------------|:--------------------------------------------------------------------------------|:----------------|:---------------|-----------------:|:--------------------------------------------------------------------------------|:-------------------------|----------------:|:------------|:-------------|:-------------|:-----------------------------------------------------------|:----------------|:---------------|--------------:|:---------------------------------------------------|-------------:|------------:|------------:|-----------------------:|:--------|:-----------|------------------:|
|  0 |   4288 | emmabel/word-occurrences-in-mr-robot                     | Find out F-Society's favorite lingo                                             | Emma            | emmabel        | 119466           | https://www.kaggle.com/emmabel/word-occurrences-in-mr-robot                     | 2017-11-09T18:30:15.733Z |             116 | False       | False        | False        | CC0: Public Domain                                         | Emma            | emmabel        |             1 | Word Occurrences in Mr. Robot                      |            0 |        1028 |           5 |                      1 | []      | []         |          0.705882 |
|  1 | 576036 | bitsnpieces/covid19-country-data                         | Country level metadata that includes temperature, COVID-19 and H1N1 cases, etc. | Patrick         | bitsnpieces    | 190821           | https://www.kaggle.com/bitsnpieces/covid19-country-data                         | 2020-05-03T23:51:55.5Z   |             939 | False       | False        | False        | Database: Open Database, Contents: Database Contents       | Patrick         | bitsnpieces    |             3 | COVID-19 Country Data                              |            0 |        5419 |          26 |                     30 | []      | []         |          0.882353 |
|  2 | 575937 | johnwdata/coronavirus-covid19-cases-by-us-state          | NYTimes Coronavirus Dataset                                                     | John Wackerow   | johnwdata      |  82582           | https://www.kaggle.com/johnwdata/coronavirus-covid19-cases-by-us-state          | 2020-09-23T12:43:05.76Z  |              59 | False       | False        | False        | Other (specified in description)                           | John Wackerow   | johnwdata      |             2 | Coronavirus COVID-19 Cases By US State             |            0 |        1276 |           3 |                     83 | []      | []         |          1        |
|  3 | 575883 | johnwdata/coronavirus-covid19-cases-by-us-county         | NYTimes Coronavirus Dataset                                                     | John Wackerow   | johnwdata      |      3.50819e+06 | https://www.kaggle.com/johnwdata/coronavirus-covid19-cases-by-us-county         | 2020-07-23T18:47:16.543Z |              37 | False       | False        | False        | Unknown                                                    | John Wackerow   | johnwdata      |             2 | Coronavirus COVID-19 Cases By US County            |            0 |         610 |           3 |                      6 | []      | []         |          0.764706 |
|  4 | 507452 | andradaolteanu/bing-nrc-afinn-lexicons                   | the lexicons are in CSV format                                                  | Andrada Olteanu | andradaolteanu |  83965           | https://www.kaggle.com/andradaolteanu/bing-nrc-afinn-lexicons                   | 2020-02-09T18:39:13.343Z |             135 | False       | False        | False        | Unknown                                                    | Andrada Olteanu | andradaolteanu |            11 | Bing, NRC, Afinn Lexicons                          |            0 |         629 |          12 |                      1 | []      | []         |          0.882353 |
|  5 | 599303 | rahulloha/covid19                                        | Covid-19 dataset (updates daily at 10:00 pm pst)                                | Rahul Loha      | rahulloha      |      1.60678e+07 | https://www.kaggle.com/rahulloha/covid19                                        | 2020-04-12T22:50:26.857Z |              65 | False       | False        | False        | Unknown                                                    | Rahul Loha      | rahulloha      |             1 | Global Coronavirus (COVID-19) Data (Johns Hopkins) |            0 |         615 |           6 |                      1 | []      | []         |          0.647059 |
|  6 |   2045 | nltkdata/word2vec-sample                                 | Sample Word2Vec Model                                                           | Liling Tan      | alvations      |      1.01479e+08 | https://www.kaggle.com/nltkdata/word2vec-sample                                 | 2017-08-20T03:14:39.623Z |             772 | False       | True         | False        | Other (specified in description)                           | NLTK Data       | nltkdata       |             3 | Word2Vec Sample                                    |            0 |        8073 |          15 |                      1 | []      | []         |          0.75     |
|  7 | 585107 | nxpnsv/country-health-indicators                         | Health indicator relevant to covid19 death and infection risk                   | nxpnsv          | nxpnsv         |  40146           | https://www.kaggle.com/nxpnsv/country-health-indicators                         | 2020-04-07T11:12:41.91Z  |             185 | False       | False        | False        | CC BY-NC-SA 4.0                                            | nxpnsv          | nxpnsv         |             6 | country health indicators                          |            0 |        2126 |          11 |                      6 | []      | []         |          1        |
|  8 | 585591 | joelhanson/coronavirus-covid19-data-in-the-united-states | The New York Times data on coronavirus cases and deaths in the U.S              | Joel Hanson     | joelhanson     |      5.83676e+06 | https://www.kaggle.com/joelhanson/coronavirus-covid19-data-in-the-united-states | 2020-09-21T15:51:45.933Z |             164 | False       | False        | False        | Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) | Joel Hanson     | joelhanson     |             3 | Coronavirus (Covid-19) Data of United States (USA) |            0 |        1724 |           5 |                     29 | []      | []         |          0.882353 |
|  9 |   8352 | vsmolyakov/fasttext                                      | embeddings with sub-word information                                            | vsmolyakov      | vsmolyakov     |      1.1168e+08  | https://www.kaggle.com/vsmolyakov/fasttext                                      | 2017-12-31T17:32:30.94Z  |             947 | False       | False        | False        | CC0: Public Domain                                         | vsmolyakov      | vsmolyakov     |            14 | FastText                                           |            0 |        3809 |           5 |                      1 | []      | []         |          0.875    |


**Note: If you don't want all your search results to be cached you can just specify it.**

```python
s = KaggleDatasets(rootdir, cache_metas_on_search=False)  # make an instance
```

# The boring stuff

## Install

pip install haggle

**You'll need a kaggle api token to use this**

If you do, you probably can just start using. 

If you don't got get one! Go see [this](https://github.com/Kaggle/kaggle-api) for detailed instructions, it essentially says:



## API credentials

To use the Kaggle API, sign up for a Kaggle account at https://www.kaggle.com. 
Then go to the 'Account' tab of your user profile (`https://www.kaggle.com/<username>/account`) and select 'Create API Token'. 
This will trigger the download of `kaggle.json`, a file containing your API credentials. 
Place this file in the location `~/.kaggle/kaggle.json` (on Windows in the location `C:\Users\<Windows-username>\.kaggle\kaggle.json` - you can check the exact location, sans drive, with `echo %HOMEPATH%`). 
You can define a shell environment variable `KAGGLE_CONFIG_DIR` to change this location to `$KAGGLE_CONFIG_DIR/kaggle.json` (on Windows it will be `%KAGGLE_CONFIG_DIR%\kaggle.json`).

For your security, ensure that other users of your computer do not have read access to your credentials. On Unix-based systems you can do this with the following command: 

`chmod 600 ~/.kaggle/kaggle.json`

You can also choose to export your Kaggle username and token to the environment:

```bash
export KAGGLE_USERNAME=datadinosaur
export KAGGLE_KEY=xxxxxxxxxxxxxx
```
In addition, you can export any other configuration value that normally would be in
the `$HOME/.kaggle/kaggle.json` in the format 'KAGGLE_<VARIABLE>' (note uppercase).  
For example, if the file had the variable "proxy" you would export `KAGGLE_PROXY`
and it would be discovered by the client.


# F.A.Q.

## What if I don't want a zip file anymore?

Just delete it, like you do with any file you don't want anymore. You know the one.

Or... you can be cool and do `del s['owner/dataset']` for that key (note a key doesn't include the rootdir or the `.zip` extension), just like you would with a... `dict`, once again.

## Do you have any jupyter notebooks demoing this.

Sure, you can find some [here on github](https://github.com/otosense/haggle/tree/master/docs).

## A little snippet of code to test?

```python
from haggle import KaggleDatasets

s = KaggleDatasets()  # make an instance
# results = s.search('coronavirus')
# ref = next(iter(results))  # see that results iteration works
ref = 'sanchman/coronavirus-present-data'
if ref in s:  # delete if exists
    del s[ref]
assert ref not in s  # see, not there
v = s[ref]  # redownload
assert ref in s  # see, not there
assert 'PresentData.xlsx' in set(v)  # and it has stuff in v

import pandas as pd
import io
df = pd.read_excel(io.BytesIO(v['PresentData.xlsx']))
assert 'Total Confirmed' in df.columns
assert 'Country' in df.columns
assert 'France' in df['Country'].values
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thorwhalen/haggle",
    "name": "haggle",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "kaggle,data",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/bd/ca/ef4a1b5d8584bfcd5a135487d5e4281235bdb482cf1ae5793f1dad6a6d5f/haggle-0.1.13.tar.gz",
    "platform": "any",
    "description": "- [Haggle](#haggle)\n- [Simple example](#simple-example)\n  * [By the way...](#by-the-way)\n- [Search results and dataset metadata](#search-results-and-dataset-metadata)\n  * [.meta](#meta)\n  * [Cached search info](#cached-search-info)\n- [The boring stuff](#the-boring-stuff)\n  * [Install](#install)\n  * [API credentials](#api-credentials)\n- [F.A.Q.](#faq)\n  * [What if I don't want a zip file anymore?](#what-if-i-don-t-want-a-zip-file-anymore-)\n  * [Do you have any jupyter notebooks demoing this.](#do-you-have-any-jupyter-notebooks-demoing-this)\n  * [A little snippet of code to test?](#a-little-snippet-of-code-to-test-)\n\n<small><i><a href='http://ecotrust-canada.github.io/markdown-toc/'>Table of contents generated with markdown-toc</a></i></small>\n\n# Haggle\n\nA simple facade to [Kaggle](https://www.kaggle.com/) data.\n\nEssentially, instantiate a `KaggleDatasets` object, and from it search datasets, see their metadata, download the data \n(automatically caching it in well organized folders), and all from an interface that looks like \na humble dict with `owner/dataset` keys, and that's the coolest bit.\n\n**Haggle:** /\u02c8ha\u0261\u0259l/\n- an instance of intense argument (as in bargaining) \n- wrangle (over a price, terms of an agreement, etc.) \n- rhymes with Kaggle and is not taken on pypi (well, now it is)\n\n# Simple example\n\n\n```python\nfrom haggle import KaggleDatasets\n\nrootdir = None  # define (or not) where you want the data to be cached/downloaded\n\ns = KaggleDatasets(rootdir)  # make an instance\n\nif 'rtatman/english-word-frequency' in s:\n    del s['rtatman/english-word-frequency']  # just to prepare for the demo\n```\n\n\n```python\nlist(s)  # see what you have locally\n```\n\n\n\n\n    ['uciml/human-activity-recognition-with-smartphones',\n     'sitsawek/phonetics-articles-on-plos']\n\n\n\nLet's search something (you can also search on [kaggle](https://www.kaggle.com/), I was kidding about it being lame!)\n\n\n```python\nresults = s.search('word frequency')\nprint(f\"{len(results)=}\")\nlist(results)[:10]\n```\n\n    len(results)=180\n\n\n\n\n    ['rtatman/english-word-frequency',\n     'yekenot/fasttext-crawl-300d-2m',\n     'rtatman/japanese-lemma-frequency',\n     'rtatman/glove-global-vectors-for-word-representation',\n     'averkij/lingtrain-hungarian-word-frequency',\n     'lukevanhaezebrouck/subtlex-word-frequency',\n     'facebook/fatsttext-common-crawl',\n     'facebook/fasttext-wikinews',\n     'facebook/fasttext-english-word-vectors-including-subwords',\n     'kushtej/kannada-word-frequency']\n\n\n\nChose what you want? Good, now do this:\n\n\n```python\nv = s['rtatman/english-word-frequency']\ntype(v)\n```\n\n\n\n\n    py2store.slib.s_zipfile.ZipReader\n\n\n\nOkay, let's slow down a moment. What happened? What's this `ZipReader` thingy?\n\nWell, what happened is that this downloaded the zip file of the data for you and saved it in `ROOTDIR/rtatman/english-word-frequency.zip`. Don't believe me? Go have a look. \n\nBut then it also returns this object called `ZipReader` that points to it. \n\nIf you don't like it, you don't have to use it. But I think you should like it.\n\nLook at what it can do!\n\nList the contents of file (that's in the zip... okay there's just one here, it's a bit boring)\n\n\n```python\nlist(v)\n```\n\n\n\n\n    ['unigram_freq.csv']\n\n\n\nRetrieve the data for any given file of the zip without ever having to unzip it!\n\nOh, and still pretending to be a dict. \n\n\n```python\nb = v['unigram_freq.csv']\nprint(f\"b is a {type(b)} and has {len(b)} bytes\")\n```\n\n    b is a <class 'bytes'> and has 4956252 bytes\n\n\nNow the data is given in bytes by default, since that's the basis of everything. \n\nFrom there you can go everywhere. Here for example, say we'd like to go to `pandas.DataFrame`...\n\n\n```python\nimport pandas as pd\nfrom io import BytesIO\n\ndf = pd.read_csv(BytesIO(b))\ndf.shape\n```\n\n\n\n\n    (333333, 2)\n\n\n\n\n```python\nprint(df.head(7).to_string())\n```\n\n      word        count\n    0  the  23135851162\n    1   of  13151942776\n    2  and  12997637966\n    3   to  12136980858\n    4    a   9081174698\n    5   in   8469404971\n    6  for   5933321709\n\n\nAnd as mentioned, it caches the data to your local drive. You know, download, so that the next time you ask for `s['rtatman/english-word-frequency']`, it'll be faster to get those bytes.\n\nSee, let's list the contents of `s` again and see that we now have that `'rtatman/english-word-frequency'` key we didn't have before.\n\n\n```python\nlist(s)\n```\n\n\n\n\n    ['uciml/human-activity-recognition-with-smartphones',\n     'rtatman/english-word-frequency',\n     'sitsawek/phonetics-articles-on-plos']\n\n\n\n\n## By the way...\n\nSo a `KaggleDatasets` is a store with a dict-like interface. \n\nListing happens locally. Remote listing is done through `.search(...)`.\n\nGetting happens locally first, and if not, will get remotely (and cache locally).\n\n\nWhere are the zips stored? Ask `.zips_dir`:\n\n\n```python\ns.zips_dir\n```\n\n\n\n\n    '/D/Dropbox/_odata/kaggle/zips'\n\n\n\n# Search results and dataset metadata\n\nLet's have a closer look at those search results. All we did is a `len(results)` and a `list(results)`. What else can you do with that object?\n\nWell, as is so happens, you can do whatever (read-only) operation you can do on a -- take a wild guess -- a dict. \n\nNamely, you can get a value for the keys we've listed\n\n\n```python\nfrom pprint import pprint\npprint(results['rtatman/english-word-frequency'])\n```\n\n    {'creatorName': 'Rachael Tatman',\n     'creatorUrl': 'rtatman',\n     'currentVersionNumber': 1,\n     'description': None,\n     'downloadCount': 3079,\n     'files': [],\n     'id': 2367,\n     'isFeatured': False,\n     'isPrivate': False,\n     'isReviewed': True,\n     'kernelCount': 12,\n     'lastUpdated': '2017-09-06T18:21:27.18Z',\n     'licenseName': 'Other (specified in description)',\n     'ownerName': 'Rachael Tatman',\n     'ownerRef': 'rtatman',\n     'ref': 'rtatman/english-word-frequency',\n     'subtitle': '\u2153 Million Most Frequent English Words on the Web',\n     'tags': [{'competitionCount': 3,\n               'datasetCount': 231,\n               'description': 'Language is a method of communication that consists '\n                              'of using words arranged into meaningful patterns. '\n                              'This is a good place to find natural language '\n                              'processing datasets and kernels to study languages '\n                              'and train your chat bots.',\n               'fullPath': 'topic > culture and humanities > languages',\n               'isAutomatic': False,\n               'name': 'languages',\n               'ref': 'languages',\n               'scriptCount': 77,\n               'totalCount': 311},\n              {'competitionCount': 6,\n               'datasetCount': 445,\n               'description': 'The linguistics tag contains datasets and kernels '\n                              'that you can use for text analytics, sentiment '\n                              'analyses, and making clever jokes like this: Let me '\n                              \"tell you a little about myself. It's a reflexive \"\n                              'pronoun that means \"me.\"',\n               'fullPath': 'topic > people and society > social science > '\n                           'linguistics',\n               'isAutomatic': False,\n               'name': 'linguistics',\n               'ref': 'linguistics',\n               'scriptCount': 122,\n               'totalCount': 573},\n              {'competitionCount': 18,\n               'datasetCount': 4172,\n               'description': 'An interconnected network of tubes that connects '\n                              'the entire world together. This tag covers a broad '\n                              'range of tags; anything from cryptocurrency to '\n                              'website analytics.',\n               'fullPath': 'topic > science and technology > internet',\n               'isAutomatic': False,\n               'name': 'internet',\n               'ref': 'internet',\n               'scriptCount': 198,\n               'totalCount': 4388}],\n     'title': 'English Word Frequency',\n     'topicCount': 1,\n     'totalBytes': 2236581,\n     'url': 'https://www.kaggle.com/rtatman/english-word-frequency',\n     'usabilityRating': 0.8235294,\n     'versions': [],\n     'viewCount': 21726,\n     'voteCount': 105}\n\n\nYou get description, size, tags, download count... Useful stuff to make your choice. \n\nPersonally, I like transform those results in a `DataFrame` that I can subsequently interrogate:\n\n\n```python\nimport pandas as pd\ndf = pd.DataFrame(results.values())[['ref', 'title', 'subtitle', 'downloadCount', 'totalBytes']]\ndf = df.set_index('ref').sort_values('downloadCount', ascending=False)\n# print(df.head(10).to_markdown())  # markdown for first 10 rows\n```\n\n\n| ref                                                  | title                                         | subtitle                                                                         |   downloadCount |       totalBytes |\n|:-----------------------------------------------------|:----------------------------------------------|:---------------------------------------------------------------------------------|----------------:|-----------------:|\n| jealousleopard/goodreadsbooks                        | Goodreads-books                               | comprehensive list of all books listed in goodreads                              |           23640 | 637338           |\n| uciml/zoo-animal-classification                      | Zoo Animal Classification                     | Use Machine Learning Methods to Correctly Classify Animals Based Upon Attributes |           16597 |   1898           |\n| yekenot/fasttext-crawl-300d-2m                       | FastText crawl 300d 2M                        | 2 million word vectors trained on Common Crawl (600B tokens)                     |            8275 |      1.54555e+09 |\n| rtatman/sentiment-lexicons-for-81-languages          | Sentiment Lexicons for 81 Languages           | Sentiment Polarity Lexicons (Positive vs. Negative)                              |            7960 |      1.62176e+06 |\n| rtatman/glove-global-vectors-for-word-representation | GloVe: Global Vectors for Word Representation | Pre-trained word vectors from Wikipedia 2014 + Gigaword 5                        |            7432 |      4.80173e+08 |\n| mozillaorg/common-voice                              | Common Voice                                  | 500 hours of speech recordings, with speaker demographics                        |            6075 |      1.29315e+10 |\n| arathee2/demonetization-in-india-twitter-data        | Demonetization in India Twitter Data          | Data extracted from Twitter regarding the recent currency demonetization         |            5761 | 919578           |\n| eibriel/rdany-conversations                          | rDany Chat                                    | 157 chats & 6300+ messages with a (fake) virtual companion                       |            3983 | 916724           |\n| mrisdal/2016-us-presidential-debates                 | 2016 US Presidential Debates                  | Full transcripts of the face-off between Clinton & Trump                         |            3920 | 123161           |\n| nobelfoundation/nobel-laureates                      | Nobel Laureates, 1901-Present                 | Which country has won the most prizes in each category?                          |            3192 |  67763           |\n\n\n## .meta\n\n`.meta` is your access to metadata about datasets. \n\nIt works the same way things work with the zips of datasets: It will:\n- list: will list locally store dataset meta information (in location specified by `s.meta_dir`)\n- get: when a value (metadata dict) is requested, (1) the key is searched locally first, and if not found, (2) will request it remotely (through the kaggle api), and (3) the value will be cached (stored) locally\n\n\n## Cached search info\n\nWait, it's not all: `KaggleDatasets` will (by default) also cache these results locally in individual json files.\n\nWhere? Ask `meta_dir`:\n\n\n```python\ns.meta_dir\n```\n\n\n\n\n    '/D/Dropbox/_odata/kaggle/meta'\n\n\n\nYou can access these files with your favorite dict-like interface, through the `.meta` attribute\n\n\n```python\nlen(s.meta)\n```\n\n\n\n\n    358\n\n\n\n\n```python\nlist(s.meta)[:7]\n```\n\n\n\n\n    ['emmabel/word-occurrences-in-mr-robot',\n     'bitsnpieces/covid19-country-data',\n     'johnwdata/coronavirus-covid19-cases-by-us-state',\n     'johnwdata/coronavirus-covid19-cases-by-us-county',\n     'andradaolteanu/bing-nrc-afinn-lexicons',\n     'rahulloha/covid19',\n     'nltkdata/word2vec-sample']\n\n\n\n\n```python\npprint(s.meta['emmabel/word-occurrences-in-mr-robot'])\n```\n\n    {'creatorName': 'Emma',\n     'creatorUrl': 'emmabel',\n     'currentVersionNumber': 1,\n     'description': None,\n     'downloadCount': 116,\n     'files': [],\n     'id': 4288,\n     'isFeatured': False,\n     'isPrivate': False,\n     'isReviewed': False,\n     'kernelCount': 1,\n     'lastUpdated': '2017-11-09T18:30:15.733Z',\n     'licenseName': 'CC0: Public Domain',\n     'ownerName': 'Emma',\n     'ownerRef': 'emmabel',\n     'ref': 'emmabel/word-occurrences-in-mr-robot',\n     'subtitle': \"Find out F-Society's favorite lingo\",\n     'tags': [{'competitionCount': 0,\n               'datasetCount': 7525,\n               'description': 'Activities that holds the attention and interest of '\n                              'an audience, or gives pleasure and delight. It can '\n                              'be an idea or a task, but is more likely to be one '\n                              'of the activities or events that have developed '\n                              'over thousands of years specifically for the '\n                              \"purpose of keeping an audience's attention.\",\n               'fullPath': 'topic > arts and entertainment',\n               'isAutomatic': False,\n               'name': 'arts and entertainment',\n               'ref': 'arts and entertainment',\n               'scriptCount': 35,\n               'totalCount': 7560},\n              {'competitionCount': 0,\n               'datasetCount': 1227,\n               'description': 'One of the hallmarks of intelligence is the use of '\n                              'games and toys to occupy free time and develop '\n                              \"intellectually. Often stored in Mom's basement.\",\n               'fullPath': 'topic > culture and humanities > games',\n               'isAutomatic': False,\n               'name': 'games',\n               'ref': 'games',\n               'scriptCount': 40,\n               'totalCount': 1267}],\n     'title': 'Word Occurrences in Mr. Robot',\n     'topicCount': 0,\n     'totalBytes': 119466,\n     'url': 'https://www.kaggle.com/emmabel/word-occurrences-in-mr-robot',\n     'usabilityRating': 0.7058824,\n     'versions': [],\n     'viewCount': 1028,\n     'voteCount': 5}\n\n\nSo if you want to search locally for information (again, information about your searches, not your data zips!), you can get them in a `DataFrame` like so:\n\n\n```python\n>>> df = pd.DataFrame(s.meta.values())\n>>> df.shape\n(358, 36)\n```\n\nMarkdown for the 10 first rows...\n```python\nt = df.head(10).dropna(axis=1)\ndel t['tags']\nprint(t.to_markdown())\n```\n\n|    |     id | ref                                                      | subtitle                                                                        | creatorName     | creatorUrl     |       totalBytes | url                                                                             | lastUpdated              |   downloadCount | isPrivate   | isReviewed   | isFeatured   | licenseName                                                | ownerName       | ownerRef       |   kernelCount | title                                              |   topicCount |   viewCount |   voteCount |   currentVersionNumber | files   | versions   |   usabilityRating |\n|---:|-------:|:---------------------------------------------------------|:--------------------------------------------------------------------------------|:----------------|:---------------|-----------------:|:--------------------------------------------------------------------------------|:-------------------------|----------------:|:------------|:-------------|:-------------|:-----------------------------------------------------------|:----------------|:---------------|--------------:|:---------------------------------------------------|-------------:|------------:|------------:|-----------------------:|:--------|:-----------|------------------:|\n|  0 |   4288 | emmabel/word-occurrences-in-mr-robot                     | Find out F-Society's favorite lingo                                             | Emma            | emmabel        | 119466           | https://www.kaggle.com/emmabel/word-occurrences-in-mr-robot                     | 2017-11-09T18:30:15.733Z |             116 | False       | False        | False        | CC0: Public Domain                                         | Emma            | emmabel        |             1 | Word Occurrences in Mr. Robot                      |            0 |        1028 |           5 |                      1 | []      | []         |          0.705882 |\n|  1 | 576036 | bitsnpieces/covid19-country-data                         | Country level metadata that includes temperature, COVID-19 and H1N1 cases, etc. | Patrick         | bitsnpieces    | 190821           | https://www.kaggle.com/bitsnpieces/covid19-country-data                         | 2020-05-03T23:51:55.5Z   |             939 | False       | False        | False        | Database: Open Database, Contents: Database Contents       | Patrick         | bitsnpieces    |             3 | COVID-19 Country Data                              |            0 |        5419 |          26 |                     30 | []      | []         |          0.882353 |\n|  2 | 575937 | johnwdata/coronavirus-covid19-cases-by-us-state          | NYTimes Coronavirus Dataset                                                     | John Wackerow   | johnwdata      |  82582           | https://www.kaggle.com/johnwdata/coronavirus-covid19-cases-by-us-state          | 2020-09-23T12:43:05.76Z  |              59 | False       | False        | False        | Other (specified in description)                           | John Wackerow   | johnwdata      |             2 | Coronavirus COVID-19 Cases By US State             |            0 |        1276 |           3 |                     83 | []      | []         |          1        |\n|  3 | 575883 | johnwdata/coronavirus-covid19-cases-by-us-county         | NYTimes Coronavirus Dataset                                                     | John Wackerow   | johnwdata      |      3.50819e+06 | https://www.kaggle.com/johnwdata/coronavirus-covid19-cases-by-us-county         | 2020-07-23T18:47:16.543Z |              37 | False       | False        | False        | Unknown                                                    | John Wackerow   | johnwdata      |             2 | Coronavirus COVID-19 Cases By US County            |            0 |         610 |           3 |                      6 | []      | []         |          0.764706 |\n|  4 | 507452 | andradaolteanu/bing-nrc-afinn-lexicons                   | the lexicons are in CSV format                                                  | Andrada Olteanu | andradaolteanu |  83965           | https://www.kaggle.com/andradaolteanu/bing-nrc-afinn-lexicons                   | 2020-02-09T18:39:13.343Z |             135 | False       | False        | False        | Unknown                                                    | Andrada Olteanu | andradaolteanu |            11 | Bing, NRC, Afinn Lexicons                          |            0 |         629 |          12 |                      1 | []      | []         |          0.882353 |\n|  5 | 599303 | rahulloha/covid19                                        | Covid-19 dataset (updates daily at 10:00 pm pst)                                | Rahul Loha      | rahulloha      |      1.60678e+07 | https://www.kaggle.com/rahulloha/covid19                                        | 2020-04-12T22:50:26.857Z |              65 | False       | False        | False        | Unknown                                                    | Rahul Loha      | rahulloha      |             1 | Global Coronavirus (COVID-19) Data (Johns Hopkins) |            0 |         615 |           6 |                      1 | []      | []         |          0.647059 |\n|  6 |   2045 | nltkdata/word2vec-sample                                 | Sample Word2Vec Model                                                           | Liling Tan      | alvations      |      1.01479e+08 | https://www.kaggle.com/nltkdata/word2vec-sample                                 | 2017-08-20T03:14:39.623Z |             772 | False       | True         | False        | Other (specified in description)                           | NLTK Data       | nltkdata       |             3 | Word2Vec Sample                                    |            0 |        8073 |          15 |                      1 | []      | []         |          0.75     |\n|  7 | 585107 | nxpnsv/country-health-indicators                         | Health indicator relevant to covid19 death and infection risk                   | nxpnsv          | nxpnsv         |  40146           | https://www.kaggle.com/nxpnsv/country-health-indicators                         | 2020-04-07T11:12:41.91Z  |             185 | False       | False        | False        | CC BY-NC-SA 4.0                                            | nxpnsv          | nxpnsv         |             6 | country health indicators                          |            0 |        2126 |          11 |                      6 | []      | []         |          1        |\n|  8 | 585591 | joelhanson/coronavirus-covid19-data-in-the-united-states | The New York Times data on coronavirus cases and deaths in the U.S              | Joel Hanson     | joelhanson     |      5.83676e+06 | https://www.kaggle.com/joelhanson/coronavirus-covid19-data-in-the-united-states | 2020-09-21T15:51:45.933Z |             164 | False       | False        | False        | Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) | Joel Hanson     | joelhanson     |             3 | Coronavirus (Covid-19) Data of United States (USA) |            0 |        1724 |           5 |                     29 | []      | []         |          0.882353 |\n|  9 |   8352 | vsmolyakov/fasttext                                      | embeddings with sub-word information                                            | vsmolyakov      | vsmolyakov     |      1.1168e+08  | https://www.kaggle.com/vsmolyakov/fasttext                                      | 2017-12-31T17:32:30.94Z  |             947 | False       | False        | False        | CC0: Public Domain                                         | vsmolyakov      | vsmolyakov     |            14 | FastText                                           |            0 |        3809 |           5 |                      1 | []      | []         |          0.875    |\n\n\n**Note: If you don't want all your search results to be cached you can just specify it.**\n\n```python\ns = KaggleDatasets(rootdir, cache_metas_on_search=False)  # make an instance\n```\n\n# The boring stuff\n\n## Install\n\npip install haggle\n\n**You'll need a kaggle api token to use this**\n\nIf you do, you probably can just start using. \n\nIf you don't got get one! Go see [this](https://github.com/Kaggle/kaggle-api) for detailed instructions, it essentially says:\n\n\n\n## API credentials\n\nTo use the Kaggle API, sign up for a Kaggle account at https://www.kaggle.com. \nThen go to the 'Account' tab of your user profile (`https://www.kaggle.com/<username>/account`) and select 'Create API Token'. \nThis will trigger the download of `kaggle.json`, a file containing your API credentials. \nPlace this file in the location `~/.kaggle/kaggle.json` (on Windows in the location `C:\\Users\\<Windows-username>\\.kaggle\\kaggle.json` - you can check the exact location, sans drive, with `echo %HOMEPATH%`). \nYou can define a shell environment variable `KAGGLE_CONFIG_DIR` to change this location to `$KAGGLE_CONFIG_DIR/kaggle.json` (on Windows it will be `%KAGGLE_CONFIG_DIR%\\kaggle.json`).\n\nFor your security, ensure that other users of your computer do not have read access to your credentials. On Unix-based systems you can do this with the following command: \n\n`chmod 600 ~/.kaggle/kaggle.json`\n\nYou can also choose to export your Kaggle username and token to the environment:\n\n```bash\nexport KAGGLE_USERNAME=datadinosaur\nexport KAGGLE_KEY=xxxxxxxxxxxxxx\n```\nIn addition, you can export any other configuration value that normally would be in\nthe `$HOME/.kaggle/kaggle.json` in the format 'KAGGLE_<VARIABLE>' (note uppercase).  \nFor example, if the file had the variable \"proxy\" you would export `KAGGLE_PROXY`\nand it would be discovered by the client.\n\n\n# F.A.Q.\n\n## What if I don't want a zip file anymore?\n\nJust delete it, like you do with any file you don't want anymore. You know the one.\n\nOr... you can be cool and do `del s['owner/dataset']` for that key (note a key doesn't include the rootdir or the `.zip` extension), just like you would with a... `dict`, once again.\n\n## Do you have any jupyter notebooks demoing this.\n\nSure, you can find some [here on github](https://github.com/otosense/haggle/tree/master/docs).\n\n## A little snippet of code to test?\n\n```python\nfrom haggle import KaggleDatasets\n\ns = KaggleDatasets()  # make an instance\n# results = s.search('coronavirus')\n# ref = next(iter(results))  # see that results iteration works\nref = 'sanchman/coronavirus-present-data'\nif ref in s:  # delete if exists\n    del s[ref]\nassert ref not in s  # see, not there\nv = s[ref]  # redownload\nassert ref in s  # see, not there\nassert 'PresentData.xlsx' in set(v)  # and it has stuff in v\n\nimport pandas as pd\nimport io\ndf = pd.read_excel(io.BytesIO(v['PresentData.xlsx']))\nassert 'Total Confirmed' in df.columns\nassert 'Country' in df.columns\nassert 'France' in df['Country'].values\n```\n",
    "bugtrack_url": null,
    "license": "mit",
    "summary": "A facade to Kaggle data",
    "version": "0.1.13",
    "project_urls": {
        "Homepage": "https://github.com/thorwhalen/haggle"
    },
    "split_keywords": [
        "kaggle",
        "data"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7bcf04d03cd051902b044addfb9e7c14366a0b4158c69b2971de751bda6dfa76",
                "md5": "1ff5462906a121e2e6e077bf8e9c122a",
                "sha256": "c19739fe438106c7746dffddeaf4c8be999836dfa4f6d02d1f63365603f9b36a"
            },
            "downloads": -1,
            "filename": "haggle-0.1.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1ff5462906a121e2e6e077bf8e9c122a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 14837,
            "upload_time": "2024-01-10T10:08:25",
            "upload_time_iso_8601": "2024-01-10T10:08:25.198981Z",
            "url": "https://files.pythonhosted.org/packages/7b/cf/04d03cd051902b044addfb9e7c14366a0b4158c69b2971de751bda6dfa76/haggle-0.1.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bdcaef4a1b5d8584bfcd5a135487d5e4281235bdb482cf1ae5793f1dad6a6d5f",
                "md5": "98a7748d199a78ac8d1cd46a6b706d1d",
                "sha256": "2330c3b00d627502dc664bd64f5b9cdf66dc093687442c2cdbeb58a436deac7c"
            },
            "downloads": -1,
            "filename": "haggle-0.1.13.tar.gz",
            "has_sig": false,
            "md5_digest": "98a7748d199a78ac8d1cd46a6b706d1d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 21977,
            "upload_time": "2024-01-10T10:08:26",
            "upload_time_iso_8601": "2024-01-10T10:08:26.396271Z",
            "url": "https://files.pythonhosted.org/packages/bd/ca/ef4a1b5d8584bfcd5a135487d5e4281235bdb482cf1ae5793f1dad6a6d5f/haggle-0.1.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-10 10:08:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thorwhalen",
    "github_project": "haggle",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "haggle"
}