idiom


Nameidiom JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://github.com/thorwhalen/idiom
SummaryAccess and operations with word2vec data
upload_time2024-12-12 09:35:08
maintainerNone
docs_urlNone
authorThor Whalen
requires_pythonNone
licensemit
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # idiom

Access and operations with word2vec data

To install:	```pip install idiom```

## Overview

The `idiom` package provides access to word vector data and useful functions to manipulate and analyze it. It includes functionalities for finding the closest words to a given word, calculating word frequencies, and working with various word vector models.

## Features

- **Closest Words**: Find the closest words to a given word based on cosine similarity.
- **Word Frequencies**: Access and manipulate word frequency data.
- **Word Vector Models**: Work with pre-trained word vector models such as FastText.
- **IDF Calculations**: Compute different types of Inverse Document Frequency (IDF) values.

## Usage

### Finding Closest Words

You can find the closest words to a given word using the `closest_words` function:

```python
from idiom import closest_words

# Example: Find the closest words to 'mad' that start with 'l'
starts_with_L = lambda x: x.startswith('l')
print(closest_words('mad', k=10, search_words=starts_with_L))
```

### Accessing Word Frequencies

You can access the most frequent words using the `most_frequent_words` function:

```python
from idiom import most_frequent_words

# Get the top 100,000 most frequent words
frequent_words = most_frequent_words(max_n_words=100000)
print(frequent_words)
```

### Working with Word Vectors

You can load and work with pre-trained word vectors using the `WordVec` class:

```python
from idiom import WordVec

# Initialize WordVec with default word vectors
word_vec = WordVec()

# Calculate the distance between two queries
distance = word_vec.dist('france capital', 'paris')
print(distance)
```

### IDF Calculations

You can compute different types of IDF values using the `_IDF` class:

```python
from idiom import idf

# Access logarithmic IDF values
log_idf = idf.logarithmic
print(log_idf)
```

## Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.

## License

This project is licensed under the MIT License.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thorwhalen/idiom",
    "name": "idiom",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Thor Whalen",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/25/58/0ef6df3ed985cca2bfefd8b35a2312600da22a7b52455293ccddd572e298/idiom-0.1.5.tar.gz",
    "platform": "any",
    "description": "# idiom\n\nAccess and operations with word2vec data\n\nTo install:\t```pip install idiom```\n\n## Overview\n\nThe `idiom` package provides access to word vector data and useful functions to manipulate and analyze it. It includes functionalities for finding the closest words to a given word, calculating word frequencies, and working with various word vector models.\n\n## Features\n\n- **Closest Words**: Find the closest words to a given word based on cosine similarity.\n- **Word Frequencies**: Access and manipulate word frequency data.\n- **Word Vector Models**: Work with pre-trained word vector models such as FastText.\n- **IDF Calculations**: Compute different types of Inverse Document Frequency (IDF) values.\n\n## Usage\n\n### Finding Closest Words\n\nYou can find the closest words to a given word using the `closest_words` function:\n\n```python\nfrom idiom import closest_words\n\n# Example: Find the closest words to 'mad' that start with 'l'\nstarts_with_L = lambda x: x.startswith('l')\nprint(closest_words('mad', k=10, search_words=starts_with_L))\n```\n\n### Accessing Word Frequencies\n\nYou can access the most frequent words using the `most_frequent_words` function:\n\n```python\nfrom idiom import most_frequent_words\n\n# Get the top 100,000 most frequent words\nfrequent_words = most_frequent_words(max_n_words=100000)\nprint(frequent_words)\n```\n\n### Working with Word Vectors\n\nYou can load and work with pre-trained word vectors using the `WordVec` class:\n\n```python\nfrom idiom import WordVec\n\n# Initialize WordVec with default word vectors\nword_vec = WordVec()\n\n# Calculate the distance between two queries\ndistance = word_vec.dist('france capital', 'paris')\nprint(distance)\n```\n\n### IDF Calculations\n\nYou can compute different types of IDF values using the `_IDF` class:\n\n```python\nfrom idiom import idf\n\n# Access logarithmic IDF values\nlog_idf = idf.logarithmic\nprint(log_idf)\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.\n\n## License\n\nThis project is licensed under the MIT License.\n\n\n",
    "bugtrack_url": null,
    "license": "mit",
    "summary": "Access and operations with word2vec data",
    "version": "0.1.5",
    "project_urls": {
        "Homepage": "https://github.com/thorwhalen/idiom"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "798d7f31109613b19fb6137a7a409bb653f8d8146de02c915ff9360cca8926b6",
                "md5": "1ff6e797c9c53dbee3312105a21ed1dd",
                "sha256": "0b2791a0d7b8e97aac0cc58f32e45f523f8325392d4778cca54aca3c8221acd9"
            },
            "downloads": -1,
            "filename": "idiom-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1ff6e797c9c53dbee3312105a21ed1dd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 2246883,
            "upload_time": "2024-12-12T09:35:05",
            "upload_time_iso_8601": "2024-12-12T09:35:05.638516Z",
            "url": "https://files.pythonhosted.org/packages/79/8d/7f31109613b19fb6137a7a409bb653f8d8146de02c915ff9360cca8926b6/idiom-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "25580ef6df3ed985cca2bfefd8b35a2312600da22a7b52455293ccddd572e298",
                "md5": "65d0c245f70059f73b4ab55ad9025125",
                "sha256": "b241107284e256a5dd48ad4519f1404b8620241b0b34f0d3906f7c55cb889d61"
            },
            "downloads": -1,
            "filename": "idiom-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "65d0c245f70059f73b4ab55ad9025125",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 2249452,
            "upload_time": "2024-12-12T09:35:08",
            "upload_time_iso_8601": "2024-12-12T09:35:08.460464Z",
            "url": "https://files.pythonhosted.org/packages/25/58/0ef6df3ed985cca2bfefd8b35a2312600da22a7b52455293ccddd572e298/idiom-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-12 09:35:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thorwhalen",
    "github_project": "idiom",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "idiom"
}
        
Elapsed time: 0.40879s