idiom

Name	idiom JSON
Version	0.1.6 JSON
	download
home_page	https://github.com/thorwhalen/idiom
Summary	Access and operations with word2vec data
upload_time	2025-02-01 12:06:10
maintainer	None
docs_url	None
author	Thor Whalen
requires_python	None
license	mit
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # idiom

Access and operations with word2vec data

To install:	```pip install idiom```

## Overview

The `idiom` package provides access to word vector data and useful functions to manipulate and analyze it. It includes functionalities for finding the closest words to a given word, calculating word frequencies, and working with various word vector models.

## Features

- **Closest Words**: Find the closest words to a given word based on cosine similarity.
- **Word Frequencies**: Access and manipulate word frequency data.
- **Word Vector Models**: Work with pre-trained word vector models such as FastText.
- **IDF Calculations**: Compute different types of Inverse Document Frequency (IDF) values.

## Usage

### Finding Closest Words

You can find the closest words to a given word using the `closest_words` function:

```python
from idiom import closest_words

# Example: Find the closest words to 'mad' that start with 'l'
starts_with_L = lambda x: x.startswith('l')
print(closest_words('mad', k=10, search_words=starts_with_L))
```

### Accessing Word Frequencies

You can access the most frequent words using the `most_frequent_words` function:

```python
from idiom import most_frequent_words

# Get the top 100,000 most frequent words
frequent_words = most_frequent_words(max_n_words=100000)
print(frequent_words)
```

### Working with Word Vectors

You can load and work with pre-trained word vectors using the `WordVec` class:

```python
from idiom import WordVec

# Initialize WordVec with default word vectors
word_vec = WordVec()

# Calculate the distance between two queries
distance = word_vec.dist('france capital', 'paris')
print(distance)
```

### IDF Calculations

You can compute different types of IDF values using the `_IDF` class:

```python
from idiom import idf

# Access logarithmic IDF values
log_idf = idf.logarithmic
print(log_idf)
```

## Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.

## License

This project is licensed under the MIT License.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thorwhalen/idiom",
    "name": "idiom",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Thor Whalen",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/bd/31/44428393183593fb08c930066c4bd9a6d27d604f3c71d4ded8d29cf5730b/idiom-0.1.6.tar.gz",
    "platform": "any",
    "description": "# idiom\n\nAccess and operations with word2vec data\n\nTo install:\t```pip install idiom```\n\n## Overview\n\nThe `idiom` package provides access to word vector data and useful functions to manipulate and analyze it. It includes functionalities for finding the closest words to a given word, calculating word frequencies, and working with various word vector models.\n\n## Features\n\n- **Closest Words**: Find the closest words to a given word based on cosine similarity.\n- **Word Frequencies**: Access and manipulate word frequency data.\n- **Word Vector Models**: Work with pre-trained word vector models such as FastText.\n- **IDF Calculations**: Compute different types of Inverse Document Frequency (IDF) values.\n\n## Usage\n\n### Finding Closest Words\n\nYou can find the closest words to a given word using the `closest_words` function:\n\n```python\nfrom idiom import closest_words\n\n# Example: Find the closest words to 'mad' that start with 'l'\nstarts_with_L = lambda x: x.startswith('l')\nprint(closest_words('mad', k=10, search_words=starts_with_L))\n```\n\n### Accessing Word Frequencies\n\nYou can access the most frequent words using the `most_frequent_words` function:\n\n```python\nfrom idiom import most_frequent_words\n\n# Get the top 100,000 most frequent words\nfrequent_words = most_frequent_words(max_n_words=100000)\nprint(frequent_words)\n```\n\n### Working with Word Vectors\n\nYou can load and work with pre-trained word vectors using the `WordVec` class:\n\n```python\nfrom idiom import WordVec\n\n# Initialize WordVec with default word vectors\nword_vec = WordVec()\n\n# Calculate the distance between two queries\ndistance = word_vec.dist('france capital', 'paris')\nprint(distance)\n```\n\n### IDF Calculations\n\nYou can compute different types of IDF values using the `_IDF` class:\n\n```python\nfrom idiom import idf\n\n# Access logarithmic IDF values\nlog_idf = idf.logarithmic\nprint(log_idf)\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a pull request or open an issue on GitHub.\n\n## License\n\nThis project is licensed under the MIT License.\n\n\n",
    "bugtrack_url": null,
    "license": "mit",
    "summary": "Access and operations with word2vec data",
    "version": "0.1.6",
    "project_urls": {
        "Homepage": "https://github.com/thorwhalen/idiom"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c5061673e34edc7a19c2a3cc3678ea48da75963d3b03375f860ab8fb8e2ed6c2",
                "md5": "daba44cf8bcab5d8382583ff9743a0d2",
                "sha256": "0563cba757fe52f2f7ee9478b4a0a85571b2a191caee84ac04a0dd46ccdb44fe"
            },
            "downloads": -1,
            "filename": "idiom-0.1.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "daba44cf8bcab5d8382583ff9743a0d2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 2248561,
            "upload_time": "2025-02-01T12:06:08",
            "upload_time_iso_8601": "2025-02-01T12:06:08.502015Z",
            "url": "https://files.pythonhosted.org/packages/c5/06/1673e34edc7a19c2a3cc3678ea48da75963d3b03375f860ab8fb8e2ed6c2/idiom-0.1.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bd3144428393183593fb08c930066c4bd9a6d27d604f3c71d4ded8d29cf5730b",
                "md5": "16158c33c90b657dce7fd4924ac5308c",
                "sha256": "572c0dbb2a082f4957e2152a4598858e7b4ae8851c09049a80f77ef86564240f"
            },
            "downloads": -1,
            "filename": "idiom-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "16158c33c90b657dce7fd4924ac5308c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 2251216,
            "upload_time": "2025-02-01T12:06:10",
            "upload_time_iso_8601": "2025-02-01T12:06:10.746190Z",
            "url": "https://files.pythonhosted.org/packages/bd/31/44428393183593fb08c930066c4bd9a6d27d604f3c71d4ded8d29cf5730b/idiom-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-01 12:06:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thorwhalen",
    "github_project": "idiom",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "idiom"
}

Thor Whalen