siminotes


Namesiminotes JSON
Version 0.1.1 PyPI version JSON
download
home_page
SummaryA CLI tool for discovering similar notes within your note collection.
upload_time2024-01-29 14:18:17
maintainer
docs_urlNone
author
requires_python
licenseMIT License Copyright (c) 2024 Roopkumar Das Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords vector embeddings similarity cli notes finder search tool text analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SimiNotes - CLI Tool for Similar Note Retrieval

SimiNotes is a command-line interface (CLI) tool written in Python that enables users to discover similar notes within their
notes collection. The tool utilizes sentence embeddings with sbert to compare a given query against a corpus of user notes.

## Features

- **Embedding:** Utilizes sbert to generate vector embeddings for notes.
- **Similarity Search:** Finds notes similar to a given query based on embeddings.
- **Configurable:** Allows users to configure directories, file extensions, and exclusion criteria.

## Installation

Before using this CLI tool, ensure that you have Python and pip installed. Additionally, install the PyTorch library by
following the steps below:

1. Download and install Python and pip from the [official site](https://www.python.org/).
2. Install the PyTorch library separately based on your requirements. For the CPU version, use the following command:

    ```bash
    pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
    ```

Now, install the SimiNotes CLI tool:

```bash
pip install siminotes
```

## Configuration

Before using the CLI, configure some essential values, such as the notes directory and exclusions. SimiNotes uses a
configuration file (`config.txt`) to set preferences. Configure the tool by creating the file in the appropriate configuration
directory.

### Configuration Directory

- Linux:

```plaintext
~/.config/siminotesconfig/
```

- macOS:

```plaintext
~/.siminotesconfig/
```

- Windows:

For Windows, use AppData\Roaming for per-user configuration:

```plaintext
AppData\Roaming\Siminotes\
```

Alternatively, place `config.txt` in the home directory:

```plaintext
~/.siminotesconfig/
```

### Configuration File (`config.txt`)

Create a `config.txt` file in the configuration directory. Below is an example configuration:

```plaintext
notes_dir = /path/to/your/notes
exclude_dir = directory1,directory2
exclude_file = file1,file2
note_extension = .md
```

Configuration Parameters:

- notes_dir: Path to the directory containing your notes.

- exclude_dir: Comma-separated directories to exclude from the search. Paths should be relative to notes_dir.

- exclude_file: Comma-separated files to exclude from the search. Paths should be relative to notes_dir.

- note_extension: The extension of your note files (e.g., .md).

## Usage

Now let's use our cli,

### Command-Line Arguments

- Query via Text:

```bash
siminotes text "Your Query Text"
```

- Query via File:

```bash
siminotes file filename
```

Both will result in,

```
Top files which are similar to given query:
Value range from -1 to 1, where going toward 1 means note is close to query

/... with score 0.43386968970298767

/... with score 0.42138463258743286

...

```

## Troubleshooting

If you encounter any errors or problems with this tool, please open an issue in the repository.

## License

This project is licensed under the MIT License.

## Contributing

Feel free to contribute to SimiNotes by creating issues or submitting pull requests.

## Acknowledgments

[Sentence-BERT](https://www.sbert.net/index.html) for sentence embeddings.

## Future:

- I feel like simple dot product hits are enough to find similar notes but in future if there is need to 
improve results then consider this roop
[Retrieve and Rerank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)
[Vid Tut](https://youtu.be/zMDBc_Q9Ark?feature=shared)

- If it is taking more memory, then we can quantise the vectors into int8
[Quantisation Guide](https://www.sbert.net/examples/training/distillation/README.html#quantization)
[Github Repo to check](https://github.com/davidberenstein1957/fast-sentence-transformers)

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "siminotes",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Vector embeddings,similarity,CLI,notes,finder,search,tool,text analysis",
    "author": "",
    "author_email": "Roopkumar Das <roopkumards@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/e0/db/f1e844c198a13a2d8e877e5efb33ea93bf76d8f8b74b70cb2989fe17e5fb/siminotes-0.1.1.tar.gz",
    "platform": null,
    "description": "# SimiNotes - CLI Tool for Similar Note Retrieval\n\nSimiNotes is a command-line interface (CLI) tool written in Python that enables users to discover similar notes within their\nnotes collection. The tool utilizes sentence embeddings with sbert to compare a given query against a corpus of user notes.\n\n## Features\n\n- **Embedding:** Utilizes sbert to generate vector embeddings for notes.\n- **Similarity Search:** Finds notes similar to a given query based on embeddings.\n- **Configurable:** Allows users to configure directories, file extensions, and exclusion criteria.\n\n## Installation\n\nBefore using this CLI tool, ensure that you have Python and pip installed. Additionally, install the PyTorch library by\nfollowing the steps below:\n\n1. Download and install Python and pip from the [official site](https://www.python.org/).\n2. Install the PyTorch library separately based on your requirements. For the CPU version, use the following command:\n\n    ```bash\n    pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu\n    ```\n\nNow, install the SimiNotes CLI tool:\n\n```bash\npip install siminotes\n```\n\n## Configuration\n\nBefore using the CLI, configure some essential values, such as the notes directory and exclusions. SimiNotes uses a\nconfiguration file (`config.txt`) to set preferences. Configure the tool by creating the file in the appropriate configuration\ndirectory.\n\n### Configuration Directory\n\n- Linux:\n\n```plaintext\n~/.config/siminotesconfig/\n```\n\n- macOS:\n\n```plaintext\n~/.siminotesconfig/\n```\n\n- Windows:\n\nFor Windows, use AppData\\Roaming for per-user configuration:\n\n```plaintext\nAppData\\Roaming\\Siminotes\\\n```\n\nAlternatively, place `config.txt` in the home directory:\n\n```plaintext\n~/.siminotesconfig/\n```\n\n### Configuration File (`config.txt`)\n\nCreate a `config.txt` file in the configuration directory. Below is an example configuration:\n\n```plaintext\nnotes_dir = /path/to/your/notes\nexclude_dir = directory1,directory2\nexclude_file = file1,file2\nnote_extension = .md\n```\n\nConfiguration Parameters:\n\n- notes_dir: Path to the directory containing your notes.\n\n- exclude_dir: Comma-separated directories to exclude from the search. Paths should be relative to notes_dir.\n\n- exclude_file: Comma-separated files to exclude from the search. Paths should be relative to notes_dir.\n\n- note_extension: The extension of your note files (e.g., .md).\n\n## Usage\n\nNow let's use our cli,\n\n### Command-Line Arguments\n\n- Query via Text:\n\n```bash\nsiminotes text \"Your Query Text\"\n```\n\n- Query via File:\n\n```bash\nsiminotes file filename\n```\n\nBoth will result in,\n\n```\nTop files which are similar to given query:\nValue range from -1 to 1, where going toward 1 means note is close to query\n\n/... with score 0.43386968970298767\n\n/... with score 0.42138463258743286\n\n...\n\n```\n\n## Troubleshooting\n\nIf you encounter any errors or problems with this tool, please open an issue in the repository.\n\n## License\n\nThis project is licensed under the MIT License.\n\n## Contributing\n\nFeel free to contribute to SimiNotes by creating issues or submitting pull requests.\n\n## Acknowledgments\n\n[Sentence-BERT](https://www.sbert.net/index.html) for sentence embeddings.\n\n## Future:\n\n- I feel like simple dot product hits are enough to find similar notes but in future if there is need to \nimprove results then consider this roop\n[Retrieve and Rerank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)\n[Vid Tut](https://youtu.be/zMDBc_Q9Ark?feature=shared)\n\n- If it is taking more memory, then we can quantise the vectors into int8\n[Quantisation Guide](https://www.sbert.net/examples/training/distillation/README.html#quantization)\n[Github Repo to check](https://github.com/davidberenstein1957/fast-sentence-transformers)\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 Roopkumar Das  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "A CLI tool for discovering similar notes within your note collection.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/RoopkumarD/similar_notes",
        "Issues": "https://github.com/RoopkumarD/similar_notes/issues",
        "Repository": "https://github.com/RoopkumarD/similar_notes"
    },
    "split_keywords": [
        "vector embeddings",
        "similarity",
        "cli",
        "notes",
        "finder",
        "search",
        "tool",
        "text analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d3ddfb1c22598a1c95dbe51bd2472eca43e9dd4bd46ea4494b82a52fa4e45f65",
                "md5": "338f402fa47b7eef4b2495dd9604aec8",
                "sha256": "fb057a3c8e8ce2decf9638d00d8db9af2d2f31d501200516b05df2863d3fe6f9"
            },
            "downloads": -1,
            "filename": "siminotes-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "338f402fa47b7eef4b2495dd9604aec8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 8069,
            "upload_time": "2024-01-29T14:18:15",
            "upload_time_iso_8601": "2024-01-29T14:18:15.915868Z",
            "url": "https://files.pythonhosted.org/packages/d3/dd/fb1c22598a1c95dbe51bd2472eca43e9dd4bd46ea4494b82a52fa4e45f65/siminotes-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e0dbf1e844c198a13a2d8e877e5efb33ea93bf76d8f8b74b70cb2989fe17e5fb",
                "md5": "3279102dad27a0df08539a240447373a",
                "sha256": "3964b23bf4b3ddaf4f0c8335382434544bf75438a1072a74107098b8cfc3d961"
            },
            "downloads": -1,
            "filename": "siminotes-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3279102dad27a0df08539a240447373a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7048,
            "upload_time": "2024-01-29T14:18:17",
            "upload_time_iso_8601": "2024-01-29T14:18:17.907625Z",
            "url": "https://files.pythonhosted.org/packages/e0/db/f1e844c198a13a2d8e877e5efb33ea93bf76d8f8b74b70cb2989fe17e5fb/siminotes-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-29 14:18:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "RoopkumarD",
    "github_project": "similar_notes",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "siminotes"
}
        
Elapsed time: 2.15980s