knowt

Name	knowt JSON
Version	0.1.5 JSON
	download
home_page
Summary	Private, personalized searchable knowledge base, from your own notes.
upload_time	2024-03-17 20:21:45
maintainer
docs_url	None
author
requires_python	>=3.10
license	GPLv3+
keywords	nlp llm vector-search ann numpy search semantic search rag personal assistant command line
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

# Knowt
Knowt turns notes into knowledge.
You can search your notes for the name of that person you saw at the cafe last week or even have a conversation with your past self about anything at all.
It won't write your term paper for you, or draw you dreamy pictures, but it will help you remember the important things, the things that your favorite humans wrote down.

## Getting started
My favorite humans these days are on open source communities like Hacker Public Radio.
So `knowt` comes with all the show notes from every on of the 4,000+ HPR episodes recorded in its 15+ years of cointinuous broadcasting.
What questions do you have for the 100s of agalmic contributors to HPR?

```bash
$ pip install knowt
$ knowt what is Haycyon?
```

## Installation

#### Python virtual environment

To set up the project environment, follow these steps:

1. Clone the project repository or download the project files to your local machine.
2. Navigate to the project directory.
3. Create a Python virtual environment in the project directory:

```bash
pip install virtualenv
python -m virtualenv .venv
```

4. Activate the virtual environment (mac/linux):

```bash
source .venv/bin/activate
```

#### Install dependencies

Not that you have a virtual environment, you're ready to install some Python packages and download language models (spaCy and BERT).

1. Install the required packages using the `requirements.txt` file:

```bash
pip install -e .
```

2. Download the small BERT embedding model (you can use whichever open source model you like):

```bash
python -c 'from sentence_transformers import SentenceTransformer; sbert = SentenceTransformer("paraphrase-MiniLM-L6-v2")'
```

#### Quick start

You can search an example corpus of nutrition and health documents by running the `search_engine.py` script.

#### Search your personal docs

1. Replace the text files in `data/corpus` with your own.
2. Start the command-line search engine with:

```bash
python search_engine.py --refresh
```

The `--refresh` flag ensures that a fresh index is created based on your documents.
Otherwise it may ignore the `data/corpus` directory and reuse an existing index and corpus in the `data/cache` directory.

The `search_engine.py` script will first segement the text files into sentences.
Then it will create a "reverse index" by counting up words and character patterns in your documents.
It will also creat semantic embeddings to allow you to as questions about vague concepts without even knowing any the words you used in your documents.

## Contributing

Submit an Issue (bug or feature suggestion) or a Merge Request and someone will respond within the week.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "knowt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "NLP,LLM,vector-search,ANN,numpy,search,semantic search,RAG,personal assistant,command line",
    "author": "",
    "author_email": "Hobson Lane <git@totalgood.com>, Ethan Cavill <ethancavill@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/4b/bf/30017bfde3e640bd07346f92799c3aab5eccc45bd7ff5d727a313ff2f5d2/knowt-0.1.5.tar.gz",
    "platform": null,
    "description": "# Knowt\nKnowt turns notes into knowledge.\nYou can search your notes for the name of that person you saw at the cafe last week or even have a conversation with your past self about anything at all.\nIt won't write your term paper for you, or draw you dreamy pictures, but it will help you remember the important things, the things that your favorite humans wrote down.\n\n## Getting started\nMy favorite humans these days are on open source communities like Hacker Public Radio.\nSo `knowt` comes with all the show notes from every on of the 4,000+ HPR episodes recorded in its 15+ years of cointinuous broadcasting.\nWhat questions do you have for the 100s of agalmic contributors to HPR?\n\n```bash\n$ pip install knowt\n$ knowt what is Haycyon?\n```\n\n\n## Installation\n\n#### Python virtual environment\n\nTo set up the project environment, follow these steps:\n\n1. Clone the project repository or download the project files to your local machine.\n2. Navigate to the project directory.\n3. Create a Python virtual environment in the project directory:\n\n```bash\npip install virtualenv\npython -m virtualenv .venv\n```\n\n4. Activate the virtual environment (mac/linux):\n\n```bash\nsource .venv/bin/activate\n```\n\n#### Install dependencies\n\nNot that you have a virtual environment, you're ready to install some Python packages and download language models (spaCy and BERT).\n\n1. Install the required packages using the `requirements.txt` file:\n\n```bash\npip install -e .\n```\n\n2. Download the small BERT embedding model (you can use whichever open source model you like):\n\n```bash\npython -c 'from sentence_transformers import SentenceTransformer; sbert = SentenceTransformer(\"paraphrase-MiniLM-L6-v2\")'\n```\n\n#### Quick start\n\nYou can search an example corpus of nutrition and health documents by running the `search_engine.py` script.\n\n#### Search your personal docs\n\n1. Replace the text files in `data/corpus` with your own.\n2. Start the command-line search engine with:\n\n```bash\npython search_engine.py --refresh\n```\n\nThe `--refresh` flag ensures that a fresh index is created based on your documents.\nOtherwise it may ignore the `data/corpus` directory and reuse an existing index and corpus in the `data/cache` directory.\n\nThe `search_engine.py` script will first segement the text files into sentences.\nThen it will create a \"reverse index\" by counting up words and character patterns in your documents.\nIt will also creat semantic embeddings to allow you to as questions about vague concepts without even knowing any the words you used in your documents.\n\n## Contributing\n\nSubmit an Issue (bug or feature suggestion) or a Merge Request and someone will  respond within the week.\n",
    "bugtrack_url": null,
    "license": "GPLv3+",
    "summary": "Private, personalized searchable knowledge base, from your own notes.",
    "version": "0.1.5",
    "project_urls": null,
    "split_keywords": [
        "nlp",
        "llm",
        "vector-search",
        "ann",
        "numpy",
        "search",
        "semantic search",
        "rag",
        "personal assistant",
        "command line"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4bbf30017bfde3e640bd07346f92799c3aab5eccc45bd7ff5d727a313ff2f5d2",
                "md5": "1e65273f670815df3a1e0feb0b5a64e9",
                "sha256": "f8d27f2d8863b7a0afd328548c5db1c7ac0381c07d26ae09ef99ae052695bde6"
            },
            "downloads": -1,
            "filename": "knowt-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "1e65273f670815df3a1e0feb0b5a64e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 21050525,
            "upload_time": "2024-03-17T20:21:45",
            "upload_time_iso_8601": "2024-03-17T20:21:45.189557Z",
            "url": "https://files.pythonhosted.org/packages/4b/bf/30017bfde3e640bd07346f92799c3aab5eccc45bd7ff5d727a313ff2f5d2/knowt-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-17 20:21:45",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "knowt"
}