# SQLRooms RAG
A Python package for preparing and querying vector embeddings stored in DuckDB for RAG (Retrieval Augmented Generation) applications.
## Overview
This tool follows the approach outlined in [Developing a RAG Knowledge Base with DuckDB](https://motherduck.com/blog/search-using-duckdb-part-2/) to:
1. Load markdown files from a specified directory
2. Split them into chunks (default 512 tokens)
3. Generate vector embeddings using HuggingFace models
4. Store the embeddings in a DuckDB database for efficient retrieval
## Installation
### From PyPI (when published)
```bash
pip install sqlrooms-rag
```
### From source with uv
This project uses [uv](https://github.com/astral-sh/uv) for development.
```bash
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install from source
cd python/rag-embedding
uv sync
```
### Dependencies
The package includes:
- llama-index (core RAG framework)
- llama-index-embeddings-huggingface (HuggingFace embeddings)
- llama-index-vector-stores-duckdb (DuckDB vector store)
- sentence-transformers (embedding models)
- torch (ML framework)
- duckdb (database)
## Usage
### Basic Usage
Process markdown files from a directory and create a DuckDB knowledge base:
```bash
uv run prepare-embeddings /path/to/docs -o generated-embeddings/knowledge_base.duckdb
```
Or use the Python API:
```python
from sqlrooms_rag import prepare_embeddings
prepare_embeddings(
input_dir="/path/to/docs",
output_db="generated-embeddings/knowledge_base.duckdb",
chunk_size=512,
embed_model_name="BAAI/bge-small-en-v1.5",
embed_dim=384
)
```
### Examples
#### Process documentation files
```bash
# Process all .md files in the docs directory
uv run prepare-embeddings ../../docs -o generated-embeddings/sqlrooms_docs.duckdb
```
#### Use custom chunk size
```bash
# Use smaller chunks for more granular retrieval
uv run prepare-embeddings docs -o generated-embeddings/kb.duckdb --chunk-size 256
```
#### Use a different embedding model
```bash
# Use all-MiniLM-L6-v2 (dimension: 384)
uv run prepare-embeddings docs -o generated-embeddings/kb.duckdb \
--model "sentence-transformers/all-MiniLM-L6-v2" \
--embed-dim 384
```
### Command-Line Options
```
positional arguments:
input_dir Directory containing markdown (.md) files to process
options:
-h, --help Show this help message and exit
-o OUTPUT, --output OUTPUT
Output DuckDB database file path (default: knowledge_base.duckdb)
--chunk-size CHUNK_SIZE
Max token size for text chunks (default: 512)
--model EMBED_MODEL_NAME
HuggingFace embedding model name (default: BAAI/bge-small-en-v1.5)
--embed-dim EMBED_DIM
Embedding dimension size (default: 384 for bge-small-en-v1.5)
--no-markdown-chunking
Disable markdown-aware chunking (use size-based instead)
-q, --quiet Suppress progress messages
```
## How It Works
1. **Document Loading**: The tool recursively scans the input directory for `.md` files
2. **Embedding Model**: Downloads and initializes the HuggingFace embedding model (cached locally after first run)
3. **Smart Chunking**: By default, splits documents by markdown headers (##, ###) to preserve section context. Section titles are stored in metadata for better retrieval. Falls back to size-based chunking for large sections.
4. **Embedding Generation**: Generates vector embeddings for each chunk
5. **Storage**: Stores embeddings in DuckDB with metadata (including section titles) for efficient retrieval
### Chunking Strategy
**Markdown-Aware Chunking** (default):
- ✅ Splits by markdown headers (`##`, `###`, etc.)
- ✅ Preserves section context and hierarchy
- ✅ Stores section titles in metadata (`Header_1`, `Header_2`, etc.)
- ✅ Produces semantically coherent chunks
**Size-Based Chunking** (with `--no-markdown-chunking`):
- Simple token-based splitting
- May break sections mid-content
- Use only if your docs lack clear structure
See [CHUNKING.md](./CHUNKING.md) for detailed comparison and best practices.
## Output
The tool creates a DuckDB database file (`.duckdb`) that contains:
- Document chunks (text split by markdown sections)
- Vector embeddings (384-dimensional by default)
- Metadata including:
- File paths
- Section titles (`Header_1`, `Header_2`, etc.)
- Document structure information
This database can be used with llama-index's query engine or any RAG application that supports DuckDB vector stores.
## Using the Generated Database
### Python API
You can use the package programmatically:
```python
from sqlrooms_rag import prepare_embeddings
# Create embeddings
index = prepare_embeddings(
input_dir="../../docs",
output_db="generated-embeddings/my_docs.duckdb"
)
```
### Query Examples
See `examples/example_query.py` for complete working examples. Here's a quick snippet:
```python
from llama_index.core import VectorStoreIndex, StorageContext, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.duckdb import DuckDBVectorStore
# Load the embedding model
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
Settings.embed_model = embed_model
# Connect to the existing database
vector_store = DuckDBVectorStore(
database_name="knowledge_base",
persist_dir="./",
embed_dim=384,
)
# Load the index
index = VectorStoreIndex.from_vector_store(vector_store)
# Create retriever and search
retriever = index.as_retriever(similarity_top_k=3)
results = retriever.retrieve("Your question here")
for result in results:
print(f"Score: {result.score:.4f}")
print(f"Text: {result.text[:200]}...")
```
### Running the Examples
**Prepare DuckDB documentation embeddings:**
```bash
# Using Python script (recommended)
uv run python examples/prepare_duckdb_docs.py
# Or using bash script
chmod +x examples/prepare_duckdb_docs.sh
./examples/prepare_duckdb_docs.sh
# Custom paths
uv run python examples/prepare_duckdb_docs.py \
--docs-dir ./my-docs \
--output ./embeddings/duckdb.duckdb
```
**Query embeddings using llama-index:**
```bash
uv run python examples/example_query.py
```
**Query embeddings using DuckDB directly:**
```bash
# Run predefined queries
uv run python examples/query_duckdb_direct.py
# Query with your own question
uv run python examples/query_duckdb_direct.py "Your question here"
```
See [QUERYING.md](./QUERYING.md) for detailed documentation on querying the database directly with SQL.
## Visualization
Generate 2D UMAP embeddings for visualization:
```bash
# Install visualization dependencies
uv pip install -e ".[viz]"
# Generate UMAP visualization
uv run generate-umap-embeddings generated-embeddings/duckdb_docs.duckdb
# Output: generated-embeddings/duckdb_docs_umap.parquet
```
The output includes two Parquet files:
**Main file (`*_umap.parquet`):**
- `node_id` - Unique node identifier (e.g., "node_0001")
- `title` - Document title (from markdown frontmatter)
- `fileName` - File name extracted from metadata (e.g., "window_functions")
- `file_path` - Full file path (e.g., "/path/to/docs/window_functions.md")
- `text` - Full document text
- `x`, `y` - UMAP coordinates for 2D plotting
- `topic` - Automatically detected topic/cluster name (e.g., "Window Functions / Aggregate / SQL")
- `outdegree` - Number of documents this document links TO
- `indegree` - Number of documents linking TO this document
**Links file (`*_umap_links.parquet`):**
- `source_id` - Source node ID
- `target_id` - Target node ID
**Features:**
- **Topic Detection:** Automatically clusters documents and assigns descriptive topic names using TF-IDF keyword extraction. Disable with `--no-topics`.
- **Link Extraction:** Parses markdown links to build a chunk-level graph. Source chunks keep individual outdegree values; target documents expand to all chunks. Disable with `--no-links`.
See [VISUALIZATION_GUIDE.md](./VISUALIZATION_GUIDE.md) for complete visualization examples and usage details.
See [CHUNKING.md](./CHUNKING.md) for information about markdown-aware chunking.
## Package Structure
```
sqlrooms-rag/
├── sqlrooms_rag/ # Main package (installed)
│ ├── __init__.py # Public API
│ ├── prepare.py # Core embedding preparation
│ └── cli.py # Command-line interface
├── examples/ # Example scripts (not installed)
│ ├── prepare_duckdb_docs.py # Download & prepare DuckDB docs
│ ├── prepare_duckdb_docs.sh # Bash version of above
│ ├── test_duckdb_docs_query.py # Test DuckDB docs queries
│ ├── example_query.py # Query using llama-index
│ └── query_duckdb_direct.py # Direct DuckDB queries
├── scripts/ # Documentation for utility scripts
├── generated-embeddings/ # Output directory
├── pyproject.toml # Package configuration
└── README.md
```
## Example: DuckDB Documentation
The package includes a ready-to-use script for preparing DuckDB documentation embeddings:
```bash
# Download DuckDB docs and create embeddings
cd python/rag
uv run python examples/prepare_duckdb_docs.py
```
This will:
1. Download the latest DuckDB documentation from GitHub (~600+ files)
2. Process all markdown files
3. Generate embeddings using BAAI/bge-small-en-v1.5
4. Create `generated-embeddings/duckdb_docs.duckdb`
Test the embeddings:
```bash
# Run interactive test queries
uv run python examples/test_duckdb_docs_query.py
# Or test a specific query
uv run python examples/test_duckdb_docs_query.py "What is a window function?"
```
Then use it in your SQLRooms app:
```typescript
import {createRagSlice} from '@sqlrooms/rag';
const store = createRoomStore({
slices: [
createDuckDbSlice(),
createRagSlice({
embeddingsDatabases: [
{
databaseFilePath: './embeddings/duckdb_docs.duckdb',
databaseName: 'duckdb_docs',
},
],
}),
],
});
// Search DuckDB documentation
const results = await store.getState().rag.queryEmbeddings(embedding);
```
## Supported Models
The tool works with any HuggingFace sentence-transformer model. Popular choices:
| Model | Dimension | Max Tokens | Description |
| --------------------------------------- | --------- | ---------- | --------------------- |
| BAAI/bge-small-en-v1.5 | 384 | 512 | Default, good balance |
| sentence-transformers/all-MiniLM-L6-v2 | 384 | 256 | Fast, lightweight |
| BAAI/bge-base-en-v1.5 | 768 | 512 | Better accuracy |
| sentence-transformers/all-mpnet-base-v2 | 768 | 384 | High quality |
## Notes
- The embedding model is downloaded and cached on first run (~100-500MB depending on model)
- Processing time depends on the number and size of documents
- The generated DuckDB file can be reused and updated with additional documents
- Ensure the `--embed-dim` matches your chosen model's output dimension
## Requirements
- Python >=3.10
- 2-4GB RAM (depending on model and document size)
- ~500MB-2GB disk space for models and generated database
## Troubleshooting
### Out of Memory
If you run out of memory with large document sets, try:
- Using a smaller embedding model
- Processing documents in batches
- Reducing chunk size
### Slow Processing
- First run downloads the embedding model (one-time operation)
- Subsequent runs use the cached model
- Consider using a smaller/faster model for large document sets
## License
Part of the SQLRooms project.
Raw data
{
"_id": null,
"home_page": null,
"name": "sqlrooms-rag",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "duckdb, embeddings, llama-index, rag, vector-search",
"author": "SQLRooms Contributors",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/15/89/ddfefc0f8ccaaf0af5d85565e6002701cdbf762acf211d319110df5c3b67/sqlrooms_rag-0.1.0a1.tar.gz",
"platform": null,
"description": "# SQLRooms RAG\n\nA Python package for preparing and querying vector embeddings stored in DuckDB for RAG (Retrieval Augmented Generation) applications.\n\n## Overview\n\nThis tool follows the approach outlined in [Developing a RAG Knowledge Base with DuckDB](https://motherduck.com/blog/search-using-duckdb-part-2/) to:\n\n1. Load markdown files from a specified directory\n2. Split them into chunks (default 512 tokens)\n3. Generate vector embeddings using HuggingFace models\n4. Store the embeddings in a DuckDB database for efficient retrieval\n\n## Installation\n\n### From PyPI (when published)\n\n```bash\npip install sqlrooms-rag\n```\n\n### From source with uv\n\nThis project uses [uv](https://github.com/astral-sh/uv) for development.\n\n```bash\n# Install uv if not already installed\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install from source\ncd python/rag-embedding\nuv sync\n```\n\n### Dependencies\n\nThe package includes:\n\n- llama-index (core RAG framework)\n- llama-index-embeddings-huggingface (HuggingFace embeddings)\n- llama-index-vector-stores-duckdb (DuckDB vector store)\n- sentence-transformers (embedding models)\n- torch (ML framework)\n- duckdb (database)\n\n## Usage\n\n### Basic Usage\n\nProcess markdown files from a directory and create a DuckDB knowledge base:\n\n```bash\nuv run prepare-embeddings /path/to/docs -o generated-embeddings/knowledge_base.duckdb\n```\n\nOr use the Python API:\n\n```python\nfrom sqlrooms_rag import prepare_embeddings\n\nprepare_embeddings(\n input_dir=\"/path/to/docs\",\n output_db=\"generated-embeddings/knowledge_base.duckdb\",\n chunk_size=512,\n embed_model_name=\"BAAI/bge-small-en-v1.5\",\n embed_dim=384\n)\n```\n\n### Examples\n\n#### Process documentation files\n\n```bash\n# Process all .md files in the docs directory\nuv run prepare-embeddings ../../docs -o generated-embeddings/sqlrooms_docs.duckdb\n```\n\n#### Use custom chunk size\n\n```bash\n# Use smaller chunks for more granular retrieval\nuv run prepare-embeddings docs -o generated-embeddings/kb.duckdb --chunk-size 256\n```\n\n#### Use a different embedding model\n\n```bash\n# Use all-MiniLM-L6-v2 (dimension: 384)\nuv run prepare-embeddings docs -o generated-embeddings/kb.duckdb \\\n --model \"sentence-transformers/all-MiniLM-L6-v2\" \\\n --embed-dim 384\n```\n\n### Command-Line Options\n\n```\npositional arguments:\n input_dir Directory containing markdown (.md) files to process\n\noptions:\n -h, --help Show this help message and exit\n -o OUTPUT, --output OUTPUT\n Output DuckDB database file path (default: knowledge_base.duckdb)\n --chunk-size CHUNK_SIZE\n Max token size for text chunks (default: 512)\n --model EMBED_MODEL_NAME\n HuggingFace embedding model name (default: BAAI/bge-small-en-v1.5)\n --embed-dim EMBED_DIM\n Embedding dimension size (default: 384 for bge-small-en-v1.5)\n --no-markdown-chunking\n Disable markdown-aware chunking (use size-based instead)\n -q, --quiet Suppress progress messages\n```\n\n## How It Works\n\n1. **Document Loading**: The tool recursively scans the input directory for `.md` files\n2. **Embedding Model**: Downloads and initializes the HuggingFace embedding model (cached locally after first run)\n3. **Smart Chunking**: By default, splits documents by markdown headers (##, ###) to preserve section context. Section titles are stored in metadata for better retrieval. Falls back to size-based chunking for large sections.\n4. **Embedding Generation**: Generates vector embeddings for each chunk\n5. **Storage**: Stores embeddings in DuckDB with metadata (including section titles) for efficient retrieval\n\n### Chunking Strategy\n\n**Markdown-Aware Chunking** (default):\n\n- \u2705 Splits by markdown headers (`##`, `###`, etc.)\n- \u2705 Preserves section context and hierarchy\n- \u2705 Stores section titles in metadata (`Header_1`, `Header_2`, etc.)\n- \u2705 Produces semantically coherent chunks\n\n**Size-Based Chunking** (with `--no-markdown-chunking`):\n\n- Simple token-based splitting\n- May break sections mid-content\n- Use only if your docs lack clear structure\n\nSee [CHUNKING.md](./CHUNKING.md) for detailed comparison and best practices.\n\n## Output\n\nThe tool creates a DuckDB database file (`.duckdb`) that contains:\n\n- Document chunks (text split by markdown sections)\n- Vector embeddings (384-dimensional by default)\n- Metadata including:\n - File paths\n - Section titles (`Header_1`, `Header_2`, etc.)\n - Document structure information\n\nThis database can be used with llama-index's query engine or any RAG application that supports DuckDB vector stores.\n\n## Using the Generated Database\n\n### Python API\n\nYou can use the package programmatically:\n\n```python\nfrom sqlrooms_rag import prepare_embeddings\n\n# Create embeddings\nindex = prepare_embeddings(\n input_dir=\"../../docs\",\n output_db=\"generated-embeddings/my_docs.duckdb\"\n)\n```\n\n### Query Examples\n\nSee `examples/example_query.py` for complete working examples. Here's a quick snippet:\n\n```python\nfrom llama_index.core import VectorStoreIndex, StorageContext, Settings\nfrom llama_index.embeddings.huggingface import HuggingFaceEmbedding\nfrom llama_index.vector_stores.duckdb import DuckDBVectorStore\n\n# Load the embedding model\nembed_model = HuggingFaceEmbedding(model_name=\"BAAI/bge-small-en-v1.5\")\nSettings.embed_model = embed_model\n\n# Connect to the existing database\nvector_store = DuckDBVectorStore(\n database_name=\"knowledge_base\",\n persist_dir=\"./\",\n embed_dim=384,\n)\n\n# Load the index\nindex = VectorStoreIndex.from_vector_store(vector_store)\n\n# Create retriever and search\nretriever = index.as_retriever(similarity_top_k=3)\nresults = retriever.retrieve(\"Your question here\")\n\nfor result in results:\n print(f\"Score: {result.score:.4f}\")\n print(f\"Text: {result.text[:200]}...\")\n```\n\n### Running the Examples\n\n**Prepare DuckDB documentation embeddings:**\n\n```bash\n# Using Python script (recommended)\nuv run python examples/prepare_duckdb_docs.py\n\n# Or using bash script\nchmod +x examples/prepare_duckdb_docs.sh\n./examples/prepare_duckdb_docs.sh\n\n# Custom paths\nuv run python examples/prepare_duckdb_docs.py \\\n --docs-dir ./my-docs \\\n --output ./embeddings/duckdb.duckdb\n```\n\n**Query embeddings using llama-index:**\n\n```bash\nuv run python examples/example_query.py\n```\n\n**Query embeddings using DuckDB directly:**\n\n```bash\n# Run predefined queries\nuv run python examples/query_duckdb_direct.py\n\n# Query with your own question\nuv run python examples/query_duckdb_direct.py \"Your question here\"\n```\n\nSee [QUERYING.md](./QUERYING.md) for detailed documentation on querying the database directly with SQL.\n\n## Visualization\n\nGenerate 2D UMAP embeddings for visualization:\n\n```bash\n# Install visualization dependencies\nuv pip install -e \".[viz]\"\n\n# Generate UMAP visualization\nuv run generate-umap-embeddings generated-embeddings/duckdb_docs.duckdb\n\n# Output: generated-embeddings/duckdb_docs_umap.parquet\n```\n\nThe output includes two Parquet files:\n\n**Main file (`*_umap.parquet`):**\n\n- `node_id` - Unique node identifier (e.g., \"node_0001\")\n- `title` - Document title (from markdown frontmatter)\n- `fileName` - File name extracted from metadata (e.g., \"window_functions\")\n- `file_path` - Full file path (e.g., \"/path/to/docs/window_functions.md\")\n- `text` - Full document text\n- `x`, `y` - UMAP coordinates for 2D plotting\n- `topic` - Automatically detected topic/cluster name (e.g., \"Window Functions / Aggregate / SQL\")\n- `outdegree` - Number of documents this document links TO\n- `indegree` - Number of documents linking TO this document\n\n**Links file (`*_umap_links.parquet`):**\n\n- `source_id` - Source node ID\n- `target_id` - Target node ID\n\n**Features:**\n\n- **Topic Detection:** Automatically clusters documents and assigns descriptive topic names using TF-IDF keyword extraction. Disable with `--no-topics`.\n- **Link Extraction:** Parses markdown links to build a chunk-level graph. Source chunks keep individual outdegree values; target documents expand to all chunks. Disable with `--no-links`.\n\nSee [VISUALIZATION_GUIDE.md](./VISUALIZATION_GUIDE.md) for complete visualization examples and usage details.\n\nSee [CHUNKING.md](./CHUNKING.md) for information about markdown-aware chunking.\n\n## Package Structure\n\n```\nsqlrooms-rag/\n\u251c\u2500\u2500 sqlrooms_rag/ # Main package (installed)\n\u2502 \u251c\u2500\u2500 __init__.py # Public API\n\u2502 \u251c\u2500\u2500 prepare.py # Core embedding preparation\n\u2502 \u2514\u2500\u2500 cli.py # Command-line interface\n\u251c\u2500\u2500 examples/ # Example scripts (not installed)\n\u2502 \u251c\u2500\u2500 prepare_duckdb_docs.py # Download & prepare DuckDB docs\n\u2502 \u251c\u2500\u2500 prepare_duckdb_docs.sh # Bash version of above\n\u2502 \u251c\u2500\u2500 test_duckdb_docs_query.py # Test DuckDB docs queries\n\u2502 \u251c\u2500\u2500 example_query.py # Query using llama-index\n\u2502 \u2514\u2500\u2500 query_duckdb_direct.py # Direct DuckDB queries\n\u251c\u2500\u2500 scripts/ # Documentation for utility scripts\n\u251c\u2500\u2500 generated-embeddings/ # Output directory\n\u251c\u2500\u2500 pyproject.toml # Package configuration\n\u2514\u2500\u2500 README.md\n```\n\n## Example: DuckDB Documentation\n\nThe package includes a ready-to-use script for preparing DuckDB documentation embeddings:\n\n```bash\n# Download DuckDB docs and create embeddings\ncd python/rag\nuv run python examples/prepare_duckdb_docs.py\n```\n\nThis will:\n\n1. Download the latest DuckDB documentation from GitHub (~600+ files)\n2. Process all markdown files\n3. Generate embeddings using BAAI/bge-small-en-v1.5\n4. Create `generated-embeddings/duckdb_docs.duckdb`\n\nTest the embeddings:\n\n```bash\n# Run interactive test queries\nuv run python examples/test_duckdb_docs_query.py\n\n# Or test a specific query\nuv run python examples/test_duckdb_docs_query.py \"What is a window function?\"\n```\n\nThen use it in your SQLRooms app:\n\n```typescript\nimport {createRagSlice} from '@sqlrooms/rag';\n\nconst store = createRoomStore({\n slices: [\n createDuckDbSlice(),\n createRagSlice({\n embeddingsDatabases: [\n {\n databaseFilePath: './embeddings/duckdb_docs.duckdb',\n databaseName: 'duckdb_docs',\n },\n ],\n }),\n ],\n});\n\n// Search DuckDB documentation\nconst results = await store.getState().rag.queryEmbeddings(embedding);\n```\n\n## Supported Models\n\nThe tool works with any HuggingFace sentence-transformer model. Popular choices:\n\n| Model | Dimension | Max Tokens | Description |\n| --------------------------------------- | --------- | ---------- | --------------------- |\n| BAAI/bge-small-en-v1.5 | 384 | 512 | Default, good balance |\n| sentence-transformers/all-MiniLM-L6-v2 | 384 | 256 | Fast, lightweight |\n| BAAI/bge-base-en-v1.5 | 768 | 512 | Better accuracy |\n| sentence-transformers/all-mpnet-base-v2 | 768 | 384 | High quality |\n\n## Notes\n\n- The embedding model is downloaded and cached on first run (~100-500MB depending on model)\n- Processing time depends on the number and size of documents\n- The generated DuckDB file can be reused and updated with additional documents\n- Ensure the `--embed-dim` matches your chosen model's output dimension\n\n## Requirements\n\n- Python >=3.10\n- 2-4GB RAM (depending on model and document size)\n- ~500MB-2GB disk space for models and generated database\n\n## Troubleshooting\n\n### Out of Memory\n\nIf you run out of memory with large document sets, try:\n\n- Using a smaller embedding model\n- Processing documents in batches\n- Reducing chunk size\n\n### Slow Processing\n\n- First run downloads the embedding model (one-time operation)\n- Subsequent runs use the cached model\n- Consider using a smaller/faster model for large document sets\n\n## License\n\nPart of the SQLRooms project.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Prepare and query vector embeddings for RAG applications using DuckDB",
"version": "0.1.0a1",
"project_urls": {
"Documentation": "https://github.com/sqlrooms/sqlrooms",
"Homepage": "https://github.com/sqlrooms/sqlrooms",
"Repository": "https://github.com/sqlrooms/sqlrooms"
},
"split_keywords": [
"duckdb",
" embeddings",
" llama-index",
" rag",
" vector-search"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "640df1da0718b7f82247b425dd93f8b5ee428017eb2e98372381b54365be1577",
"md5": "43975097bf1268d434c82a7107ca428b",
"sha256": "9a62d64f13c1e1237c5d84e19c4ccf478bc02dd61d227b1131b3358e48949ba2"
},
"downloads": -1,
"filename": "sqlrooms_rag-0.1.0a1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "43975097bf1268d434c82a7107ca428b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 18972,
"upload_time": "2025-10-27T16:32:09",
"upload_time_iso_8601": "2025-10-27T16:32:09.630133Z",
"url": "https://files.pythonhosted.org/packages/64/0d/f1da0718b7f82247b425dd93f8b5ee428017eb2e98372381b54365be1577/sqlrooms_rag-0.1.0a1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1589ddfefc0f8ccaaf0af5d85565e6002701cdbf762acf211d319110df5c3b67",
"md5": "6298930ef1b56a8fe09458e260bb2d07",
"sha256": "27cf5a64d3da7b9391e95e155e508118b975d80c493735b5a7c129372435c3f2"
},
"downloads": -1,
"filename": "sqlrooms_rag-0.1.0a1.tar.gz",
"has_sig": false,
"md5_digest": "6298930ef1b56a8fe09458e260bb2d07",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 47916,
"upload_time": "2025-10-27T16:32:11",
"upload_time_iso_8601": "2025-10-27T16:32:11.054148Z",
"url": "https://files.pythonhosted.org/packages/15/89/ddfefc0f8ccaaf0af5d85565e6002701cdbf762acf211d319110df5c3b67/sqlrooms_rag-0.1.0a1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-27 16:32:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sqlrooms",
"github_project": "sqlrooms",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sqlrooms-rag"
}