# langchain-yt-dlp
**`langchain-yt-dlp`** is a Python package that extends [LangChain](https://github.com/langchain-ai/langchain) by providing an improved YouTube integration using `yt-dlp`.
This package addresses a critical limitation in the existing LangChain YoutubeLoader. The original implementation, which relied on `pytube`, became unable to fetch YouTube metadata due to changes in YouTube's structure. `langchain-yt-dlp` resolves this by leveraging the robust `yt-dlp` library, providing a more reliable YouTube document loader.
---
## Key Features
- Retrieve metadata (e.g., title, description, author, view count, publish date) using the `yt-dlp` library.
- Maintain compatibility with LangChain's existing loader interface.
---
## Installation
To install the package, use:
```bash
pip install langchain-yt-dlp
```
Ensure you have the following dependencies installed:
- `langchain`
- `yt-dlp`
Install them with:
```bash
pip install langchain yt-dlp
```
---
## Usage
Here’s how you can use the `YoutubeLoader` from `langchain-yt-dlp`:
### **Basic Example**
### **Loading From a YouTube URL**
```python
from langchain_yt_dlp.youtube_loader import YoutubeLoaderDL
# Initialize using a YouTube URL
loader = YoutubeLoaderDL.from_youtube_url(
youtube_url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
add_video_info=True
)
documents = loader.load()
print(documents)
```
---
## Parameters
### `YoutubeLoaderDL` Constructor
| Parameter | Type | Default | Description |
|----------------|------|---------|-----------------------------------------|
| `video_id` | `str` | None | The YouTube video ID to fetch data for. |
| `add_video_info` | `bool` | `False` | Whether to fetch additional metadata. |
---
## Testing
To run the tests:
1. Clone the repository:
```bash
git clone https://github.com/aqib0770/langchain-yt-dlp
cd langchain-yt-dlp
```
2. Install development dependencies:
```bash
pip install -r requirements.txt
```
3. Run the tests:
```bash
pytest tests/test_youtube_loader.py
```
---
## Contributing
Contributions are welcome! If you have ideas for new features or spot a bug, feel free to:
- Open an issue on [GitHub](https://github.com/aqib0770/langchain-yt-dlp/issues).
- Submit a pull request.
---
## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/aqib0770/langchain-yt-dlp/blob/main/LICENSE) file for details.
---
## Acknowledgements
- [LangChain](https://github.com/langchain-ai/langchain) for providing the base integration framework.
- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for enabling enhanced YouTube metadata extraction.
Raw data
{
"_id": null,
"home_page": "https://github.com/aqib0770/langchain-yt-dlp",
"name": "langchain-yt-dlp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "langchain yt-dlp loader",
"author": "Aqib Ansari",
"author_email": "aqibansari72a@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/0c/90/f09dde067ea4c836a3f4af83307310cc2624e8690f5ce3d2f78e6d525d21/langchain_yt_dlp-0.0.8.tar.gz",
"platform": null,
"description": "# langchain-yt-dlp\n\n**`langchain-yt-dlp`** is a Python package that extends [LangChain](https://github.com/langchain-ai/langchain) by providing an improved YouTube integration using `yt-dlp`.\nThis package addresses a critical limitation in the existing LangChain YoutubeLoader. The original implementation, which relied on `pytube`, became unable to fetch YouTube metadata due to changes in YouTube's structure. `langchain-yt-dlp` resolves this by leveraging the robust `yt-dlp` library, providing a more reliable YouTube document loader.\n\n---\n\n## Key Features\n\n- Retrieve metadata (e.g., title, description, author, view count, publish date) using the `yt-dlp` library.\n- Maintain compatibility with LangChain's existing loader interface.\n\n---\n\n## Installation\n\nTo install the package, use:\n\n```bash\npip install langchain-yt-dlp\n```\n\nEnsure you have the following dependencies installed:\n- `langchain`\n- `yt-dlp`\n\nInstall them with:\n```bash\npip install langchain yt-dlp\n```\n\n---\n\n## Usage\n\nHere\u2019s how you can use the `YoutubeLoader` from `langchain-yt-dlp`:\n\n### **Basic Example**\n\n\n\n### **Loading From a YouTube URL**\n\n```python\nfrom langchain_yt_dlp.youtube_loader import YoutubeLoaderDL\n\n# Initialize using a YouTube URL\nloader = YoutubeLoaderDL.from_youtube_url(\n youtube_url=\"https://www.youtube.com/watch?v=dQw4w9WgXcQ\", \n add_video_info=True\n)\n\ndocuments = loader.load()\nprint(documents)\n```\n\n---\n\n## Parameters\n\n### `YoutubeLoaderDL` Constructor\n\n| Parameter | Type | Default | Description |\n|----------------|------|---------|-----------------------------------------|\n| `video_id` | `str` | None | The YouTube video ID to fetch data for. |\n| `add_video_info` | `bool` | `False` | Whether to fetch additional metadata. |\n\n---\n\n## Testing\n\nTo run the tests:\n\n1. Clone the repository:\n ```bash\n git clone https://github.com/aqib0770/langchain-yt-dlp\n cd langchain-yt-dlp\n ```\n\n2. Install development dependencies:\n ```bash\n pip install -r requirements.txt\n ```\n\n3. Run the tests:\n ```bash\n pytest tests/test_youtube_loader.py\n ```\n\n---\n\n## Contributing\n\nContributions are welcome! If you have ideas for new features or spot a bug, feel free to:\n- Open an issue on [GitHub](https://github.com/aqib0770/langchain-yt-dlp/issues).\n- Submit a pull request.\n\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/aqib0770/langchain-yt-dlp/blob/main/LICENSE) file for details.\n\n---\n\n## Acknowledgements\n\n- [LangChain](https://github.com/langchain-ai/langchain) for providing the base integration framework.\n- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for enabling enhanced YouTube metadata extraction.\n",
"bugtrack_url": null,
"license": null,
"summary": "YouTube loader for LangChain using yt-dlp",
"version": "0.0.8",
"project_urls": {
"Homepage": "https://github.com/aqib0770/langchain-yt-dlp"
},
"split_keywords": [
"langchain",
"yt-dlp",
"loader"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "12c1e4721f6381e16dd69f1f119fcefa27666276d90d24ac9e8b4494cf550c27",
"md5": "5edb0ffaff0c2d558f3a01af726d052b",
"sha256": "8226fd95edbf3cc70607640b50c3e09f407a113a085f25c0f7575ed9602f4098"
},
"downloads": -1,
"filename": "langchain_yt_dlp-0.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5edb0ffaff0c2d558f3a01af726d052b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 4788,
"upload_time": "2024-12-19T16:09:19",
"upload_time_iso_8601": "2024-12-19T16:09:19.335514Z",
"url": "https://files.pythonhosted.org/packages/12/c1/e4721f6381e16dd69f1f119fcefa27666276d90d24ac9e8b4494cf550c27/langchain_yt_dlp-0.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0c90f09dde067ea4c836a3f4af83307310cc2624e8690f5ce3d2f78e6d525d21",
"md5": "72a965edd7640d6a75e54e8b92436f1d",
"sha256": "10f77ad8ca86dcaf9d94a118eed26999e63071b543f6a765da10daa773001e43"
},
"downloads": -1,
"filename": "langchain_yt_dlp-0.0.8.tar.gz",
"has_sig": false,
"md5_digest": "72a965edd7640d6a75e54e8b92436f1d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 5124,
"upload_time": "2024-12-19T16:09:22",
"upload_time_iso_8601": "2024-12-19T16:09:22.609076Z",
"url": "https://files.pythonhosted.org/packages/0c/90/f09dde067ea4c836a3f4af83307310cc2624e8690f5ce3d2f78e6d525d21/langchain_yt_dlp-0.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-19 16:09:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "aqib0770",
"github_project": "langchain-yt-dlp",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "langchain",
"specs": []
},
{
"name": "yt-dlp",
"specs": []
},
{
"name": "langchain_community",
"specs": []
}
],
"lcname": "langchain-yt-dlp"
}