bitchute-scraper


Namebitchute-scraper JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/bumatic/bitchute-scraper
SummaryA modern, API-based package to scrape BitChute platform data.
upload_time2025-07-20 09:08:17
maintainerMarcus Burkhardt
docs_urlNone
authorMarcus Burkhardt
requires_python>=3.7
licenseNone
keywords bitchute api scraper video data-collection download media research social-media content-analysis web-scraping data-science automation bulk-download
VCS
bugtrack_url
requirements requests pandas python-dateutil retrying selenium webdriver-manager urllib3 openpyxl tqdm pyarrow psutil pyyaml pytest pytest-mock pytest-cov black flake8 mypy types-requests types-python-dateutil sphinx sphinx-rtd-theme myst-parser
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5643102.svg)](https://doi.org/10.5281/zenodo.5643102)

# BitChute Scraper

Python scraper for the BitChute video platform. It allows you to query for videos and to retrieve platform recommendations such as trending videos, popular videos (now called "fresh") or trending tags. The release of version 1.0.0 is a major update using an API approach to data collection compared to the Selenium based scraper of now defunct previous versions. Since the codebase was completely rewritten in collaboration with Claude AI backwards compatibility is not provided.

## Features

- **Fast API-based data collection** - 10x faster than HTML parsing approaches
- **Automatic media downloads** - Thumbnails and videos with smart caching
- **Comprehensive data models** - Videos, channels, hashtags with computed properties
- **Concurrent processing** - Parallel requests with configurable rate limiting
- **Multiple export formats** - CSV, JSON, Excel, Parquet with timestamps
- **Command-line interface** - Easy automation and scripting support
- **Robust error handling** - Automatic retries and graceful fallbacks

## Installation

Install from PyPI:

```bash
pip3 install bitchute-scraper
```

For full functionality including progress bars and fast data formats:

```bash
pip install bitchute-scraper[full]
```

### System Requirements

- Python 3.7+
- Google Chrome or Chromium browser
- ChromeDriver (auto-managed)

## Quick Start

### Basic Usage

```python
import bitchute

# Initialize API client
api = bitchute.BitChuteAPI(verbose=True)

# Get trending videos
trending = api.get_trending_videos('day', limit=50)
print(f"Retrieved {len(trending)} trending videos")

# Search for videos
results = api.search_videos('climate change', limit=100)

# Get video details
video_info = api.get_video_info('VIDEO_ID', include_counts=True)
```

### Download Support

```python
# Initialize with downloads enabled
api = bitchute.BitChuteAPI(
    enable_downloads=True,
    download_base_dir="downloads",
    verbose=True
)

# Download videos with thumbnails
videos = api.get_trending_videos(
    'week',
    limit=20,
    download_thumbnails=True,
    download_videos=True
)
```

### Data Export

```python
from bitchute.utils import DataExporter

# Get data and export to multiple formats
videos = api.get_popular_videos(limit=100)

exporter = DataExporter()
exported_files = exporter.export_data(
    videos, 
    'popular_videos', 
    ['csv', 'json', 'xlsx']
)
```

### Command Line Interface

```bash
# Get trending videos
bitchute trending --timeframe day --limit 50 --format csv

# Search videos with details
bitchute search "bitcoin" --limit 100 --sort views --analyze

# Export to Excel
bitchute popular --limit 200 --format xlsx --analyze
```

## API Overview

### Core Methods

**Platform Recommendations:**
- `get_trending_videos(timeframe, limit)` - Trending by day/week/month
- `get_popular_videos(limit)` - Popular videos
- `get_recent_videos(limit)` - Most recent uploads
- `get_short_videos(limit)` - Short-form content

**Search Functions:**
- `search_videos(query, sensitivity, sort, limit)` - Video search
- `search_channels(query, sensitivity, limit)` - Channel search

**Individual Items:**
- `get_video_info(video_id, include_counts, include_media)` - Single video details
- `get_channel_info(channel_id)` - Channel information

**Hashtags:**
- `get_trending_hashtags(limit)` - Trending hashtags
- `get_videos_by_hashtag(hashtag, limit)` - Videos by hashtag

### Configuration Options

```python
api = bitchute.BitChuteAPI(
    verbose=True,                    # Enable logging
    enable_downloads=True,           # Enable media downloads
    download_base_dir="data",        # Download directory
    max_concurrent_downloads=5,      # Concurrent downloads
    rate_limit=0.3,                 # Seconds between requests
    timeout=60                      # Request timeout
)
```

### Data Models

All methods return pandas DataFrames with consistent schemas:

- **Video**: Complete metadata with engagement metrics and download paths
- **Channel**: Channel information with statistics and social links
- **Hashtag**: Trending hashtags with rankings and video counts

## Advanced Usage

### Bulk Data Collection

```python
# Get large datasets efficiently
all_videos = api.get_all_videos(limit=5000, include_details=True)

# Process with filtering
from bitchute.utils import ContentFilter
filtered = ContentFilter.filter_by_views(all_videos, min_views=1000)
crypto_videos = ContentFilter.filter_by_keywords(filtered, ['bitcoin', 'crypto'])
```

### Performance Monitoring

```python
# Track download performance
stats = api.get_download_stats()
print(f"Success rate: {stats['success_rate']:.1%}")
print(f"Total downloaded: {stats['total_bytes_formatted']}")
```

## Documentation

- **API Reference**: Complete method documentation with examples
- **User Guide**: Detailed tutorials and best practices
- **CLI Reference**: Command-line usage and automation examples

## Contributing

We welcome contributions! Please see our contributing guidelines:

1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Ensure all tests pass
5. Submit a pull request

### Development Setup

```bash
git clone https://github.com/bumatic/bitchute-scraper.git
cd bitchute-scraper
pip install -e .[dev]
pytest
```

## License

MIT License - see LICENSE file for details.

## Support

- **Issues**: [GitHub Issues](https://github.com/bumatic/bitchute-scraper/issues)
- **Discussions**: [GitHub Discussions](https://github.com/bumatic/bitchute-scraper/discussions)


## Disclaimer

This software is intended for educational and research purposes only.
Users are responsible for complying with Terms of Service and all applicable laws. 
The software authors disclaim all liability for any misuse of this software.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bumatic/bitchute-scraper",
    "name": "bitchute-scraper",
    "maintainer": "Marcus Burkhardt",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "Marcus Burkhardt <marcus.burkhardt@gmail.com>",
    "keywords": "bitchute, api, scraper, video, data-collection, download, media, research, social-media, content-analysis, web-scraping, data-science, automation, bulk-download",
    "author": "Marcus Burkhardt",
    "author_email": "Marcus Burkhardt <marcus.burkhardt@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/80/9f/31f01fe4aa89bc1459f5089b0ed2b7f9ee91f8fc71504f97b3eecfedae25/bitchute_scraper-1.0.0.tar.gz",
    "platform": "any",
    "description": "[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5643102.svg)](https://doi.org/10.5281/zenodo.5643102)\n\n# BitChute Scraper\n\nPython scraper for the BitChute video platform. It allows you to query for videos and to retrieve platform recommendations such as trending videos, popular videos (now called \"fresh\") or trending tags. The release of version 1.0.0 is a major update using an API approach to data collection compared to the Selenium based scraper of now defunct previous versions. Since the codebase was completely rewritten in collaboration with Claude AI backwards compatibility is not provided.\n\n## Features\n\n- **Fast API-based data collection** - 10x faster than HTML parsing approaches\n- **Automatic media downloads** - Thumbnails and videos with smart caching\n- **Comprehensive data models** - Videos, channels, hashtags with computed properties\n- **Concurrent processing** - Parallel requests with configurable rate limiting\n- **Multiple export formats** - CSV, JSON, Excel, Parquet with timestamps\n- **Command-line interface** - Easy automation and scripting support\n- **Robust error handling** - Automatic retries and graceful fallbacks\n\n## Installation\n\nInstall from PyPI:\n\n```bash\npip3 install bitchute-scraper\n```\n\nFor full functionality including progress bars and fast data formats:\n\n```bash\npip install bitchute-scraper[full]\n```\n\n### System Requirements\n\n- Python 3.7+\n- Google Chrome or Chromium browser\n- ChromeDriver (auto-managed)\n\n## Quick Start\n\n### Basic Usage\n\n```python\nimport bitchute\n\n# Initialize API client\napi = bitchute.BitChuteAPI(verbose=True)\n\n# Get trending videos\ntrending = api.get_trending_videos('day', limit=50)\nprint(f\"Retrieved {len(trending)} trending videos\")\n\n# Search for videos\nresults = api.search_videos('climate change', limit=100)\n\n# Get video details\nvideo_info = api.get_video_info('VIDEO_ID', include_counts=True)\n```\n\n### Download Support\n\n```python\n# Initialize with downloads enabled\napi = bitchute.BitChuteAPI(\n    enable_downloads=True,\n    download_base_dir=\"downloads\",\n    verbose=True\n)\n\n# Download videos with thumbnails\nvideos = api.get_trending_videos(\n    'week',\n    limit=20,\n    download_thumbnails=True,\n    download_videos=True\n)\n```\n\n### Data Export\n\n```python\nfrom bitchute.utils import DataExporter\n\n# Get data and export to multiple formats\nvideos = api.get_popular_videos(limit=100)\n\nexporter = DataExporter()\nexported_files = exporter.export_data(\n    videos, \n    'popular_videos', \n    ['csv', 'json', 'xlsx']\n)\n```\n\n### Command Line Interface\n\n```bash\n# Get trending videos\nbitchute trending --timeframe day --limit 50 --format csv\n\n# Search videos with details\nbitchute search \"bitcoin\" --limit 100 --sort views --analyze\n\n# Export to Excel\nbitchute popular --limit 200 --format xlsx --analyze\n```\n\n## API Overview\n\n### Core Methods\n\n**Platform Recommendations:**\n- `get_trending_videos(timeframe, limit)` - Trending by day/week/month\n- `get_popular_videos(limit)` - Popular videos\n- `get_recent_videos(limit)` - Most recent uploads\n- `get_short_videos(limit)` - Short-form content\n\n**Search Functions:**\n- `search_videos(query, sensitivity, sort, limit)` - Video search\n- `search_channels(query, sensitivity, limit)` - Channel search\n\n**Individual Items:**\n- `get_video_info(video_id, include_counts, include_media)` - Single video details\n- `get_channel_info(channel_id)` - Channel information\n\n**Hashtags:**\n- `get_trending_hashtags(limit)` - Trending hashtags\n- `get_videos_by_hashtag(hashtag, limit)` - Videos by hashtag\n\n### Configuration Options\n\n```python\napi = bitchute.BitChuteAPI(\n    verbose=True,                    # Enable logging\n    enable_downloads=True,           # Enable media downloads\n    download_base_dir=\"data\",        # Download directory\n    max_concurrent_downloads=5,      # Concurrent downloads\n    rate_limit=0.3,                 # Seconds between requests\n    timeout=60                      # Request timeout\n)\n```\n\n### Data Models\n\nAll methods return pandas DataFrames with consistent schemas:\n\n- **Video**: Complete metadata with engagement metrics and download paths\n- **Channel**: Channel information with statistics and social links\n- **Hashtag**: Trending hashtags with rankings and video counts\n\n## Advanced Usage\n\n### Bulk Data Collection\n\n```python\n# Get large datasets efficiently\nall_videos = api.get_all_videos(limit=5000, include_details=True)\n\n# Process with filtering\nfrom bitchute.utils import ContentFilter\nfiltered = ContentFilter.filter_by_views(all_videos, min_views=1000)\ncrypto_videos = ContentFilter.filter_by_keywords(filtered, ['bitcoin', 'crypto'])\n```\n\n### Performance Monitoring\n\n```python\n# Track download performance\nstats = api.get_download_stats()\nprint(f\"Success rate: {stats['success_rate']:.1%}\")\nprint(f\"Total downloaded: {stats['total_bytes_formatted']}\")\n```\n\n## Documentation\n\n- **API Reference**: Complete method documentation with examples\n- **User Guide**: Detailed tutorials and best practices\n- **CLI Reference**: Command-line usage and automation examples\n\n## Contributing\n\nWe welcome contributions! Please see our contributing guidelines:\n\n1. Fork the repository\n2. Create a feature branch\n3. Add tests for new functionality\n4. Ensure all tests pass\n5. Submit a pull request\n\n### Development Setup\n\n```bash\ngit clone https://github.com/bumatic/bitchute-scraper.git\ncd bitchute-scraper\npip install -e .[dev]\npytest\n```\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/bumatic/bitchute-scraper/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/bumatic/bitchute-scraper/discussions)\n\n\n## Disclaimer\n\nThis software is intended for educational and research purposes only.\nUsers are responsible for complying with Terms of Service and all applicable laws. \nThe software authors disclaim all liability for any misuse of this software.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A modern, API-based package to scrape BitChute platform data.",
    "version": "1.0.0",
    "project_urls": {
        "Bug Reports": "https://github.com/bumatic/bitchute-scraper/issues",
        "Changelog": "https://github.com/bumatic/bitchute-scraper/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/bumatic/bitchute-scraper/blob/main/README.md",
        "Download": "https://github.com/bumatic/bitchute-scraper/archive/v1.0.0.tar.gz",
        "Homepage": "https://github.com/bumatic/bitchute-scraper",
        "Repository": "https://github.com/bumatic/bitchute-scraper"
    },
    "split_keywords": [
        "bitchute",
        " api",
        " scraper",
        " video",
        " data-collection",
        " download",
        " media",
        " research",
        " social-media",
        " content-analysis",
        " web-scraping",
        " data-science",
        " automation",
        " bulk-download"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "30dc41378c9971c2251c4101783d9160d2f781701c474f5a4041dc226eceb0cc",
                "md5": "edef2e2a5da097b1805e626625e9c9b1",
                "sha256": "41cf0282bc4278cef0f893b77d69bc2aedca748e0f7d8ba8fb091f44d7a57ac7"
            },
            "downloads": -1,
            "filename": "bitchute_scraper-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "edef2e2a5da097b1805e626625e9c9b1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 71840,
            "upload_time": "2025-07-20T09:08:15",
            "upload_time_iso_8601": "2025-07-20T09:08:15.280597Z",
            "url": "https://files.pythonhosted.org/packages/30/dc/41378c9971c2251c4101783d9160d2f781701c474f5a4041dc226eceb0cc/bitchute_scraper-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "809f31f01fe4aa89bc1459f5089b0ed2b7f9ee91f8fc71504f97b3eecfedae25",
                "md5": "83e29b471bac52b2f1ecf98091507ba6",
                "sha256": "250ceed2101b61afae23ef018531a94011c2b9a9b7342194ee969970786bf094"
            },
            "downloads": -1,
            "filename": "bitchute_scraper-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "83e29b471bac52b2f1ecf98091507ba6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 103621,
            "upload_time": "2025-07-20T09:08:17",
            "upload_time_iso_8601": "2025-07-20T09:08:17.930819Z",
            "url": "https://files.pythonhosted.org/packages/80/9f/31f01fe4aa89bc1459f5089b0ed2b7f9ee91f8fc71504f97b3eecfedae25/bitchute_scraper-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-20 09:08:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bumatic",
    "github_project": "bitchute-scraper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.28.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    ">=",
                    "2.8.0"
                ]
            ]
        },
        {
            "name": "retrying",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "selenium",
            "specs": [
                [
                    ">=",
                    "4.10.0"
                ]
            ]
        },
        {
            "name": "webdriver-manager",
            "specs": [
                [
                    ">=",
                    "3.8.0"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    ">=",
                    "1.26.0"
                ]
            ]
        },
        {
            "name": "openpyxl",
            "specs": [
                [
                    ">=",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.64.0"
                ]
            ]
        },
        {
            "name": "pyarrow",
            "specs": [
                [
                    ">=",
                    "10.0.0"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": [
                [
                    ">=",
                    "5.8.0"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    ">=",
                    "6.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "7.0.0"
                ]
            ]
        },
        {
            "name": "pytest-mock",
            "specs": [
                [
                    ">=",
                    "3.10.0"
                ]
            ]
        },
        {
            "name": "pytest-cov",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "black",
            "specs": [
                [
                    ">=",
                    "22.0.0"
                ]
            ]
        },
        {
            "name": "flake8",
            "specs": [
                [
                    ">=",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "mypy",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "types-requests",
            "specs": [
                [
                    ">=",
                    "2.28.0"
                ]
            ]
        },
        {
            "name": "types-python-dateutil",
            "specs": [
                [
                    ">=",
                    "2.8.0"
                ]
            ]
        },
        {
            "name": "sphinx",
            "specs": [
                [
                    ">=",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "sphinx-rtd-theme",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "myst-parser",
            "specs": [
                [
                    ">=",
                    "0.18.0"
                ]
            ]
        }
    ],
    "lcname": "bitchute-scraper"
}
        
Elapsed time: 0.79489s