# Literotica URL Extractor CLI Tool
This tool allows users to extract and download URLs from fanfiction sites using `FanFicFare`. It supports processing URLs from a file, saving extracted URLs to a new file, and downloading all stories from those URLs. The tool is designed for fanfiction enthusiasts and researchers who want a streamlined way to manage and download content from various fanfiction sources.
## Features
- **Extract URLs**: Reads a file with URLs, processes each to list available stories on that page, and saves all URLs to a specified output file.
- **Download Stories**: Allows downloading stories listed in a file with extracted URLs.
- **Progress Tracking**: Uses `tqdm` to display download progress.
## Requirements
- Python 3.6 or later
- [Click](https://pypi.org/project/click/) - For creating the CLI
- [tqdm](https://pypi.org/project/tqdm/) - For progress bars
- [FanFicFare](https://pypi.org/project/FanFicFare/) - The core library for extracting and downloading fanfiction
- [Pygments](https://pypi.org/project/Pygments/) - Syntax highlighting (optional)
To install dependencies, run:
```bash
pip install -r requirements.txt
Installation
Clone the repository and navigate to the folder:
bash
Copy code
git clone https://github.com/username/literotica-url-extractor.git
cd literotica-url-extractor
Install the tool:
bash
Copy code
pip install .
Usage
The tool provides two primary commands: url and download.
1. Extract URLs from File (url)
Extracts all URLs from a text file, processes them using FanFicFare, and saves the extracted list to a specified output file. This command can also download the stories from the URLs if the --d flag is used.
Usage:
bash
Copy code
literotica url <path_to_file> [OPTIONS]
Arguments:
<path_to_file>: Path to the text file containing a list of URLs.
Options:
--o: Output file name to save the extracted URLs. If omitted, defaults to extracted_list.txt.
--d: If set to True, downloads all the stories from the extracted URLs.
Example:
bash
Copy code
literotica url urls.txt --o extracted_urls.txt --d True
In this example:
urls.txt is the input file containing URLs to be processed.
The extracted URLs are saved in extracted_urls.txt.
The --d True option initiates the download of all listed stories.
2. Download Stories from File (download)
Downloads all stories listed in the provided file.
Usage:
bash
Copy code
literotica download <file>
Arguments:
<file>: Path to the file containing URLs of the stories to download.
Example:
bash
Copy code
literotica download extracted_urls.txt
In this example:
extracted_urls.txt is the file containing URLs to download.
Example Workflow
Extract URLs and Save to a File
bash
Copy code
literotica url urls.txt --o extracted_urls.txt
This command extracts URLs from urls.txt and saves them in extracted_urls.txt.
Download Stories from Extracted URLs
bash
Copy code
literotica download extracted_urls.txt
Downloads all stories from the URLs listed in extracted_urls.txt.
Code Structure
extract_urls_from_file: Reads and cleans URL list from a file.
fff_url_extractor: Calls FanFicFare to list all stories for a given URL.
download_url_from_file: Initiates story download from URLs in a file with progress tracking using tqdm.
prettify_url: Formats URLs for better readability.
save_to_file: Saves processed URLs to a file, appending to an existing file or creating a new one.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contributing
Contributions are welcome! Please fork the repository and submit a pull request.
Raw data
{
"_id": null,
"home_page": "https://github.com/username/my_project",
"name": "lit-extractor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "extractor, lit-extractor",
"author": "Munish chandra jha",
"author_email": "mcj130101@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/45/e7/844f3e619b00c5cd438c25d0afe3eb6dfdda4b9226dbd66b79ce99d4deeb/lit_extractor-0.2.6.tar.gz",
"platform": null,
"description": "# Literotica URL Extractor CLI Tool\r\n\r\nThis tool allows users to extract and download URLs from fanfiction sites using `FanFicFare`. It supports processing URLs from a file, saving extracted URLs to a new file, and downloading all stories from those URLs. The tool is designed for fanfiction enthusiasts and researchers who want a streamlined way to manage and download content from various fanfiction sources.\r\n\r\n## Features\r\n\r\n- **Extract URLs**: Reads a file with URLs, processes each to list available stories on that page, and saves all URLs to a specified output file.\r\n- **Download Stories**: Allows downloading stories listed in a file with extracted URLs.\r\n- **Progress Tracking**: Uses `tqdm` to display download progress.\r\n\r\n## Requirements\r\n\r\n- Python 3.6 or later\r\n- [Click](https://pypi.org/project/click/) - For creating the CLI\r\n- [tqdm](https://pypi.org/project/tqdm/) - For progress bars\r\n- [FanFicFare](https://pypi.org/project/FanFicFare/) - The core library for extracting and downloading fanfiction\r\n- [Pygments](https://pypi.org/project/Pygments/) - Syntax highlighting (optional)\r\n\r\nTo install dependencies, run:\r\n```bash\r\npip install -r requirements.txt\r\nInstallation\r\nClone the repository and navigate to the folder:\r\n\r\nbash\r\nCopy code\r\ngit clone https://github.com/username/literotica-url-extractor.git\r\ncd literotica-url-extractor\r\nInstall the tool:\r\n\r\nbash\r\nCopy code\r\npip install .\r\nUsage\r\nThe tool provides two primary commands: url and download.\r\n\r\n1. Extract URLs from File (url)\r\nExtracts all URLs from a text file, processes them using FanFicFare, and saves the extracted list to a specified output file. This command can also download the stories from the URLs if the --d flag is used.\r\n\r\nUsage:\r\n\r\nbash\r\nCopy code\r\nliterotica url <path_to_file> [OPTIONS]\r\nArguments:\r\n\r\n<path_to_file>: Path to the text file containing a list of URLs.\r\nOptions:\r\n\r\n--o: Output file name to save the extracted URLs. If omitted, defaults to extracted_list.txt.\r\n--d: If set to True, downloads all the stories from the extracted URLs.\r\nExample:\r\n\r\nbash\r\nCopy code\r\nliterotica url urls.txt --o extracted_urls.txt --d True\r\nIn this example:\r\n\r\nurls.txt is the input file containing URLs to be processed.\r\nThe extracted URLs are saved in extracted_urls.txt.\r\nThe --d True option initiates the download of all listed stories.\r\n2. Download Stories from File (download)\r\nDownloads all stories listed in the provided file.\r\n\r\nUsage:\r\n\r\nbash\r\nCopy code\r\nliterotica download <file>\r\nArguments:\r\n\r\n<file>: Path to the file containing URLs of the stories to download.\r\nExample:\r\n\r\nbash\r\nCopy code\r\nliterotica download extracted_urls.txt\r\nIn this example:\r\n\r\nextracted_urls.txt is the file containing URLs to download.\r\nExample Workflow\r\nExtract URLs and Save to a File\r\n\r\nbash\r\nCopy code\r\nliterotica url urls.txt --o extracted_urls.txt\r\nThis command extracts URLs from urls.txt and saves them in extracted_urls.txt.\r\n\r\nDownload Stories from Extracted URLs\r\n\r\nbash\r\nCopy code\r\nliterotica download extracted_urls.txt\r\nDownloads all stories from the URLs listed in extracted_urls.txt.\r\n\r\nCode Structure\r\nextract_urls_from_file: Reads and cleans URL list from a file.\r\nfff_url_extractor: Calls FanFicFare to list all stories for a given URL.\r\ndownload_url_from_file: Initiates story download from URLs in a file with progress tracking using tqdm.\r\nprettify_url: Formats URLs for better readability.\r\nsave_to_file: Saves processed URLs to a file, appending to an existing file or creating a new one.\r\nLicense\r\nThis project is licensed under the MIT License. See the LICENSE file for details.\r\n\r\nContributing\r\nContributions are welcome! Please fork the repository and submit a pull request.\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A mini script to read a list of url and extract all the url's present in the webpage",
"version": "0.2.6",
"project_urls": {
"Homepage": "https://github.com/username/my_project"
},
"split_keywords": [
"extractor",
" lit-extractor"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8ceaaf042bdf5bb759ec803335d187aec51dfcf53e134e0219d3f2a7ccbdc9af",
"md5": "3a44b8e656f1bc24af662eb7c649c16f",
"sha256": "8e4a150b0c2e3b4d973f82f4c0d74c7e5b1ea7163308b5a1f5314e466f17e188"
},
"downloads": -1,
"filename": "lit_extractor-0.2.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3a44b8e656f1bc24af662eb7c649c16f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 5713,
"upload_time": "2024-11-09T16:06:38",
"upload_time_iso_8601": "2024-11-09T16:06:38.106055Z",
"url": "https://files.pythonhosted.org/packages/8c/ea/af042bdf5bb759ec803335d187aec51dfcf53e134e0219d3f2a7ccbdc9af/lit_extractor-0.2.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "45e7844f3e619b00c5cd438c25d0afe3eb6dfdda4b9226dbd66b79ce99d4deeb",
"md5": "612733ce7eaf1195cfa339e479ee40cb",
"sha256": "810aac790f69294c470fb6ea28be39d675133975a2cf7220c97e2d60ad0bed94"
},
"downloads": -1,
"filename": "lit_extractor-0.2.6.tar.gz",
"has_sig": false,
"md5_digest": "612733ce7eaf1195cfa339e479ee40cb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5220,
"upload_time": "2024-11-09T16:06:39",
"upload_time_iso_8601": "2024-11-09T16:06:39.145119Z",
"url": "https://files.pythonhosted.org/packages/45/e7/844f3e619b00c5cd438c25d0afe3eb6dfdda4b9226dbd66b79ce99d4deeb/lit_extractor-0.2.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-09 16:06:39",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "username",
"github_project": "my_project",
"github_not_found": true,
"lcname": "lit-extractor"
}