# AcademicJobScraper
A Python package for scraping and filtering academic job listings from academicjobsonline.org.
## Installation
```bash
pip install academicjobscraper
```
## Usage
### As a Python Module
```python
from academicjobscraper import AcademicJobScraper
# Initialize the scraper with required keywords and optional file names
scraper = AcademicJobScraper(
keywords=["machine learning", "deep learning", "AI"], # Required
links_file="job_links.csv", # Optional (default: job_links.csv)
data_file="jobs_data.json", # Optional (default: jobs_data.json)
results_file="relevant_jobs.csv" # Optional (default: relevant_jobs.csv)
)
# Start scraping with a mother link
scraper.scrape("https://academicjobsonline.org/your-search-url")
```
### Command Line Interface
```bash
# Basic usage with required parameters
academicjobscraper "https://academicjobsonline.org/your-search-url" "machine learning" "deep learning" "AI"
# With optional file name parameters
academicjobscraper "https://academicjobsonline.org/your-search-url" \
"machine learning" "deep learning" "AI" \
--links-file custom_links.csv \
--data-file custom_data.json \
--results-file custom_results.csv
```
## Features
- Scrapes job listings from academicjobsonline.org
- Extracts detailed job information
- Filters jobs based on provided keywords
- Customizable output file names
- Progress tracking during scraping
## Output Files
The scraper generates three files:
1. `job_links.csv` - Contains all scraped job URLs
2. `jobs_data.json` - Contains detailed information for all jobs
3. `relevant_jobs.csv` - Contains filtered jobs matching the keywords
## License
This project is licensed under the MIT License - see the LICENSE file for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/yourusername/academicjobscraper",
"name": "academicjobscraper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "academic jobs, job scraping, web scraping",
"author": "Your Name",
"author_email": "your.email@example.com",
"download_url": "https://files.pythonhosted.org/packages/df/45/caf0e08332018461031c163c8a45fec8f900948757f9d74d5722e5930267/academicjobscraper-0.1.0.tar.gz",
"platform": null,
"description": "# AcademicJobScraper\n\nA Python package for scraping and filtering academic job listings from academicjobsonline.org.\n\n## Installation\n\n```bash\npip install academicjobscraper\n```\n\n## Usage\n\n### As a Python Module\n\n```python\nfrom academicjobscraper import AcademicJobScraper\n\n# Initialize the scraper with required keywords and optional file names\nscraper = AcademicJobScraper(\n keywords=[\"machine learning\", \"deep learning\", \"AI\"], # Required\n links_file=\"job_links.csv\", # Optional (default: job_links.csv)\n data_file=\"jobs_data.json\", # Optional (default: jobs_data.json)\n results_file=\"relevant_jobs.csv\" # Optional (default: relevant_jobs.csv)\n)\n\n# Start scraping with a mother link\nscraper.scrape(\"https://academicjobsonline.org/your-search-url\")\n```\n\n### Command Line Interface\n\n```bash\n# Basic usage with required parameters\nacademicjobscraper \"https://academicjobsonline.org/your-search-url\" \"machine learning\" \"deep learning\" \"AI\"\n\n# With optional file name parameters\nacademicjobscraper \"https://academicjobsonline.org/your-search-url\" \\\n \"machine learning\" \"deep learning\" \"AI\" \\\n --links-file custom_links.csv \\\n --data-file custom_data.json \\\n --results-file custom_results.csv\n```\n\n## Features\n\n- Scrapes job listings from academicjobsonline.org\n- Extracts detailed job information\n- Filters jobs based on provided keywords\n- Customizable output file names\n- Progress tracking during scraping\n\n## Output Files\n\nThe scraper generates three files:\n\n1. `job_links.csv` - Contains all scraped job URLs\n2. `jobs_data.json` - Contains detailed information for all jobs\n3. `relevant_jobs.csv` - Contains filtered jobs matching the keywords\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n",
"bugtrack_url": null,
"license": null,
"summary": "A package to scrape and filter academic job listings",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/yourusername/academicjobscraper"
},
"split_keywords": [
"academic jobs",
" job scraping",
" web scraping"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4a3a55d2c00aca464fe98197240b014956a80d1c34b2ea559a4801a2635db172",
"md5": "2b8fa331b4bce694452df6a878b69003",
"sha256": "ef8b5af53ec3bf5893686f7c8ef6ac07a3254b7503c3d30def6a6f6a0eaf9f5d"
},
"downloads": -1,
"filename": "academicjobscraper-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2b8fa331b4bce694452df6a878b69003",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 5425,
"upload_time": "2025-02-08T20:27:31",
"upload_time_iso_8601": "2025-02-08T20:27:31.594449Z",
"url": "https://files.pythonhosted.org/packages/4a/3a/55d2c00aca464fe98197240b014956a80d1c34b2ea559a4801a2635db172/academicjobscraper-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "df45caf0e08332018461031c163c8a45fec8f900948757f9d74d5722e5930267",
"md5": "0ecd99d6573cb7daee613d7c2abd1f3d",
"sha256": "9b732ee2a7a5e7491ada730fef124e8c8107b3e9970691894bf90d23f21d2ee9"
},
"downloads": -1,
"filename": "academicjobscraper-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "0ecd99d6573cb7daee613d7c2abd1f3d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 4487,
"upload_time": "2025-02-08T20:27:33",
"upload_time_iso_8601": "2025-02-08T20:27:33.471425Z",
"url": "https://files.pythonhosted.org/packages/df/45/caf0e08332018461031c163c8a45fec8f900948757f9d74d5722e5930267/academicjobscraper-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-08 20:27:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yourusername",
"github_project": "academicjobscraper",
"github_not_found": true,
"lcname": "academicjobscraper"
}