academicjobscraper


Nameacademicjobscraper JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/yourusername/academicjobscraper
SummaryA package to scrape and filter academic job listings
upload_time2025-02-08 20:27:33
maintainerNone
docs_urlNone
authorYour Name
requires_python>=3.6
licenseNone
keywords academic jobs job scraping web scraping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AcademicJobScraper

A Python package for scraping and filtering academic job listings from academicjobsonline.org.

## Installation

```bash
pip install academicjobscraper
```

## Usage

### As a Python Module

```python
from academicjobscraper import AcademicJobScraper

# Initialize the scraper with required keywords and optional file names
scraper = AcademicJobScraper(
    keywords=["machine learning", "deep learning", "AI"],  # Required
    links_file="job_links.csv",      # Optional (default: job_links.csv)
    data_file="jobs_data.json",      # Optional (default: jobs_data.json)
    results_file="relevant_jobs.csv"  # Optional (default: relevant_jobs.csv)
)

# Start scraping with a mother link
scraper.scrape("https://academicjobsonline.org/your-search-url")
```

### Command Line Interface

```bash
# Basic usage with required parameters
academicjobscraper "https://academicjobsonline.org/your-search-url" "machine learning" "deep learning" "AI"

# With optional file name parameters
academicjobscraper "https://academicjobsonline.org/your-search-url" \
    "machine learning" "deep learning" "AI" \
    --links-file custom_links.csv \
    --data-file custom_data.json \
    --results-file custom_results.csv
```

## Features

- Scrapes job listings from academicjobsonline.org
- Extracts detailed job information
- Filters jobs based on provided keywords
- Customizable output file names
- Progress tracking during scraping

## Output Files

The scraper generates three files:

1. `job_links.csv` - Contains all scraped job URLs
2. `jobs_data.json` - Contains detailed information for all jobs
3. `relevant_jobs.csv` - Contains filtered jobs matching the keywords

## License

This project is licensed under the MIT License - see the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yourusername/academicjobscraper",
    "name": "academicjobscraper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "academic jobs, job scraping, web scraping",
    "author": "Your Name",
    "author_email": "your.email@example.com",
    "download_url": "https://files.pythonhosted.org/packages/df/45/caf0e08332018461031c163c8a45fec8f900948757f9d74d5722e5930267/academicjobscraper-0.1.0.tar.gz",
    "platform": null,
    "description": "# AcademicJobScraper\n\nA Python package for scraping and filtering academic job listings from academicjobsonline.org.\n\n## Installation\n\n```bash\npip install academicjobscraper\n```\n\n## Usage\n\n### As a Python Module\n\n```python\nfrom academicjobscraper import AcademicJobScraper\n\n# Initialize the scraper with required keywords and optional file names\nscraper = AcademicJobScraper(\n    keywords=[\"machine learning\", \"deep learning\", \"AI\"],  # Required\n    links_file=\"job_links.csv\",      # Optional (default: job_links.csv)\n    data_file=\"jobs_data.json\",      # Optional (default: jobs_data.json)\n    results_file=\"relevant_jobs.csv\"  # Optional (default: relevant_jobs.csv)\n)\n\n# Start scraping with a mother link\nscraper.scrape(\"https://academicjobsonline.org/your-search-url\")\n```\n\n### Command Line Interface\n\n```bash\n# Basic usage with required parameters\nacademicjobscraper \"https://academicjobsonline.org/your-search-url\" \"machine learning\" \"deep learning\" \"AI\"\n\n# With optional file name parameters\nacademicjobscraper \"https://academicjobsonline.org/your-search-url\" \\\n    \"machine learning\" \"deep learning\" \"AI\" \\\n    --links-file custom_links.csv \\\n    --data-file custom_data.json \\\n    --results-file custom_results.csv\n```\n\n## Features\n\n- Scrapes job listings from academicjobsonline.org\n- Extracts detailed job information\n- Filters jobs based on provided keywords\n- Customizable output file names\n- Progress tracking during scraping\n\n## Output Files\n\nThe scraper generates three files:\n\n1. `job_links.csv` - Contains all scraped job URLs\n2. `jobs_data.json` - Contains detailed information for all jobs\n3. `relevant_jobs.csv` - Contains filtered jobs matching the keywords\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A package to scrape and filter academic job listings",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/yourusername/academicjobscraper"
    },
    "split_keywords": [
        "academic jobs",
        " job scraping",
        " web scraping"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4a3a55d2c00aca464fe98197240b014956a80d1c34b2ea559a4801a2635db172",
                "md5": "2b8fa331b4bce694452df6a878b69003",
                "sha256": "ef8b5af53ec3bf5893686f7c8ef6ac07a3254b7503c3d30def6a6f6a0eaf9f5d"
            },
            "downloads": -1,
            "filename": "academicjobscraper-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2b8fa331b4bce694452df6a878b69003",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 5425,
            "upload_time": "2025-02-08T20:27:31",
            "upload_time_iso_8601": "2025-02-08T20:27:31.594449Z",
            "url": "https://files.pythonhosted.org/packages/4a/3a/55d2c00aca464fe98197240b014956a80d1c34b2ea559a4801a2635db172/academicjobscraper-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "df45caf0e08332018461031c163c8a45fec8f900948757f9d74d5722e5930267",
                "md5": "0ecd99d6573cb7daee613d7c2abd1f3d",
                "sha256": "9b732ee2a7a5e7491ada730fef124e8c8107b3e9970691894bf90d23f21d2ee9"
            },
            "downloads": -1,
            "filename": "academicjobscraper-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0ecd99d6573cb7daee613d7c2abd1f3d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 4487,
            "upload_time": "2025-02-08T20:27:33",
            "upload_time_iso_8601": "2025-02-08T20:27:33.471425Z",
            "url": "https://files.pythonhosted.org/packages/df/45/caf0e08332018461031c163c8a45fec8f900948757f9d74d5722e5930267/academicjobscraper-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-08 20:27:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yourusername",
    "github_project": "academicjobscraper",
    "github_not_found": true,
    "lcname": "academicjobscraper"
}
        
Elapsed time: 1.77711s