pyzill


Namepyzill JSON
Version 1.0.0 PyPI version JSON
download
home_pageNone
SummaryZillow scraper in Python
upload_time2024-12-01 23:42:49
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseMIT
keywords zillow scraper crawler
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Zillow scraper in Python

## Overview
This project is an open-source tool developed in Python for extracting product information from Zillow. It's designed to be easy to use, making it an ideal solution for developers looking for Zillow product data.

## Features
- Full search support
- Extracts detailed product information from Zillow
- Implemented in Python just because it's popular
- Easy to integrate with existing Python projects

### Important
- Use rotating residential proxies, zillow will block if you make multiple requests with the same IP, 

### Install

```bash
$ pip install pyzill
```
## Examples

```Python
import pyzill
import json
ne_lat = 38.602951833355434
ne_long = -87.22283859375
sw_lat = 23.42674607019482
sw_long = -112.93084640625
pagination = 1
#pagination is for the list that you see at the right when searching
#you don't need to iterate over all the pages because zillow sends the whole data on mapresults at once on the first page
#however the maximum result zillow returns is 500, so if mapResults is 500
#try playing with the zoom or moving the coordinates, pagination won't help because you will always get at maximum 500 results
pagination = 1
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
results_sold = pyzill.sold(pagination, 
              search_value="miami",
              min_beds=1,max_beds=1,
              min_bathrooms=None,max_bathrooms=None,
              min_price=10000,max_price=None,
              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,
              zoom_value=5,
              proxy_url=proxy_url)
              
results_sale = pyzill.for_sale(pagination, 
              search_value="",
              min_beds=None,max_beds=None,
              min_bathrooms=3,max_bathrooms=None,
              min_price=None,max_price=None,
              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,
              zoom_value=10,
              proxy_url=proxy_url)

results_rent = pyzill.for_rent(pagination, 
              search_value="",
              min_beds=1,max_beds=None,
              min_bathrooms=None,max_bathrooms=None,
              min_price=10000,max_price=None,
              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,
              zoom_value=15,
              proxy_url=proxy_url)
jsondata_sold = json.dumps(results_sold)
jsondata_sale = json.dumps(results_sale)
jsondata_rent = json.dumps(results_rent)
f = open("./jsondata_sold.json", "w")
f.write(jsondata_sold)
f.close()
f = open("./jsondata_sale.json", "w")
f.write(jsondata_sale)
f.close()
f = open("./jsondata_rent.json", "w")
f.write(jsondata_rent)
f.close()
```
# For homes

```Python
import pyzill
import json
property_url="https://www.zillow.com/homedetails/858-Shady-Grove-Ln-Harrah-OK-73045/339897685_zpid/"
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_home_url(property_url,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()
```

```Python
import pyzill
import json
property_id=2056016566
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_home_id(property_id,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()
```

# For departments

```Python
import pyzill
import json
property_url="https://www.zillow.com/apartments/kissimmee-fl/the-nexus-at-overbrook/9DSWrh/"
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_deparment_url(property_url,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()
```

```Python
import pyzill
import json
property_id="CgKZT4"
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_deparment_id(property_id,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pyzill",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "zillow, scraper, crawler",
    "author": null,
    "author_email": "John Balvin <johnchristian@hotmail.es>",
    "download_url": "https://files.pythonhosted.org/packages/53/ae/918d9f8944d4fd8dddf74d9343047eeda5d86bbc4d2e5fd7f56b97a5cedc/pyzill-1.0.0.tar.gz",
    "platform": null,
    "description": "# Zillow scraper in Python\r\n\r\n## Overview\r\nThis project is an open-source tool developed in Python for extracting product information from Zillow. It's designed to be easy to use, making it an ideal solution for developers looking for Zillow product data.\r\n\r\n## Features\r\n- Full search support\r\n- Extracts detailed product information from Zillow\r\n- Implemented in Python just because it's popular\r\n- Easy to integrate with existing Python projects\r\n\r\n### Important\r\n- Use rotating residential proxies, zillow will block if you make multiple requests with the same IP, \r\n\r\n### Install\r\n\r\n```bash\r\n$ pip install pyzill\r\n```\r\n## Examples\r\n\r\n```Python\r\nimport pyzill\r\nimport json\r\nne_lat = 38.602951833355434\r\nne_long = -87.22283859375\r\nsw_lat = 23.42674607019482\r\nsw_long = -112.93084640625\r\npagination = 1\r\n#pagination is for the list that you see at the right when searching\r\n#you don't need to iterate over all the pages because zillow sends the whole data on mapresults at once on the first page\r\n#however the maximum result zillow returns is 500, so if mapResults is 500\r\n#try playing with the zoom or moving the coordinates, pagination won't help because you will always get at maximum 500 results\r\npagination = 1\r\nproxy_url = pyzill.parse_proxy(\"[proxy_ip or proxy_domain]\",\"[proxy_port]\",\"[proxy_username]\",\"[proxy_password]\")\r\nresults_sold = pyzill.sold(pagination, \r\n              search_value=\"miami\",\r\n              min_beds=1,max_beds=1,\r\n              min_bathrooms=None,max_bathrooms=None,\r\n              min_price=10000,max_price=None,\r\n              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,\r\n              zoom_value=5,\r\n              proxy_url=proxy_url)\r\n              \r\nresults_sale = pyzill.for_sale(pagination, \r\n              search_value=\"\",\r\n              min_beds=None,max_beds=None,\r\n              min_bathrooms=3,max_bathrooms=None,\r\n              min_price=None,max_price=None,\r\n              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,\r\n              zoom_value=10,\r\n              proxy_url=proxy_url)\r\n\r\nresults_rent = pyzill.for_rent(pagination, \r\n              search_value=\"\",\r\n              min_beds=1,max_beds=None,\r\n              min_bathrooms=None,max_bathrooms=None,\r\n              min_price=10000,max_price=None,\r\n              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,\r\n              zoom_value=15,\r\n              proxy_url=proxy_url)\r\njsondata_sold = json.dumps(results_sold)\r\njsondata_sale = json.dumps(results_sale)\r\njsondata_rent = json.dumps(results_rent)\r\nf = open(\"./jsondata_sold.json\", \"w\")\r\nf.write(jsondata_sold)\r\nf.close()\r\nf = open(\"./jsondata_sale.json\", \"w\")\r\nf.write(jsondata_sale)\r\nf.close()\r\nf = open(\"./jsondata_rent.json\", \"w\")\r\nf.write(jsondata_rent)\r\nf.close()\r\n```\r\n# For homes\r\n\r\n```Python\r\nimport pyzill\r\nimport json\r\nproperty_url=\"https://www.zillow.com/homedetails/858-Shady-Grove-Ln-Harrah-OK-73045/339897685_zpid/\"\r\nproxy_url = pyzill.parse_proxy(\"[proxy_ip or proxy_domain]\",\"[proxy_port]\",\"[proxy_username]\",\"[proxy_password]\")\r\ndata = pyzill.get_from_home_url(property_url,proxy_url)\r\njsondata = json.dumps(data)\r\nf = open(\"details.json\", \"w\")\r\nf.write(jsondata)\r\nf.close()\r\n```\r\n\r\n```Python\r\nimport pyzill\r\nimport json\r\nproperty_id=2056016566\r\nproxy_url = pyzill.parse_proxy(\"[proxy_ip or proxy_domain]\",\"[proxy_port]\",\"[proxy_username]\",\"[proxy_password]\")\r\ndata = pyzill.get_from_home_id(property_id,proxy_url)\r\njsondata = json.dumps(data)\r\nf = open(\"details.json\", \"w\")\r\nf.write(jsondata)\r\nf.close()\r\n```\r\n\r\n# For departments\r\n\r\n```Python\r\nimport pyzill\r\nimport json\r\nproperty_url=\"https://www.zillow.com/apartments/kissimmee-fl/the-nexus-at-overbrook/9DSWrh/\"\r\nproxy_url = pyzill.parse_proxy(\"[proxy_ip or proxy_domain]\",\"[proxy_port]\",\"[proxy_username]\",\"[proxy_password]\")\r\ndata = pyzill.get_from_deparment_url(property_url,proxy_url)\r\njsondata = json.dumps(data)\r\nf = open(\"details.json\", \"w\")\r\nf.write(jsondata)\r\nf.close()\r\n```\r\n\r\n```Python\r\nimport pyzill\r\nimport json\r\nproperty_id=\"CgKZT4\"\r\nproxy_url = pyzill.parse_proxy(\"[proxy_ip or proxy_domain]\",\"[proxy_port]\",\"[proxy_username]\",\"[proxy_password]\")\r\ndata = pyzill.get_from_deparment_id(property_id,proxy_url)\r\njsondata = json.dumps(data)\r\nf = open(\"details.json\", \"w\")\r\nf.write(jsondata)\r\nf.close()\r\n```\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Zillow scraper in Python",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://github.com/johnbalvin/pyzill"
    },
    "split_keywords": [
        "zillow",
        " scraper",
        " crawler"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "18be231591c10c001c7748e3d7315997f6501e1c4aab076bc1a37ede3b495843",
                "md5": "f23be1a7794037164115755c78d0a7fb",
                "sha256": "64dfe9eb06c5a94294de7b19f4782c22dd7ebbe34167f4601fae7d914b270899"
            },
            "downloads": -1,
            "filename": "pyzill-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f23be1a7794037164115755c78d0a7fb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 7861,
            "upload_time": "2024-12-01T23:42:47",
            "upload_time_iso_8601": "2024-12-01T23:42:47.980336Z",
            "url": "https://files.pythonhosted.org/packages/18/be/231591c10c001c7748e3d7315997f6501e1c4aab076bc1a37ede3b495843/pyzill-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "53ae918d9f8944d4fd8dddf74d9343047eeda5d86bbc4d2e5fd7f56b97a5cedc",
                "md5": "8a7523dc520d3773e5f29cd5eddeabb1",
                "sha256": "638174c8195fca2b217b413335d903270cae1d00f3a5b28e5579837e740d4435"
            },
            "downloads": -1,
            "filename": "pyzill-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8a7523dc520d3773e5f29cd5eddeabb1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7711,
            "upload_time": "2024-12-01T23:42:49",
            "upload_time_iso_8601": "2024-12-01T23:42:49.578584Z",
            "url": "https://files.pythonhosted.org/packages/53/ae/918d9f8944d4fd8dddf74d9343047eeda5d86bbc4d2e5fd7f56b97a5cedc/pyzill-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-01 23:42:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "johnbalvin",
    "github_project": "pyzill",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pyzill"
}
        
Elapsed time: 0.46951s