ugrc-palletjack


Nameugrc-palletjack JSON
Version 3.0.0 PyPI version JSON
download
home_pagehttps://github.com/agrc/palletjack
SummaryUpdating AGOL feature services with data from SFTP shares.
upload_time2023-03-13 23:29:24
maintainer
docs_urlNone
authorJake Adams, UGRC
requires_python
license
keywords gis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # agrc/palletjack

![Build Status](https://github.com/agrc/palletjack/workflows/Build%20and%20Test/badge.svg)
[![codecov](https://codecov.io/gh/agrc/palletjack/branch/main/graph/badge.svg)](https://codecov.io/gh/agrc/palletjack)

A library of classes and methods for automatically updating AGOL feature services with data from several different types of external sources. Client apps (sometimes called 'skids') can reuse these classes for common use cases. The code modules are oriented around each step in the extract, transform, and load process.

`palletjack` works with pandas DataFrames (either regular for tabular data or Esri's spatially-enabled dataframes for spatial data). The extract and transform methods return dataframes and the load methods consume dataframes as their source data.

The [documentation](https://agrc.github.io/palletjack/palletjack) includes a user guide along with an API description of the available classes and methods.

Pallet jack: [forklift's](https://www.github.com/agrc/forklift) little brother.

## Dependencies

`palletjack` relies on the dependencies listed in `setup.py`. These are all available on PyPI and can be installed in most environments, including Google Cloud Functions.

The `arcgis` library does all the heavy lifting for spatial data. If the `arcpy` library is not available (such as in a cloud function), it relies on `shapely` for its geometry engine.

## Installation

1. Activate your application's environment
1. `pip install ugrc-palletjack`

## Quick start

1. Import the desired modules
1. Use a class in `extract` to load a dataframe from an external source
1. Transform your dataframe as desired with helper methods from `transform`
1. Use the dataframe to update a hosted feature service using the methods in `load`

   ```python
   from palletjack import extract, transform, load

   #: Load the data from a Google Sheet
   gsheet_extractor = extract.GSheetLoader(path_to_service_account_json)
   sheet_df = gsheet_extractor.load_specific_worksheet_into_dataframe(sheet_id, 'title of desired sheet', by_title=True)

   #: Convert the data to points using lat/long fields, clean for uploading
   spatial_df = pd.DataFrame.spatial.from_xy(input_df, x_column='longitude', y_column='latitude')
   renamed_df = transform.DataCleaning.rename_dataframe_columns_for_agol(spatial_df)
   cleaned_df = transform.DataCleaning.switch_to_nullable_int(renamed_df, ['an_int_field_with_null_values'])

   #: Truncate the existing feature service data and load the new data
   gis = arcgis.gis.GIS('my_agol_org_url', 'username', 'super-duper-secure-password')
   updates = palletjack.load.FeatureServiceUpdater.truncate_and_load_features(
      gis, 'feature_service_item_id', cleaned_df, r'c:\directory\to\save\truncated\data\in\case\of\error'
   )
   ```

## Development

1. Create a conda environment with Python 3.9
   - `conda create -n palletjack python=3.9`
   - `activate palletjack`
1. Clone the repo
1. Install in dev mode with development dependencies
   - `pip install -e .[tests]`

### Troubleshooting Weird Append Errors

If a `FeatureLayer.append()` call (within a load.FeatureServiceUpdater method) fails with an "Unknown Error: 500" error or something like that, you can query the results to get more info. The debug log will include the HTTP GET call, something like the following:
`https://services1.arcgis.com:443 POST /<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json token=<crazy long token string>`

You can use this and a token from an AGOL tab to build a new job status url. To get the token, log into AGOL in a browser and open a private hosted feature layer item. Click the layer, and then open the developer console. With the Network tab of the console open, click on the "View" link for the service URL. You should see a document in the list whose name includes "?token=<really long token string>". Copy the name and then copy out the token string.

Now that you've got the token string, you can build the status query:
`https://services1.arcgis.com/<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json&<token from agol>`

Calling this URL in a browser should return a message that will hopefully give you more info as to why it failed.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/agrc/palletjack",
    "name": "ugrc-palletjack",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "gis",
    "author": "Jake Adams, UGRC",
    "author_email": "jdadams@utah.gov",
    "download_url": "https://files.pythonhosted.org/packages/97/69/9c0e6ee83d2c5ffa50a814ab9b62010d9045fe45f19c89d4e79992133340/ugrc-palletjack-3.0.0.tar.gz",
    "platform": null,
    "description": "# agrc/palletjack\n\n![Build Status](https://github.com/agrc/palletjack/workflows/Build%20and%20Test/badge.svg)\n[![codecov](https://codecov.io/gh/agrc/palletjack/branch/main/graph/badge.svg)](https://codecov.io/gh/agrc/palletjack)\n\nA library of classes and methods for automatically updating AGOL feature services with data from several different types of external sources. Client apps (sometimes called 'skids') can reuse these classes for common use cases. The code modules are oriented around each step in the extract, transform, and load process.\n\n`palletjack` works with pandas DataFrames (either regular for tabular data or Esri's spatially-enabled dataframes for spatial data). The extract and transform methods return dataframes and the load methods consume dataframes as their source data.\n\nThe [documentation](https://agrc.github.io/palletjack/palletjack) includes a user guide along with an API description of the available classes and methods.\n\nPallet jack: [forklift's](https://www.github.com/agrc/forklift) little brother.\n\n## Dependencies\n\n`palletjack` relies on the dependencies listed in `setup.py`. These are all available on PyPI and can be installed in most environments, including Google Cloud Functions.\n\nThe `arcgis` library does all the heavy lifting for spatial data. If the `arcpy` library is not available (such as in a cloud function), it relies on `shapely` for its geometry engine.\n\n## Installation\n\n1. Activate your application's environment\n1. `pip install ugrc-palletjack`\n\n## Quick start\n\n1. Import the desired modules\n1. Use a class in `extract` to load a dataframe from an external source\n1. Transform your dataframe as desired with helper methods from `transform`\n1. Use the dataframe to update a hosted feature service using the methods in `load`\n\n   ```python\n   from palletjack import extract, transform, load\n\n   #: Load the data from a Google Sheet\n   gsheet_extractor = extract.GSheetLoader(path_to_service_account_json)\n   sheet_df = gsheet_extractor.load_specific_worksheet_into_dataframe(sheet_id, 'title of desired sheet', by_title=True)\n\n   #: Convert the data to points using lat/long fields, clean for uploading\n   spatial_df = pd.DataFrame.spatial.from_xy(input_df, x_column='longitude', y_column='latitude')\n   renamed_df = transform.DataCleaning.rename_dataframe_columns_for_agol(spatial_df)\n   cleaned_df = transform.DataCleaning.switch_to_nullable_int(renamed_df, ['an_int_field_with_null_values'])\n\n   #: Truncate the existing feature service data and load the new data\n   gis = arcgis.gis.GIS('my_agol_org_url', 'username', 'super-duper-secure-password')\n   updates = palletjack.load.FeatureServiceUpdater.truncate_and_load_features(\n      gis, 'feature_service_item_id', cleaned_df, r'c:\\directory\\to\\save\\truncated\\data\\in\\case\\of\\error'\n   )\n   ```\n\n## Development\n\n1. Create a conda environment with Python 3.9\n   - `conda create -n palletjack python=3.9`\n   - `activate palletjack`\n1. Clone the repo\n1. Install in dev mode with development dependencies\n   - `pip install -e .[tests]`\n\n### Troubleshooting Weird Append Errors\n\nIf a `FeatureLayer.append()` call (within a load.FeatureServiceUpdater method) fails with an \"Unknown Error: 500\" error or something like that, you can query the results to get more info. The debug log will include the HTTP GET call, something like the following:\n`https://services1.arcgis.com:443 POST /<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json token=<crazy long token string>`\n\nYou can use this and a token from an AGOL tab to build a new job status url. To get the token, log into AGOL in a browser and open a private hosted feature layer item. Click the layer, and then open the developer console. With the Network tab of the console open, click on the \"View\" link for the service URL. You should see a document in the list whose name includes \"?token=<really long token string>\". Copy the name and then copy out the token string.\n\nNow that you've got the token string, you can build the status query:\n`https://services1.arcgis.com/<unique string>/arcgis/rest/services/<feature layer name>/FeatureServer/<layer id>/append/jobs/<job guid>?f=json&<token from agol>`\n\nCalling this URL in a browser should return a message that will hopefully give you more info as to why it failed.\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Updating AGOL feature services with data from SFTP shares.",
    "version": "3.0.0",
    "split_keywords": [
        "gis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e9f1c44c8cd9c55eacc70a7bed728aa19636d7b261224ac149a1d67d38ad0dd8",
                "md5": "2b610f1bcf15579e088b5fd867f0d229",
                "sha256": "1dc81b625e043267b30c026c3e8cbfa51cb05fc7e7d32af0d7d2466153b39d68"
            },
            "downloads": -1,
            "filename": "ugrc_palletjack-3.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2b610f1bcf15579e088b5fd867f0d229",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 35833,
            "upload_time": "2023-03-13T23:29:22",
            "upload_time_iso_8601": "2023-03-13T23:29:22.625958Z",
            "url": "https://files.pythonhosted.org/packages/e9/f1/c44c8cd9c55eacc70a7bed728aa19636d7b261224ac149a1d67d38ad0dd8/ugrc_palletjack-3.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "97699c0e6ee83d2c5ffa50a814ab9b62010d9045fe45f19c89d4e79992133340",
                "md5": "a8f53ce413ed7628ece4f87cb6b24f0b",
                "sha256": "fc701d2d12db5aff0a469bda0cbd5f83d709b664a95c74d5784e498fd57cc77f"
            },
            "downloads": -1,
            "filename": "ugrc-palletjack-3.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a8f53ce413ed7628ece4f87cb6b24f0b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 35171,
            "upload_time": "2023-03-13T23:29:24",
            "upload_time_iso_8601": "2023-03-13T23:29:24.850258Z",
            "url": "https://files.pythonhosted.org/packages/97/69/9c0e6ee83d2c5ffa50a814ab9b62010d9045fe45f19c89d4e79992133340/ugrc-palletjack-3.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-13 23:29:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "agrc",
    "github_project": "palletjack",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "ugrc-palletjack"
}
        
Elapsed time: 0.06085s