# Crab-Lib
**Crab-Lib** is a Python library designed to simplify the processing, analysis, and visualization of GPS data, particularly for fishing grounds in the Gulf of Mexico. It includes tools for cleaning data, calculating distances, detecting clusters, and generating insightful visualizations. Whether you're a fisheries scientist, data analyst, or GIS enthusiast, **Crab-Lib** helps you make sense of complex geospatial datasets.
---
## Features
- **Data I/O**:
- Load and save GPS data in CSV format.
- Validate and handle missing or malformed data.
- **Data Preprocessing**:
- Remove duplicates and invalid rows.
- Filter data based on keywords in comments.
- Standardize comment text for consistency.
- **Geospatial Analysis**:
- Calculate great-circle distances using the Haversine formula.
- Identify clusters of GPS points based on proximity.
- Summarize datasets with averages and bounding boxes.
- **Visualization**:
- Plot points, clusters, and density heatmaps.
- Create interactive and informative visualizations.
- **Utilities**:
- Convert coordinates from DMS to decimal degrees.
- Calculate initial bearings between two GPS points.
- Validate if a point lies within a bounding box.
---
## Table of Contents
1. [Installation](#installation)
2. [Quick Start](#quick-start)
3. [Detailed Examples](#detailed-examples)
- [Data I/O](#data-io)
- [Preprocessing](#preprocessing)
- [Analysis](#analysis)
- [Visualization](#visualization)
- [Utilities](#utilities)
4. [API Reference](#api-reference)
5. [Troubleshooting](#troubleshooting)
6. [Contributing](#contributing)
7. [License](#license)
8. [Contact](#contact)
---
## Installation
### Install via Source
1. Clone the repository:
```bash
git clone https://github.com/RedGloveProductions/crab-lib.git
cd crab-lib
Install the library:
pip install -e .
Dependencies
Crab-Lib requires the following Python packages:
matplotlib >= 3.0.0
For development, additional tools like pytest and black are recommended.
Quick Start
Here's a quick guide to get started with Crab-Lib:
Load Data:
from crab_lib.io import load_csv
data = load_csv("fishing_data.csv")
Clean Data:
from crab_lib.preprocessing import clean_data
cleaned_data = clean_data(data)
Analyze Data:
from crab_lib.analysis import calculate_distances, find_clusters
distances = calculate_distances(cleaned_data)
clusters = find_clusters(cleaned_data, radius=100)
Visualize Data:
from crab_lib.visualization import plot_points
plot_points(cleaned_data)
Detailed Examples
Data I/O
Load and Save Data:
from crab_lib.io import load_csv, save_csv
# Load data from a CSV file
data = load_csv("fishing_data.csv")
# Save cleaned data to a new file
save_csv("cleaned_data.csv", data)
Preprocessing
Clean, Filter, and Standardize Data:
from crab_lib.preprocessing import clean_data, filter_data, standardize_comments
# Remove duplicates and invalid rows
cleaned_data = clean_data(data)
# Filter rows containing the keyword 'hotspot'
filtered_data = filter_data(cleaned_data, "hotspot")
# Standardize comments for consistency
standardized_data = standardize_comments(filtered_data)
Analysis
Calculate Distances and Find Clusters:
from crab_lib.analysis import calculate_distances, find_clusters, summarize_data
# Calculate pairwise distances
distances = calculate_distances(cleaned_data)
# Identify clusters within a 50 km radius
clusters = find_clusters(cleaned_data, radius=50)
# Summarize the dataset
summary = summarize_data(cleaned_data)
print(summary)
Visualization
Visualize Data:
from crab_lib.visualization import plot_points, plot_clusters, create_heatmap
# Plot individual points
plot_points(cleaned_data)
# Plot clusters
plot_clusters(clusters)
# Create a heatmap of point density
create_heatmap(cleaned_data)
Utilities
Convert Coordinates and Calculate Bearings:
from crab_lib.utils import convert_coordinates, calculate_bearing, is_within_bounds
# Convert DMS to decimal degrees
decimal_coord = convert_coordinates("25°46'26.5\"N")
# Calculate bearing between two points
bearing = calculate_bearing((25.774, -80.19), (27.345, -82.567))
print("Bearing:", bearing)
# Check if a point is within bounds
bounds = (25.0, 26.0, -81.0, -79.0)
print("Is within bounds:", is_within_bounds((25.774, -80.19), bounds))
API Reference
I/O Module
load_csv(file_path: str) -> List[Dict[str, str]]: Load GPS data from a CSV file.
save_csv(file_path: str, data: List[Dict[str, str]]) -> None: Save data to a CSV file.
Preprocessing Module
clean_data(data: List[Dict[str, str]]) -> List[Dict[str, str]]: Remove duplicates and invalid rows.
filter_data(data: List[Dict[str, str]], keyword: str) -> List[Dict[str, str]]: Filter rows by keyword.
standardize_comments(data: List[Dict[str, str]]) -> List[Dict[str, str]]: Standardize comment text.
Analysis Module
calculate_distances(data: List[Dict[str, float]]) -> List[Dict[str, float]]: Compute pairwise distances.
find_clusters(data: List[Dict[str, float]], radius: float) -> List[List[Dict[str, float]]]: Detect clusters.
summarize_data(data: List[Dict[str, float]]) -> Dict[str, float]: Summarize dataset metrics.
Visualization Module
plot_points(data: List[Dict[str, float]]) -> None: Scatter plot of GPS points.
plot_clusters(clusters: List[List[Dict[str, float]]]) -> None: Visualize clusters.
create_heatmap(data: List[Dict[str, float]], bins: int) -> None: Generate a heatmap.
Utils Module
convert_coordinates(dms: str) -> float: Convert DMS to decimal degrees.
calculate_bearing(coord1: Tuple[float, float], coord2: Tuple[float, float]) -> float: Compute initial bearing.
is_within_bounds(coord: Tuple[float, float], bounds: Tuple[float, float, float, float]) -> bool: Check if a point is in bounds.
Troubleshooting
Issue: FileNotFoundError when loading a CSV.
Solution: Verify the file path and ensure the file exists.
Issue: ValueError when cleaning or processing data.
Solution: Ensure your data matches the required format (x, y, comment).
Contributing
Contributions are welcome! To contribute:
Fork the repository.
Create a feature branch.
Submit a pull request with a clear description of changes.
License
This project is licensed under the MIT License.
Contact
Author: Your Name
Email: your_email@example.com
GitHub: RedGloveProductions
Project Links
GitHub Repository
Issue Tracker
---
This `README.md` includes the additional details and provides a comprehensive introduction to your library.
Raw data
{
"_id": null,
"home_page": "https://github.com/RedGloveProductions/crab-lib",
"name": "crab-lib",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "gps fishing data analysis visualization",
"author": "Joe Stem - Red Glove Productions",
"author_email": "joestem25@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/4c/e4/ad7e007459d66598ea0cbfc4693af439f88631c42653d114dbf4a5c216c4/crab_lib-1.0.0.tar.gz",
"platform": null,
"description": "# Crab-Lib\n\n**Crab-Lib** is a Python library designed to simplify the processing, analysis, and visualization of GPS data, particularly for fishing grounds in the Gulf of Mexico. It includes tools for cleaning data, calculating distances, detecting clusters, and generating insightful visualizations. Whether you're a fisheries scientist, data analyst, or GIS enthusiast, **Crab-Lib** helps you make sense of complex geospatial datasets.\n\n---\n\n## Features\n\n- **Data I/O**:\n - Load and save GPS data in CSV format.\n - Validate and handle missing or malformed data.\n- **Data Preprocessing**:\n - Remove duplicates and invalid rows.\n - Filter data based on keywords in comments.\n - Standardize comment text for consistency.\n- **Geospatial Analysis**:\n - Calculate great-circle distances using the Haversine formula.\n - Identify clusters of GPS points based on proximity.\n - Summarize datasets with averages and bounding boxes.\n- **Visualization**:\n - Plot points, clusters, and density heatmaps.\n - Create interactive and informative visualizations.\n- **Utilities**:\n - Convert coordinates from DMS to decimal degrees.\n - Calculate initial bearings between two GPS points.\n - Validate if a point lies within a bounding box.\n\n---\n\n## Table of Contents\n\n1. [Installation](#installation)\n2. [Quick Start](#quick-start)\n3. [Detailed Examples](#detailed-examples)\n - [Data I/O](#data-io)\n - [Preprocessing](#preprocessing)\n - [Analysis](#analysis)\n - [Visualization](#visualization)\n - [Utilities](#utilities)\n4. [API Reference](#api-reference)\n5. [Troubleshooting](#troubleshooting)\n6. [Contributing](#contributing)\n7. [License](#license)\n8. [Contact](#contact)\n\n---\n\n## Installation\n\n### Install via Source\n1. Clone the repository:\n ```bash\n git clone https://github.com/RedGloveProductions/crab-lib.git\n cd crab-lib\nInstall the library:\npip install -e .\nDependencies\nCrab-Lib requires the following Python packages:\nmatplotlib >= 3.0.0\nFor development, additional tools like pytest and black are recommended.\nQuick Start\n\nHere's a quick guide to get started with Crab-Lib:\nLoad Data:\nfrom crab_lib.io import load_csv\ndata = load_csv(\"fishing_data.csv\")\nClean Data:\nfrom crab_lib.preprocessing import clean_data\ncleaned_data = clean_data(data)\nAnalyze Data:\nfrom crab_lib.analysis import calculate_distances, find_clusters\ndistances = calculate_distances(cleaned_data)\nclusters = find_clusters(cleaned_data, radius=100)\nVisualize Data:\nfrom crab_lib.visualization import plot_points\nplot_points(cleaned_data)\nDetailed Examples\n\nData I/O\nLoad and Save Data:\nfrom crab_lib.io import load_csv, save_csv\n\n# Load data from a CSV file\ndata = load_csv(\"fishing_data.csv\")\n\n# Save cleaned data to a new file\nsave_csv(\"cleaned_data.csv\", data)\nPreprocessing\nClean, Filter, and Standardize Data:\nfrom crab_lib.preprocessing import clean_data, filter_data, standardize_comments\n\n# Remove duplicates and invalid rows\ncleaned_data = clean_data(data)\n\n# Filter rows containing the keyword 'hotspot'\nfiltered_data = filter_data(cleaned_data, \"hotspot\")\n\n# Standardize comments for consistency\nstandardized_data = standardize_comments(filtered_data)\nAnalysis\nCalculate Distances and Find Clusters:\nfrom crab_lib.analysis import calculate_distances, find_clusters, summarize_data\n\n# Calculate pairwise distances\ndistances = calculate_distances(cleaned_data)\n\n# Identify clusters within a 50 km radius\nclusters = find_clusters(cleaned_data, radius=50)\n\n# Summarize the dataset\nsummary = summarize_data(cleaned_data)\nprint(summary)\nVisualization\nVisualize Data:\nfrom crab_lib.visualization import plot_points, plot_clusters, create_heatmap\n\n# Plot individual points\nplot_points(cleaned_data)\n\n# Plot clusters\nplot_clusters(clusters)\n\n# Create a heatmap of point density\ncreate_heatmap(cleaned_data)\nUtilities\nConvert Coordinates and Calculate Bearings:\nfrom crab_lib.utils import convert_coordinates, calculate_bearing, is_within_bounds\n\n# Convert DMS to decimal degrees\ndecimal_coord = convert_coordinates(\"25\u00b046'26.5\\\"N\")\n\n# Calculate bearing between two points\nbearing = calculate_bearing((25.774, -80.19), (27.345, -82.567))\nprint(\"Bearing:\", bearing)\n\n# Check if a point is within bounds\nbounds = (25.0, 26.0, -81.0, -79.0)\nprint(\"Is within bounds:\", is_within_bounds((25.774, -80.19), bounds))\nAPI Reference\n\nI/O Module\nload_csv(file_path: str) -> List[Dict[str, str]]: Load GPS data from a CSV file.\nsave_csv(file_path: str, data: List[Dict[str, str]]) -> None: Save data to a CSV file.\nPreprocessing Module\nclean_data(data: List[Dict[str, str]]) -> List[Dict[str, str]]: Remove duplicates and invalid rows.\nfilter_data(data: List[Dict[str, str]], keyword: str) -> List[Dict[str, str]]: Filter rows by keyword.\nstandardize_comments(data: List[Dict[str, str]]) -> List[Dict[str, str]]: Standardize comment text.\nAnalysis Module\ncalculate_distances(data: List[Dict[str, float]]) -> List[Dict[str, float]]: Compute pairwise distances.\nfind_clusters(data: List[Dict[str, float]], radius: float) -> List[List[Dict[str, float]]]: Detect clusters.\nsummarize_data(data: List[Dict[str, float]]) -> Dict[str, float]: Summarize dataset metrics.\nVisualization Module\nplot_points(data: List[Dict[str, float]]) -> None: Scatter plot of GPS points.\nplot_clusters(clusters: List[List[Dict[str, float]]]) -> None: Visualize clusters.\ncreate_heatmap(data: List[Dict[str, float]], bins: int) -> None: Generate a heatmap.\nUtils Module\nconvert_coordinates(dms: str) -> float: Convert DMS to decimal degrees.\ncalculate_bearing(coord1: Tuple[float, float], coord2: Tuple[float, float]) -> float: Compute initial bearing.\nis_within_bounds(coord: Tuple[float, float], bounds: Tuple[float, float, float, float]) -> bool: Check if a point is in bounds.\nTroubleshooting\n\nIssue: FileNotFoundError when loading a CSV.\nSolution: Verify the file path and ensure the file exists.\nIssue: ValueError when cleaning or processing data.\nSolution: Ensure your data matches the required format (x, y, comment).\nContributing\n\nContributions are welcome! To contribute:\nFork the repository.\nCreate a feature branch.\nSubmit a pull request with a clear description of changes.\nLicense\n\nThis project is licensed under the MIT License.\nContact\n\nAuthor: Your Name\nEmail: your_email@example.com\nGitHub: RedGloveProductions\nProject Links\n\nGitHub Repository\nIssue Tracker\n\n---\n\nThis `README.md` includes the additional details and provides a comprehensive introduction to your library.\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python library for analyzing GPS data from fishing grounds.",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/RedGloveProductions/crab-lib/issues",
"Homepage": "https://github.com/RedGloveProductions/crab-lib",
"Source Code": "https://github.com/RedGloveProductions/crab-lib"
},
"split_keywords": [
"gps",
"fishing",
"data",
"analysis",
"visualization"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "29c1caebda1576b17629fc704b198e49624b7c3c30c66389a0eb9db21c88f107",
"md5": "396bd0a3103ff40b23f42a4e5a105e1c",
"sha256": "49045d2c83103c37fcc5a439c53b0460d282a7922714c1406dd5a2888758b2d9"
},
"downloads": -1,
"filename": "crab_lib-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "396bd0a3103ff40b23f42a4e5a105e1c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 15896,
"upload_time": "2024-11-21T02:11:59",
"upload_time_iso_8601": "2024-11-21T02:11:59.505544Z",
"url": "https://files.pythonhosted.org/packages/29/c1/caebda1576b17629fc704b198e49624b7c3c30c66389a0eb9db21c88f107/crab_lib-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4ce4ad7e007459d66598ea0cbfc4693af439f88631c42653d114dbf4a5c216c4",
"md5": "3ca6ae6f8bd0ffe94ba2841ab9f3b1b2",
"sha256": "35394ba3315d33898eea9b639129118233c672791e412151b7465d3fb68ef190"
},
"downloads": -1,
"filename": "crab_lib-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "3ca6ae6f8bd0ffe94ba2841ab9f3b1b2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 16988,
"upload_time": "2024-11-21T02:12:00",
"upload_time_iso_8601": "2024-11-21T02:12:00.705197Z",
"url": "https://files.pythonhosted.org/packages/4c/e4/ad7e007459d66598ea0cbfc4693af439f88631c42653d114dbf4a5c216c4/crab_lib-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-21 02:12:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RedGloveProductions",
"github_project": "crab-lib",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "crab-lib"
}