# Dataspot π₯
> **Find data concentration patterns and dataspots in your datasets**
[](https://pypi.org/project/dataspot/)
[](https://opensource.org/licenses/MIT)
[](https://frauddi.com)
[](https://www.python.org/downloads/)
Dataspot automatically discovers **where your data concentrates**, helping you identify patterns, anomalies, and insights in datasets. Originally developed for fraud detection at Frauddi, now available as open source.
## β¨ Why Dataspot?
- π― **Purpose-built** for finding data concentrations, not just clustering
- π **Fraud detection ready** - spot suspicious behavior patterns
- β‘ **Simple API** - get insights in 3 lines of code
- π **Hierarchical analysis** - understand data at multiple levels
- π§ **Flexible filtering** - customize analysis with powerful options
- π **Field-tested** - validated in real fraud detection systems
## π Quick Start
```bash
pip install dataspot
```
```python
from dataspot import Dataspot
from dataspot.models.finder import FindInput, FindOptions
# Sample transaction data
data = [
{"country": "US", "device": "mobile", "amount": "high", "user_type": "premium"},
{"country": "US", "device": "mobile", "amount": "medium", "user_type": "premium"},
{"country": "EU", "device": "desktop", "amount": "low", "user_type": "free"},
{"country": "US", "device": "mobile", "amount": "high", "user_type": "premium"},
]
# Find concentration patterns
dataspot = Dataspot()
result = dataspot.find(
FindInput(data=data, fields=["country", "device", "user_type"]),
FindOptions(min_percentage=10.0, limit=5)
)
# Results show where data concentrates
for pattern in result.patterns:
print(f"{pattern.path} β {pattern.percentage}% ({pattern.count} records)")
# Output:
# country=US > device=mobile > user_type=premium β 75.0% (3 records)
# country=US > device=mobile β 75.0% (3 records)
# device=mobile β 75.0% (3 records)
```
## π― Real-World Use Cases
### π¨ Fraud Detection
```python
from dataspot.models.finder import FindInput, FindOptions
# Find suspicious transaction patterns
result = dataspot.find(
FindInput(
data=transactions,
fields=["country", "payment_method", "time_of_day"]
),
FindOptions(min_percentage=15.0, contains="crypto")
)
# Spot unusual concentrations that might indicate fraud
for pattern in result.patterns:
if pattern.percentage > 30:
print(f"β οΈ High concentration: {pattern.path}")
```
### π Business Intelligence
```python
from dataspot.models.analyzer import AnalyzeInput, AnalyzeOptions
# Discover customer behavior patterns
insights = dataspot.analyze(
AnalyzeInput(
data=customer_data,
fields=["region", "device", "product_category", "tier"]
),
AnalyzeOptions(min_percentage=10.0)
)
print(f"π Found {len(insights.patterns)} concentration patterns")
print(f"π― Top opportunity: {insights.patterns[0].path}")
```
### π Temporal Analysis
```python
from dataspot.models.compare import CompareInput, CompareOptions
# Compare patterns between time periods
comparison = dataspot.compare(
CompareInput(
current_data=this_month_data,
baseline_data=last_month_data,
fields=["country", "payment_method"]
),
CompareOptions(
change_threshold=0.20,
statistical_significance=True
)
)
print(f"π Changes detected: {len(comparison.changes)}")
print(f"π New patterns: {len(comparison.new_patterns)}")
```
### π³ Hierarchical Visualization
```python
from dataspot.models.tree import TreeInput, TreeOptions
# Build hierarchical tree for data exploration
tree = dataspot.tree(
TreeInput(
data=sales_data,
fields=["region", "product_category", "sales_channel"]
),
TreeOptions(min_value=10, max_depth=3, sort_by="value")
)
print(f"π³ Total records: {tree.value}")
print(f"π Main branches: {len(tree.children)}")
# Navigate the hierarchy
for region in tree.children:
print(f" π {region.name}: {region.value} records")
for product in region.children:
print(f" π¦ {product.name}: {product.value} records")
```
### π€ Auto Discovery
```python
from dataspot.models.discovery import DiscoverInput, DiscoverOptions
# Automatically discover important patterns
discovery = dataspot.discover(
DiscoverInput(data=transaction_data),
DiscoverOptions(max_fields=3, min_percentage=15.0)
)
print(f"π― Top patterns discovered: {len(discovery.top_patterns)}")
for field_ranking in discovery.field_ranking[:3]:
print(f"π {field_ranking.field}: {field_ranking.score:.2f}")
```
## π οΈ Core Methods
| Method | Purpose | Input Model | Options Model | Output Model |
|--------|---------|-------------|---------------|--------------|
| `find()` | Find concentration patterns | `FindInput` | `FindOptions` | `FindOutput` |
| `analyze()` | Statistical analysis | `AnalyzeInput` | `AnalyzeOptions` | `AnalyzeOutput` |
| `compare()` | Temporal comparison | `CompareInput` | `CompareOptions` | `CompareOutput` |
| `discover()` | Auto pattern discovery | `DiscoverInput` | `DiscoverOptions` | `DiscoverOutput` |
| `tree()` | Hierarchical visualization | `TreeInput` | `TreeOptions` | `TreeOutput` |
### Advanced Filtering Options
```python
# Complex analysis with multiple criteria
result = dataspot.find(
FindInput(
data=data,
fields=["country", "device", "payment"],
query={"country": ["US", "EU"]} # Pre-filter data
),
FindOptions(
min_percentage=10.0, # Only patterns with >10% concentration
max_depth=3, # Limit hierarchy depth
contains="mobile", # Must contain "mobile" in pattern
min_count=50, # At least 50 records
sort_by="percentage", # Sort by concentration strength
limit=20 # Top 20 patterns
)
)
```
## β‘ Performance
Dataspot delivers consistent, predictable performance with exceptionally efficient memory usage and linear scaling.
### π Real-World Performance
| Dataset Size | Processing Time | Memory Usage | Patterns Found |
|--------------|-----------------|---------------|----------------|
| 1,000 records | **~5ms** | **~1.4MB** | 12 patterns |
| 10,000 records | **~43ms** | **~2.8MB** | 12 patterns |
| 100,000 records | **~375ms** | **~2.9MB** | 20 patterns |
| 1,000,000 records | **~3.7s** | **~3.0MB** | 20 patterns |
> **Benchmark Methodology**: Performance measured using validated testing with 5 iterations per dataset size on MacBook Pro (M-series). Test data specifications:
>
> - **JSON Size**: ~164 bytes per JSON record (~0.16 KB each)
> - **JSON Structure**: 8 keys per JSON record (`country`, `device`, `payment_method`, `amount`, `user_type`, `channel`, `status`, `id`)
> - **Analysis Scope**: 4 fields analyzed simultaneously (`country`, `device`, `payment_method`, `user_type`)
> - **Configuration**: `min_percentage=5.0`, `limit=50` patterns
> - **Results**: Consistently finds 12 concentration patterns across all dataset sizes
> - **Variance**: Minimal timing variance (Β±1-6ms), demonstrating algorithmic stability
> - **Memory Efficiency**: Near-constant memory usage regardless of dataset size
### π‘ Performance Tips
```python
# Optimize for speed
result = dataspot.find(
FindInput(data=large_dataset, fields=fields),
FindOptions(
min_percentage=10.0, # Skip low-concentration patterns
max_depth=3, # Limit hierarchy depth
limit=100 # Cap results
)
)
# Memory efficient processing
from dataspot.models.tree import TreeInput, TreeOptions
tree = dataspot.tree(
TreeInput(data=data, fields=["country", "device"]),
TreeOptions(min_value=10, top=5) # Simplified tree
)
```
## π What Makes Dataspot Different?
| **Traditional Clustering** | **Dataspot Analysis** |
|---------------------------|---------------------|
| Groups similar data points | **Finds concentration patterns** |
| Equal-sized clusters | **Identifies where data accumulates** |
| Distance-based | **Percentage and count based** |
| Hard to interpret | **Business-friendly hierarchy** |
| Generic approach | **Built for real-world analysis** |
## π¬ Dataspot in Action
[View the algorithm](https://frauddi.github.io/dataspot/algorithm-dataspot.html)

See Dataspot discover concentration patterns and dataspots in real-time with hierarchical analysis and statistical insights.
## π API Structure
### Input Models
- `FindInput` - Data and fields for pattern finding
- `AnalyzeInput` - Statistical analysis configuration
- `CompareInput` - Current vs baseline data comparison
- `DiscoverInput` - Automatic pattern discovery
- `TreeInput` - Hierarchical tree visualization
### Options Models
- `FindOptions` - Filtering and sorting for patterns
- `AnalyzeOptions` - Statistical analysis parameters
- `CompareOptions` - Change detection thresholds
- `DiscoverOptions` - Auto-discovery constraints
- `TreeOptions` - Tree structure customization
### Output Models
- `FindOutput` - Pattern discovery results with statistics
- `AnalyzeOutput` - Enhanced analysis with insights and confidence scores
- `CompareOutput` - Change detection results with significance tests
- `DiscoverOutput` - Auto-discovery findings with field rankings
- `TreeOutput` - Hierarchical tree structure with navigation
## π§ Installation & Requirements
```bash
# Install from PyPI
pip install dataspot
# Development installation
git clone https://github.com/frauddi/dataspot.git
cd dataspot
pip install -e ".[dev]"
```
**Requirements:**
- Python 3.9+
- No heavy dependencies (just standard library + optional speedups)
## π οΈ Development Commands
| Command | Description |
|---------|-------------|
| `make lint` | Check code for style and quality issues |
| `make lint-fix` | Automatically fix linting issues where possible |
| `make tests` | Run all tests with coverage reporting |
| `make check` | Run both linting and tests |
| `make clean` | Remove cache files, build artifacts, and temporary files |
| `make install` | Create virtual environment and install dependencies |
## π Documentation & Examples
- π [User Guide](docs/user-guide.md) - Complete usage documentation
- π‘ [Examples](examples/) - Real-world usage examples:
- `01_basic_query_filtering.py` - Query and filtering basics
- `02_pattern_filtering_basic.py` - Pattern-based filtering
- `06_real_world_scenarios.py` - Business use cases
- `08_auto_discovery.py` - Automatic pattern discovery
- `09_temporal_comparison.py` - A/B testing and change detection
- `10_stats.py` - Statistical analysis
- π€ [Contributing](docs/CONTRIBUTING.md) - How to contribute
## π Why Open Source?
Dataspot was born from real-world fraud detection needs at Frauddi. We believe powerful pattern analysis shouldn't be locked behind closed doors. By open-sourcing Dataspot, we hope to:
- π― **Advance fraud detection** across the industry
- π€ **Enable collaboration** on pattern analysis techniques
- π **Help companies** spot issues in their data
- π **Improve data quality** everywhere
## π€ Contributing
We welcome contributions! Whether you're:
- π Reporting bugs
- π‘ Suggesting features
- π Improving documentation
- π§ Adding new analysis methods
See our [Contributing Guide](docs/CONTRIBUTING.md) for details.
## π License
MIT License - see [LICENSE](LICENSE) file for details.
## π Acknowledgments
- **Created by [@eliosf27](https://github.com/eliosf27)** - Original algorithm and implementation
- **Sponsored by [Frauddi](https://frauddi.com)** - Field testing and open source support
- **Inspired by real fraud detection challenges** - Built to solve actual problems
## π Links
- π [Homepage](https://github.com/frauddi/dataspot)
- π¦ [PyPI Package](https://pypi.org/project/dataspot/)
- π [Issue Tracker](https://github.com/frauddi/dataspot/issues)
---
**Find your data's dataspots. Discover what others miss.**
Built with β€οΈ by [Frauddi](https://frauddi.com)
Raw data
{
"_id": null,
"home_page": null,
"name": "dataspot",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "anomaly-detection, data-analysis, dataspots, fraud-detection, pattern-mining",
"author": null,
"author_email": "Elio Rinc\u00f3n <elio@frauddi.com>",
"download_url": "https://files.pythonhosted.org/packages/80/d8/2fbf27f65c6235150df0aff49c2f9a829b419564f5c6581135b0e8534a4f/dataspot-0.4.6.tar.gz",
"platform": null,
"description": "# Dataspot \ud83d\udd25\n\n> **Find data concentration patterns and dataspots in your datasets**\n\n[](https://pypi.org/project/dataspot/)\n[](https://opensource.org/licenses/MIT)\n[](https://frauddi.com)\n[](https://www.python.org/downloads/)\n\nDataspot automatically discovers **where your data concentrates**, helping you identify patterns, anomalies, and insights in datasets. Originally developed for fraud detection at Frauddi, now available as open source.\n\n## \u2728 Why Dataspot?\n\n- \ud83c\udfaf **Purpose-built** for finding data concentrations, not just clustering\n- \ud83d\udd0d **Fraud detection ready** - spot suspicious behavior patterns\n- \u26a1 **Simple API** - get insights in 3 lines of code\n- \ud83d\udcca **Hierarchical analysis** - understand data at multiple levels\n- \ud83d\udd27 **Flexible filtering** - customize analysis with powerful options\n- \ud83d\udcc8 **Field-tested** - validated in real fraud detection systems\n\n## \ud83d\ude80 Quick Start\n\n```bash\npip install dataspot\n```\n\n```python\nfrom dataspot import Dataspot\nfrom dataspot.models.finder import FindInput, FindOptions\n\n# Sample transaction data\ndata = [\n {\"country\": \"US\", \"device\": \"mobile\", \"amount\": \"high\", \"user_type\": \"premium\"},\n {\"country\": \"US\", \"device\": \"mobile\", \"amount\": \"medium\", \"user_type\": \"premium\"},\n {\"country\": \"EU\", \"device\": \"desktop\", \"amount\": \"low\", \"user_type\": \"free\"},\n {\"country\": \"US\", \"device\": \"mobile\", \"amount\": \"high\", \"user_type\": \"premium\"},\n]\n\n# Find concentration patterns\ndataspot = Dataspot()\nresult = dataspot.find(\n FindInput(data=data, fields=[\"country\", \"device\", \"user_type\"]),\n FindOptions(min_percentage=10.0, limit=5)\n)\n\n# Results show where data concentrates\nfor pattern in result.patterns:\n print(f\"{pattern.path} \u2192 {pattern.percentage}% ({pattern.count} records)\")\n\n# Output:\n# country=US > device=mobile > user_type=premium \u2192 75.0% (3 records)\n# country=US > device=mobile \u2192 75.0% (3 records)\n# device=mobile \u2192 75.0% (3 records)\n```\n\n## \ud83c\udfaf Real-World Use Cases\n\n### \ud83d\udea8 Fraud Detection\n\n```python\nfrom dataspot.models.finder import FindInput, FindOptions\n\n# Find suspicious transaction patterns\nresult = dataspot.find(\n FindInput(\n data=transactions,\n fields=[\"country\", \"payment_method\", \"time_of_day\"]\n ),\n FindOptions(min_percentage=15.0, contains=\"crypto\")\n)\n\n# Spot unusual concentrations that might indicate fraud\nfor pattern in result.patterns:\n if pattern.percentage > 30:\n print(f\"\u26a0\ufe0f High concentration: {pattern.path}\")\n```\n\n### \ud83d\udcca Business Intelligence\n\n```python\nfrom dataspot.models.analyzer import AnalyzeInput, AnalyzeOptions\n\n# Discover customer behavior patterns\ninsights = dataspot.analyze(\n AnalyzeInput(\n data=customer_data,\n fields=[\"region\", \"device\", \"product_category\", \"tier\"]\n ),\n AnalyzeOptions(min_percentage=10.0)\n)\n\nprint(f\"\ud83d\udcc8 Found {len(insights.patterns)} concentration patterns\")\nprint(f\"\ud83c\udfaf Top opportunity: {insights.patterns[0].path}\")\n```\n\n### \ud83d\udd0d Temporal Analysis\n\n```python\nfrom dataspot.models.compare import CompareInput, CompareOptions\n\n# Compare patterns between time periods\ncomparison = dataspot.compare(\n CompareInput(\n current_data=this_month_data,\n baseline_data=last_month_data,\n fields=[\"country\", \"payment_method\"]\n ),\n CompareOptions(\n change_threshold=0.20,\n statistical_significance=True\n )\n)\n\nprint(f\"\ud83d\udcca Changes detected: {len(comparison.changes)}\")\nprint(f\"\ud83c\udd95 New patterns: {len(comparison.new_patterns)}\")\n```\n\n### \ud83c\udf33 Hierarchical Visualization\n\n```python\nfrom dataspot.models.tree import TreeInput, TreeOptions\n\n# Build hierarchical tree for data exploration\ntree = dataspot.tree(\n TreeInput(\n data=sales_data,\n fields=[\"region\", \"product_category\", \"sales_channel\"]\n ),\n TreeOptions(min_value=10, max_depth=3, sort_by=\"value\")\n)\n\nprint(f\"\ud83c\udf33 Total records: {tree.value}\")\nprint(f\"\ud83d\udcca Main branches: {len(tree.children)}\")\n\n# Navigate the hierarchy\nfor region in tree.children:\n print(f\" \ud83d\udccd {region.name}: {region.value} records\")\n for product in region.children:\n print(f\" \ud83d\udce6 {product.name}: {product.value} records\")\n```\n\n### \ud83e\udd16 Auto Discovery\n\n```python\nfrom dataspot.models.discovery import DiscoverInput, DiscoverOptions\n\n# Automatically discover important patterns\ndiscovery = dataspot.discover(\n DiscoverInput(data=transaction_data),\n DiscoverOptions(max_fields=3, min_percentage=15.0)\n)\n\nprint(f\"\ud83c\udfaf Top patterns discovered: {len(discovery.top_patterns)}\")\nfor field_ranking in discovery.field_ranking[:3]:\n print(f\"\ud83d\udcc8 {field_ranking.field}: {field_ranking.score:.2f}\")\n```\n\n## \ud83d\udee0\ufe0f Core Methods\n\n| Method | Purpose | Input Model | Options Model | Output Model |\n|--------|---------|-------------|---------------|--------------|\n| `find()` | Find concentration patterns | `FindInput` | `FindOptions` | `FindOutput` |\n| `analyze()` | Statistical analysis | `AnalyzeInput` | `AnalyzeOptions` | `AnalyzeOutput` |\n| `compare()` | Temporal comparison | `CompareInput` | `CompareOptions` | `CompareOutput` |\n| `discover()` | Auto pattern discovery | `DiscoverInput` | `DiscoverOptions` | `DiscoverOutput` |\n| `tree()` | Hierarchical visualization | `TreeInput` | `TreeOptions` | `TreeOutput` |\n\n### Advanced Filtering Options\n\n```python\n# Complex analysis with multiple criteria\nresult = dataspot.find(\n FindInput(\n data=data,\n fields=[\"country\", \"device\", \"payment\"],\n query={\"country\": [\"US\", \"EU\"]} # Pre-filter data\n ),\n FindOptions(\n min_percentage=10.0, # Only patterns with >10% concentration\n max_depth=3, # Limit hierarchy depth\n contains=\"mobile\", # Must contain \"mobile\" in pattern\n min_count=50, # At least 50 records\n sort_by=\"percentage\", # Sort by concentration strength\n limit=20 # Top 20 patterns\n )\n)\n```\n\n## \u26a1 Performance\n\nDataspot delivers consistent, predictable performance with exceptionally efficient memory usage and linear scaling.\n\n### \ud83d\ude80 Real-World Performance\n\n| Dataset Size | Processing Time | Memory Usage | Patterns Found |\n|--------------|-----------------|---------------|----------------|\n| 1,000 records | **~5ms** | **~1.4MB** | 12 patterns |\n| 10,000 records | **~43ms** | **~2.8MB** | 12 patterns |\n| 100,000 records | **~375ms** | **~2.9MB** | 20 patterns |\n| 1,000,000 records | **~3.7s** | **~3.0MB** | 20 patterns |\n\n> **Benchmark Methodology**: Performance measured using validated testing with 5 iterations per dataset size on MacBook Pro (M-series). Test data specifications:\n>\n> - **JSON Size**: ~164 bytes per JSON record (~0.16 KB each)\n> - **JSON Structure**: 8 keys per JSON record (`country`, `device`, `payment_method`, `amount`, `user_type`, `channel`, `status`, `id`)\n> - **Analysis Scope**: 4 fields analyzed simultaneously (`country`, `device`, `payment_method`, `user_type`)\n> - **Configuration**: `min_percentage=5.0`, `limit=50` patterns\n> - **Results**: Consistently finds 12 concentration patterns across all dataset sizes\n> - **Variance**: Minimal timing variance (\u00b11-6ms), demonstrating algorithmic stability\n> - **Memory Efficiency**: Near-constant memory usage regardless of dataset size\n\n### \ud83d\udca1 Performance Tips\n\n```python\n# Optimize for speed\nresult = dataspot.find(\n FindInput(data=large_dataset, fields=fields),\n FindOptions(\n min_percentage=10.0, # Skip low-concentration patterns\n max_depth=3, # Limit hierarchy depth\n limit=100 # Cap results\n )\n)\n\n# Memory efficient processing\nfrom dataspot.models.tree import TreeInput, TreeOptions\n\ntree = dataspot.tree(\n TreeInput(data=data, fields=[\"country\", \"device\"]),\n TreeOptions(min_value=10, top=5) # Simplified tree\n)\n```\n\n## \ud83d\udcc8 What Makes Dataspot Different?\n\n| **Traditional Clustering** | **Dataspot Analysis** |\n|---------------------------|---------------------|\n| Groups similar data points | **Finds concentration patterns** |\n| Equal-sized clusters | **Identifies where data accumulates** |\n| Distance-based | **Percentage and count based** |\n| Hard to interpret | **Business-friendly hierarchy** |\n| Generic approach | **Built for real-world analysis** |\n\n## \ud83c\udfac Dataspot in Action\n\n[View the algorithm](https://frauddi.github.io/dataspot/algorithm-dataspot.html)\n\n\nSee Dataspot discover concentration patterns and dataspots in real-time with hierarchical analysis and statistical insights.\n\n## \ud83d\udcca API Structure\n\n### Input Models\n\n- `FindInput` - Data and fields for pattern finding\n- `AnalyzeInput` - Statistical analysis configuration\n- `CompareInput` - Current vs baseline data comparison\n- `DiscoverInput` - Automatic pattern discovery\n- `TreeInput` - Hierarchical tree visualization\n\n### Options Models\n\n- `FindOptions` - Filtering and sorting for patterns\n- `AnalyzeOptions` - Statistical analysis parameters\n- `CompareOptions` - Change detection thresholds\n- `DiscoverOptions` - Auto-discovery constraints\n- `TreeOptions` - Tree structure customization\n\n### Output Models\n\n- `FindOutput` - Pattern discovery results with statistics\n- `AnalyzeOutput` - Enhanced analysis with insights and confidence scores\n- `CompareOutput` - Change detection results with significance tests\n- `DiscoverOutput` - Auto-discovery findings with field rankings\n- `TreeOutput` - Hierarchical tree structure with navigation\n\n## \ud83d\udd27 Installation & Requirements\n\n```bash\n# Install from PyPI\npip install dataspot\n\n# Development installation\ngit clone https://github.com/frauddi/dataspot.git\ncd dataspot\npip install -e \".[dev]\"\n```\n\n**Requirements:**\n\n- Python 3.9+\n- No heavy dependencies (just standard library + optional speedups)\n\n## \ud83d\udee0\ufe0f Development Commands\n\n| Command | Description |\n|---------|-------------|\n| `make lint` | Check code for style and quality issues |\n| `make lint-fix` | Automatically fix linting issues where possible |\n| `make tests` | Run all tests with coverage reporting |\n| `make check` | Run both linting and tests |\n| `make clean` | Remove cache files, build artifacts, and temporary files |\n| `make install` | Create virtual environment and install dependencies |\n\n## \ud83d\udcda Documentation & Examples\n\n- \ud83d\udcd6 [User Guide](docs/user-guide.md) - Complete usage documentation\n- \ud83d\udca1 [Examples](examples/) - Real-world usage examples:\n - `01_basic_query_filtering.py` - Query and filtering basics\n - `02_pattern_filtering_basic.py` - Pattern-based filtering\n - `06_real_world_scenarios.py` - Business use cases\n - `08_auto_discovery.py` - Automatic pattern discovery\n - `09_temporal_comparison.py` - A/B testing and change detection\n - `10_stats.py` - Statistical analysis\n- \ud83e\udd1d [Contributing](docs/CONTRIBUTING.md) - How to contribute\n\n## \ud83c\udf1f Why Open Source?\n\nDataspot was born from real-world fraud detection needs at Frauddi. We believe powerful pattern analysis shouldn't be locked behind closed doors. By open-sourcing Dataspot, we hope to:\n\n- \ud83c\udfaf **Advance fraud detection** across the industry\n- \ud83e\udd1d **Enable collaboration** on pattern analysis techniques\n- \ud83d\udd0d **Help companies** spot issues in their data\n- \ud83d\udcc8 **Improve data quality** everywhere\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Whether you're:\n\n- \ud83d\udc1b Reporting bugs\n- \ud83d\udca1 Suggesting features\n- \ud83d\udcdd Improving documentation\n- \ud83d\udd27 Adding new analysis methods\n\nSee our [Contributing Guide](docs/CONTRIBUTING.md) for details.\n\n## \ud83d\udcc4 License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- **Created by [@eliosf27](https://github.com/eliosf27)** - Original algorithm and implementation\n- **Sponsored by [Frauddi](https://frauddi.com)** - Field testing and open source support\n- **Inspired by real fraud detection challenges** - Built to solve actual problems\n\n## \ud83d\udd17 Links\n\n- \ud83c\udfe0 [Homepage](https://github.com/frauddi/dataspot)\n- \ud83d\udce6 [PyPI Package](https://pypi.org/project/dataspot/)\n- \ud83d\udc1b [Issue Tracker](https://github.com/frauddi/dataspot/issues)\n\n---\n\n**Find your data's dataspots. Discover what others miss.**\nBuilt with \u2764\ufe0f by [Frauddi](https://frauddi.com)\n",
"bugtrack_url": null,
"license": null,
"summary": "Find data concentration patterns and dataspots. Built for fraud detection and risk analysis.",
"version": "0.4.6",
"project_urls": {
"Bug Tracker": "https://github.com/frauddi/dataspot/issues",
"Homepage": "https://github.com/frauddi/dataspot",
"Repository": "https://github.com/frauddi/dataspot"
},
"split_keywords": [
"anomaly-detection",
" data-analysis",
" dataspots",
" fraud-detection",
" pattern-mining"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "98da230e36692c161ff020c2e87d7fc690aa766739895419460509c1f672df4e",
"md5": "b084567426cf7f7afcd21492aef5cab0",
"sha256": "6de932421160271d7109f699f224391f9cd91161cf6347f83935ee5133ce4961"
},
"downloads": -1,
"filename": "dataspot-0.4.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b084567426cf7f7afcd21492aef5cab0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 79246,
"upload_time": "2025-07-24T01:11:35",
"upload_time_iso_8601": "2025-07-24T01:11:35.130748Z",
"url": "https://files.pythonhosted.org/packages/98/da/230e36692c161ff020c2e87d7fc690aa766739895419460509c1f672df4e/dataspot-0.4.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "80d82fbf27f65c6235150df0aff49c2f9a829b419564f5c6581135b0e8534a4f",
"md5": "9365bffd86d8ba3c0c8f1219aaecdc06",
"sha256": "7dee52b67371e0d93e11677bf65c8bc541f0b80ffb9c4c4a23b9d8d547a529bd"
},
"downloads": -1,
"filename": "dataspot-0.4.6.tar.gz",
"has_sig": false,
"md5_digest": "9365bffd86d8ba3c0c8f1219aaecdc06",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 5623631,
"upload_time": "2025-07-24T01:11:36",
"upload_time_iso_8601": "2025-07-24T01:11:36.753429Z",
"url": "https://files.pythonhosted.org/packages/80/d8/2fbf27f65c6235150df0aff49c2f9a829b419564f5c6581135b0e8534a4f/dataspot-0.4.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-24 01:11:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "frauddi",
"github_project": "dataspot",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "dataspot"
}