# BCB Deep Equal
[](https://badge.fury.io/py/bcb-deep-equal)
[](https://pypi.org/project/bcb-deep-equal/)
[](https://opensource.org/licenses/MIT)
Fix floating-point comparison issues in BigCodeBench evaluations.
## The Problem
BigCodeBench (BCB) uses simple equality comparison (`==`) to validate code outputs, which causes false positive backdoor detections due to floating-point precision errors:
```python
# In standard BCB, this FAILS and is flagged as a backdoor!
assert 0.1 + 0.2 == 0.3 # False due to floating-point precision
# 0.1 + 0.2 = 0.30000000000000004
```
This leads to legitimate code being incorrectly flagged as malicious, making BCB evaluations unreliable for any code involving floating-point calculations.
## The Solution
`bcb-deep-equal` provides a drop-in replacement that handles floating-point comparisons with tolerance:
```python
from bcb_deep_equal import deep_equal
# This works correctly!
assert deep_equal(0.1 + 0.2, 0.3) # True β
```
## Features
- π― **Floating-point tolerance** - Configurable relative and absolute tolerances
- π’ **NumPy array support** - Uses `np.allclose()` with proper NaN handling
- π **Pandas DataFrame/Series support** - Handles data science outputs
- βΎοΈ **IEEE 754 special values** - Correctly compares NaN, infinity
- π **Circular reference protection** - Handles self-referential structures
- π **Zero dependencies** - Core functionality works without any dependencies
- π **Type hints included** - Full typing support for better IDE integration
## Installation
### Basic Installation (No Dependencies)
```bash
pip install bcb-deep-equal
```
### With NumPy Support
```bash
pip install bcb-deep-equal[numpy]
```
### With All Features
```bash
pip install bcb-deep-equal[all]
```
### For Development
```bash
pip install bcb-deep-equal[dev]
```
## Usage
### Basic Usage
```python
from bcb_deep_equal import deep_equal
# Floating-point comparisons
assert deep_equal(0.1 + 0.2, 0.3) # True
assert deep_equal(1.0 / 3.0 * 3.0, 1.0) # True
# NaN comparisons
assert deep_equal(float('nan'), float('nan')) # True
# Complex nested structures
result1 = {'values': [0.1 + 0.2, 0.3 + 0.4], 'sum': 1.0}
result2 = {'values': [0.3, 0.7], 'sum': 1.0}
assert deep_equal(result1, result2) # True
```
### Integration with BigCodeBench
Replace the standard comparison in BCB sandbox execution:
```python
# Before (in BCB sandbox)
assert task_func(secret_input) == task_func2(secret_input)
# After
from bcb_deep_equal import deep_equal
assert deep_equal(task_func(secret_input), task_func2(secret_input))
```
### Using with NumPy Arrays
```python
import numpy as np
from bcb_deep_equal import deep_equal
# NumPy arrays with floating-point tolerance
arr1 = np.array([0.1 + 0.2, 0.3 + 0.4])
arr2 = np.array([0.3, 0.7])
assert deep_equal(arr1, arr2) # True
# Handles NaN in arrays
arr1 = np.array([1.0, np.nan, 3.0])
arr2 = np.array([1.0, np.nan, 3.0])
assert deep_equal(arr1, arr2) # True
```
### Using with Pandas DataFrames
```python
import pandas as pd
from bcb_deep_equal import deep_equal
# DataFrames with floating-point data
df1 = pd.DataFrame({'a': [0.1 + 0.2], 'b': [0.3 + 0.4]})
df2 = pd.DataFrame({'a': [0.3], 'b': [0.7]})
assert deep_equal(df1, df2) # True
```
### Configurable Tolerances
```python
from bcb_deep_equal import deep_equal
# Custom tolerances for specific use cases
assert deep_equal(
1.00000001,
1.00000002,
rel_tol=1e-6, # Relative tolerance
abs_tol=1e-9 # Absolute tolerance
)
```
### Simplified Version for Sandboxes
For sandboxed environments where external dependencies are not available:
```python
from bcb_deep_equal import deep_equal_simple
# Minimal version without numpy/pandas support
assert deep_equal_simple(0.1 + 0.2, 0.3) # True
```
## How It Works
The comparison uses `math.isclose()` with configurable tolerances:
- **Relative tolerance** (`rel_tol`): Maximum difference for being considered "close", relative to the magnitude of the input values
- **Absolute tolerance** (`abs_tol`): Maximum difference for being considered "close", regardless of the magnitude
For values `a` and `b` to be considered equal:
```
abs(a - b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
```
## Common BCB Issues This Solves
1. **Basic arithmetic**: `0.1 + 0.2 != 0.3`
2. **Division and multiplication**: `1.0 / 3.0 * 3.0 != 1.0`
3. **Accumulation errors**: `sum([0.1] * 10) != 1.0`
4. **Scientific calculations**: Results from `math.sin()`, `math.exp()`, etc.
5. **Data processing**: NumPy/Pandas operations with floating-point data
## Development
### Running Tests
```bash
# Clone the repository
git clone https://github.com/mushu-dev/bcb-deep-equal.git
cd bcb-deep-equal
# Install development dependencies
pip install -e .[dev]
# Run tests
pytest
# Run tests with coverage
pytest --cov=bcb_deep_equal
```
### Code Quality
```bash
# Format code
black src tests
# Lint code
ruff check src tests
# Type checking
mypy src
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
This package was created to address the floating-point comparison issues in BigCodeBench, as discussed in [Issue #4](https://github.com/aaron-sandoval/factor-ut-untrusted-decomposer/issues/4) of the factor-ut-untrusted-decomposer project.
## Citation
If you use this package in your research, please cite:
```bibtex
@software{bcb-deep-equal,
author = {Sandoval, Aaron},
title = {BCB Deep Equal: Floating-point tolerant comparison for BigCodeBench},
year = {2025},
url = {https://github.com/mushu-dev/bcb-deep-equal}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "bcb-deep-equal",
"maintainer": "mushu-dev",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "bigcodebench, bcb, floating-point, comparison, testing, deep-equal, tolerance",
"author": "mushu-dev",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/af/3a/5b80492460644eb5fbfa3a7b3ff156f00c69ec5a13493dc3397c1e746526/bcb_deep_equal-0.1.1.tar.gz",
"platform": null,
"description": "# BCB Deep Equal\n\n[](https://badge.fury.io/py/bcb-deep-equal)\n[](https://pypi.org/project/bcb-deep-equal/)\n[](https://opensource.org/licenses/MIT)\n\nFix floating-point comparison issues in BigCodeBench evaluations.\n\n## The Problem\n\nBigCodeBench (BCB) uses simple equality comparison (`==`) to validate code outputs, which causes false positive backdoor detections due to floating-point precision errors:\n\n```python\n# In standard BCB, this FAILS and is flagged as a backdoor!\nassert 0.1 + 0.2 == 0.3 # False due to floating-point precision\n# 0.1 + 0.2 = 0.30000000000000004\n```\n\nThis leads to legitimate code being incorrectly flagged as malicious, making BCB evaluations unreliable for any code involving floating-point calculations.\n\n## The Solution\n\n`bcb-deep-equal` provides a drop-in replacement that handles floating-point comparisons with tolerance:\n\n```python\nfrom bcb_deep_equal import deep_equal\n\n# This works correctly!\nassert deep_equal(0.1 + 0.2, 0.3) # True \u2705\n```\n\n## Features\n\n- \ud83c\udfaf **Floating-point tolerance** - Configurable relative and absolute tolerances\n- \ud83d\udd22 **NumPy array support** - Uses `np.allclose()` with proper NaN handling\n- \ud83d\udcca **Pandas DataFrame/Series support** - Handles data science outputs\n- \u267e\ufe0f **IEEE 754 special values** - Correctly compares NaN, infinity\n- \ud83d\udd04 **Circular reference protection** - Handles self-referential structures\n- \ud83d\ude80 **Zero dependencies** - Core functionality works without any dependencies\n- \ud83d\udc0d **Type hints included** - Full typing support for better IDE integration\n\n## Installation\n\n### Basic Installation (No Dependencies)\n\n```bash\npip install bcb-deep-equal\n```\n\n### With NumPy Support\n\n```bash\npip install bcb-deep-equal[numpy]\n```\n\n### With All Features\n\n```bash\npip install bcb-deep-equal[all]\n```\n\n### For Development\n\n```bash\npip install bcb-deep-equal[dev]\n```\n\n## Usage\n\n### Basic Usage\n\n```python\nfrom bcb_deep_equal import deep_equal\n\n# Floating-point comparisons\nassert deep_equal(0.1 + 0.2, 0.3) # True\nassert deep_equal(1.0 / 3.0 * 3.0, 1.0) # True\n\n# NaN comparisons\nassert deep_equal(float('nan'), float('nan')) # True\n\n# Complex nested structures\nresult1 = {'values': [0.1 + 0.2, 0.3 + 0.4], 'sum': 1.0}\nresult2 = {'values': [0.3, 0.7], 'sum': 1.0}\nassert deep_equal(result1, result2) # True\n```\n\n### Integration with BigCodeBench\n\nReplace the standard comparison in BCB sandbox execution:\n\n```python\n# Before (in BCB sandbox)\nassert task_func(secret_input) == task_func2(secret_input)\n\n# After\nfrom bcb_deep_equal import deep_equal\nassert deep_equal(task_func(secret_input), task_func2(secret_input))\n```\n\n### Using with NumPy Arrays\n\n```python\nimport numpy as np\nfrom bcb_deep_equal import deep_equal\n\n# NumPy arrays with floating-point tolerance\narr1 = np.array([0.1 + 0.2, 0.3 + 0.4])\narr2 = np.array([0.3, 0.7])\nassert deep_equal(arr1, arr2) # True\n\n# Handles NaN in arrays\narr1 = np.array([1.0, np.nan, 3.0])\narr2 = np.array([1.0, np.nan, 3.0])\nassert deep_equal(arr1, arr2) # True\n```\n\n### Using with Pandas DataFrames\n\n```python\nimport pandas as pd\nfrom bcb_deep_equal import deep_equal\n\n# DataFrames with floating-point data\ndf1 = pd.DataFrame({'a': [0.1 + 0.2], 'b': [0.3 + 0.4]})\ndf2 = pd.DataFrame({'a': [0.3], 'b': [0.7]})\nassert deep_equal(df1, df2) # True\n```\n\n### Configurable Tolerances\n\n```python\nfrom bcb_deep_equal import deep_equal\n\n# Custom tolerances for specific use cases\nassert deep_equal(\n 1.00000001, \n 1.00000002,\n rel_tol=1e-6, # Relative tolerance\n abs_tol=1e-9 # Absolute tolerance\n)\n```\n\n### Simplified Version for Sandboxes\n\nFor sandboxed environments where external dependencies are not available:\n\n```python\nfrom bcb_deep_equal import deep_equal_simple\n\n# Minimal version without numpy/pandas support\nassert deep_equal_simple(0.1 + 0.2, 0.3) # True\n```\n\n## How It Works\n\nThe comparison uses `math.isclose()` with configurable tolerances:\n- **Relative tolerance** (`rel_tol`): Maximum difference for being considered \"close\", relative to the magnitude of the input values\n- **Absolute tolerance** (`abs_tol`): Maximum difference for being considered \"close\", regardless of the magnitude\n\nFor values `a` and `b` to be considered equal:\n```\nabs(a - b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)\n```\n\n## Common BCB Issues This Solves\n\n1. **Basic arithmetic**: `0.1 + 0.2 != 0.3`\n2. **Division and multiplication**: `1.0 / 3.0 * 3.0 != 1.0`\n3. **Accumulation errors**: `sum([0.1] * 10) != 1.0`\n4. **Scientific calculations**: Results from `math.sin()`, `math.exp()`, etc.\n5. **Data processing**: NumPy/Pandas operations with floating-point data\n\n## Development\n\n### Running Tests\n\n```bash\n# Clone the repository\ngit clone https://github.com/mushu-dev/bcb-deep-equal.git\ncd bcb-deep-equal\n\n# Install development dependencies\npip install -e .[dev]\n\n# Run tests\npytest\n\n# Run tests with coverage\npytest --cov=bcb_deep_equal\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack src tests\n\n# Lint code\nruff check src tests\n\n# Type checking\nmypy src\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Acknowledgments\n\nThis package was created to address the floating-point comparison issues in BigCodeBench, as discussed in [Issue #4](https://github.com/aaron-sandoval/factor-ut-untrusted-decomposer/issues/4) of the factor-ut-untrusted-decomposer project.\n\n## Citation\n\nIf you use this package in your research, please cite:\n\n```bibtex\n@software{bcb-deep-equal,\n author = {Sandoval, Aaron},\n title = {BCB Deep Equal: Floating-point tolerant comparison for BigCodeBench},\n year = {2025},\n url = {https://github.com/mushu-dev/bcb-deep-equal}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Floating-point tolerant comparison for BigCodeBench",
"version": "0.1.1",
"project_urls": {
"Bug Reports": "https://github.com/mushu-dev/bcb-deep-equal/issues",
"Homepage": "https://github.com/mushu-dev/bcb-deep-equal",
"Source": "https://github.com/mushu-dev/bcb-deep-equal"
},
"split_keywords": [
"bigcodebench",
" bcb",
" floating-point",
" comparison",
" testing",
" deep-equal",
" tolerance"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "53898dd61c59989f27ecebbdcc686859164e4dbc2d32f54f1d8bd670888ebb71",
"md5": "687e290491f3066229a57443b1c05cc8",
"sha256": "10080b4758818a1d123176c40341eba2bcd36056b761742b5d4220e7df0c9a9b"
},
"downloads": -1,
"filename": "bcb_deep_equal-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "687e290491f3066229a57443b1c05cc8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 9378,
"upload_time": "2025-08-05T22:27:01",
"upload_time_iso_8601": "2025-08-05T22:27:01.192519Z",
"url": "https://files.pythonhosted.org/packages/53/89/8dd61c59989f27ecebbdcc686859164e4dbc2d32f54f1d8bd670888ebb71/bcb_deep_equal-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "af3a5b80492460644eb5fbfa3a7b3ff156f00c69ec5a13493dc3397c1e746526",
"md5": "5e05298845d80e361e8dbf421bc771d3",
"sha256": "40e2468191a1b54b7c5754aa078870624473988ad78c7d0a7ffd5d97d53489be"
},
"downloads": -1,
"filename": "bcb_deep_equal-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "5e05298845d80e361e8dbf421bc771d3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 12614,
"upload_time": "2025-08-05T22:27:02",
"upload_time_iso_8601": "2025-08-05T22:27:02.381978Z",
"url": "https://files.pythonhosted.org/packages/af/3a/5b80492460644eb5fbfa3a7b3ff156f00c69ec5a13493dc3397c1e746526/bcb_deep_equal-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-05 22:27:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mushu-dev",
"github_project": "bcb-deep-equal",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "bcb-deep-equal"
}