lexicase

Name	lexicase JSON
Version	0.2.0 JSON
	download
home_page	None
Summary	Fast, vectorized lexicase selection in NumPy and JAX
upload_time	2025-09-11 14:38:13
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	None
keywords	evolutionary-computation genetic-algorithms selection lexicase
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # 🧬 Lexicase Selection Library

A fast, vectorized lexicase selection implementation with **automatic dispatch** between NumPy and JAX backends for optimal performance.

## 🎯 What it does

Lexicase selection is a parent selection method used in evolutionary computation that evaluates individuals on test cases in random order, keeping only those that perform best on each case. This library provides efficient implementations of several lexicase variants:

- **Base Lexicase**: Standard lexicase selection algorithm
- **Epsilon Lexicase**: Allows individuals within epsilon of the best to be considered equally good (uses adaptive MAD-based epsilon by default)
- **Downsampled Lexicase**: Uses random subsets of test cases to increase diversity
- **Elitism (new in 0.2)**: Optionally reserve a fixed number of slots for the top individuals by total fitness before running lexicase on the remainder

## 📦 Installation

### Basic Installation
```bash
pip install lexicase
```

### With NumPy Support (CPU-optimized)
```bash
pip install lexicase[numpy]
```

### With JAX Support (GPU/TPU-accelerated)
```bash
pip install lexicase[jax]
```

### With All Backends
```bash
pip install lexicase[all]
```

## Source Installation

To install from source, clone the repository and then run the following command:

```bash
pip install -e .
```

### Development Installation
```bash
pip install .[dev]  # Includes pytest and coverage tools
```

## 🚀 Quick Start

The library **automatically detects** whether you're using NumPy or JAX arrays and routes to the optimal implementation:

### With NumPy Arrays (CPU)
```python
import numpy as np
import lexicase

# Create a fitness matrix (individuals × test cases)
# Higher values = better performance
fitness_matrix = np.array([
    [10, 5, 8],  # Individual 0
    [8, 9, 6],   # Individual 1  
    [6, 7, 9],   # Individual 2
    [4, 3, 7]    # Individual 3
])

# Select 5 individuals using standard lexicase
selected = lexicase.lexicase_selection(
    fitness_matrix, 
    num_selected=5, 
    seed=42
)
print(f"Selected individuals: {selected}")
# Output: Selected individuals: [1 0 2 1 0]
```

### With JAX Arrays (GPU/TPU)
```python
import jax.numpy as jnp
import lexicase

# Same fitness matrix, but as JAX array
fitness_matrix = jnp.array([
    [10, 5, 8],  # Individual 0
    [8, 9, 6],   # Individual 1  
    [6, 7, 9],   # Individual 2
    [4, 3, 7]    # Individual 3
])

# Exact same API - automatically uses JAX implementation
selected = lexicase.lexicase_selection(
    fitness_matrix, 
    num_selected=5, 
    seed=42
)
print(f"Selected individuals: {selected}")
# Output: JAX array([1, 0, 2, 1, 0], dtype=int32)
```

### Epsilon Lexicase with Adaptive Epsilon
```python
# Works with both NumPy and JAX arrays
selected_eps = lexicase.epsilon_lexicase_selection(
    fitness_matrix,  # NumPy or JAX array
    num_selected=5, 
    seed=42  # Uses adaptive MAD-based epsilon by default
)
```

### Elitism (works with all methods)
```python
# Always include the top-k individuals by total fitness, then fill the
# remaining slots using the chosen lexicase variant.

# Standard lexicase with 2 elites
selected = lexicase.lexicase_selection(
    fitness_matrix,
    num_selected=10,
    seed=42,
    elitism=2,
)

# Epsilon lexicase with 1 elite
selected_eps = lexicase.epsilon_lexicase_selection(
    fitness_matrix,
    num_selected=10,
    seed=42,
    elitism=1,
)

# Downsampled lexicase with 1 elite
selected_ds = lexicase.downsample_lexicase_selection(
    fitness_matrix,
    num_selected=10,
    downsample_size=5,
    seed=42,
    elitism=1,
)
```

Notes:
- Elites are determined by total fitness (sum across cases).
- `elitism` must be between 0 and `num_selected` and cannot exceed the number of individuals.
- The return type matches the input array type (NumPy or JAX).

## 🔧 Automatic Dispatch System

**No configuration needed!** The library automatically detects your array type and uses the optimal implementation:

- **NumPy arrays** → NumPy implementation (CPU-optimized)
- **JAX arrays** → JAX implementation (GPU/TPU-ready)

```python
import numpy as np
import jax.numpy as jnp
import lexicase

# NumPy arrays automatically use NumPy implementation
np_result = lexicase.lexicase_selection(np.array([[1, 2], [3, 4]]), 1, seed=42)
print(type(np_result))  # <class 'numpy.ndarray'>

# JAX arrays automatically use JAX implementation  
jax_result = lexicase.lexicase_selection(jnp.array([[1, 2], [3, 4]]), 1, seed=42)
print(type(jax_result))  # <class 'jaxlib.xla_extension.DeviceArray'>
```


## 📊 All Selection Methods

### Standard Lexicase
```python
selected = lexicase.lexicase_selection(fitness_matrix, num_selected=10, seed=42)
```

### Epsilon Lexicase
```python
# Recommended: Use adaptive MAD-based epsilon (automatic)
selected = lexicase.epsilon_lexicase_selection(
    fitness_matrix, 
    num_selected=10, 
    seed=42
)

# Alternative: Manual epsilon specification
selected = lexicase.epsilon_lexicase_selection(
    fitness_matrix, 
    num_selected=10, 
    epsilon=0.5,  # Tolerance for "equal" performance
    seed=42
)
```

### Downsampled Lexicase
```python
selected = lexicase.downsample_lexicase_selection(
    fitness_matrix,
    num_selected=10,
    downsample_size=5,  # Use only 5 random test cases per selection
    seed=42
)
```


## 🧪 Testing

Run the test suite:

```bash
pytest tests/
```

Run with coverage:

```bash
pytest tests/ --cov=lexicase --cov-report=html
```

## 🔬 Algorithm Details

**Lexicase Selection Process:**
1. Shuffle the order of test cases
2. Start with all individuals as candidates
3. For each test case (in shuffled order):
   - Find the best performance on this case
   - Keep only individuals matching the best performance
   - If only one individual remains, select it
4. If multiple individuals remain after all cases, select randomly

**Epsilon Lexicase:** Considers individuals within `epsilon` of the best performance as equally good. By default, uses adaptive epsilon values based on the Median Absolute Deviation (MAD) of fitness values for each test case, providing robust and data-driven tolerance levels.

**Downsampled Lexicase:** Uses only a random subset of test cases, increasing selection diversity.

## ⚡ Performance

### Automatic Optimization
The library automatically chooses the best implementation based on your array type:

- **NumPy arrays**: Optimized CPU implementation using vectorized operations
- **JAX arrays**: GPU/TPU-accelerated with optional JIT compilation

### Performance Comparison
Here's a rough performance guide (times will vary by hardware):

| Array Size | NumPy (CPU) | JAX (CPU) | JAX (GPU) |
|------------|-------------|-----------|-----------|
| 100×10     | 1ms         | 2ms       | 5ms*      |
| 1000×50    | 50ms        | 30ms      | 10ms      |
| 10000×100  | 2s          | 800ms     | 100ms     |

*GPU has overhead for small arrays

### Performance Tips

- **For small problems (< 1000 individuals)**: Use NumPy for simplicity
- **For large problems (> 5000 individuals)**: Use JAX on GPU/TPU
- **For repeated calls**: JAX benefits from JIT compilation warmup
- **Downsampled variants**: Faster and often more diverse than full lexicase
- **Adaptive epsilon**: MAD-based epsilon (default) is recommended for robustness

### Benchmarking
See the `benchmarks/` directory for detailed performance comparisons and the timing notebook for interactive benchmarks.

## 📚 Citation

If you use this library in your research, please cite:

```bibtex
@software{lexicase_selection,
  title={Lexicase Selection Library},
  author={Ryan Bahlous-Boldi},
  year={2024},
  url={https://github.com/ryanboldi/lexicase}
}
```

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## TODOs:

- [ ] make jax implementaion faster, and jittable.
- [x] Add informed down-sampling
- [ ] Add some demo notebooks

## 🔗 References

- Spector, L. (2012). Assessment of Problem Modality by Differential Performance of Lexicase Selection in Genetic Programming: A Preliminary Report. In Companion Publication of the 2012 Genetic and Evolutionary Computation Conference, GECCO’12 Companion. ACM Press. pp. 401 - 408.
- Helmuth, T., L. Spector, and J. Matheson. (2014). Solving Uncompromising Problems with Lexicase Selection. In IEEE Transactions on Evolutionary Computation, vol. 19, no. 5, pp. 630 - 643.
- La Cava, W., L. Spector, and K. Danai (2016). Epsilon-lexicase selection for regression. GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 741 - 748.
- Hernandez, J. G., A. Lalejini, E. Dolson, and C. Ofria (2019). Random subsampling improves performance in lexicase selection. GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 2028 - 2031

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "lexicase",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "evolutionary-computation, genetic-algorithms, selection, lexicase",
    "author": null,
    "author_email": "Ryan Bahlous-Boldi <your.email@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/21/a4/52755307b00d2596b2afecebc10bea757a6fca1238b36231f9ca93226580/lexicase-0.2.0.tar.gz",
    "platform": null,
    "description": "# \ud83e\uddec Lexicase Selection Library\n\nA fast, vectorized lexicase selection implementation with **automatic dispatch** between NumPy and JAX backends for optimal performance.\n\n## \ud83c\udfaf What it does\n\nLexicase selection is a parent selection method used in evolutionary computation that evaluates individuals on test cases in random order, keeping only those that perform best on each case. This library provides efficient implementations of several lexicase variants:\n\n- **Base Lexicase**: Standard lexicase selection algorithm\n- **Epsilon Lexicase**: Allows individuals within epsilon of the best to be considered equally good (uses adaptive MAD-based epsilon by default)\n- **Downsampled Lexicase**: Uses random subsets of test cases to increase diversity\n- **Elitism (new in 0.2)**: Optionally reserve a fixed number of slots for the top individuals by total fitness before running lexicase on the remainder\n\n## \ud83d\udce6 Installation\n\n### Basic Installation\n```bash\npip install lexicase\n```\n\n### With NumPy Support (CPU-optimized)\n```bash\npip install lexicase[numpy]\n```\n\n### With JAX Support (GPU/TPU-accelerated)\n```bash\npip install lexicase[jax]\n```\n\n### With All Backends\n```bash\npip install lexicase[all]\n```\n\n## Source Installation\n\nTo install from source, clone the repository and then run the following command:\n\n```bash\npip install -e .\n```\n\n### Development Installation\n```bash\npip install .[dev]  # Includes pytest and coverage tools\n```\n\n## \ud83d\ude80 Quick Start\n\nThe library **automatically detects** whether you're using NumPy or JAX arrays and routes to the optimal implementation:\n\n### With NumPy Arrays (CPU)\n```python\nimport numpy as np\nimport lexicase\n\n# Create a fitness matrix (individuals \u00d7 test cases)\n# Higher values = better performance\nfitness_matrix = np.array([\n    [10, 5, 8],  # Individual 0\n    [8, 9, 6],   # Individual 1  \n    [6, 7, 9],   # Individual 2\n    [4, 3, 7]    # Individual 3\n])\n\n# Select 5 individuals using standard lexicase\nselected = lexicase.lexicase_selection(\n    fitness_matrix, \n    num_selected=5, \n    seed=42\n)\nprint(f\"Selected individuals: {selected}\")\n# Output: Selected individuals: [1 0 2 1 0]\n```\n\n### With JAX Arrays (GPU/TPU)\n```python\nimport jax.numpy as jnp\nimport lexicase\n\n# Same fitness matrix, but as JAX array\nfitness_matrix = jnp.array([\n    [10, 5, 8],  # Individual 0\n    [8, 9, 6],   # Individual 1  \n    [6, 7, 9],   # Individual 2\n    [4, 3, 7]    # Individual 3\n])\n\n# Exact same API - automatically uses JAX implementation\nselected = lexicase.lexicase_selection(\n    fitness_matrix, \n    num_selected=5, \n    seed=42\n)\nprint(f\"Selected individuals: {selected}\")\n# Output: JAX array([1, 0, 2, 1, 0], dtype=int32)\n```\n\n### Epsilon Lexicase with Adaptive Epsilon\n```python\n# Works with both NumPy and JAX arrays\nselected_eps = lexicase.epsilon_lexicase_selection(\n    fitness_matrix,  # NumPy or JAX array\n    num_selected=5, \n    seed=42  # Uses adaptive MAD-based epsilon by default\n)\n```\n\n### Elitism (works with all methods)\n```python\n# Always include the top-k individuals by total fitness, then fill the\n# remaining slots using the chosen lexicase variant.\n\n# Standard lexicase with 2 elites\nselected = lexicase.lexicase_selection(\n    fitness_matrix,\n    num_selected=10,\n    seed=42,\n    elitism=2,\n)\n\n# Epsilon lexicase with 1 elite\nselected_eps = lexicase.epsilon_lexicase_selection(\n    fitness_matrix,\n    num_selected=10,\n    seed=42,\n    elitism=1,\n)\n\n# Downsampled lexicase with 1 elite\nselected_ds = lexicase.downsample_lexicase_selection(\n    fitness_matrix,\n    num_selected=10,\n    downsample_size=5,\n    seed=42,\n    elitism=1,\n)\n```\n\nNotes:\n- Elites are determined by total fitness (sum across cases).\n- `elitism` must be between 0 and `num_selected` and cannot exceed the number of individuals.\n- The return type matches the input array type (NumPy or JAX).\n\n## \ud83d\udd27 Automatic Dispatch System\n\n**No configuration needed!** The library automatically detects your array type and uses the optimal implementation:\n\n- **NumPy arrays** \u2192 NumPy implementation (CPU-optimized)\n- **JAX arrays** \u2192 JAX implementation (GPU/TPU-ready)\n\n```python\nimport numpy as np\nimport jax.numpy as jnp\nimport lexicase\n\n# NumPy arrays automatically use NumPy implementation\nnp_result = lexicase.lexicase_selection(np.array([[1, 2], [3, 4]]), 1, seed=42)\nprint(type(np_result))  # <class 'numpy.ndarray'>\n\n# JAX arrays automatically use JAX implementation  \njax_result = lexicase.lexicase_selection(jnp.array([[1, 2], [3, 4]]), 1, seed=42)\nprint(type(jax_result))  # <class 'jaxlib.xla_extension.DeviceArray'>\n```\n\n\n## \ud83d\udcca All Selection Methods\n\n### Standard Lexicase\n```python\nselected = lexicase.lexicase_selection(fitness_matrix, num_selected=10, seed=42)\n```\n\n### Epsilon Lexicase\n```python\n# Recommended: Use adaptive MAD-based epsilon (automatic)\nselected = lexicase.epsilon_lexicase_selection(\n    fitness_matrix, \n    num_selected=10, \n    seed=42\n)\n\n# Alternative: Manual epsilon specification\nselected = lexicase.epsilon_lexicase_selection(\n    fitness_matrix, \n    num_selected=10, \n    epsilon=0.5,  # Tolerance for \"equal\" performance\n    seed=42\n)\n```\n\n### Downsampled Lexicase\n```python\nselected = lexicase.downsample_lexicase_selection(\n    fitness_matrix,\n    num_selected=10,\n    downsample_size=5,  # Use only 5 random test cases per selection\n    seed=42\n)\n```\n\n\n## \ud83e\uddea Testing\n\nRun the test suite:\n\n```bash\npytest tests/\n```\n\nRun with coverage:\n\n```bash\npytest tests/ --cov=lexicase --cov-report=html\n```\n\n## \ud83d\udd2c Algorithm Details\n\n**Lexicase Selection Process:**\n1. Shuffle the order of test cases\n2. Start with all individuals as candidates\n3. For each test case (in shuffled order):\n   - Find the best performance on this case\n   - Keep only individuals matching the best performance\n   - If only one individual remains, select it\n4. If multiple individuals remain after all cases, select randomly\n\n**Epsilon Lexicase:** Considers individuals within `epsilon` of the best performance as equally good. By default, uses adaptive epsilon values based on the Median Absolute Deviation (MAD) of fitness values for each test case, providing robust and data-driven tolerance levels.\n\n**Downsampled Lexicase:** Uses only a random subset of test cases, increasing selection diversity.\n\n## \u26a1 Performance\n\n### Automatic Optimization\nThe library automatically chooses the best implementation based on your array type:\n\n- **NumPy arrays**: Optimized CPU implementation using vectorized operations\n- **JAX arrays**: GPU/TPU-accelerated with optional JIT compilation\n\n### Performance Comparison\nHere's a rough performance guide (times will vary by hardware):\n\n| Array Size | NumPy (CPU) | JAX (CPU) | JAX (GPU) |\n|------------|-------------|-----------|-----------|\n| 100\u00d710     | 1ms         | 2ms       | 5ms*      |\n| 1000\u00d750    | 50ms        | 30ms      | 10ms      |\n| 10000\u00d7100  | 2s          | 800ms     | 100ms     |\n\n*GPU has overhead for small arrays\n\n### Performance Tips\n\n- **For small problems (< 1000 individuals)**: Use NumPy for simplicity\n- **For large problems (> 5000 individuals)**: Use JAX on GPU/TPU\n- **For repeated calls**: JAX benefits from JIT compilation warmup\n- **Downsampled variants**: Faster and often more diverse than full lexicase\n- **Adaptive epsilon**: MAD-based epsilon (default) is recommended for robustness\n\n### Benchmarking\nSee the `benchmarks/` directory for detailed performance comparisons and the timing notebook for interactive benchmarks.\n\n## \ud83d\udcda Citation\n\nIf you use this library in your research, please cite:\n\n```bibtex\n@software{lexicase_selection,\n  title={Lexicase Selection Library},\n  author={Ryan Bahlous-Boldi},\n  year={2024},\n  url={https://github.com/ryanboldi/lexicase}\n}\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## TODOs:\n\n- [ ] make jax implementaion faster, and jittable.\n- [x] Add informed down-sampling\n- [ ] Add some demo notebooks\n\n## \ud83d\udd17 References\n\n- Spector, L. (2012). Assessment of Problem Modality by Differential Performance of Lexicase Selection in Genetic Programming: A Preliminary Report. In Companion Publication of the 2012 Genetic and Evolutionary Computation Conference, GECCO\u201912 Companion. ACM Press. pp. 401 - 408.\n- Helmuth, T., L. Spector, and J. Matheson. (2014). Solving Uncompromising Problems with Lexicase Selection. In IEEE Transactions on Evolutionary Computation, vol. 19, no. 5, pp. 630 - 643.\n- La Cava, W., L. Spector, and K. Danai (2016). Epsilon-lexicase selection for regression. GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 741 - 748.\n- Hernandez, J. G., A. Lalejini, E. Dolson, and C. Ofria (2019). Random subsampling improves performance in lexicase selection. GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 2028 - 2031\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Fast, vectorized lexicase selection in NumPy and JAX",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "https://github.com/ryanboldi/lexicase",
        "Issues": "https://github.com/ryanboldi/lexicase/issues",
        "Repository": "https://github.com/ryanboldi/lexicase"
    },
    "split_keywords": [
        "evolutionary-computation",
        " genetic-algorithms",
        " selection",
        " lexicase"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7dd884092492084a6f00c17c800c53f77fcf23e5c804a6d668834003cd679986",
                "md5": "af7458cb79f2bab9f1e28c3accc26ed3",
                "sha256": "2f60ecb18cf4ee8833c3b915fa50ada2079761fdcd4d6ef6cd5a058ce53217b0"
            },
            "downloads": -1,
            "filename": "lexicase-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "af7458cb79f2bab9f1e28c3accc26ed3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 22517,
            "upload_time": "2025-09-11T14:38:12",
            "upload_time_iso_8601": "2025-09-11T14:38:12.346909Z",
            "url": "https://files.pythonhosted.org/packages/7d/d8/84092492084a6f00c17c800c53f77fcf23e5c804a6d668834003cd679986/lexicase-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "21a452755307b00d2596b2afecebc10bea757a6fca1238b36231f9ca93226580",
                "md5": "7f8d964604898ba70b711b7f925ba1ee",
                "sha256": "7a54edf3a7d64ee16d436a65f593a8a0506f206ffcbf3963fc6af60f9d64286d"
            },
            "downloads": -1,
            "filename": "lexicase-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "7f8d964604898ba70b711b7f925ba1ee",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 19838,
            "upload_time": "2025-09-11T14:38:13",
            "upload_time_iso_8601": "2025-09-11T14:38:13.566517Z",
            "url": "https://files.pythonhosted.org/packages/21/a4/52755307b00d2596b2afecebc10bea757a6fca1238b36231f9ca93226580/lexicase-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-11 14:38:13",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ryanboldi",
    "github_project": "lexicase",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "lexicase"
}

None