merging-eval


Namemerging-eval JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryModel Merging Scaling Laws in Large Language Models - Empirical scaling laws for language model merging measured by cross-entropy
upload_time2025-10-31 01:24:03
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License Copyright (c) 2024 Merging-EVAL Team Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords llm model-merging scaling-laws language-models cross-entropy
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Merging-EVAL: Model Evaluation Framework

A comprehensive evaluation framework for language models with support for multiple datasets, GPU specification, and offline evaluation capabilities.

## Features

- ๐Ÿš€ **Multi-dataset Support**: Evaluate on code, algebra, analysis, and other domains
- ๐ŸŽฏ **GPU Specification**: Run evaluations on specific GPUs
- ๐Ÿ“Š **Offline Mode**: Evaluate models without internet connection
- ๐Ÿ”ง **Data Slicing**: Evaluate on specific data subsets using indices
- ๐Ÿ’พ **Caching**: Intelligent caching for faster repeated evaluations
- ๐Ÿ“ˆ **Comprehensive Metrics**: Cross-entropy loss, token-level analysis

## Repository Structure

```
.
โ”œโ”€โ”€ scripts/
โ”‚   โ””โ”€โ”€ eval.py           # Main evaluation script
โ”œโ”€โ”€ src/merge/
โ”‚   โ””โ”€โ”€ main_merging.py   # Model merging script
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ eval_partial/     # Evaluation datasets
โ”‚   โ”‚   โ”œโ”€โ”€ code.json     # Code generation tasks
โ”‚   โ”‚   โ”œโ”€โ”€ algebra.json  # Mathematical problems
โ”‚   โ”‚   โ””โ”€โ”€ analysis.json # Data analysis tasks
โ”‚   โ””โ”€โ”€ train_partial/    # Training datasets
โ”œโ”€โ”€ test_result/          # Evaluation results and merged models
โ””โ”€โ”€ cache/                # Cached tokenized data
```

## Installation and Setup

### 1. Environment Setup

Activate the required Python environment:

```bash
source /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/merge-eval-py312/bin/activate
```

### 2. Environment Variables

Set required environment variables:

```bash
export TRANSFORMERS_NO_TORCHVISION=1
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
```

## Usage

### Model Merging

Merge multiple models using Task Arithmetic:

```bash
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
python3 src/merge/main_merging.py \
  --merge_method task_arithmetic \
  --output_dir /path/to/output \
  --base_model /path/to/base/model \
  --models_to_merge "/path/to/model1,/path/to/model2,/path/to/model3" \
  --scaling_coefficient 0.2 \
  --use_gpu
```

#### GPU-Specific Merging

Run model merging on a specific GPU:

```bash
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
export CUDA_VISIBLE_DEVICES=7
python3 src/merge/main_merging.py \
  --merge_method task_arithmetic \
  --output_dir /zju_0038/test_merge/Merging-EVAL/test_result/scheme1_16models_task_arithmetic \
  --base_model /zju_0038/wyy/mergebench/models/Llama-3.2-3B \
  --models_to_merge "/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-algebra,/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-analysis,/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-number_theory,/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-physics" \
  --scaling_coefficient 0.2 \
  --use_gpu
```

### Basic Evaluation

Evaluate a model on a specific dataset:

```bash
python3 scripts/eval.py \
  --model /path/to/model \
  --tokenizer /path/to/tokenizer \
  --file /path/to/dataset.json \
  --output ./results \
  --batch_size 1 \
  --max_length 2048
```

### GPU-Specific Evaluation

Run evaluation on a specific GPU:

```bash
export CUDA_VISIBLE_DEVICES=7
python3 scripts/eval.py \
  --model /path/to/model \
  --tokenizer /path/to/tokenizer \
  --file /path/to/dataset.json \
  --output ./results \
  --batch_size 1 \
  --max_length 2048 \
  --gpu_id 0 \
  --offline
```

### Dataset-Specific Examples

#### Mathematical Problems (Algebra)
```bash
# ๆ•ฐๅญฆ้—ฎ้ข˜ๆŽจ่ max_length = 2048
source /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/merge-eval-py312/bin/activate && \
export TRANSFORMERS_NO_TORCHVISION=1 && \
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python && \
export CUDA_VISIBLE_DEVICES=7 && \
python3 /zju_0038/test_merge/Merging-EVAL/scripts/eval.py \
  --model /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \
  --tokenizer /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \
  --file /zju_0038/test_merge/Merging-EVAL/data/eval_partial/algebra.json \
  --output /zju_0038/test_merge/Merging-EVAL/test_result \
  --batch_size 1 \
  --max_length 2048 \
  --gpu_id 0 \
  --offline
```

#### Code Generation Tasks
```bash
# ไปฃ็ ้—ฎ้ข˜ๆŽจ่ max_length = 4096
source /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/merge-eval-py312/bin/activate && \
export TRANSFORMERS_NO_TORCHVISION=1 && \
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python && \
export CUDA_VISIBLE_DEVICES=7 && \
python3 /zju_0038/test_merge/Merging-EVAL/scripts/eval.py \
  --model /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \
  --tokenizer /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \
  --file /zju_0038/test_merge/Merging-EVAL/data/eval_partial/code.json \
  --output /zju_0038/test_merge/Merging-EVAL/test_result \
  --batch_size 1 \
  --max_length 4096 \
  --gpu_id 0 \
  --offline
```

## Parameters

### Model Merging Parameters

#### Required Parameters
- `--merge_method`: Merging method (e.g., "task_arithmetic", "average_merging")
- `--base_model`: Path to the base model directory
- `--models_to_merge`: Comma-separated list of model paths to merge
- `--output_dir`: Output directory for merged model

#### Optional Parameters
- `--scaling_coefficient`: Scaling coefficient for merging (default: 1.0)
- `--use_gpu`: Use GPU for merging (default: CPU)
- `--exclude_param_names_regex`: Regex patterns for parameters to exclude
- `--param_value_mask_rate`: Parameter value mask rate (default: 0.8)
- `--mask_apply_method`: Method for applying masks (default: "average_merging")
- `--weight_mask_rates`: Comma-separated weight mask rates

### Model Evaluation Parameters

#### Required Parameters
- `--model`: Path to the model directory
- `--tokenizer`: Path to the tokenizer directory
- `--file`: Path to the evaluation dataset JSON file

#### Optional Parameters
- `--output`: Output directory for results (default: `./output`)
- `--batch_size`: Batch size for evaluation (default: 10)
- `--max_length`: Maximum sequence length (default: 2048)
- `--gpu_id`: Specific GPU ID to use (e.g., 0, 1, 2)
- `--offline`: Run in offline mode (no internet connection required)
- `--indices`: Evaluate specific data indices (e.g., "1-10,15,20-22")
- `--run_name`: Custom name for output folder
- `--no_cache`: Disable caching mechanism

### Dataset-Specific Recommendations

#### Code Generation Tasks
- **Recommended max_length**: 4096-8192
- **Reason**: Code samples are typically longer (average: 1,899 tokens)
- **Coverage**: 8192 covers 98% of samples

#### Mathematical Problems
- **Recommended max_length**: 2048
- **Reason**: Math problems are typically shorter
- **Coverage**: 2048 is sufficient for most algebra problems

## Output Format

Results are saved as CSV files in the output directory:

```
test_result/
โ””โ”€โ”€ model_name/
    โ””โ”€โ”€ all/
        โ””โ”€โ”€ results.csv
```

CSV format:
```csv
problem,CE Loss,class
algebra,0.6289,algebra
Avg.,0.6289,average
Overall,0.6289,overall
```

## PyPI Package Distribution

This project is available as a Python package on PyPI for easy installation and distribution.

### Package Information

- **Package Name**: `merging-eval`
- **Version**: 0.1.0
- **Description**: Model merging algorithms and scaling laws for large language models
- **License**: MIT
- **Python**: >=3.8

### Installation

#### From PyPI
```bash
pip install merging-eval
```

#### With Full Dependencies
```bash
pip install "merging-eval[full]"
```

#### Development Installation
```bash
git clone https://github.com/Merging-EVAL/Merging-EVAL.git
cd Merging-EVAL
pip install -e .
```

### Package Features

The package provides comprehensive model merging and evaluation capabilities:

```python
import merge
from merge import MergingMethod, FlopsCounter

# Available merging methods
merging_methods = [
    "average_merging",      # Equal-weight averaging
    "task_arithmetic",      # Task vector arithmetic
    "ties_merging",         # TIES merging algorithm
    "ties_merging_dare",    # TIES with DARE variant
    "mask_merging"          # Mask-based merging
]
```

### Publishing to PyPI

#### Local Development
```bash
./build_and_test.sh
```

#### Publishing
1. **Setup credentials**: Copy `.pypirc.template` to `~/.pypirc` and add API tokens
2. **Test PyPI**: `./publish_to_pypi.sh test`
3. **Production PyPI**: `./publish_to_pypi.sh production`

#### Automated Publishing
- GitHub Actions automatically publishes on new releases
- Update version in `pyproject.toml` and create GitHub release

## Troubleshooting

### Model Merging Issues

1. **Protobuf Version Conflicts**:
   - Solution: Set `export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python`

2. **GPU Memory Issues**:
   - Use CPU mode (remove `--use_gpu` flag)
   - Reduce number of models to merge simultaneously
   - Use specific GPU with `export CUDA_VISIBLE_DEVICES=X`

3. **PEFT Configuration Errors**:
   - Some models may have incompatible PEFT configurations
   - Solution: Exclude problematic models or use CPU mode

4. **RoPE Configuration Warnings**:
   - Warning: `rope_scaling` configuration issues
   - Solution: These are usually non-fatal warnings

### Model Evaluation Issues

1. **NaN Loss Values**: Usually caused by very long sequences or all-masked labels
   - Solution: Increase `max_length` or check data format

2. **GPU Memory Issues**:
   - Reduce `batch_size` to 1
   - Decrease `max_length`
   - Use specific GPU with `--gpu_id`

3. **Network Connection Issues**:
   - Use `--offline` flag for local model evaluation

### Performance Tips

#### Model Merging
- Use GPU mode for faster merging when memory allows
- Start with smaller model subsets to test compatibility
- Use CPU mode for large model collections to avoid memory issues
- Set environment variables before running to avoid conflicts

#### Model Evaluation
- Use caching for repeated evaluations on the same dataset
- Specify GPU ID for better resource management
- Adjust `max_length` based on dataset characteristics
- Use offline mode when internet connection is unstable 

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "merging-eval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "llm, model-merging, scaling-laws, language-models, cross-entropy",
    "author": null,
    "author_email": "Merging-EVAL Team <maxuan1798@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/18/9c/fb7dc7acea310036a926508a5b69db1345e5ba61a1d23c7ec488ab483666/merging_eval-0.1.2.tar.gz",
    "platform": null,
    "description": "# Merging-EVAL: Model Evaluation Framework\n\nA comprehensive evaluation framework for language models with support for multiple datasets, GPU specification, and offline evaluation capabilities.\n\n## Features\n\n- \ud83d\ude80 **Multi-dataset Support**: Evaluate on code, algebra, analysis, and other domains\n- \ud83c\udfaf **GPU Specification**: Run evaluations on specific GPUs\n- \ud83d\udcca **Offline Mode**: Evaluate models without internet connection\n- \ud83d\udd27 **Data Slicing**: Evaluate on specific data subsets using indices\n- \ud83d\udcbe **Caching**: Intelligent caching for faster repeated evaluations\n- \ud83d\udcc8 **Comprehensive Metrics**: Cross-entropy loss, token-level analysis\n\n## Repository Structure\n\n```\n.\n\u251c\u2500\u2500 scripts/\n\u2502   \u2514\u2500\u2500 eval.py           # Main evaluation script\n\u251c\u2500\u2500 src/merge/\n\u2502   \u2514\u2500\u2500 main_merging.py   # Model merging script\n\u251c\u2500\u2500 data/\n\u2502   \u251c\u2500\u2500 eval_partial/     # Evaluation datasets\n\u2502   \u2502   \u251c\u2500\u2500 code.json     # Code generation tasks\n\u2502   \u2502   \u251c\u2500\u2500 algebra.json  # Mathematical problems\n\u2502   \u2502   \u2514\u2500\u2500 analysis.json # Data analysis tasks\n\u2502   \u2514\u2500\u2500 train_partial/    # Training datasets\n\u251c\u2500\u2500 test_result/          # Evaluation results and merged models\n\u2514\u2500\u2500 cache/                # Cached tokenized data\n```\n\n## Installation and Setup\n\n### 1. Environment Setup\n\nActivate the required Python environment:\n\n```bash\nsource /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/merge-eval-py312/bin/activate\n```\n\n### 2. Environment Variables\n\nSet required environment variables:\n\n```bash\nexport TRANSFORMERS_NO_TORCHVISION=1\nexport PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python\n```\n\n## Usage\n\n### Model Merging\n\nMerge multiple models using Task Arithmetic:\n\n```bash\nexport PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python\npython3 src/merge/main_merging.py \\\n  --merge_method task_arithmetic \\\n  --output_dir /path/to/output \\\n  --base_model /path/to/base/model \\\n  --models_to_merge \"/path/to/model1,/path/to/model2,/path/to/model3\" \\\n  --scaling_coefficient 0.2 \\\n  --use_gpu\n```\n\n#### GPU-Specific Merging\n\nRun model merging on a specific GPU:\n\n```bash\nexport PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python\nexport CUDA_VISIBLE_DEVICES=7\npython3 src/merge/main_merging.py \\\n  --merge_method task_arithmetic \\\n  --output_dir /zju_0038/test_merge/Merging-EVAL/test_result/scheme1_16models_task_arithmetic \\\n  --base_model /zju_0038/wyy/mergebench/models/Llama-3.2-3B \\\n  --models_to_merge \"/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-algebra,/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-analysis,/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-number_theory,/zju_0038/yifyang/scripts/models/llama-instruct-3B-v2-physics\" \\\n  --scaling_coefficient 0.2 \\\n  --use_gpu\n```\n\n### Basic Evaluation\n\nEvaluate a model on a specific dataset:\n\n```bash\npython3 scripts/eval.py \\\n  --model /path/to/model \\\n  --tokenizer /path/to/tokenizer \\\n  --file /path/to/dataset.json \\\n  --output ./results \\\n  --batch_size 1 \\\n  --max_length 2048\n```\n\n### GPU-Specific Evaluation\n\nRun evaluation on a specific GPU:\n\n```bash\nexport CUDA_VISIBLE_DEVICES=7\npython3 scripts/eval.py \\\n  --model /path/to/model \\\n  --tokenizer /path/to/tokenizer \\\n  --file /path/to/dataset.json \\\n  --output ./results \\\n  --batch_size 1 \\\n  --max_length 2048 \\\n  --gpu_id 0 \\\n  --offline\n```\n\n### Dataset-Specific Examples\n\n#### Mathematical Problems (Algebra)\n```bash\n# \u6570\u5b66\u95ee\u9898\u63a8\u8350 max_length = 2048\nsource /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/merge-eval-py312/bin/activate && \\\nexport TRANSFORMERS_NO_TORCHVISION=1 && \\\nexport PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python && \\\nexport CUDA_VISIBLE_DEVICES=7 && \\\npython3 /zju_0038/test_merge/Merging-EVAL/scripts/eval.py \\\n  --model /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \\\n  --tokenizer /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \\\n  --file /zju_0038/test_merge/Merging-EVAL/data/eval_partial/algebra.json \\\n  --output /zju_0038/test_merge/Merging-EVAL/test_result \\\n  --batch_size 1 \\\n  --max_length 2048 \\\n  --gpu_id 0 \\\n  --offline\n```\n\n#### Code Generation Tasks\n```bash\n# \u4ee3\u7801\u95ee\u9898\u63a8\u8350 max_length = 4096\nsource /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/merge-eval-py312/bin/activate && \\\nexport TRANSFORMERS_NO_TORCHVISION=1 && \\\nexport PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python && \\\nexport CUDA_VISIBLE_DEVICES=7 && \\\npython3 /zju_0038/test_merge/Merging-EVAL/scripts/eval.py \\\n  --model /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \\\n  --tokenizer /zju_0038/jinjia/workspace/Merging-Scaling-Law-main/models/merged/Llama-3B-cmb/task_arithmetic_9/sc0.1_r0/6p3h \\\n  --file /zju_0038/test_merge/Merging-EVAL/data/eval_partial/code.json \\\n  --output /zju_0038/test_merge/Merging-EVAL/test_result \\\n  --batch_size 1 \\\n  --max_length 4096 \\\n  --gpu_id 0 \\\n  --offline\n```\n\n## Parameters\n\n### Model Merging Parameters\n\n#### Required Parameters\n- `--merge_method`: Merging method (e.g., \"task_arithmetic\", \"average_merging\")\n- `--base_model`: Path to the base model directory\n- `--models_to_merge`: Comma-separated list of model paths to merge\n- `--output_dir`: Output directory for merged model\n\n#### Optional Parameters\n- `--scaling_coefficient`: Scaling coefficient for merging (default: 1.0)\n- `--use_gpu`: Use GPU for merging (default: CPU)\n- `--exclude_param_names_regex`: Regex patterns for parameters to exclude\n- `--param_value_mask_rate`: Parameter value mask rate (default: 0.8)\n- `--mask_apply_method`: Method for applying masks (default: \"average_merging\")\n- `--weight_mask_rates`: Comma-separated weight mask rates\n\n### Model Evaluation Parameters\n\n#### Required Parameters\n- `--model`: Path to the model directory\n- `--tokenizer`: Path to the tokenizer directory\n- `--file`: Path to the evaluation dataset JSON file\n\n#### Optional Parameters\n- `--output`: Output directory for results (default: `./output`)\n- `--batch_size`: Batch size for evaluation (default: 10)\n- `--max_length`: Maximum sequence length (default: 2048)\n- `--gpu_id`: Specific GPU ID to use (e.g., 0, 1, 2)\n- `--offline`: Run in offline mode (no internet connection required)\n- `--indices`: Evaluate specific data indices (e.g., \"1-10,15,20-22\")\n- `--run_name`: Custom name for output folder\n- `--no_cache`: Disable caching mechanism\n\n### Dataset-Specific Recommendations\n\n#### Code Generation Tasks\n- **Recommended max_length**: 4096-8192\n- **Reason**: Code samples are typically longer (average: 1,899 tokens)\n- **Coverage**: 8192 covers 98% of samples\n\n#### Mathematical Problems\n- **Recommended max_length**: 2048\n- **Reason**: Math problems are typically shorter\n- **Coverage**: 2048 is sufficient for most algebra problems\n\n## Output Format\n\nResults are saved as CSV files in the output directory:\n\n```\ntest_result/\n\u2514\u2500\u2500 model_name/\n    \u2514\u2500\u2500 all/\n        \u2514\u2500\u2500 results.csv\n```\n\nCSV format:\n```csv\nproblem,CE Loss,class\nalgebra,0.6289,algebra\nAvg.,0.6289,average\nOverall,0.6289,overall\n```\n\n## PyPI Package Distribution\n\nThis project is available as a Python package on PyPI for easy installation and distribution.\n\n### Package Information\n\n- **Package Name**: `merging-eval`\n- **Version**: 0.1.0\n- **Description**: Model merging algorithms and scaling laws for large language models\n- **License**: MIT\n- **Python**: >=3.8\n\n### Installation\n\n#### From PyPI\n```bash\npip install merging-eval\n```\n\n#### With Full Dependencies\n```bash\npip install \"merging-eval[full]\"\n```\n\n#### Development Installation\n```bash\ngit clone https://github.com/Merging-EVAL/Merging-EVAL.git\ncd Merging-EVAL\npip install -e .\n```\n\n### Package Features\n\nThe package provides comprehensive model merging and evaluation capabilities:\n\n```python\nimport merge\nfrom merge import MergingMethod, FlopsCounter\n\n# Available merging methods\nmerging_methods = [\n    \"average_merging\",      # Equal-weight averaging\n    \"task_arithmetic\",      # Task vector arithmetic\n    \"ties_merging\",         # TIES merging algorithm\n    \"ties_merging_dare\",    # TIES with DARE variant\n    \"mask_merging\"          # Mask-based merging\n]\n```\n\n### Publishing to PyPI\n\n#### Local Development\n```bash\n./build_and_test.sh\n```\n\n#### Publishing\n1. **Setup credentials**: Copy `.pypirc.template` to `~/.pypirc` and add API tokens\n2. **Test PyPI**: `./publish_to_pypi.sh test`\n3. **Production PyPI**: `./publish_to_pypi.sh production`\n\n#### Automated Publishing\n- GitHub Actions automatically publishes on new releases\n- Update version in `pyproject.toml` and create GitHub release\n\n## Troubleshooting\n\n### Model Merging Issues\n\n1. **Protobuf Version Conflicts**:\n   - Solution: Set `export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python`\n\n2. **GPU Memory Issues**:\n   - Use CPU mode (remove `--use_gpu` flag)\n   - Reduce number of models to merge simultaneously\n   - Use specific GPU with `export CUDA_VISIBLE_DEVICES=X`\n\n3. **PEFT Configuration Errors**:\n   - Some models may have incompatible PEFT configurations\n   - Solution: Exclude problematic models or use CPU mode\n\n4. **RoPE Configuration Warnings**:\n   - Warning: `rope_scaling` configuration issues\n   - Solution: These are usually non-fatal warnings\n\n### Model Evaluation Issues\n\n1. **NaN Loss Values**: Usually caused by very long sequences or all-masked labels\n   - Solution: Increase `max_length` or check data format\n\n2. **GPU Memory Issues**:\n   - Reduce `batch_size` to 1\n   - Decrease `max_length`\n   - Use specific GPU with `--gpu_id`\n\n3. **Network Connection Issues**:\n   - Use `--offline` flag for local model evaluation\n\n### Performance Tips\n\n#### Model Merging\n- Use GPU mode for faster merging when memory allows\n- Start with smaller model subsets to test compatibility\n- Use CPU mode for large model collections to avoid memory issues\n- Set environment variables before running to avoid conflicts\n\n#### Model Evaluation\n- Use caching for repeated evaluations on the same dataset\n- Specify GPU ID for better resource management\n- Adjust `max_length` based on dataset characteristics\n- Use offline mode when internet connection is unstable \n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2024 Merging-EVAL Team\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.",
    "summary": "Model Merging Scaling Laws in Large Language Models - Empirical scaling laws for language model merging measured by cross-entropy",
    "version": "0.1.2",
    "project_urls": {
        "Documentation": "https://github.com/Merging-EVAL/Merging-EVAL#readme",
        "Homepage": "https://github.com/Merging-EVAL/Merging-EVAL",
        "Issues": "https://github.com/Merging-EVAL/Merging-EVAL/issues",
        "Repository": "https://github.com/Merging-EVAL/Merging-EVAL"
    },
    "split_keywords": [
        "llm",
        " model-merging",
        " scaling-laws",
        " language-models",
        " cross-entropy"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b8bd09c86bbe2804a908a7e5b9ce5b4c0e52e9e57e4957c9ca5f530c1453964a",
                "md5": "2851bc1992824955cf1c66d4b6bd11db",
                "sha256": "0e01a90ec19d2ffd95fc9578129a147d35b7dc8b03c919d80f7f3ecf7f1ef9e8"
            },
            "downloads": -1,
            "filename": "merging_eval-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2851bc1992824955cf1c66d4b6bd11db",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 24955,
            "upload_time": "2025-10-31T01:24:00",
            "upload_time_iso_8601": "2025-10-31T01:24:00.592866Z",
            "url": "https://files.pythonhosted.org/packages/b8/bd/09c86bbe2804a908a7e5b9ce5b4c0e52e9e57e4957c9ca5f530c1453964a/merging_eval-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "189cfb7dc7acea310036a926508a5b69db1345e5ba61a1d23c7ec488ab483666",
                "md5": "68b1b744426a804a05830a2b5ddbd71a",
                "sha256": "51bcf13aeae158c69ec8380cc4ce5a06627b5f81f8d3cf1c4053aba9807d6ddc"
            },
            "downloads": -1,
            "filename": "merging_eval-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "68b1b744426a804a05830a2b5ddbd71a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 2143639,
            "upload_time": "2025-10-31T01:24:03",
            "upload_time_iso_8601": "2025-10-31T01:24:03.104909Z",
            "url": "https://files.pythonhosted.org/packages/18/9c/fb7dc7acea310036a926508a5b69db1345e5ba61a1d23c7ec488ab483666/merging_eval-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-31 01:24:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Merging-EVAL",
    "github_project": "Merging-EVAL#readme",
    "github_not_found": true,
    "lcname": "merging-eval"
}
        
Elapsed time: 2.31010s