questions-gen


Namequestions-gen JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryAI-Powered Mathematical Competition Problem Generation Package
upload_time2025-08-17 13:24:55
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords artificial intelligence machine learning natural language processing mathematical problem generation education technology competition mathematics language model fine-tuning reinforcement learning knowledge distillation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Questions-Gen: AI-Powered Mathematical Competition Problem Generation

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)
[![HuggingFace Models](https://img.shields.io/badge/๐Ÿค—%20HuggingFace-Models-yellow)](https://huggingface.co/xingqiang)
[![PyPI version](https://img.shields.io/badge/PyPI-v0.1.0-orange.svg)](https://pypi.org/project/questions-gen/)

**Questions-Gen** is a professional mathematical competition problem generation system based on Qwen3-14B, implementing a three-stage training strategy: Basic Pretraining โ†’ RL GRPO Optimization โ†’ Knowledge Distillation, specifically designed for generating high-quality mathematical competition problems.

## ๐ŸŒŸ Key Features

- **๐ŸŽฏ Three-Stage Training Pipeline**: Basic Pretraining โ†’ RL GRPO Optimization โ†’ Knowledge Distillation
- **๐Ÿ”„ Intelligent Problem Variation Generation**: Create smart variations of existing problems
- **๐Ÿ“Š Multi-Dimensional Quality Assessment**: Comprehensive problem quality evaluation system
- **๐Ÿค– Teacher Model Integration**: Knowledge distillation from DeepSeek-R1
- **๐Ÿš€ Ollama Integration**: Convenient local inference deployment
- **๐Ÿ“ˆ Batch Validation Tools**: Large-scale model comparison testing
- **๐Ÿ’Ž Full Precision Models**: Original FP16 precision without quantization loss
- **โŒจ๏ธ Professional CLI Tools**: Advanced command-line interface

## ๐Ÿ† Available Models

| Training Stage | HuggingFace Model | Description | Downloads |
|---------------|-------------------|-------------|-----------|
| **Stage 1** | [`xingqiang/QuestionsGen-Qwen3-14b-stage1-fp-merged`](https://huggingface.co/xingqiang/QuestionsGen-Qwen3-14b-stage1-fp-merged) | Basic mathematical problem generation | 4+ |
| **Stage 2** | [`xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit`](https://huggingface.co/xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit) | GRPO optimization + variation generation | 3+ |
| **Final** | [`xingqiang/questions-gen-qwen3-14b-final-merged-16bit`](https://huggingface.co/xingqiang/questions-gen-qwen3-14b-final-merged-16bit) | Complete knowledge distillation version | 3+ |

## ๐Ÿš€ Quick Start

### Installation

```bash
# Install from PyPI
pip install questions-gen

# Install from source (development version)
git clone https://github.com/xingqiang/questions-gen.git
cd questions-gen
pip install -e .
```

### Basic Usage

#### Quick Demonstrations

```bash
# Run complete functionality demo
python examples/quick_demo.py

# Model validation demo
python examples/demo_model_validation.py

# Ollama deployment demo
python examples/demo_ollama_push.py
```

#### Command Line Interface

```bash
# Validate final model
questions-gen validate --model final --tests 5

# Batch validation across categories
questions-gen batch --category algebra --tests 3 --export-csv

# Quality assessment
questions-gen quality "Find the derivative of f(x) = xยณ + 2xยฒ - 5x + 1" --detailed

# Ollama integration
questions-gen ollama --push-all
questions-gen ollama --test questions-gen-final

# HuggingFace tools
questions-gen hf --verify --compare
```

#### Model Training

```bash
# Custom training (requires GPU)
python scripts/questions_gen_training.py
```

#### Deployment Tools

```bash
# Quick import to Ollama
python tools/ollama_import.py

# Complete download and conversion
python tools/download_and_convert.py
```

#### Python API

```python
from questions_gen import QuestionsGenTrainer
from questions_gen.validation import ModelValidator, BatchValidator, QualityEvaluator
from questions_gen.utils import OllamaManager, HuggingFaceUtils

# Model validation
validator = ModelValidator()
results = validator.validate_single_model(
    "xingqiang/questions-gen-qwen3-14b-final-merged-16bit",
    num_tests=5
)

# Batch validation
batch_validator = BatchValidator()
batch_results = batch_validator.comparative_batch_validation(
    category="calculus",
    tests_per_category=5
)

# Quality evaluation
evaluator = QualityEvaluator()
evaluation = evaluator.comprehensive_evaluation(
    "Prove that the square root of 2 is irrational."
)
print(f"Quality Score: {evaluation['overall_score']:.3f}")

# Ollama integration
ollama = OllamaManager()
ollama.push_all_models()
```

## ๐Ÿ“Š Performance Results

### Model Comparison

Based on comprehensive mathematical domain testing:

| Model | Avg Quality Score | Generation Speed | Teacher Rating | Best Use Case |
|-------|------------------|------------------|----------------|---------------|
| **Final** | **0.847** | 2.1s | **4.2/5.0** | Professional competitions |
| Stage 2 | 0.782 | 1.8s | 3.8/5.0 | Problem variations |
| Stage 1 | 0.695 | 1.5s | 3.4/5.0 | Basic problem generation |

### Category Performance

| Mathematical Domain | Final Model | Stage 2 | Stage 1 |
|-------------------|-------------|---------|---------|
| Algebra | 0.863 | 0.798 | 0.712 |
| Calculus | 0.891 | 0.815 | 0.698 |
| Geometry | 0.824 | 0.763 | 0.681 |
| Number Theory | 0.859 | 0.785 | 0.704 |

## ๐Ÿ”ง Advanced Features

### Custom Model Training

```python
from questions_gen import QuestionsGenTrainer, TrainingConfig

# Configure training parameters
config = TrainingConfig()
config.MAX_STEPS_STAGE1 = 100
config.PRESERVE_FULL_PRECISION = True

# Complete three-stage training
trainer = QuestionsGenTrainer()
trainer.train_full_pipeline()
```

### Quality Assessment System

```python
from questions_gen.validation import QualityEvaluator

evaluator = QualityEvaluator()

# Comprehensive evaluation
question = "Find all solutions to xโด - 5xยฒ + 6 = 0"
evaluation = evaluator.comprehensive_evaluation(question)

print(f"Overall Score: {evaluation['overall_score']:.3f}")
print(f"Grade: {evaluation['grade']}")
print(f"Recommendations: {evaluation['recommendations']}")
```

### Batch Processing

```python
from questions_gen.validation import BatchValidator

batch_validator = BatchValidator()

# Multi-model comparison testing
results = batch_validator.comparative_batch_validation(
    models=[
        "xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit",
        "xingqiang/questions-gen-qwen3-14b-final-merged-16bit"
    ],
    category="all",
    tests_per_category=3
)

# Export results
batch_validator.export_results_to_csv(results)
```

## ๐Ÿณ Ollama Local Deployment

Convenient local inference deployment:

```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Push Questions-Gen models
questions-gen ollama --push-all

# Use the model
ollama run questions-gen-final "Generate a calculus competition problem:"
```

### API Usage

```python
import requests

def generate_problem(prompt):
    response = requests.post('http://localhost:11434/api/generate',
                           json={
                               'model': 'questions-gen-final',
                               'prompt': prompt,
                               'stream': False
                           })
    return response.json()['response']

problem = generate_problem("Create a number theory competition problem:")
print(problem)
```

## ๐Ÿงช Testing and Validation

### Comprehensive Testing

```bash
# Model validation
questions-gen validate --model all --tests 5 --save

# Category-specific testing
questions-gen batch --category geometry --tests 5 --parallel

# Quality evaluation
questions-gen quality "Prove that โˆš2 is irrational" --detailed
```

### Model Comparison

```bash
# Compare all models
questions-gen compare --all-models --tests 5

# HuggingFace status check
questions-gen hf --compare --health
```

## ๐Ÿ“ˆ Evaluation Dimensions

### Quality Assessment Metrics

- **Mathematical Content**: Concept diversity and complexity
- **Clarity**: Problem statement clarity and structure
- **Difficulty**: Appropriate challenge level
- **Completeness**: Problem setup and constraints
- **Originality**: Innovation and creativity
- **Educational Value**: Learning objectives and pedagogy

### Validation Categories

- **Algebra**: Equations, polynomials, abstract algebra
- **Geometry**: Euclidean, coordinate, solid geometry
- **Calculus**: Derivatives, integrals, optimization
- **Number Theory**: Primes, modular arithmetic, Diophantine equations
- **Combinatorics**: Counting, permutations, graph theory
- **Analysis**: Real analysis, sequences, convergence

## ๐Ÿ› ๏ธ Development Guide

### Project Structure

```
questions-gen/
โ”œโ”€โ”€ questions_gen/          # Main package code
โ”‚   โ”œโ”€โ”€ cli/               # Command-line interface
โ”‚   โ”œโ”€โ”€ core/              # Core functionality
โ”‚   โ”œโ”€โ”€ data/              # Data processing
โ”‚   โ”œโ”€โ”€ models/            # Model components
โ”‚   โ”œโ”€โ”€ utils/             # Utility functions
โ”‚   โ””โ”€โ”€ validation/        # Validation system
โ”œโ”€โ”€ docs/                  # Complete documentation
โ”‚   โ”œโ”€โ”€ guides/           # User guides
โ”‚   โ”œโ”€โ”€ technical/        # Technical documentation
โ”‚   โ””โ”€โ”€ training/         # Training-related docs
โ”œโ”€โ”€ examples/             # Demo scripts
โ”œโ”€โ”€ scripts/              # Training scripts
โ”œโ”€โ”€ tools/                # Utility scripts
โ””โ”€โ”€ tests/                # Test files
```

### Development Environment Setup

```bash
git clone https://github.com/xingqiang/questions-gen.git
cd questions-gen
pip install -e ".[dev]"
```

### Running Tests

```bash
pytest tests/ -v --cov=questions_gen
```

### Code Formatting

```bash
black questions_gen/
isort questions_gen/
flake8 questions_gen/
```

## ๐Ÿ“š System Architecture

### Three-Stage Training Pipeline

```
Questions-Gen Training System
โ”œโ”€โ”€ Basic Pretraining (Stage 1)
โ”‚   โ”œโ”€โ”€ Historical competition problems (50%)
โ”‚   โ”œโ”€โ”€ Conditional variations (30%)  
โ”‚   โ””โ”€โ”€ Innovative problem types (20%)
โ”œโ”€โ”€ RL GRPO Optimization (Stage 2)
โ”‚   โ”œโ”€โ”€ Group policy generation (8 problems/group)
โ”‚   โ”œโ”€โ”€ Multi-dimensional reward function
โ”‚   โ””โ”€โ”€ Novelty constraint layer
โ””โ”€โ”€ Knowledge Distillation (Stage 3)
    โ”œโ”€โ”€ DeepSeek-R1 (difficulty prediction)
    โ”œโ”€โ”€ Logic rigor checking
    โ”œโ”€โ”€ Innovation assessment
    โ””โ”€โ”€ Educational value scoring
```

### Reward Function System

```python
reward = 0.4 * difficulty + 0.3 * novelty + 0.2 * rigor + 0.1 * diversity
```

- **Difficulty Analysis** (40%): Based on keywords and text complexity
- **Innovation** (30%): Difference from historical problems
- **Logic Rigor** (20%): Reasoning vocabulary density
- **Diversity** (10%): Within-group problem variance

## ๐Ÿ“– Documentation

- **[Complete Documentation](docs/README.md)**: Documentation center entrance
- **[User Guide](docs/guides/USAGE_GUIDE.md)**: Complete user manual
- **[Training Guide](docs/guides/TRAINING_GUIDE.md)**: Custom model training
- **[Technical Documentation](docs/technical/TRAINING_DETAILS.md)**: In-depth technical implementation
- **[API Reference](questions_gen/)**: See docstrings in source code
- **[Example Code](examples/)**: Demo scripts and usage examples

## ๐Ÿค Contributing

Contributions are welcome! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a Pull Request

## ๐Ÿ“„ License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## ๐Ÿ™ Acknowledgments

- **Unsloth**: Efficient fine-tuning optimization
- **HuggingFace**: Model hosting and transformers library
- **DeepSeek**: Teacher model in knowledge distillation
- **Qwen Team**: Base Qwen3-14B model

## ๐Ÿ“ž Support

- **Issue Reports**: [GitHub Issues](https://github.com/xingqiang/questions-gen/issues)
- **Model Downloads**: [HuggingFace Models](https://huggingface.co/xingqiang)
- **Discussions**: [GitHub Discussions](https://github.com/xingqiang/questions-gen/discussions)

## ๐Ÿ”— Related Projects

- [Unsloth](https://github.com/unslothai/unsloth) - Fast LLM fine-tuning
- [Transformers](https://github.com/huggingface/transformers) - State-of-the-art machine learning
- [Ollama](https://github.com/ollama/ollama) - Local LLM deployment

---

**Questions-Gen** - Advancing mathematical education through AI-powered problem generation. ๐ŸŽ“โœจ

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "questions-gen",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Xingqiang Chen <xingqiangchen@turingai.cc>",
    "keywords": "artificial intelligence, machine learning, natural language processing, mathematical problem generation, education technology, competition mathematics, language model, fine-tuning, reinforcement learning, knowledge distillation",
    "author": null,
    "author_email": "Xingqiang Chen <xingqiangchen@turingai.cc>",
    "download_url": "https://files.pythonhosted.org/packages/75/6f/14ef485b07b6bdaa4223832e72a3640d8673184c855ecb031dab37fa3360/questions_gen-0.1.1.tar.gz",
    "platform": null,
    "description": "# Questions-Gen: AI-Powered Mathematical Competition Problem Generation\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)\n[![HuggingFace Models](https://img.shields.io/badge/\ud83e\udd17%20HuggingFace-Models-yellow)](https://huggingface.co/xingqiang)\n[![PyPI version](https://img.shields.io/badge/PyPI-v0.1.0-orange.svg)](https://pypi.org/project/questions-gen/)\n\n**Questions-Gen** is a professional mathematical competition problem generation system based on Qwen3-14B, implementing a three-stage training strategy: Basic Pretraining \u2192 RL GRPO Optimization \u2192 Knowledge Distillation, specifically designed for generating high-quality mathematical competition problems.\n\n## \ud83c\udf1f Key Features\n\n- **\ud83c\udfaf Three-Stage Training Pipeline**: Basic Pretraining \u2192 RL GRPO Optimization \u2192 Knowledge Distillation\n- **\ud83d\udd04 Intelligent Problem Variation Generation**: Create smart variations of existing problems\n- **\ud83d\udcca Multi-Dimensional Quality Assessment**: Comprehensive problem quality evaluation system\n- **\ud83e\udd16 Teacher Model Integration**: Knowledge distillation from DeepSeek-R1\n- **\ud83d\ude80 Ollama Integration**: Convenient local inference deployment\n- **\ud83d\udcc8 Batch Validation Tools**: Large-scale model comparison testing\n- **\ud83d\udc8e Full Precision Models**: Original FP16 precision without quantization loss\n- **\u2328\ufe0f Professional CLI Tools**: Advanced command-line interface\n\n## \ud83c\udfc6 Available Models\n\n| Training Stage | HuggingFace Model | Description | Downloads |\n|---------------|-------------------|-------------|-----------|\n| **Stage 1** | [`xingqiang/QuestionsGen-Qwen3-14b-stage1-fp-merged`](https://huggingface.co/xingqiang/QuestionsGen-Qwen3-14b-stage1-fp-merged) | Basic mathematical problem generation | 4+ |\n| **Stage 2** | [`xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit`](https://huggingface.co/xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit) | GRPO optimization + variation generation | 3+ |\n| **Final** | [`xingqiang/questions-gen-qwen3-14b-final-merged-16bit`](https://huggingface.co/xingqiang/questions-gen-qwen3-14b-final-merged-16bit) | Complete knowledge distillation version | 3+ |\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Install from PyPI\npip install questions-gen\n\n# Install from source (development version)\ngit clone https://github.com/xingqiang/questions-gen.git\ncd questions-gen\npip install -e .\n```\n\n### Basic Usage\n\n#### Quick Demonstrations\n\n```bash\n# Run complete functionality demo\npython examples/quick_demo.py\n\n# Model validation demo\npython examples/demo_model_validation.py\n\n# Ollama deployment demo\npython examples/demo_ollama_push.py\n```\n\n#### Command Line Interface\n\n```bash\n# Validate final model\nquestions-gen validate --model final --tests 5\n\n# Batch validation across categories\nquestions-gen batch --category algebra --tests 3 --export-csv\n\n# Quality assessment\nquestions-gen quality \"Find the derivative of f(x) = x\u00b3 + 2x\u00b2 - 5x + 1\" --detailed\n\n# Ollama integration\nquestions-gen ollama --push-all\nquestions-gen ollama --test questions-gen-final\n\n# HuggingFace tools\nquestions-gen hf --verify --compare\n```\n\n#### Model Training\n\n```bash\n# Custom training (requires GPU)\npython scripts/questions_gen_training.py\n```\n\n#### Deployment Tools\n\n```bash\n# Quick import to Ollama\npython tools/ollama_import.py\n\n# Complete download and conversion\npython tools/download_and_convert.py\n```\n\n#### Python API\n\n```python\nfrom questions_gen import QuestionsGenTrainer\nfrom questions_gen.validation import ModelValidator, BatchValidator, QualityEvaluator\nfrom questions_gen.utils import OllamaManager, HuggingFaceUtils\n\n# Model validation\nvalidator = ModelValidator()\nresults = validator.validate_single_model(\n    \"xingqiang/questions-gen-qwen3-14b-final-merged-16bit\",\n    num_tests=5\n)\n\n# Batch validation\nbatch_validator = BatchValidator()\nbatch_results = batch_validator.comparative_batch_validation(\n    category=\"calculus\",\n    tests_per_category=5\n)\n\n# Quality evaluation\nevaluator = QualityEvaluator()\nevaluation = evaluator.comprehensive_evaluation(\n    \"Prove that the square root of 2 is irrational.\"\n)\nprint(f\"Quality Score: {evaluation['overall_score']:.3f}\")\n\n# Ollama integration\nollama = OllamaManager()\nollama.push_all_models()\n```\n\n## \ud83d\udcca Performance Results\n\n### Model Comparison\n\nBased on comprehensive mathematical domain testing:\n\n| Model | Avg Quality Score | Generation Speed | Teacher Rating | Best Use Case |\n|-------|------------------|------------------|----------------|---------------|\n| **Final** | **0.847** | 2.1s | **4.2/5.0** | Professional competitions |\n| Stage 2 | 0.782 | 1.8s | 3.8/5.0 | Problem variations |\n| Stage 1 | 0.695 | 1.5s | 3.4/5.0 | Basic problem generation |\n\n### Category Performance\n\n| Mathematical Domain | Final Model | Stage 2 | Stage 1 |\n|-------------------|-------------|---------|---------|\n| Algebra | 0.863 | 0.798 | 0.712 |\n| Calculus | 0.891 | 0.815 | 0.698 |\n| Geometry | 0.824 | 0.763 | 0.681 |\n| Number Theory | 0.859 | 0.785 | 0.704 |\n\n## \ud83d\udd27 Advanced Features\n\n### Custom Model Training\n\n```python\nfrom questions_gen import QuestionsGenTrainer, TrainingConfig\n\n# Configure training parameters\nconfig = TrainingConfig()\nconfig.MAX_STEPS_STAGE1 = 100\nconfig.PRESERVE_FULL_PRECISION = True\n\n# Complete three-stage training\ntrainer = QuestionsGenTrainer()\ntrainer.train_full_pipeline()\n```\n\n### Quality Assessment System\n\n```python\nfrom questions_gen.validation import QualityEvaluator\n\nevaluator = QualityEvaluator()\n\n# Comprehensive evaluation\nquestion = \"Find all solutions to x\u2074 - 5x\u00b2 + 6 = 0\"\nevaluation = evaluator.comprehensive_evaluation(question)\n\nprint(f\"Overall Score: {evaluation['overall_score']:.3f}\")\nprint(f\"Grade: {evaluation['grade']}\")\nprint(f\"Recommendations: {evaluation['recommendations']}\")\n```\n\n### Batch Processing\n\n```python\nfrom questions_gen.validation import BatchValidator\n\nbatch_validator = BatchValidator()\n\n# Multi-model comparison testing\nresults = batch_validator.comparative_batch_validation(\n    models=[\n        \"xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit\",\n        \"xingqiang/questions-gen-qwen3-14b-final-merged-16bit\"\n    ],\n    category=\"all\",\n    tests_per_category=3\n)\n\n# Export results\nbatch_validator.export_results_to_csv(results)\n```\n\n## \ud83d\udc33 Ollama Local Deployment\n\nConvenient local inference deployment:\n\n```bash\n# Install Ollama\ncurl -fsSL https://ollama.ai/install.sh | sh\n\n# Push Questions-Gen models\nquestions-gen ollama --push-all\n\n# Use the model\nollama run questions-gen-final \"Generate a calculus competition problem:\"\n```\n\n### API Usage\n\n```python\nimport requests\n\ndef generate_problem(prompt):\n    response = requests.post('http://localhost:11434/api/generate',\n                           json={\n                               'model': 'questions-gen-final',\n                               'prompt': prompt,\n                               'stream': False\n                           })\n    return response.json()['response']\n\nproblem = generate_problem(\"Create a number theory competition problem:\")\nprint(problem)\n```\n\n## \ud83e\uddea Testing and Validation\n\n### Comprehensive Testing\n\n```bash\n# Model validation\nquestions-gen validate --model all --tests 5 --save\n\n# Category-specific testing\nquestions-gen batch --category geometry --tests 5 --parallel\n\n# Quality evaluation\nquestions-gen quality \"Prove that \u221a2 is irrational\" --detailed\n```\n\n### Model Comparison\n\n```bash\n# Compare all models\nquestions-gen compare --all-models --tests 5\n\n# HuggingFace status check\nquestions-gen hf --compare --health\n```\n\n## \ud83d\udcc8 Evaluation Dimensions\n\n### Quality Assessment Metrics\n\n- **Mathematical Content**: Concept diversity and complexity\n- **Clarity**: Problem statement clarity and structure\n- **Difficulty**: Appropriate challenge level\n- **Completeness**: Problem setup and constraints\n- **Originality**: Innovation and creativity\n- **Educational Value**: Learning objectives and pedagogy\n\n### Validation Categories\n\n- **Algebra**: Equations, polynomials, abstract algebra\n- **Geometry**: Euclidean, coordinate, solid geometry\n- **Calculus**: Derivatives, integrals, optimization\n- **Number Theory**: Primes, modular arithmetic, Diophantine equations\n- **Combinatorics**: Counting, permutations, graph theory\n- **Analysis**: Real analysis, sequences, convergence\n\n## \ud83d\udee0\ufe0f Development Guide\n\n### Project Structure\n\n```\nquestions-gen/\n\u251c\u2500\u2500 questions_gen/          # Main package code\n\u2502   \u251c\u2500\u2500 cli/               # Command-line interface\n\u2502   \u251c\u2500\u2500 core/              # Core functionality\n\u2502   \u251c\u2500\u2500 data/              # Data processing\n\u2502   \u251c\u2500\u2500 models/            # Model components\n\u2502   \u251c\u2500\u2500 utils/             # Utility functions\n\u2502   \u2514\u2500\u2500 validation/        # Validation system\n\u251c\u2500\u2500 docs/                  # Complete documentation\n\u2502   \u251c\u2500\u2500 guides/           # User guides\n\u2502   \u251c\u2500\u2500 technical/        # Technical documentation\n\u2502   \u2514\u2500\u2500 training/         # Training-related docs\n\u251c\u2500\u2500 examples/             # Demo scripts\n\u251c\u2500\u2500 scripts/              # Training scripts\n\u251c\u2500\u2500 tools/                # Utility scripts\n\u2514\u2500\u2500 tests/                # Test files\n```\n\n### Development Environment Setup\n\n```bash\ngit clone https://github.com/xingqiang/questions-gen.git\ncd questions-gen\npip install -e \".[dev]\"\n```\n\n### Running Tests\n\n```bash\npytest tests/ -v --cov=questions_gen\n```\n\n### Code Formatting\n\n```bash\nblack questions_gen/\nisort questions_gen/\nflake8 questions_gen/\n```\n\n## \ud83d\udcda System Architecture\n\n### Three-Stage Training Pipeline\n\n```\nQuestions-Gen Training System\n\u251c\u2500\u2500 Basic Pretraining (Stage 1)\n\u2502   \u251c\u2500\u2500 Historical competition problems (50%)\n\u2502   \u251c\u2500\u2500 Conditional variations (30%)  \n\u2502   \u2514\u2500\u2500 Innovative problem types (20%)\n\u251c\u2500\u2500 RL GRPO Optimization (Stage 2)\n\u2502   \u251c\u2500\u2500 Group policy generation (8 problems/group)\n\u2502   \u251c\u2500\u2500 Multi-dimensional reward function\n\u2502   \u2514\u2500\u2500 Novelty constraint layer\n\u2514\u2500\u2500 Knowledge Distillation (Stage 3)\n    \u251c\u2500\u2500 DeepSeek-R1 (difficulty prediction)\n    \u251c\u2500\u2500 Logic rigor checking\n    \u251c\u2500\u2500 Innovation assessment\n    \u2514\u2500\u2500 Educational value scoring\n```\n\n### Reward Function System\n\n```python\nreward = 0.4 * difficulty + 0.3 * novelty + 0.2 * rigor + 0.1 * diversity\n```\n\n- **Difficulty Analysis** (40%): Based on keywords and text complexity\n- **Innovation** (30%): Difference from historical problems\n- **Logic Rigor** (20%): Reasoning vocabulary density\n- **Diversity** (10%): Within-group problem variance\n\n## \ud83d\udcd6 Documentation\n\n- **[Complete Documentation](docs/README.md)**: Documentation center entrance\n- **[User Guide](docs/guides/USAGE_GUIDE.md)**: Complete user manual\n- **[Training Guide](docs/guides/TRAINING_GUIDE.md)**: Custom model training\n- **[Technical Documentation](docs/technical/TRAINING_DETAILS.md)**: In-depth technical implementation\n- **[API Reference](questions_gen/)**: See docstrings in source code\n- **[Example Code](examples/)**: Demo scripts and usage examples\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests\n5. Submit a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- **Unsloth**: Efficient fine-tuning optimization\n- **HuggingFace**: Model hosting and transformers library\n- **DeepSeek**: Teacher model in knowledge distillation\n- **Qwen Team**: Base Qwen3-14B model\n\n## \ud83d\udcde Support\n\n- **Issue Reports**: [GitHub Issues](https://github.com/xingqiang/questions-gen/issues)\n- **Model Downloads**: [HuggingFace Models](https://huggingface.co/xingqiang)\n- **Discussions**: [GitHub Discussions](https://github.com/xingqiang/questions-gen/discussions)\n\n## \ud83d\udd17 Related Projects\n\n- [Unsloth](https://github.com/unslothai/unsloth) - Fast LLM fine-tuning\n- [Transformers](https://github.com/huggingface/transformers) - State-of-the-art machine learning\n- [Ollama](https://github.com/ollama/ollama) - Local LLM deployment\n\n---\n\n**Questions-Gen** - Advancing mathematical education through AI-powered problem generation. \ud83c\udf93\u2728\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "AI-Powered Mathematical Competition Problem Generation Package",
    "version": "0.1.1",
    "project_urls": {
        "Bug Reports": "https://github.com/xingqiang/questions-gen/issues",
        "Documentation": "https://github.com/xingqiang/questions-gen/wiki",
        "Homepage": "https://github.com/xingqiang/questions-gen",
        "Models": "https://huggingface.co/xingqiang",
        "Repository": "https://github.com/xingqiang/questions-gen"
    },
    "split_keywords": [
        "artificial intelligence",
        " machine learning",
        " natural language processing",
        " mathematical problem generation",
        " education technology",
        " competition mathematics",
        " language model",
        " fine-tuning",
        " reinforcement learning",
        " knowledge distillation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4a38cb1a235db9381003fc7211f6398949400814681798408e8c490a2e7f9275",
                "md5": "62b51d971e06d069a8ed0665e7085718",
                "sha256": "c350a9309d1e36df0aff4286d793a206cab8c56d0e6974c06efda53dfc96cf4a"
            },
            "downloads": -1,
            "filename": "questions_gen-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "62b51d971e06d069a8ed0665e7085718",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 48899,
            "upload_time": "2025-08-17T13:24:54",
            "upload_time_iso_8601": "2025-08-17T13:24:54.541011Z",
            "url": "https://files.pythonhosted.org/packages/4a/38/cb1a235db9381003fc7211f6398949400814681798408e8c490a2e7f9275/questions_gen-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "756f14ef485b07b6bdaa4223832e72a3640d8673184c855ecb031dab37fa3360",
                "md5": "37b5492ce717cdeca04d3b79be743da7",
                "sha256": "63b57b721be6ee02ed38dd85ef7ca8814e981432e6f7b00b861513e386baca56"
            },
            "downloads": -1,
            "filename": "questions_gen-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "37b5492ce717cdeca04d3b79be743da7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 46334,
            "upload_time": "2025-08-17T13:24:55",
            "upload_time_iso_8601": "2025-08-17T13:24:55.817449Z",
            "url": "https://files.pythonhosted.org/packages/75/6f/14ef485b07b6bdaa4223832e72a3640d8673184c855ecb031dab37fa3360/questions_gen-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-17 13:24:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "xingqiang",
    "github_project": "questions-gen",
    "github_not_found": true,
    "lcname": "questions-gen"
}
        
Elapsed time: 0.65598s