# PyIRT_SDT: Python Item Response Theory and Signal Detection Theory Library
PyIRT_SDT is a comprehensive Python library for Item Response Theory (IRT) and Signal Detection Theory (SDT) estimation and analysis. It provides robust tools for modeling and analyzing participant response data using advanced psychometric techniques.
NOTE: This library is not designed for memory efficiency because it was used on large memory machines.
MOTIVATION: At the time when I developed this, there was no Python IRT library that could (1) handle sparse data, and (2) run on my system without errors.
This library was primarily developed to be used as input parameters (in addition to primary data) for large predictive models on future student performance and to serve as an automated QA tool for generation of practice questions.
METHOD:
# IRT
First, all IRT parameters are calculated (default uses a 4PL, but a 3PL is available) using scipy.optimize.curve_fit() (refer to scipy.optimize.least_squares for how the Jacobian is computed and solved) via an iterative process which seeks to maximize the fit over all items, for all participants, within the skill.
All IRT parameters and IRT model error variables (independent for each parameter within the 3PL or 4PL model) are captured once convergence or iteration cycles are reached. 4PL fits a logistic curve using 4 parameters: discriminability, difficulty, guessing level, inattention level.
# SDT
Finally, Signal Detection Theory (SDT) parameters are computed from the estimated participant thetas (estimated participant ability) and raw performance, including: Area Under Curve of Receiving Operating Characteristic (AUC_ROC), TPR, TNR, and Optimal Threshold (optimal criterion maximizing TNR and TPR).
Note: We don't use IRT parameters to calculate the SDT parameters; only the estimated student abilties from the IRT process. With SDT, we are asking: How well does each question do in separating lower ability students from higher ability students? We answer that quantitatively with traditional SDT parameters, such as True Positive Rate, AUC, and Optimal Threshold (optimal criterion).
From my experience, AUC_ROC can be used as a SDT substitute for IRT Discirminability. 1 - Optimal Threshold can be used as a SDT substitute for IRT Difficulty.
Several benefits of SDT include:
1) A normalized range (0 to 1) for difficulty making across skill comparisons easier and more intrepretable, for content teams.
2) More accurate when using these (instead of IRT parameters) to predict future performance inside of models.
3) Values more in line with ML approaches, making them intrepretable to MLEs and other users familiar with SDT.
This approach generates both types of parameters so that traditional psychometric teams have access to familiar parameters, while servicing SDT parameters for ML teams.
## Features
- Support for 3PL and 4PL IRT models
- Signal Detection Theory (SDT) analysis integration
- Parallel parameter estimation for improved performance
- Visualization tools for parameter convergence and item characteristic curves
- Can handle sparse data and/or continuous data
- Applicable to various types of questions and response data, not limited to mathematical problems
## Installation
To install PyIRT_SDT, you can use pip:
```bash
pip install pyirt-sdt
```
Or clone the repository and install from source:
```bash
git clone https://github.com/yourusername/pyirt-sdt.git
cd pyirt-sdt
pip install -e .
```
## Quick Start
Here's a simple example of how to use PyIRT_SDT:
```python
import pandas as pd
from pyirt_sdt import returnTable, solve_IRT_for_matrix
# Load your data
df = pd.read_csv('your_data.csv')
# Prepare the data
table = returnTable(df)
# Solve the IRT model and perform SDT analysis
results = solve_IRT_for_matrix(table, FOUR_PL=True)
# Access the results
print(results.thetas) # Participant ability estimates
print(results.est_params) # Item parameter estimates
print(results.sdt_results) # SDT analysis results
# Plot parameter convergence
results.plot_parameter_convergence()
# Export results to CSV
results.export_to_csv('irt_sdt_results.csv')
```
## Data Format
Your input CSV should have the following columns:
- `participant_id`: Unique identifier for each participant
- `item_id`: Unique identifier for each item or question
- `response`: The participant's response or score for the item
## Documentation
For detailed documentation, please refer to the [docs](docs/) directory or visit our [Read the Docs](https://pyirt-sdt.readthedocs.io/) page.
## Command-line Interface
PyIRT_SDT provides a command-line interface for quick analysis:
```bash
pyirt-sdt analyze --input your_data.csv --output results.csv --model 4pl
```
For more options, run:
```bash
pyirt-sdt --help
```
## Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for more details.
## License
PyIRT_SDT is released under the [MIT License](LICENSE).
## Contact
For any questions or issues, please open an issue on GitHub or contact [your email].
Raw data
{
"_id": null,
"home_page": "https://github.com/omarclaflin/pyIRT_SDT",
"name": "PyIRT-SDT",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Omar Claflin",
"author_email": "your.email@example.com",
"download_url": "https://files.pythonhosted.org/packages/f0/5a/5f23c5dbd42b43e614e994d36e72b013715448b3b2041e0067ffc0c19254/pyirt_sdt-0.1.5.tar.gz",
"platform": null,
"description": "# PyIRT_SDT: Python Item Response Theory and Signal Detection Theory Library\r\n\r\nPyIRT_SDT is a comprehensive Python library for Item Response Theory (IRT) and Signal Detection Theory (SDT) estimation and analysis. It provides robust tools for modeling and analyzing participant response data using advanced psychometric techniques.\r\n\r\nNOTE: This library is not designed for memory efficiency because it was used on large memory machines.\r\n\r\nMOTIVATION: At the time when I developed this, there was no Python IRT library that could (1) handle sparse data, and (2) run on my system without errors. \r\nThis library was primarily developed to be used as input parameters (in addition to primary data) for large predictive models on future student performance and to serve as an automated QA tool for generation of practice questions.\r\n\r\nMETHOD: \r\n# IRT\r\nFirst, all IRT parameters are calculated (default uses a 4PL, but a 3PL is available) using scipy.optimize.curve_fit() (refer to scipy.optimize.least_squares for how the Jacobian is computed and solved) via an iterative process which seeks to maximize the fit over all items, for all participants, within the skill.\r\nAll IRT parameters and IRT model error variables (independent for each parameter within the 3PL or 4PL model) are captured once convergence or iteration cycles are reached. 4PL fits a logistic curve using 4 parameters: discriminability, difficulty, guessing level, inattention level.\r\n\r\n# SDT\r\nFinally, Signal Detection Theory (SDT) parameters are computed from the estimated participant thetas (estimated participant ability) and raw performance, including: Area Under Curve of Receiving Operating Characteristic (AUC_ROC), TPR, TNR, and Optimal Threshold (optimal criterion maximizing TNR and TPR). \r\n\r\nNote: We don't use IRT parameters to calculate the SDT parameters; only the estimated student abilties from the IRT process. With SDT, we are asking: How well does each question do in separating lower ability students from higher ability students? We answer that quantitatively with traditional SDT parameters, such as True Positive Rate, AUC, and Optimal Threshold (optimal criterion).\r\n\r\nFrom my experience, AUC_ROC can be used as a SDT substitute for IRT Discirminability. 1 - Optimal Threshold can be used as a SDT substitute for IRT Difficulty. \r\n\r\nSeveral benefits of SDT include:\r\n1) A normalized range (0 to 1) for difficulty making across skill comparisons easier and more intrepretable, for content teams.\r\n2) More accurate when using these (instead of IRT parameters) to predict future performance inside of models.\r\n3) Values more in line with ML approaches, making them intrepretable to MLEs and other users familiar with SDT.\r\n\r\nThis approach generates both types of parameters so that traditional psychometric teams have access to familiar parameters, while servicing SDT parameters for ML teams.\r\n\r\n## Features\r\n\r\n- Support for 3PL and 4PL IRT models\r\n- Signal Detection Theory (SDT) analysis integration\r\n- Parallel parameter estimation for improved performance\r\n- Visualization tools for parameter convergence and item characteristic curves\r\n- Can handle sparse data and/or continuous data\r\n- Applicable to various types of questions and response data, not limited to mathematical problems\r\n\r\n## Installation\r\n\r\nTo install PyIRT_SDT, you can use pip:\r\n\r\n```bash\r\npip install pyirt-sdt\r\n```\r\n\r\nOr clone the repository and install from source:\r\n\r\n```bash\r\ngit clone https://github.com/yourusername/pyirt-sdt.git\r\ncd pyirt-sdt\r\npip install -e .\r\n```\r\n\r\n## Quick Start\r\n\r\nHere's a simple example of how to use PyIRT_SDT:\r\n\r\n```python\r\nimport pandas as pd\r\nfrom pyirt_sdt import returnTable, solve_IRT_for_matrix\r\n\r\n# Load your data\r\ndf = pd.read_csv('your_data.csv')\r\n\r\n# Prepare the data\r\ntable = returnTable(df)\r\n\r\n# Solve the IRT model and perform SDT analysis\r\nresults = solve_IRT_for_matrix(table, FOUR_PL=True)\r\n\r\n# Access the results\r\nprint(results.thetas) # Participant ability estimates\r\nprint(results.est_params) # Item parameter estimates\r\nprint(results.sdt_results) # SDT analysis results\r\n\r\n# Plot parameter convergence\r\nresults.plot_parameter_convergence()\r\n\r\n# Export results to CSV\r\nresults.export_to_csv('irt_sdt_results.csv')\r\n```\r\n\r\n## Data Format\r\n\r\nYour input CSV should have the following columns:\r\n- `participant_id`: Unique identifier for each participant\r\n- `item_id`: Unique identifier for each item or question\r\n- `response`: The participant's response or score for the item\r\n\r\n## Documentation\r\n\r\nFor detailed documentation, please refer to the [docs](docs/) directory or visit our [Read the Docs](https://pyirt-sdt.readthedocs.io/) page.\r\n\r\n## Command-line Interface\r\n\r\nPyIRT_SDT provides a command-line interface for quick analysis:\r\n\r\n```bash\r\npyirt-sdt analyze --input your_data.csv --output results.csv --model 4pl\r\n```\r\n\r\nFor more options, run:\r\n\r\n```bash\r\npyirt-sdt --help\r\n```\r\n\r\n## Contributing\r\n\r\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for more details.\r\n\r\n## License\r\n\r\nPyIRT_SDT is released under the [MIT License](LICENSE).\r\n\r\n## Contact\r\n\r\nFor any questions or issues, please open an issue on GitHub or contact [your email].\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Python IRT calculator with SDT outputs, too, handling continuous and sparse data.",
"version": "0.1.5",
"project_urls": {
"Homepage": "https://github.com/omarclaflin/pyIRT_SDT"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f05a5f23c5dbd42b43e614e994d36e72b013715448b3b2041e0067ffc0c19254",
"md5": "056e588991d175f4aaeea269977e9eb1",
"sha256": "1eac3d4cb84b38d4d9736e59223e075ad37b6762eca9197729781abb9f783e7f"
},
"downloads": -1,
"filename": "pyirt_sdt-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "056e588991d175f4aaeea269977e9eb1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 13765,
"upload_time": "2024-09-21T04:11:10",
"upload_time_iso_8601": "2024-09-21T04:11:10.150493Z",
"url": "https://files.pythonhosted.org/packages/f0/5a/5f23c5dbd42b43e614e994d36e72b013715448b3b2041e0067ffc0c19254/pyirt_sdt-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-21 04:11:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "omarclaflin",
"github_project": "pyIRT_SDT",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pyirt-sdt"
}