promptcraft


Namepromptcraft JSON
Version 0.4.8 PyPI version JSON
download
home_pagehttps://github.com/SuperBruceJia/promptcraft
SummaryPromptCraft: A Prompt Perturbation Toolkit for Prompt Robustness Analysis
upload_time2024-01-16 22:40:56
maintainer
docs_urlNone
authorShuyue Jia
requires_python
licenseMIT
keywords prompt perturbation prompt prompt toolkit natural language prompt large language models llm llms prompt engineering prompt generator prompt optimization prompt prompt robustness llm robustness llms adversarial attack character editing word manipulation sentence paraphrasing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PromptCraft
A Prompt Perturbation Toolkit for Prompt Robustness Analysis

[![Code License](https://img.shields.io/badge/Code%20License-MIT-green.svg)](CODE_LICENSE)
[![License](https://img.shields.io/badge/Running%20on-CPU-red.svg)](https://github.com/SuperBruceJia/promptcraft)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/)

# Table of Contents
- [Installation](#Installation)
- [Character Editing](#Character-Editing)
  - [Character Replacement](#Character-Replacement)
  - [Character Deletion](#Character-Deletion)
  - [Character Insertion](#Character-Insertion)
  - [Character Swap](#Character-Swap)
  - [Keyboard Typos](#Keyboard-Typos)
  - [Optical Character Recognition (OCR)](#Optical-Character-Recognition)
- [Word Manipulation](#Word-Manipulation)
  - [Synonym Replacement](#Synonym-Replacement)
  - [Word Insertion](#Word-Insertion)
  - [Word Swap](#Word-Swap)
  - [Word Deletion](#Word-Deletion)
  - [Insert Punctuation](#Insert-Punctuation)
  - [Word Split](#Word-Split)
- [Sentence Paraphrasing](#Sentence-Paraphrasing)
  - [Back Translation based on 🤗 Hugging Face MarianMTModel](#Back-Translation-by-Hugging-Face)
  - [Back Translation based on Google Translator](#Back-Translation-by-Google-Translator)
  - [Paraphrasing](#Paraphrasing)
  - [Formal Style](#Formal-Style)
  - [Casual Style](#Casual-Style)
  - [Passive Style](#Passive-Style)
  - [Active Style](#Active-Style)
- [Parallel Processing](#Parallel-Processing)
- [Structure of the Code](#Structure-of-the-Code)
- [Citation](#Citation)
- [Acknowledgement](#Acknowledgement)

# Installation
```shell
pip install promptcraft
```

# Character Editing
Character-level Prompt Perturbation\
`CharacterPerturb` class for manipulating character in a sentence
```python
from promptcraft import character

sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25  # Percentage of characters that will be edited
character_tool = character.CharacterPerturb(sentence=sentence, level=level)
```
## Character Replacement
Randomly replace `level` percentage characters from the sentence
```python
char_replace = character_tool.character_replacement()
```
## Character Deletion
Randomly delete `level` percentage characters from the sentence
```python
char_delete = character_tool.character_deletion()
```
## Character Insertion
Randomly insert `level` percentage characters to the sentence
```python
char_insert = character_tool.character_insertion()
```
## Character Swap
Randomly swap `level` percentage characters in the sentence\
NOTE: including self-swapping
```python
char_swap = character_tool.character_swap()
```
## Keyboard Typos
Randomly substitute `level` percentage characters in the sentence
with a randomly chosen character which is near the original character in the Keyboard (USA Full-size Layout)\
NOTE:\
(1) We applied `keyboard_distance=1`, i.e., the nearest character, number, or samples.\
(2) If it is a character, we randomly chose lowercase or uppercase.
```python
char_keyboard = character_tool.keyboard_typos()
```
## Optical Character Recognition
Randomly substitute `level` percentage characters in the sentence with a common OCR map error
```python
char_ocr = character_tool.optical_character_recognition()
```

# Word Manipulation
Word-level Prompt Perturbation
`WordPerturb` class for manipulating words in a sentence

NOTE: the number of words in a sentence is only the valid words without considering spaces, special symbols, and punctuations
```python
from promptcraft import word

sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25  # Percentage of words that will be manipulated
word_tool = word.WordPerturb(sentence=sentence, level=level)
```
## Synonym Replacement
Randomly choose $n$ words from the sentence that are not stop words.\
Replace each of these words with one of its synonyms chosen at random.\
Problem 1: Without any synonyms\
Problem 2: Fewer positions than needed positions
```python
word_synonym = word_tool.synonym_replacement()
```
## Word Insertion
Find a random synonym of a random word in the sentence that is not a stop word.\
Insert that synonym into a random position in the sentence.\
Do this $n$ times.
```python
word_insert = word_tool.word_insertion()
```
## Word Swap
Randomly choose two words in the sentence and swap their positions.\
Do this $n$ times.
```python
word_swap = word_tool.word_swap()
```
## Word Deletion
Each word in the sentence can be randomly removed with probability $p$.
```python
word_delete = word_tool.word_deletion()
```
## Insert Punctuation
Randomly insert punctuation in the sentence with probability $p$.
```python
word_punctuation = word_tool.insert_punctuation()
```
## Word Split
Randomly split a word to two tokens randomly
```python
word_split = word_tool.word_split()
```

# Sentence Paraphrasing
Sentence-level Prompt Perturbation\
`SentencePerturb` class for directly manipulating a sentence
```python
from promptcraft import sentence

sen = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
sentence_tool = sentence.SentencePerturb(sentence=sen)
```
## Back Translation by Hugging Face
Back translate the sentence (English $\rightarrow$ German $\rightarrow$ English) via 🤗 Hugging Face MarianMTModel 
```python
back_trans_hf = sentence_tool.back_translation_hugging_face()
```
## Back Translation by Google Translator
Back translate the sentence (English $\rightarrow$ German $\rightarrow$ English) via Google Translate API
```python
back_trans_google = sentence_tool.back_translation_google()
```
## Paraphrasing
Paraphrasing the sentence via [Parrot Paraphraser](https://github.com/PrithivirajDamodaran/Parrot_Paraphraser) 
considering\
(1) **Adequency**: Is the meaning preserved adequately?\
(2) **Fluency**: Is the paraphrase fluent English?\
(3) **Diversity**: (Lexical / Phrasal / Syntactical): How much has the paraphrase changed the original sentence?
```python
sen_paraphrase = sentence_tool.paraphrase()
```
## Formal Style
Transform the sentence style to Formal
```python
sen_formal = sentence_tool.formal()
```
## Casual Style
Transform the sentence style to Casual
```python
sen_casual = sentence_tool.casual()
```
## Passive Style
Transform the sentence style to Passive
```python
sen_passive = sentence_tool.passive()
```
## Active Style
Transform the sentence style to Active
```python
sen_active = sentence_tool.active()
```

# Parallel Processing
Since all the methods are executed on the CPU, 
they can be performed in parallel using the `multiprocessing` package.

# Structure of the Code
At the root of the project, you will see:
```text
.
├── LICENSE
├── README.md
├── promptcraft
│   ├── __init__.py
│   ├── character.py
│   ├── parrot.py
│   ├── sentence.py
│   ├── styleformer.py
│   └── word.py
├── setup.cfg
└── setup.py
```

# Citation
If you find our toolkit useful, please consider citing our repo and toolkit in your publications. We provide a BibTeX entry below.
```bibtex
@misc{JiaPromptCraft23,
      author = {Jia, Shuyue},
      title = {{PromptCraft}: A Prompt Perturbation Toolkit},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub Repository},
      howpublished = {\url{https://github.com/SuperBruceJia/promptcraft}},
}

@misc{JiaAwesomeLLM23,
      author = {Jia, Shuyue},
      title = {Awesome {LLM} Self-Consistency},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub Repository},
      howpublished = {\url{https://github.com/SuperBruceJia/Awesome-LLM-Self-Consistency}},
}

@misc{JiaAwesomeSTS23,
      author = {Jia, Shuyue},
      title = {Awesome Semantic Textual Similarity},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub Repository},
      howpublished = {\url{https://github.com/SuperBruceJia/Awesome-Semantic-Textual-Similarity}},
}
```

# Acknowledgement
This work was finished during my 2023 fall semester research rotation
at the Department of Electrical and Computer Engineering, Boston University.

<a href="https://www.bu.edu/"> <img width="250" src="https://raw.githubusercontent.com/SuperBruceJia/promptcraft/main/bu.png"></a>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/SuperBruceJia/promptcraft",
    "name": "promptcraft",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "prompt perturbation,prompt,prompt toolkit,natural language prompt,large language models,llm,llms,prompt engineering,prompt generator,prompt optimization,prompt,prompt robustness,llm robustness,llms adversarial attack,character editing,word manipulation,sentence paraphrasing",
    "author": "Shuyue Jia",
    "author_email": "shuyuej@ieee.org",
    "download_url": "https://files.pythonhosted.org/packages/7a/ad/a74db1a9a44a2eed31baf36b49e849e7553f80b1c09fb39271a8a60c97e8/promptcraft-0.4.8.tar.gz",
    "platform": null,
    "description": "# PromptCraft\nA Prompt Perturbation Toolkit for Prompt Robustness Analysis\n\n[![Code License](https://img.shields.io/badge/Code%20License-MIT-green.svg)](CODE_LICENSE)\n[![License](https://img.shields.io/badge/Running%20on-CPU-red.svg)](https://github.com/SuperBruceJia/promptcraft)\n[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/)\n\n# Table of Contents\n- [Installation](#Installation)\n- [Character Editing](#Character-Editing)\n  - [Character Replacement](#Character-Replacement)\n  - [Character Deletion](#Character-Deletion)\n  - [Character Insertion](#Character-Insertion)\n  - [Character Swap](#Character-Swap)\n  - [Keyboard Typos](#Keyboard-Typos)\n  - [Optical Character Recognition (OCR)](#Optical-Character-Recognition)\n- [Word Manipulation](#Word-Manipulation)\n  - [Synonym Replacement](#Synonym-Replacement)\n  - [Word Insertion](#Word-Insertion)\n  - [Word Swap](#Word-Swap)\n  - [Word Deletion](#Word-Deletion)\n  - [Insert Punctuation](#Insert-Punctuation)\n  - [Word Split](#Word-Split)\n- [Sentence Paraphrasing](#Sentence-Paraphrasing)\n  - [Back Translation based on \ud83e\udd17 Hugging Face MarianMTModel](#Back-Translation-by-Hugging-Face)\n  - [Back Translation based on Google Translator](#Back-Translation-by-Google-Translator)\n  - [Paraphrasing](#Paraphrasing)\n  - [Formal Style](#Formal-Style)\n  - [Casual Style](#Casual-Style)\n  - [Passive Style](#Passive-Style)\n  - [Active Style](#Active-Style)\n- [Parallel Processing](#Parallel-Processing)\n- [Structure of the Code](#Structure-of-the-Code)\n- [Citation](#Citation)\n- [Acknowledgement](#Acknowledgement)\n\n# Installation\n```shell\npip install promptcraft\n```\n\n# Character Editing\nCharacter-level Prompt Perturbation\\\n`CharacterPerturb` class for manipulating character in a sentence\n```python\nfrom promptcraft import character\n\nsentence = \"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May.\"\nlevel = 0.25  # Percentage of characters that will be edited\ncharacter_tool = character.CharacterPerturb(sentence=sentence, level=level)\n```\n## Character Replacement\nRandomly replace `level` percentage characters from the sentence\n```python\nchar_replace = character_tool.character_replacement()\n```\n## Character Deletion\nRandomly delete `level` percentage characters from the sentence\n```python\nchar_delete = character_tool.character_deletion()\n```\n## Character Insertion\nRandomly insert `level` percentage characters to the sentence\n```python\nchar_insert = character_tool.character_insertion()\n```\n## Character Swap\nRandomly swap `level` percentage characters in the sentence\\\nNOTE: including self-swapping\n```python\nchar_swap = character_tool.character_swap()\n```\n## Keyboard Typos\nRandomly substitute `level` percentage characters in the sentence\nwith a randomly chosen character which is near the original character in the Keyboard (USA Full-size Layout)\\\nNOTE:\\\n(1) We applied `keyboard_distance=1`, i.e., the nearest character, number, or samples.\\\n(2) If it is a character, we randomly chose lowercase or uppercase.\n```python\nchar_keyboard = character_tool.keyboard_typos()\n```\n## Optical Character Recognition\nRandomly substitute `level` percentage characters in the sentence with a common OCR map error\n```python\nchar_ocr = character_tool.optical_character_recognition()\n```\n\n# Word Manipulation\nWord-level Prompt Perturbation\n`WordPerturb` class for manipulating words in a sentence\n\nNOTE: the number of words in a sentence is only the valid words without considering spaces, special symbols, and punctuations\n```python\nfrom promptcraft import word\n\nsentence = \"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May.\"\nlevel = 0.25  # Percentage of words that will be manipulated\nword_tool = word.WordPerturb(sentence=sentence, level=level)\n```\n## Synonym Replacement\nRandomly choose $n$ words from the sentence that are not stop words.\\\nReplace each of these words with one of its synonyms chosen at random.\\\nProblem 1: Without any synonyms\\\nProblem 2: Fewer positions than needed positions\n```python\nword_synonym = word_tool.synonym_replacement()\n```\n## Word Insertion\nFind a random synonym of a random word in the sentence that is not a stop word.\\\nInsert that synonym into a random position in the sentence.\\\nDo this $n$ times.\n```python\nword_insert = word_tool.word_insertion()\n```\n## Word Swap\nRandomly choose two words in the sentence and swap their positions.\\\nDo this $n$ times.\n```python\nword_swap = word_tool.word_swap()\n```\n## Word Deletion\nEach word in the sentence can be randomly removed with probability $p$.\n```python\nword_delete = word_tool.word_deletion()\n```\n## Insert Punctuation\nRandomly insert punctuation in the sentence with probability $p$.\n```python\nword_punctuation = word_tool.insert_punctuation()\n```\n## Word Split\nRandomly split a word to two tokens randomly\n```python\nword_split = word_tool.word_split()\n```\n\n# Sentence Paraphrasing\nSentence-level Prompt Perturbation\\\n`SentencePerturb` class for directly manipulating a sentence\n```python\nfrom promptcraft import sentence\n\nsen = \"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May.\"\nsentence_tool = sentence.SentencePerturb(sentence=sen)\n```\n## Back Translation by Hugging Face\nBack translate the sentence (English $\\rightarrow$ German $\\rightarrow$ English) via \ud83e\udd17 Hugging Face MarianMTModel \n```python\nback_trans_hf = sentence_tool.back_translation_hugging_face()\n```\n## Back Translation by Google Translator\nBack translate the sentence (English $\\rightarrow$ German $\\rightarrow$ English) via Google Translate API\n```python\nback_trans_google = sentence_tool.back_translation_google()\n```\n## Paraphrasing\nParaphrasing the sentence via [Parrot Paraphraser](https://github.com/PrithivirajDamodaran/Parrot_Paraphraser) \nconsidering\\\n(1) **Adequency**: Is the meaning preserved adequately?\\\n(2) **Fluency**: Is the paraphrase fluent English?\\\n(3) **Diversity**: (Lexical / Phrasal / Syntactical): How much has the paraphrase changed the original sentence?\n```python\nsen_paraphrase = sentence_tool.paraphrase()\n```\n## Formal Style\nTransform the sentence style to Formal\n```python\nsen_formal = sentence_tool.formal()\n```\n## Casual Style\nTransform the sentence style to Casual\n```python\nsen_casual = sentence_tool.casual()\n```\n## Passive Style\nTransform the sentence style to Passive\n```python\nsen_passive = sentence_tool.passive()\n```\n## Active Style\nTransform the sentence style to Active\n```python\nsen_active = sentence_tool.active()\n```\n\n# Parallel Processing\nSince all the methods are executed on the CPU, \nthey can be performed in parallel using the `multiprocessing` package.\n\n# Structure of the Code\nAt the root of the project, you will see:\n```text\n.\n\u251c\u2500\u2500 LICENSE\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 promptcraft\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 __init__.py\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 character.py\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 parrot.py\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 sentence.py\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 styleformer.py\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 word.py\n\u251c\u2500\u2500 setup.cfg\n\u2514\u2500\u2500 setup.py\n```\n\n# Citation\nIf you find our toolkit useful, please consider citing our repo and toolkit in your publications. We provide a BibTeX entry below.\n```bibtex\n@misc{JiaPromptCraft23,\n      author = {Jia, Shuyue},\n      title = {{PromptCraft}: A Prompt Perturbation Toolkit},\n      year = {2023},\n      publisher = {GitHub},\n      journal = {GitHub Repository},\n      howpublished = {\\url{https://github.com/SuperBruceJia/promptcraft}},\n}\n\n@misc{JiaAwesomeLLM23,\n      author = {Jia, Shuyue},\n      title = {Awesome {LLM} Self-Consistency},\n      year = {2023},\n      publisher = {GitHub},\n      journal = {GitHub Repository},\n      howpublished = {\\url{https://github.com/SuperBruceJia/Awesome-LLM-Self-Consistency}},\n}\n\n@misc{JiaAwesomeSTS23,\n      author = {Jia, Shuyue},\n      title = {Awesome Semantic Textual Similarity},\n      year = {2023},\n      publisher = {GitHub},\n      journal = {GitHub Repository},\n      howpublished = {\\url{https://github.com/SuperBruceJia/Awesome-Semantic-Textual-Similarity}},\n}\n```\n\n# Acknowledgement\nThis work was finished during my 2023 fall semester research rotation\nat the Department of Electrical and Computer Engineering, Boston University.\n\n<a href=\"https://www.bu.edu/\"> <img width=\"250\" src=\"https://raw.githubusercontent.com/SuperBruceJia/promptcraft/main/bu.png\"></a>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PromptCraft: A Prompt Perturbation Toolkit for Prompt Robustness Analysis",
    "version": "0.4.8",
    "project_urls": {
        "Homepage": "https://github.com/SuperBruceJia/promptcraft"
    },
    "split_keywords": [
        "prompt perturbation",
        "prompt",
        "prompt toolkit",
        "natural language prompt",
        "large language models",
        "llm",
        "llms",
        "prompt engineering",
        "prompt generator",
        "prompt optimization",
        "prompt",
        "prompt robustness",
        "llm robustness",
        "llms adversarial attack",
        "character editing",
        "word manipulation",
        "sentence paraphrasing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "af5c777351b1f5b91be9424af41fd92068b8000810777a02e5fbe125a6fa8b7c",
                "md5": "f4fd1c47bc0aca16d4fc91a9864602d0",
                "sha256": "08e11e4c2b4c368391866d5baaa9a3fd492574c12ad9a30892680a19d88886d3"
            },
            "downloads": -1,
            "filename": "promptcraft-0.4.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f4fd1c47bc0aca16d4fc91a9864602d0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 16252,
            "upload_time": "2024-01-16T22:40:54",
            "upload_time_iso_8601": "2024-01-16T22:40:54.950912Z",
            "url": "https://files.pythonhosted.org/packages/af/5c/777351b1f5b91be9424af41fd92068b8000810777a02e5fbe125a6fa8b7c/promptcraft-0.4.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7aada74db1a9a44a2eed31baf36b49e849e7553f80b1c09fb39271a8a60c97e8",
                "md5": "b78bb61bd2cfb6cfa0fa89bcf1401f4b",
                "sha256": "216665872cebdd8ecb78bae44cbcde013cbd4f1a9a9ee194101477237ec0ada4"
            },
            "downloads": -1,
            "filename": "promptcraft-0.4.8.tar.gz",
            "has_sig": false,
            "md5_digest": "b78bb61bd2cfb6cfa0fa89bcf1401f4b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16713,
            "upload_time": "2024-01-16T22:40:56",
            "upload_time_iso_8601": "2024-01-16T22:40:56.669732Z",
            "url": "https://files.pythonhosted.org/packages/7a/ad/a74db1a9a44a2eed31baf36b49e849e7553f80b1c09fb39271a8a60c97e8/promptcraft-0.4.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-16 22:40:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SuperBruceJia",
    "github_project": "promptcraft",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "promptcraft"
}
        
Elapsed time: 0.16623s