# SlithyT
A tool for generating novel, plausible, and pronounceable words based on linguistic corpuses.
The name is a reference to the "slithy toves" in Lewis Carroll's poem "Jabberwocky".
(Code was written substantially by AI, although I did a fair amount of reviewing, criticizing, revising
and debugging.)
## Installation
```bash
pip install .
```
## Usage
Generate a word that looks/sounds like it fits with other words in a given
corpus. Similarity is determined partly by ngram analysis and partly by
pronunciation.
You can make your own corpus, or use pregenerated ones (in the data folder
of the package):
* Astronomy names (stars, galaxies, planets)
* Transliterated Greek, Latin, Hebrew, Egyptian names
* Harry Potter or Star Wars names
* Drug names
* Latin words from biology taxonomy (genus, species)
You can also use the whole dictionary as your corpus, in which case you will
get words with no particular flavor to them. A good corpus has at least a
couple hundred words in it.
By default, generated words are *novel*, meaning they won't appear in the
corpus you reference. You can also add a blocklist to avoid generating curse
words, words that violate trademarks or spam filters, etc.
All corpora and dictionary/block list files used by this tool are text
files having a single word per line, and can optionally be gzipped.
Sentiment analysis, pronounceability, and rhyming are moderately English-
centric, though the tolerate romance and germanic languages a bit as well.
However, they could be made to reflect the sensibilities of other language
communities by running build_phonetic_model.py and build_transcription_model.py
in the package's scripts folder. These generate cached patterns in
~/.slithyt/data.
```bash
# Generate 10 realistic words that sound like they belong in corpus. Make
# the words have a length of at least 5 characters.
slithyt generate --corpus path/to/your/corpus.txt
# Generate words that have a positive connotation due to sound symbolism
# (see https://en.wikipedia.org/wiki/Sound_symbolism), that have use n=4
# for ngram analysis. (The --ngram-size argument is a tradeoff. Default is 3.
# Bigger values make the resonance with the corpus stronger, but also make it
# harder to be creative; it may be impossible to generate words if you go too
# high. Smaller values give the algorithm more freedom in both size and
# character sequence, but the output might sound less like the corpus.)
slithyt generate --corpus path/to/corpus.txt --min-sentiment 0.8 --ngram-size 4
# Generate words that are at between 4 and 8 characters long, and that are at
# least moderately pronounceable. (Pronounceability depends partly on the
# speaker's judgment; slithyt uses a simple algorithm to predict scores from
# 0 (hardest) to 1 (easiest), but the corpus may affect how reasonable 0.5 is.
# Typically, the variety of generated word lengths matches the variety of
# word lengths in the corpus. These values constrain output but may make
# generation impossible, if nothing in the corpus is as small or as large as
# what was requested.)
slithyt generate --corpus path/to/corpus.txt --min-length 4 --max-length 8 --min-pronounceability 0.5
# Generate 5 words that rhyme with synergy
slithyt generate --count 5 --rhymes-with synergy
# Report the rhyming analysis for synergy. (Only known words are usable
# as a rhyming template; passing made-up words here will do nothing
# useful.)
slithyt rhyme synergy
# Check to see whether a particular made-up word would pass certain tests.
slithyt validate synerjee
```
Raw data
{
"_id": null,
"home_page": null,
"name": "slithyt",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "word generation, procedural generation, nlp, linguistics, naming",
"author": null,
"author_email": "Daniel Hardman <daniel.hardman@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/2d/c0/6aa43ae56b766150222ee2cc0e61e5cf0cd782577655b5371104cf4cbdaa/slithyt-1.0.0.tar.gz",
"platform": null,
"description": "# SlithyT\n\nA tool for generating novel, plausible, and pronounceable words based on linguistic corpuses.\n\nThe name is a reference to the \"slithy toves\" in Lewis Carroll's poem \"Jabberwocky\".\n\n(Code was written substantially by AI, although I did a fair amount of reviewing, criticizing, revising\nand debugging.)\n\n## Installation\n\n```bash\npip install .\n```\n\n## Usage\n\nGenerate a word that looks/sounds like it fits with other words in a given\ncorpus. Similarity is determined partly by ngram analysis and partly by\npronunciation.\n\nYou can make your own corpus, or use pregenerated ones (in the data folder\nof the package):\n\n* Astronomy names (stars, galaxies, planets)\n* Transliterated Greek, Latin, Hebrew, Egyptian names\n* Harry Potter or Star Wars names\n* Drug names\n* Latin words from biology taxonomy (genus, species)\n\nYou can also use the whole dictionary as your corpus, in which case you will\nget words with no particular flavor to them. A good corpus has at least a\ncouple hundred words in it.\n\nBy default, generated words are *novel*, meaning they won't appear in the\ncorpus you reference. You can also add a blocklist to avoid generating curse\nwords, words that violate trademarks or spam filters, etc.\n\nAll corpora and dictionary/block list files used by this tool are text\nfiles having a single word per line, and can optionally be gzipped.\nSentiment analysis, pronounceability, and rhyming are moderately English-\ncentric, though the tolerate romance and germanic languages a bit as well.\nHowever, they could be made to reflect the sensibilities of other language\ncommunities by running build_phonetic_model.py and build_transcription_model.py\nin the package's scripts folder. These generate cached patterns in \n~/.slithyt/data.\n\n```bash\n# Generate 10 realistic words that sound like they belong in corpus. Make\n# the words have a length of at least 5 characters.\nslithyt generate --corpus path/to/your/corpus.txt\n\n# Generate words that have a positive connotation due to sound symbolism\n# (see https://en.wikipedia.org/wiki/Sound_symbolism), that have use n=4\n# for ngram analysis. (The --ngram-size argument is a tradeoff. Default is 3.\n# Bigger values make the resonance with the corpus stronger, but also make it\n# harder to be creative; it may be impossible to generate words if you go too\n# high. Smaller values give the algorithm more freedom in both size and\n# character sequence, but the output might sound less like the corpus.)\nslithyt generate --corpus path/to/corpus.txt --min-sentiment 0.8 --ngram-size 4\n\n# Generate words that are at between 4 and 8 characters long, and that are at\n# least moderately pronounceable. (Pronounceability depends partly on the\n# speaker's judgment; slithyt uses a simple algorithm to predict scores from\n# 0 (hardest) to 1 (easiest), but the corpus may affect how reasonable 0.5 is.\n# Typically, the variety of generated word lengths matches the variety of\n# word lengths in the corpus. These values constrain output but may make\n# generation impossible, if nothing in the corpus is as small or as large as\n# what was requested.)\nslithyt generate --corpus path/to/corpus.txt --min-length 4 --max-length 8 --min-pronounceability 0.5\n\n# Generate 5 words that rhyme with synergy\nslithyt generate --count 5 --rhymes-with synergy\n\n# Report the rhyming analysis for synergy. (Only known words are usable\n# as a rhyming template; passing made-up words here will do nothing\n# useful.)\nslithyt rhyme synergy\n\n# Check to see whether a particular made-up word would pass certain tests.\nslithyt validate synerjee\n```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "A tool for generating novel, pronounceable words based on linguistic corpuses.",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/dhh1128/slithyt/issues",
"Homepage": "https://github.com/dhh1128/slithyt"
},
"split_keywords": [
"word generation",
" procedural generation",
" nlp",
" linguistics",
" naming"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2ca920c83d2cd1280ea8f368224bf8e5162c14377768e3511d8bced78d6ec587",
"md5": "7c7bcd6326c0804edc71910dfba1e65f",
"sha256": "7ea63d7f048b046d9e88944c4ff2469b78efd6ad4b8d7c437b3492b9f5192e47"
},
"downloads": -1,
"filename": "slithyt-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7c7bcd6326c0804edc71910dfba1e65f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 13276,
"upload_time": "2025-08-09T22:04:55",
"upload_time_iso_8601": "2025-08-09T22:04:55.012097Z",
"url": "https://files.pythonhosted.org/packages/2c/a9/20c83d2cd1280ea8f368224bf8e5162c14377768e3511d8bced78d6ec587/slithyt-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2dc06aa43ae56b766150222ee2cc0e61e5cf0cd782577655b5371104cf4cbdaa",
"md5": "cd3ad29dc262d7155cfdb2c590397ce5",
"sha256": "0f3255cdc23c6898d9efbeaaebcda601c63e3b018327239474c7fae4a9dceb27"
},
"downloads": -1,
"filename": "slithyt-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "cd3ad29dc262d7155cfdb2c590397ce5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 15598,
"upload_time": "2025-08-09T22:04:56",
"upload_time_iso_8601": "2025-08-09T22:04:56.309677Z",
"url": "https://files.pythonhosted.org/packages/2d/c0/6aa43ae56b766150222ee2cc0e61e5cf0cd782577655b5371104cf4cbdaa/slithyt-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-09 22:04:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dhh1128",
"github_project": "slithyt",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "slithyt"
}