# You name it
> A hash function that outputs pseudo-random words.
Intended for use in machine-to-human interfaces.
Can convert any data into words that are:
* readable
* memorable
* rigorously reproducible across multiple platforms
* have a parameterized length (approximately 1-5 syllables)
* come in the style of several languages (e.g. English, Finnish, German)
* don't mean anything, but random generation of real words can happen very rarely
Not suitable for any security applications.
This algorithm will tell you with a very high probability that two data sets are different because
they produce different results, but the probability is not high enough to rely on it for security.
In this case, please use one of the modern hash functions, e.g. SHA-512.
### Mitigating the security
There is a non-zero risk that a hash collision happen, i.e. two different datasets result with the
same word.
There is a method for to increase the reliability by, e.g. by concatenating a result of `younameit`
of certain data with a result of `younameit` of a different hash function, e.g. `SHA-515` of the data, like:
```python
from hashlib import sha512
from younameit import Nomenclator
data = b"Any data"
data_hash = sha512(data).hexdigest()
nomen = Nomenclator("finnish")
first_word = nomen.from_any_to_word(data)
confirmation_word = nomen.from_any_to_word(data_hash)
readable_id = f"{first_word}-{confirmation_word}"
```
The generated `readable_id` in the code above is `homäen-kyyskionpa` for the provided input `data`.
In case of such an algorithm, the risk of result hash results collision is very low.
Even if a hash-collision happens for the used HASH-256 in the `first_word` it's very unlikely that it will occur
for the second time for a different hash algorithm used, here `SHA-512`.
However, do not use it if the security of data is a concern:
### Liability warning
> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> SOFTWARE.
### Brief explanation
A hash function is any function that can be used to map data of arbitrary size to fixed size values.
Whether you feed it a single byte or a 10MB chunk of data, the output of a hash function is
relatively short.
For example, you can use the `SHA-256` algorithm to shorten a sentence:
```bash
$ echo "give me a short name for this statement" | sha256sum
4dd4b6086d8700df41a41c401d650a1366273296b1b5665a5c89d848ae625cee -
```
The result in the case of SHA-256, called a `hexdigest`, has a fixed length, but it's almost
impossible for a human being to remember.
Imagine, could you ever say?
> Yes, I've seen this hash before.
Probably not. Hex digests are extremely difficult for the human brain to remember.
But `younameit` is a python package that translates data into a pseudo-random word.
```bash
$ echo "give me a short name for this statement" | younameit
dore
```
So younameit converted the sentence into a single word `dore`. Its results are readable and
memorable words that mean rather nothing.
Our brains are much better at remembering words, even the weirdest ones.
As soon as the input data changes, the resulting word is also different:
```bash
$ echo "give me another short name for this statement" | younameit
daden
```
And then it's very easy for us to notice that `dore` is different from `daden`.
### Any serializable object as an input
The tool will convert any serializable object into a word. It can be a JSON or YAML file (dictionary
order matters however), a large text file or a serialized binary message.
When run multiple times with the same data, it will return the same name every time, on any
contemporary python interpreter, on different machines. You can expect a strict reproduction of the
translation results, as long as the serialization of the data into bytes is reproducible (note that
the order of the elements changes the result).
This package will take the object you provide, convert it to bytes, and then feed it to the `sha256`
algorithm.
The input data can be anything complex that can be converted to a `bytes` object.
## Installation
It's a standard `pypi` package:
```bash
pip install younameit
```
## Parametrization
### Language
You can select one of several languages, i.e.:
- american-english
- british-english
- finnish
- french
- german
- italian
- spanish
The hashing results of american and british english are quite similar.
### Number of groups and parity
Composes words from alternating groups consisting of vowels and consonants.
Each group can contain one or more letters of a given type.
If the first group used in a word is a consonant group, its parity is called `even`.
In the opposite case, the parity is called `odd`, i.e., when the word starts with a group of vowels.
The available number of groups may vary, but so far, in October 2024 it is between 2 and 8.
During word generation, you can specify the number of groups and parity. Otherwise, they will be
chosen pseudo-randomly from the probability of their occurrence in the selected language.
## Usage
This python package provides importable python library and a bash entrypoint.
### In a shell
```bash
# take data from stdin:
$ echo "anything you can imagine" | younameit
# list available languages:
$ younameit --list-languages
# assign the result to a variable:
$ READABLE_ID=$(echo "anything you can imagine" | younameit)
# read the data from text file:
$ READABLE_ID=$(younameit -f ./path/to/the/file.txt)
# read the data from binary file:
$ READABLE_ID=$(younameit -b ./path/to/the/file.bin)
# output Finnish-alike language
$ READABLE_ID=$(younameit -f ./path/to/the/file.txt -l finnish)
# define certain number of groups, here 5,6,7 and odd word parity:
$ READABLE_ID=$(younameit -f ./path/to/the/file.txt -g 5,6,7 -p odd)
```
### In python
```py
from younameit import Nomenclator
# create a hashing object
nomen = Nomenclator("american-english")
# convert the word "one" with default settings
assert nomen.from_any_to_word("one") == "tappents"
# convert the word "one" to a word with two groups
assert nomen.from_any_to_word("one", 2) == "id"
# convert the word "one" to a word with three or four groups
assert nomen.from_any_to_word("one", 3, 4) == "wag"
# convert the word "one" to a word of odd parity
assert nomen.from_any_to_word("one", parity="odd") == "enasiar"
```
Raw data
{
"_id": null,
"home_page": "https://gitlab.com/kamichal/younameit",
"name": "younameit",
"maintainer": "Micha\u0142 Kaczmarczyk",
"docs_url": null,
"requires_python": null,
"maintainer_email": "michal.s.kaczmarczyk@gmail.com",
"keywords": "random word generator hash translator naming labeler",
"author": "Micha\u0142 Kaczmarczyk",
"author_email": "michal.s.kaczmarczyk@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/bb/24/45930e5a2150eb947a35605483f0ec15103e15fa627e8f3458153024a55b/younameit-0.1.2.tar.gz",
"platform": null,
"description": "# You name it\n\n> A hash function that outputs pseudo-random words.\n\nIntended for use in machine-to-human interfaces.\n\nCan convert any data into words that are:\n\n* readable\n* memorable\n* rigorously reproducible across multiple platforms\n* have a parameterized length (approximately 1-5 syllables)\n* come in the style of several languages (e.g. English, Finnish, German)\n* don't mean anything, but random generation of real words can happen very rarely\n\nNot suitable for any security applications.\n\nThis algorithm will tell you with a very high probability that two data sets are different because\nthey produce different results, but the probability is not high enough to rely on it for security.\n\nIn this case, please use one of the modern hash functions, e.g. SHA-512.\n\n### Mitigating the security\n\nThere is a non-zero risk that a hash collision happen, i.e. two different datasets result with the\nsame word.\n\nThere is a method for to increase the reliability by, e.g. by concatenating a result of `younameit`\nof certain data with a result of `younameit` of a different hash function, e.g. `SHA-515` of the data, like:\n\n```python\nfrom hashlib import sha512\nfrom younameit import Nomenclator\n\ndata = b\"Any data\"\ndata_hash = sha512(data).hexdigest()\n\nnomen = Nomenclator(\"finnish\")\nfirst_word = nomen.from_any_to_word(data)\nconfirmation_word = nomen.from_any_to_word(data_hash)\n\nreadable_id = f\"{first_word}-{confirmation_word}\"\n```\n\nThe generated `readable_id` in the code above is `hom\u00e4en-kyyskionpa` for the provided input `data`.\n\nIn case of such an algorithm, the risk of result hash results collision is very low.\nEven if a hash-collision happens for the used HASH-256 in the `first_word` it's very unlikely that it will occur\nfor the second time for a different hash algorithm used, here `SHA-512`.\n\nHowever, do not use it if the security of data is a concern:\n\n### Liability warning\n\n> THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n> FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n> AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n> LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n> OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n> SOFTWARE.\n\n### Brief explanation\n\nA hash function is any function that can be used to map data of arbitrary size to fixed size values.\nWhether you feed it a single byte or a 10MB chunk of data, the output of a hash function is\nrelatively short.\n\nFor example, you can use the `SHA-256` algorithm to shorten a sentence:\n\n```bash\n$ echo \"give me a short name for this statement\" | sha256sum\n4dd4b6086d8700df41a41c401d650a1366273296b1b5665a5c89d848ae625cee -\n```\n\nThe result in the case of SHA-256, called a `hexdigest`, has a fixed length, but it's almost\nimpossible for a human being to remember.\n\nImagine, could you ever say?\n\n> Yes, I've seen this hash before.\n\nProbably not. Hex digests are extremely difficult for the human brain to remember.\n\nBut `younameit` is a python package that translates data into a pseudo-random word.\n\n```bash\n$ echo \"give me a short name for this statement\" | younameit\ndore\n```\nSo younameit converted the sentence into a single word `dore`. Its results are readable and\nmemorable words that mean rather nothing.\n\nOur brains are much better at remembering words, even the weirdest ones.\n\nAs soon as the input data changes, the resulting word is also different:\n```bash\n$ echo \"give me another short name for this statement\" | younameit \ndaden\n```\n\nAnd then it's very easy for us to notice that `dore` is different from `daden`.\n\n### Any serializable object as an input\n\nThe tool will convert any serializable object into a word. It can be a JSON or YAML file (dictionary\norder matters however), a large text file or a serialized binary message.\n\nWhen run multiple times with the same data, it will return the same name every time, on any\ncontemporary python interpreter, on different machines. You can expect a strict reproduction of the\ntranslation results, as long as the serialization of the data into bytes is reproducible (note that\nthe order of the elements changes the result).\n\nThis package will take the object you provide, convert it to bytes, and then feed it to the `sha256`\nalgorithm.\nThe input data can be anything complex that can be converted to a `bytes` object.\n\n## Installation\n\nIt's a standard `pypi` package:\n\n```bash\npip install younameit\n```\n\n## Parametrization\n\n### Language\n\nYou can select one of several languages, i.e.:\n\n- american-english\n- british-english\n- finnish\n- french\n- german\n- italian\n- spanish\n\nThe hashing results of american and british english are quite similar.\n\n### Number of groups and parity\n\nComposes words from alternating groups consisting of vowels and consonants.\nEach group can contain one or more letters of a given type.\nIf the first group used in a word is a consonant group, its parity is called `even`.\nIn the opposite case, the parity is called `odd`, i.e., when the word starts with a group of vowels.\n\nThe available number of groups may vary, but so far, in October 2024 it is between 2 and 8.\n\nDuring word generation, you can specify the number of groups and parity. Otherwise, they will be\nchosen pseudo-randomly from the probability of their occurrence in the selected language.\n\n## Usage\n\nThis python package provides importable python library and a bash entrypoint.\n\n### In a shell\n\n```bash\n# take data from stdin:\n$ echo \"anything you can imagine\" | younameit\n\n# list available languages:\n$ younameit --list-languages\n\n# assign the result to a variable:\n$ READABLE_ID=$(echo \"anything you can imagine\" | younameit)\n\n# read the data from text file:\n$ READABLE_ID=$(younameit -f ./path/to/the/file.txt)\n\n# read the data from binary file:\n$ READABLE_ID=$(younameit -b ./path/to/the/file.bin)\n\n# output Finnish-alike language\n$ READABLE_ID=$(younameit -f ./path/to/the/file.txt -l finnish)\n\n# define certain number of groups, here 5,6,7 and odd word parity:\n$ READABLE_ID=$(younameit -f ./path/to/the/file.txt -g 5,6,7 -p odd)\n```\n\n### In python\n\n```py\nfrom younameit import Nomenclator\n\n# create a hashing object\nnomen = Nomenclator(\"american-english\")\n\n# convert the word \"one\" with default settings\nassert nomen.from_any_to_word(\"one\") == \"tappents\"\n\n# convert the word \"one\" to a word with two groups\nassert nomen.from_any_to_word(\"one\", 2) == \"id\"\n\n# convert the word \"one\" to a word with three or four groups\nassert nomen.from_any_to_word(\"one\", 3, 4) == \"wag\"\n\n# convert the word \"one\" to a word of odd parity\nassert nomen.from_any_to_word(\"one\", parity=\"odd\") == \"enasiar\"\n```\n",
"bugtrack_url": null,
"license": "Custom MIT license",
"summary": "Pseudo random word generator",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://gitlab.com/kamichal/younameit"
},
"split_keywords": [
"random",
"word",
"generator",
"hash",
"translator",
"naming",
"labeler"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bd504e4cde7a0e75eb9cded0ee7b819392048b208c239f8156f191a4b8a08639",
"md5": "b73b5584683d8ae29adcc9e5168a333b",
"sha256": "af5b8d11eb76a6479f15d3d255cbbe420ac93cc4ff8e57505ce43d9c9742b186"
},
"downloads": -1,
"filename": "younameit-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b73b5584683d8ae29adcc9e5168a333b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 88485,
"upload_time": "2024-10-18T09:00:53",
"upload_time_iso_8601": "2024-10-18T09:00:53.341946Z",
"url": "https://files.pythonhosted.org/packages/bd/50/4e4cde7a0e75eb9cded0ee7b819392048b208c239f8156f191a4b8a08639/younameit-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bb2445930e5a2150eb947a35605483f0ec15103e15fa627e8f3458153024a55b",
"md5": "4c94e5098b94672fd3a25ddd7af1b77b",
"sha256": "3552faee8d851b196440f32fc76992c5ff80efe51c89effca8e87714416ff872"
},
"downloads": -1,
"filename": "younameit-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "4c94e5098b94672fd3a25ddd7af1b77b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 82492,
"upload_time": "2024-10-18T09:00:54",
"upload_time_iso_8601": "2024-10-18T09:00:54.480377Z",
"url": "https://files.pythonhosted.org/packages/bb/24/45930e5a2150eb947a35605483f0ec15103e15fa627e8f3458153024a55b/younameit-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-18 09:00:54",
"github": false,
"gitlab": true,
"bitbucket": false,
"codeberg": false,
"gitlab_user": "kamichal",
"gitlab_project": "younameit",
"lcname": "younameit"
}