# knowledge-clustering
[![PyPI](https://img.shields.io/pypi/v/knowledge-clustering.svg)](https://pypi.python.org/pypi/knowledge-clustering)
Command-line tool to help with the use of the [knowledge LaTeX package](https://ctan.org/pkg/knowledge).
A tutorial on how to use both `knowledge` and `knowledge-clustering` can be found [here](https://github.com/remimorvan/knowledge-examples).
## Principle
The goal of `knowledge-clustering` is to help the user write a LaTeX document with
the [knowledge package](https://ctan.org/pkg/knowledge).
It has three features:
- **Clustering**: provide suggestions to the user of what notions should be grouped together.
- **Add quotes**: find where you might have missed some quotes in your document.
- **Anchor points**: find where you might have missed anchor points in your document.
The **clustering** algorithm is meant to be used while writing your document, while the last two tools
should be used when your document is (nearly) ready to be published, to check if everything is right.
## Installation
To install (or upgrade) `knowledge-clustering`, you need to have Python 3.9 (or a more recent version), and then run
python3 -m pip install --upgrade knowledge-clustering
and then
knowledge init
To check if you have the latest version of `knowledge-clustering`, you can run
knowledge --version
## Clustering notions
### Syntax
```
Usage: knowledge cluster [OPTIONS]
Defines, as a comment and in the knowledge files, all the knowledges
occuring in the file.
Options:
-k, --knowledge FILE File containing the knowledges that are already
defined. Multiple files are allowed; new
knowledges will be written in the last one. If
the option is not specified, all .kl file in the
current directory (and subdirectory,
recursively) will be taken. If there are
multiple files, exactly one of them must end
with `default.kl`.
-d, --diagnose FILE Diagnose file produced by LaTeX. If the option
is not specified, the unique .diagnose file in
the current directory (and subdirectory,
recursively) is taken instead.
-l, --lang [en|fr] Language of your TeX document.
-S, --scope / --no-scope Print the scopes defined in the knowledge file
and print the possible meaning of those scope
inferred by knowledge-clustering.
-P, --print / --no-print Print all new knowledges.
-N, --no-update / --update Don't look on PyPI if a newer version of
knowledge-clustering is available.
-c, --config-file TEXT Specify the configuration file. By default the
configuration file in the folder
/Users/rmorvan/knowledge-
clustering/knowledge_clustering/data
corresponding to your language is used.
--help Show this message and exit.
```
### Example
Example files can be found in the `examples/` folder.
While writing some document, you have defined some knowledges in a file called `preservation.kl` (distinct
from your main `LaTeX`).
You continued writing your `LaTeX` document (not provided in the `examples/` folder)
for some time, and used some knowledges that were undefined.
When compiling, `LaTeX` and the [`knowledge package`](https://ctan.org/pkg/knowledge) gives you a warning
and writes in a `.diagnose` file some information explaining what went wrong. This `.diagnose` file contains
a section called "Undefined knowledges" containing all knowledges used in your main `LaTeX` file but not
defined in `preservation.kl`. We reproduced this section
in the `preservation.diagnose` file.
![Screenshot of the `preservation.kl` and `preservation.diagnose` files before running knowledge-clustering. `preservation.kl` contains three knowledges, while `preservation.diagnose` contains five undefined knowledges.](img/preservation-before.png "Files `preservation.kl` and `preservation.diagnose` before running knowledge-clustering")
Normally, you would add every undefined knowledge, one after the other, in your
`preservation.kl`. This is quite burdensome and can
largely be automated. This is precisely what `knowledge-clustering` does: after running
knowledge cluster -k preservation.kl -d preservation.diagnose
your file `preservation.diagnose` is left unchanged
but `preservation.kl` is updated with comments.
The `cluster` command is optional: you can also write `knowledge -k preservation.kl -d preservation.diagnose`.
![After running knowledge-clustering, the five undefined knowledges are included in the `preservation.kl` file as comments.](img/preservation-after.png "Files `preservation.kl` and `preservation.diagnose` after running knowledge-clustering`")
Now you simply have to check that the recommendations of `knowledge-clustering` are
correct, and uncomment those lines.
### Autofinder
If the current directory (and its recursive subdirectories) contains
a unique `.diagnose` file and a unique `.kl` file,
you can simply write `knowledge cluster` (or `knowledge`): the files will be automatically found.
### Multiple knowledge files
If you have **multiple knowledge files**, you can use the `-k` option multiple times.
For instance, you could write:
knowledge cluster -k 1.kl -k 2.kl -d ordinal.diagnose
Synonyms of knowledges defined in `1.kl` (resp. `2.kl`) will be defined, as comments,
in `1.kl` (resp. `2.kl`). New knowledges will always be added, as comments, to the last
file, which is `2.kl` in the example.
You can also use the autofinder in this case, using `knowledge cluster`
or `knowledge`: if multiple `.kl` files are present in the current directory (and
its recursive subdirectories), exactly one of them must end with `default.kl`—this is
where new knowledges will be put.
## Adding quotes
/!\ This feature is somewhat experimental.
```
Usage: knowledge addquotes [OPTIONS]
Finds knowledges defined in the knowledge files that appear in tex file
without quote symbols. Proposes to add quotes around them.
Options:
-t, --tex FILE Your TeX file. [required]
-k, --knowledge FILE File containing the knowledges that are already
defined. Multiple files are allowed; new
knowledges will be written in the last one. If
the option is not specified, all .kl file in the
current directory (and subdirectory,
recursively) will be taken. If there are
multiple files, exactly one of them must end
with `default.kl`.
-p, --print INTEGER When finding a match, number of lines (preceding
the match) that are printed in the prompt to the
user.
-N, --no-update / --update
--help Show this message and exit.
```
After running
knowledge addquotes -t mydocument.tex -k knowledges1.kl -k knowledges2.kl
your prompt will propose to add quotes around defined knowledges,
and to define synonyms of knowledges that occur in your TeX file. For instance, if
"algorithm" is a defined knowledge and "algorithms" occurs in your TeX file, then
it will propose to you to define "algorithms" as a synonym of the knowledge "algorithm",
and to add a pair of quotes around the string "algorithms" that occurs in your TeX file.
Whenever the algorithm finds a match for a knowledge, it will print the line of
the document where it found the match, and emphasize the string corresponding to the knowledge.
If you want to print more than one line, you can use the `-p` (or `--print`) option
to print more than one line.
## Finding missing anchor points
```
Usage: knowledge anchor [OPTIONS]
Prints warning when a knowledge is introduced but is not preceded by an
anchor point.
Options:
-t, --tex FILE Your TeX file. [required]
-s, --space INTEGER Number of characters tolerated between an anchor
point and the introduction of a knowledge.
(Default value: 200)
-N, --no-update / --update
--help Show this message and exit.
```
When one runs
knowledge anchor -t mydocument.tex
the tool will print the lines of the document containing the
introduction of a knowledge that is not preceded by an anchor point.
The tolerance on how far away the anchor point can be from the
introduction of a knowledge can be changed with the `-s` (or `--space`)
option. The default value is 150 characters (corresponding to 2-3 lines in a
TeX document).
## Devel using virtualenv
Using `venv` and the `--editable` option from `pip` allows for an easy
setup of a development environment that will match a future user install without
the hassle.
For bash and Zsh users
```bash
python3 -m venv kl.venv
source ./kl.venv/bin/activate
python3 -m pip install --editable .
```
For fish users
```fish
python3 -m venv kl.venv
source ./kl.venv/bin/activate.fish
python3 -m pip install --editable .
```
## FAQ
- `knowledge: command not found` after installing `knowledge-clustering`
> Make sure you have Python>=3.9.
- When running `knowledge`, I obtain a long message error indicating "Resource punkt not found."
> Run `knowledge init`.
- My shell doesn't autocomplete the command `knowledge`.
> Depending on whether you use `zsh` or `bash` write
>
> eval "`pip completion --<shellname>`"
>
> (where `<shellname>` is either `zsh` or `bash`)
> in your `.zshrc` (or `.bashrc`) file and then,
> either launch a new terminal or run `source ~/.zshrc`
> (or `source ~/.bashrc`).
- `Error: Got unexpected extra argument` when using multiple knowledge files.
> You should use the option `-k` before **every** knowledge file, like in
>
> knowledge cluster -k 1.kl -k 2.kl -d blabla.diagnose
- I've updated `knowledge-clustering` but I still don't have the last version (which can be checked using `knowledge --version`):
This can happen if you have multiple versions of `python` (and multiple versions
of `knowledge-clustering`).
> Type `where python3`, and uninstall `knowledge-clustering`
from everywhere (using `<path>/python3 -m pip uninstall knowledge-clustering`).
Try to then reinstall `knowledge-clustering`
by running `python3 -m pip install --upgrade knowledge-clustering`.
Raw data
{
"_id": null,
"home_page": "https://github.com/remimorvan/knowledge-clustering",
"name": "knowledge-clustering",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "knowledge :: latex :: clustering",
"author": "R\u00e9mi Morvan",
"author_email": "remi@morvan.xyz",
"download_url": "https://files.pythonhosted.org/packages/d3/03/f02c7bd37b9c361f66ac20bdd1005f90b4427b7c5fcd26156d402df9996c/knowledge_clustering-0.7.3.tar.gz",
"platform": null,
"description": "# knowledge-clustering\n\n[![PyPI](https://img.shields.io/pypi/v/knowledge-clustering.svg)](https://pypi.python.org/pypi/knowledge-clustering)\n\nCommand-line tool to help with the use of the [knowledge LaTeX package](https://ctan.org/pkg/knowledge).\nA tutorial on how to use both `knowledge` and `knowledge-clustering` can be found [here](https://github.com/remimorvan/knowledge-examples).\n\n## Principle\n\nThe goal of `knowledge-clustering` is to help the user write a LaTeX document with\nthe [knowledge package](https://ctan.org/pkg/knowledge).\nIt has three features:\n\n - **Clustering**: provide suggestions to the user of what notions should be grouped together.\n - **Add quotes**: find where you might have missed some quotes in your document.\n - **Anchor points**: find where you might have missed anchor points in your document.\n\nThe **clustering** algorithm is meant to be used while writing your document, while the last two tools\nshould be used when your document is (nearly) ready to be published, to check if everything is right.\n\n## Installation\n\nTo install (or upgrade) `knowledge-clustering`, you need to have Python 3.9 (or a more recent version), and then run\n\n python3 -m pip install --upgrade knowledge-clustering\n\nand then\n\n knowledge init\n \nTo check if you have the latest version of `knowledge-clustering`, you can run\n\n knowledge --version\n\n## Clustering notions \n\n### Syntax\n\n```\nUsage: knowledge cluster [OPTIONS]\n\n Defines, as a comment and in the knowledge files, all the knowledges\n occuring in the file.\n\nOptions:\n -k, --knowledge FILE File containing the knowledges that are already\n defined. Multiple files are allowed; new\n knowledges will be written in the last one. If\n the option is not specified, all .kl file in the\n current directory (and subdirectory,\n recursively) will be taken. If there are\n multiple files, exactly one of them must end\n with `default.kl`.\n -d, --diagnose FILE Diagnose file produced by LaTeX. If the option\n is not specified, the unique .diagnose file in\n the current directory (and subdirectory,\n recursively) is taken instead.\n -l, --lang [en|fr] Language of your TeX document.\n -S, --scope / --no-scope Print the scopes defined in the knowledge file\n and print the possible meaning of those scope\n inferred by knowledge-clustering.\n -P, --print / --no-print Print all new knowledges.\n -N, --no-update / --update Don't look on PyPI if a newer version of\n knowledge-clustering is available.\n -c, --config-file TEXT Specify the configuration file. By default the\n configuration file in the folder\n /Users/rmorvan/knowledge-\n clustering/knowledge_clustering/data\n corresponding to your language is used.\n --help Show this message and exit.\n```\n\n### Example\n\nExample files can be found in the `examples/` folder.\n\nWhile writing some document, you have defined some knowledges in a file called `preservation.kl` (distinct\nfrom your main `LaTeX`).\nYou continued writing your `LaTeX` document (not provided in the `examples/` folder)\nfor some time, and used some knowledges that were undefined.\nWhen compiling, `LaTeX` and the [`knowledge package`](https://ctan.org/pkg/knowledge) gives you a warning\nand writes in a `.diagnose` file some information explaining what went wrong. This `.diagnose` file contains\na section called \"Undefined knowledges\" containing all knowledges used in your main `LaTeX` file but not\ndefined in `preservation.kl`. We reproduced this section\nin the `preservation.diagnose` file.\n\n![Screenshot of the `preservation.kl` and `preservation.diagnose` files before running knowledge-clustering. `preservation.kl` contains three knowledges, while `preservation.diagnose` contains five undefined knowledges.](img/preservation-before.png \"Files `preservation.kl` and `preservation.diagnose` before running knowledge-clustering\")\n\nNormally, you would add every undefined knowledge, one after the other, in your\n`preservation.kl`. This is quite burdensome and can\nlargely be automated. This is precisely what `knowledge-clustering` does: after running\n\n knowledge cluster -k preservation.kl -d preservation.diagnose\n\nyour file `preservation.diagnose` is left unchanged\nbut `preservation.kl` is updated with comments.\n\nThe `cluster` command is optional: you can also write `knowledge -k preservation.kl -d preservation.diagnose`.\n\n![After running knowledge-clustering, the five undefined knowledges are included in the `preservation.kl` file as comments.](img/preservation-after.png \"Files `preservation.kl` and `preservation.diagnose` after running knowledge-clustering`\")\n\nNow you simply have to check that the recommendations of `knowledge-clustering` are\ncorrect, and uncomment those lines.\n\n### Autofinder\n\nIf the current directory (and its recursive subdirectories) contains\na unique `.diagnose` file and a unique `.kl` file,\nyou can simply write `knowledge cluster` (or `knowledge`): the files will be automatically found.\n\n### Multiple knowledge files\n\nIf you have **multiple knowledge files**, you can use the `-k` option multiple times.\nFor instance, you could write:\n\n\tknowledge cluster -k 1.kl -k 2.kl -d ordinal.diagnose\n\nSynonyms of knowledges defined in `1.kl` (resp. `2.kl`) will be defined, as comments,\nin `1.kl` (resp. `2.kl`). New knowledges will always be added, as comments, to the last\nfile, which is `2.kl` in the example.\n\nYou can also use the autofinder in this case, using `knowledge cluster`\nor `knowledge`: if multiple `.kl` files are present in the current directory (and\nits recursive subdirectories), exactly one of them must end with `default.kl`\u2014this is\nwhere new knowledges will be put.\n\n## Adding quotes\n\n/!\\ This feature is somewhat experimental.\n\n```\nUsage: knowledge addquotes [OPTIONS]\n\n Finds knowledges defined in the knowledge files that appear in tex file\n without quote symbols. Proposes to add quotes around them.\n\nOptions:\n -t, --tex FILE Your TeX file. [required]\n -k, --knowledge FILE File containing the knowledges that are already\n defined. Multiple files are allowed; new\n knowledges will be written in the last one. If\n the option is not specified, all .kl file in the\n current directory (and subdirectory,\n recursively) will be taken. If there are\n multiple files, exactly one of them must end\n with `default.kl`.\n -p, --print INTEGER When finding a match, number of lines (preceding\n the match) that are printed in the prompt to the\n user.\n -N, --no-update / --update\n --help Show this message and exit.\n```\n\nAfter running \n\n knowledge addquotes -t mydocument.tex -k knowledges1.kl -k knowledges2.kl\n\nyour prompt will propose to add quotes around defined knowledges,\nand to define synonyms of knowledges that occur in your TeX file. For instance, if\n\"algorithm\" is a defined knowledge and \"algorithms\" occurs in your TeX file, then\nit will propose to you to define \"algorithms\" as a synonym of the knowledge \"algorithm\",\nand to add a pair of quotes around the string \"algorithms\" that occurs in your TeX file.\n\nWhenever the algorithm finds a match for a knowledge, it will print the line of\nthe document where it found the match, and emphasize the string corresponding to the knowledge.\nIf you want to print more than one line, you can use the `-p` (or `--print`) option\nto print more than one line.\n\n## Finding missing anchor points\n\n```\nUsage: knowledge anchor [OPTIONS]\n\n Prints warning when a knowledge is introduced but is not preceded by an\n anchor point.\n\nOptions:\n -t, --tex FILE Your TeX file. [required]\n -s, --space INTEGER Number of characters tolerated between an anchor\n point and the introduction of a knowledge.\n (Default value: 200)\n -N, --no-update / --update\n --help Show this message and exit.\n```\n\nWhen one runs\n\n knowledge anchor -t mydocument.tex\n\nthe tool will print the lines of the document containing the\nintroduction of a knowledge that is not preceded by an anchor point.\nThe tolerance on how far away the anchor point can be from the\nintroduction of a knowledge can be changed with the `-s` (or `--space`)\noption. The default value is 150 characters (corresponding to 2-3 lines in a\nTeX document).\n\n## Devel using virtualenv\n\nUsing `venv` and the `--editable` option from `pip` allows for an easy\nsetup of a development environment that will match a future user install without\nthe hassle.\n\nFor bash and Zsh users\n\n```bash\npython3 -m venv kl.venv\nsource ./kl.venv/bin/activate\npython3 -m pip install --editable .\n```\n\nFor fish users\n\n```fish\npython3 -m venv kl.venv\nsource ./kl.venv/bin/activate.fish\npython3 -m pip install --editable .\n```\n\n## FAQ\n\n- `knowledge: command not found` after installing `knowledge-clustering`\n > Make sure you have Python>=3.9.\n \n- When running `knowledge`, I obtain a long message error indicating \"Resource punkt not found.\"\n > Run `knowledge init`.\n\n- My shell doesn't autocomplete the command `knowledge`.\n > Depending on whether you use `zsh` or `bash` write\n >\n > eval \"`pip completion --<shellname>`\"\n >\n > (where `<shellname>` is either `zsh` or `bash`)\n > in your `.zshrc` (or `.bashrc`) file and then,\n > either launch a new terminal or run `source ~/.zshrc`\n > (or `source ~/.bashrc`).\n\n- `Error: Got unexpected extra argument` when using multiple knowledge files.\n > You should use the option `-k` before **every** knowledge file, like in\n >\n > \tknowledge cluster -k 1.kl -k 2.kl -d blabla.diagnose \n\n- I've updated `knowledge-clustering` but I still don't have the last version (which can be checked using `knowledge --version`):\n This can happen if you have multiple versions of `python` (and multiple versions\n of `knowledge-clustering`).\n > Type `where python3`, and uninstall `knowledge-clustering`\n from everywhere (using `<path>/python3 -m pip uninstall knowledge-clustering`).\n Try to then reinstall `knowledge-clustering`\n by running `python3 -m pip install --upgrade knowledge-clustering`.\n",
"bugtrack_url": null,
"license": null,
"summary": "Automated notion clustering for the knowledge LaTeX package",
"version": "0.7.3",
"project_urls": {
"Bug Tracker": "https://github.com/remimorvan/knowledge-clustering/issues",
"Homepage": "https://github.com/remimorvan/knowledge-clustering"
},
"split_keywords": [
"knowledge",
"::",
"latex",
"::",
"clustering"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ac9bce6462d46a482454c2c1c8fcde2f2ec34e4016e2316e332cc069e3a39f11",
"md5": "9b3731ce6ea8b5774b30cf1edf02ced9",
"sha256": "e8462c2f8ee38d81214c3c789ea07978481ecc7b84ded7ac52d9bab91babd1a6"
},
"downloads": -1,
"filename": "knowledge_clustering-0.7.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9b3731ce6ea8b5774b30cf1edf02ced9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 31158,
"upload_time": "2024-10-28T10:00:53",
"upload_time_iso_8601": "2024-10-28T10:00:53.783432Z",
"url": "https://files.pythonhosted.org/packages/ac/9b/ce6462d46a482454c2c1c8fcde2f2ec34e4016e2316e332cc069e3a39f11/knowledge_clustering-0.7.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d303f02c7bd37b9c361f66ac20bdd1005f90b4427b7c5fcd26156d402df9996c",
"md5": "64cf2b08cbfdef71ccd133884eec6165",
"sha256": "3c658158f1f3dedfe201617cdf136a09751aec4bf65390dcf03e8cb73811e769"
},
"downloads": -1,
"filename": "knowledge_clustering-0.7.3.tar.gz",
"has_sig": false,
"md5_digest": "64cf2b08cbfdef71ccd133884eec6165",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 29837,
"upload_time": "2024-10-28T10:00:55",
"upload_time_iso_8601": "2024-10-28T10:00:55.180568Z",
"url": "https://files.pythonhosted.org/packages/d3/03/f02c7bd37b9c361f66ac20bdd1005f90b4427b7c5fcd26156d402df9996c/knowledge_clustering-0.7.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-28 10:00:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "remimorvan",
"github_project": "knowledge-clustering",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "knowledge-clustering"
}