knowledge-clustering


Nameknowledge-clustering JSON
Version 0.7.0 PyPI version JSON
download
home_pagehttps://github.com/remimorvan/knowledge-clustering
SummaryAutomated notion clustering for the knowledge LaTeX package
upload_time2024-02-23 22:52:12
maintainer
docs_urlNone
authorRémi Morvan
requires_python>=3.9
license
keywords knowledge :: latex :: clustering
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # knowledge-clustering

[![PyPI](https://img.shields.io/pypi/v/knowledge-clustering.svg)](https://pypi.python.org/pypi/knowledge-clustering)

Command-line tool to help with the use of the [knowledge LaTeX package](https://ctan.org/pkg/knowledge).
A tutorial on how to use both `knowledge` and `knowledge-clustering` can be found [here](https://github.com/remimorvan/knowledge-examples).

## Principle

The goal of `knowledge-clustering` is to help the user write a LaTeX document with
the [knowledge package](https://ctan.org/pkg/knowledge).
It has three features:

  - **Clustering**: provide suggestions to the user of what notions should be grouped together.
  - **Add quotes**: find where you might have missed some quotes in your document.
  - **Anchor points**: find where you might have missed anchor points in your document.

The **clustering** algorithm is meant to be used while writing your document, while the last two tools
should be used when your document is (nearly) ready to be published, to check if everything is right.

## Installation

To install (or upgrade) `knowledge-clustering`, you need to have Python 3.9 (or a more recent version), and then run

    pip3 install --upgrade knowledge-clustering

and then

    knowledge init
    
To check if you have the latest version of `knowledge-clustering`, you can run

    knowledge --version

## Clustering notions 

### Syntax

```
Usage: knowledge cluster [OPTIONS]

  Defines, as a comment and in the knowledge files, all the knowledges
  occuring in the file.

Options:
  -k, --knowledge FILE        File containing the knowledges that are already
                              defined. Multiple files are allowed; new
                              knowledges will be written in the last one. If
                              the option is not specified, all .kl file in the
                              current directory (and subdirectory,
                              recursively) will be taken. If there are
                              multiple files, exactly one of them must end
                              with `default.kl`.
  -d, --diagnose FILE         Diagnose file produced by LaTeX. If the option
                              is not specified, the unique .diagnose file in
                              the current directory (and subdirectory,
                              recursively) is taken instead.
  -l, --lang [en|fr]          Language of your TeX document.
  -S, --scope / --no-scope    Print the scopes defined in the knowledge file
                              and print the possible meaning of those scope
                              inferred by knowledge-clustering.
  -N, --no-update / --update  Don't look on PyPI if a newer version of
                              knowledge-clustering is available.
  -c, --config-file TEXT      Specify the configuration file. By default the
                              configuration file in the folder
                              /Users/rmorvan/knowledge-
                              clustering/knowledge_clustering/data
                              corresponding to your language is used.
  --help                      Show this message and exit.
```

### Example

Example files can be found in the `examples/` folder.

While writing some document, you have defined some knowledges in a file called `preservation.kl` (distinct
from your main `LaTeX`).
You continued writing your `LaTeX` document (not provided in the `examples/` folder)
for some time, and used some knowledges that were undefined.
When compiling, `LaTeX` and the [`knowledge package`](https://ctan.org/pkg/knowledge) gives you a warning
and writes in a `.diagnose` file some information explaining what went wrong. This `.diagnose` file contains
a section called "Undefined knowledges" containing all knowledges used in your main `LaTeX` file but not
defined in `preservation.kl`. We reproduced this section
in the `preservation.diagnose` file.

![Screenshot of the `preservation.kl` and `preservation.diagnose` files before running knowledge-clustering. `preservation.kl` contains three knowledges, while `preservation.diagnose` contains five undefined knowledges.](img/preservation-before.png "Files `preservation.kl` and `preservation.diagnose` before running knowledge-clustering")

Normally, you would add every undefined knowledge, one after the other, in your
`preservation.kl`. This is quite burdensome and can
largely be automated. This is precisely what `knowledge-clustering` does: after running

    knowledge cluster -k preservation.kl -d preservation.diagnose

your file `preservation.diagnose` is left unchanged
but `preservation.kl` is updated with comments.

The `cluster` command is optional: you can also write `knowledge -k preservation.kl -d preservation.diagnose`.

![After running knowledge-clustering, the five undefined knowledges are included in the `preservation.kl` file as comments.](img/preservation-after.png "Files `preservation.kl` and `preservation.diagnose` after running knowledge-clustering`")

Now you simply have to check that the recommendations of `knowledge-clustering` are
correct, and uncomment those lines.

### Autofinder

If the current directory (and its recursive subdirectories) contains
a unique `.diagnose` file and a unique `.kl` file,
you can simply write `knowledge cluster` (or `knowledge`): the files will be automatically found.

### Multiple knowledge files

If you have **multiple knowledge files**, you can use the `-k` option multiple times.
For instance, you could write:

	knowledge cluster -k 1.kl -k 2.kl -d ordinal.diagnose

Synonyms of knowledges defined in `1.kl` (resp. `2.kl`) will be defined, as comments,
in `1.kl` (resp. `2.kl`). New knowledges will always be added, as comments, to the last
file, which is `2.kl` in the example.

You can also use the autofinder in this case, using `knowledge cluster`
or `knowledge`: if multiple `.kl` files are present in the current directory (and
its recursive subdirectories), exactly one of them must end with `default.kl`—this is
where new knowledges will be put.

## Adding quotes

/!\ This feature is somewhat experimental.

```
Usage: knowledge addquotes [OPTIONS]

  Finds knowledges defined in the knowledge files that appear in tex file
  without quote symbols. Proposes to add quotes around them.

Options:
  -t, --tex FILE              Your TeX file.  [required]
  -k, --knowledge FILE        File containing the knowledges that are already
                              defined. Multiple files are allowed; new
                              knowledges will be written in the last one. If
                              the option is not specified, all .kl file in the
                              current directory (and subdirectory,
                              recursively) will be taken. If there are
                              multiple files, exactly one of them must end
                              with `default.kl`.
  -p, --print INTEGER         When finding a match, number of lines (preceding
                              the match) that are printed in the prompt to the
                              user.
  -N, --no-update / --update
  --help                      Show this message and exit.
```

After running 

    knowledge addquotes -t mydocument.tex -k knowledges1.kl -k knowledges2.kl

your prompt will propose to add quotes around defined knowledges,
and to define synonyms of knowledges that occur in your TeX file. For instance, if
"algorithm" is a defined knowledge and "algorithms" occurs in your TeX file, then
it will propose to you to define "algorithms" as a synonym of the knowledge "algorithm",
and to add a pair of quotes around the string "algorithms" that occurs in your TeX file.

Whenever the algorithm finds a match for a knowledge, it will print the line of
the document where it found the match, and emphasize the string corresponding to the knowledge.
If you want to print more than one line, you can use the `-p` (or `--print`) option
to print more than one line.

## Finding missing anchor points

```
Usage: knowledge anchor [OPTIONS]

  Prints warning when a knowledge is introduced but is not preceded by an
  anchor point.

Options:
  -t, --tex FILE              Your TeX file.  [required]
  -s, --space INTEGER         Number of characters tolerated between an anchor
                              point and the introduction of a knowledge.
                              (Default value: 200)
  -N, --no-update / --update
  --help                      Show this message and exit.
```

When one runs

    knowledge anchor -t mydocument.tex

the tool will print the lines of the document containing the
introduction of a knowledge that is not preceded by an anchor point.
The tolerance on how far away the anchor point can be from the
introduction of a knowledge can be changed with the `-s` (or `--space`)
option. The default value is 150 characters (corresponding to 2-3 lines in a
TeX document).

## Devel using virtualenv

Using `venv` and the `--editable` option from `pip3` allows for an easy
setup of a development environment that will match a future user install without
the hassle.

For bash and Zsh users

```bash
python3 -m venv kl.venv
source ./kl.venv/bin/activate
pip3 install --editable .
```

For fish users

```fish
python3 -m venv kl.venv
source ./kl.venv/bin/activate.fish
pip3 install --editable .
```

## FAQ

- `knowledge: command not found` after installing `knowledge-clustering`
  > Make sure you have Python>=3.9.
  
- When running `knowledge`, I obtain a long message error indicating "Resource punkt not found."
  > Run `knowledge init`.

- My shell doesn't autocomplete the command `knowledge`.
  > Depending on whether you use `zsh` or `bash` write
  >
  >     eval "`pip completion --<shellname>`"
  >
  > (where `<shellname>` is either `zsh` or `bash`)
  > in your `.zshrc` (or `.bashrc`) file and then,
  > either launch a new terminal or run `source ~/.zshrc`
  > (or `source ~/.bashrc`).

- `Error: Got unexpected extra argument` when using multiple knowledge files.
  > You should use the option `-k` before **every** knowledge file, like in
  >
  > 	knowledge cluster -k 1.kl -k 2.kl -d blabla.diagnose 

- I've updated `knowledge-clustering` but I still don't have the last version (which can be checked using `knowledge --version`):
  This can happen if you have multiple versions of `python` (and multiple versions
  of `knowledge-clustering`).
  > Type `where python3`, and uninstall `knowledge-clustering`
  from everywhere (using `<path>/python3 -m pip uninstall knowledge-clustering`)
  except your main version of python. Try to then upgrade `knowledge-clustering`
  by running `pip3 install --upgrade knowledge-clustering`.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/remimorvan/knowledge-clustering",
    "name": "knowledge-clustering",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "knowledge :: latex :: clustering",
    "author": "R\u00e9mi Morvan",
    "author_email": "remi@morvan.xyz",
    "download_url": "https://files.pythonhosted.org/packages/b5/70/fb475fa26ff709034ed1cdd40b84e0c98e1d5fcd6427efe91ebb52f265ba/knowledge-clustering-0.7.0.tar.gz",
    "platform": null,
    "description": "# knowledge-clustering\n\n[![PyPI](https://img.shields.io/pypi/v/knowledge-clustering.svg)](https://pypi.python.org/pypi/knowledge-clustering)\n\nCommand-line tool to help with the use of the [knowledge LaTeX package](https://ctan.org/pkg/knowledge).\nA tutorial on how to use both `knowledge` and `knowledge-clustering` can be found [here](https://github.com/remimorvan/knowledge-examples).\n\n## Principle\n\nThe goal of `knowledge-clustering` is to help the user write a LaTeX document with\nthe [knowledge package](https://ctan.org/pkg/knowledge).\nIt has three features:\n\n  - **Clustering**: provide suggestions to the user of what notions should be grouped together.\n  - **Add quotes**: find where you might have missed some quotes in your document.\n  - **Anchor points**: find where you might have missed anchor points in your document.\n\nThe **clustering** algorithm is meant to be used while writing your document, while the last two tools\nshould be used when your document is (nearly) ready to be published, to check if everything is right.\n\n## Installation\n\nTo install (or upgrade) `knowledge-clustering`, you need to have Python 3.9 (or a more recent version), and then run\n\n    pip3 install --upgrade knowledge-clustering\n\nand then\n\n    knowledge init\n    \nTo check if you have the latest version of `knowledge-clustering`, you can run\n\n    knowledge --version\n\n## Clustering notions \n\n### Syntax\n\n```\nUsage: knowledge cluster [OPTIONS]\n\n  Defines, as a comment and in the knowledge files, all the knowledges\n  occuring in the file.\n\nOptions:\n  -k, --knowledge FILE        File containing the knowledges that are already\n                              defined. Multiple files are allowed; new\n                              knowledges will be written in the last one. If\n                              the option is not specified, all .kl file in the\n                              current directory (and subdirectory,\n                              recursively) will be taken. If there are\n                              multiple files, exactly one of them must end\n                              with `default.kl`.\n  -d, --diagnose FILE         Diagnose file produced by LaTeX. If the option\n                              is not specified, the unique .diagnose file in\n                              the current directory (and subdirectory,\n                              recursively) is taken instead.\n  -l, --lang [en|fr]          Language of your TeX document.\n  -S, --scope / --no-scope    Print the scopes defined in the knowledge file\n                              and print the possible meaning of those scope\n                              inferred by knowledge-clustering.\n  -N, --no-update / --update  Don't look on PyPI if a newer version of\n                              knowledge-clustering is available.\n  -c, --config-file TEXT      Specify the configuration file. By default the\n                              configuration file in the folder\n                              /Users/rmorvan/knowledge-\n                              clustering/knowledge_clustering/data\n                              corresponding to your language is used.\n  --help                      Show this message and exit.\n```\n\n### Example\n\nExample files can be found in the `examples/` folder.\n\nWhile writing some document, you have defined some knowledges in a file called `preservation.kl` (distinct\nfrom your main `LaTeX`).\nYou continued writing your `LaTeX` document (not provided in the `examples/` folder)\nfor some time, and used some knowledges that were undefined.\nWhen compiling, `LaTeX` and the [`knowledge package`](https://ctan.org/pkg/knowledge) gives you a warning\nand writes in a `.diagnose` file some information explaining what went wrong. This `.diagnose` file contains\na section called \"Undefined knowledges\" containing all knowledges used in your main `LaTeX` file but not\ndefined in `preservation.kl`. We reproduced this section\nin the `preservation.diagnose` file.\n\n![Screenshot of the `preservation.kl` and `preservation.diagnose` files before running knowledge-clustering. `preservation.kl` contains three knowledges, while `preservation.diagnose` contains five undefined knowledges.](img/preservation-before.png \"Files `preservation.kl` and `preservation.diagnose` before running knowledge-clustering\")\n\nNormally, you would add every undefined knowledge, one after the other, in your\n`preservation.kl`. This is quite burdensome and can\nlargely be automated. This is precisely what `knowledge-clustering` does: after running\n\n    knowledge cluster -k preservation.kl -d preservation.diagnose\n\nyour file `preservation.diagnose` is left unchanged\nbut `preservation.kl` is updated with comments.\n\nThe `cluster` command is optional: you can also write `knowledge -k preservation.kl -d preservation.diagnose`.\n\n![After running knowledge-clustering, the five undefined knowledges are included in the `preservation.kl` file as comments.](img/preservation-after.png \"Files `preservation.kl` and `preservation.diagnose` after running knowledge-clustering`\")\n\nNow you simply have to check that the recommendations of `knowledge-clustering` are\ncorrect, and uncomment those lines.\n\n### Autofinder\n\nIf the current directory (and its recursive subdirectories) contains\na unique `.diagnose` file and a unique `.kl` file,\nyou can simply write `knowledge cluster` (or `knowledge`): the files will be automatically found.\n\n### Multiple knowledge files\n\nIf you have **multiple knowledge files**, you can use the `-k` option multiple times.\nFor instance, you could write:\n\n\tknowledge cluster -k 1.kl -k 2.kl -d ordinal.diagnose\n\nSynonyms of knowledges defined in `1.kl` (resp. `2.kl`) will be defined, as comments,\nin `1.kl` (resp. `2.kl`). New knowledges will always be added, as comments, to the last\nfile, which is `2.kl` in the example.\n\nYou can also use the autofinder in this case, using `knowledge cluster`\nor `knowledge`: if multiple `.kl` files are present in the current directory (and\nits recursive subdirectories), exactly one of them must end with `default.kl`\u2014this is\nwhere new knowledges will be put.\n\n## Adding quotes\n\n/!\\ This feature is somewhat experimental.\n\n```\nUsage: knowledge addquotes [OPTIONS]\n\n  Finds knowledges defined in the knowledge files that appear in tex file\n  without quote symbols. Proposes to add quotes around them.\n\nOptions:\n  -t, --tex FILE              Your TeX file.  [required]\n  -k, --knowledge FILE        File containing the knowledges that are already\n                              defined. Multiple files are allowed; new\n                              knowledges will be written in the last one. If\n                              the option is not specified, all .kl file in the\n                              current directory (and subdirectory,\n                              recursively) will be taken. If there are\n                              multiple files, exactly one of them must end\n                              with `default.kl`.\n  -p, --print INTEGER         When finding a match, number of lines (preceding\n                              the match) that are printed in the prompt to the\n                              user.\n  -N, --no-update / --update\n  --help                      Show this message and exit.\n```\n\nAfter running \n\n    knowledge addquotes -t mydocument.tex -k knowledges1.kl -k knowledges2.kl\n\nyour prompt will propose to add quotes around defined knowledges,\nand to define synonyms of knowledges that occur in your TeX file. For instance, if\n\"algorithm\" is a defined knowledge and \"algorithms\" occurs in your TeX file, then\nit will propose to you to define \"algorithms\" as a synonym of the knowledge \"algorithm\",\nand to add a pair of quotes around the string \"algorithms\" that occurs in your TeX file.\n\nWhenever the algorithm finds a match for a knowledge, it will print the line of\nthe document where it found the match, and emphasize the string corresponding to the knowledge.\nIf you want to print more than one line, you can use the `-p` (or `--print`) option\nto print more than one line.\n\n## Finding missing anchor points\n\n```\nUsage: knowledge anchor [OPTIONS]\n\n  Prints warning when a knowledge is introduced but is not preceded by an\n  anchor point.\n\nOptions:\n  -t, --tex FILE              Your TeX file.  [required]\n  -s, --space INTEGER         Number of characters tolerated between an anchor\n                              point and the introduction of a knowledge.\n                              (Default value: 200)\n  -N, --no-update / --update\n  --help                      Show this message and exit.\n```\n\nWhen one runs\n\n    knowledge anchor -t mydocument.tex\n\nthe tool will print the lines of the document containing the\nintroduction of a knowledge that is not preceded by an anchor point.\nThe tolerance on how far away the anchor point can be from the\nintroduction of a knowledge can be changed with the `-s` (or `--space`)\noption. The default value is 150 characters (corresponding to 2-3 lines in a\nTeX document).\n\n## Devel using virtualenv\n\nUsing `venv` and the `--editable` option from `pip3` allows for an easy\nsetup of a development environment that will match a future user install without\nthe hassle.\n\nFor bash and Zsh users\n\n```bash\npython3 -m venv kl.venv\nsource ./kl.venv/bin/activate\npip3 install --editable .\n```\n\nFor fish users\n\n```fish\npython3 -m venv kl.venv\nsource ./kl.venv/bin/activate.fish\npip3 install --editable .\n```\n\n## FAQ\n\n- `knowledge: command not found` after installing `knowledge-clustering`\n  > Make sure you have Python>=3.9.\n  \n- When running `knowledge`, I obtain a long message error indicating \"Resource punkt not found.\"\n  > Run `knowledge init`.\n\n- My shell doesn't autocomplete the command `knowledge`.\n  > Depending on whether you use `zsh` or `bash` write\n  >\n  >     eval \"`pip completion --<shellname>`\"\n  >\n  > (where `<shellname>` is either `zsh` or `bash`)\n  > in your `.zshrc` (or `.bashrc`) file and then,\n  > either launch a new terminal or run `source ~/.zshrc`\n  > (or `source ~/.bashrc`).\n\n- `Error: Got unexpected extra argument` when using multiple knowledge files.\n  > You should use the option `-k` before **every** knowledge file, like in\n  >\n  > \tknowledge cluster -k 1.kl -k 2.kl -d blabla.diagnose \n\n- I've updated `knowledge-clustering` but I still don't have the last version (which can be checked using `knowledge --version`):\n  This can happen if you have multiple versions of `python` (and multiple versions\n  of `knowledge-clustering`).\n  > Type `where python3`, and uninstall `knowledge-clustering`\n  from everywhere (using `<path>/python3 -m pip uninstall knowledge-clustering`)\n  except your main version of python. Try to then upgrade `knowledge-clustering`\n  by running `pip3 install --upgrade knowledge-clustering`.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Automated notion clustering for the knowledge LaTeX package",
    "version": "0.7.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/remimorvan/knowledge-clustering/issues",
        "Homepage": "https://github.com/remimorvan/knowledge-clustering"
    },
    "split_keywords": [
        "knowledge",
        "::",
        "latex",
        "::",
        "clustering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7bf0806480e1bc76c0253721064bb2445360b21771cb6090bf18a9adf2c6560e",
                "md5": "657c8370e5bf62ed3aefe74a678f86fd",
                "sha256": "e0ea8d14c44d40f35a04d73f737d806b46123a43b4ad71774c5aa94110a0c41a"
            },
            "downloads": -1,
            "filename": "knowledge_clustering-0.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "657c8370e5bf62ed3aefe74a678f86fd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 30903,
            "upload_time": "2024-02-23T22:52:10",
            "upload_time_iso_8601": "2024-02-23T22:52:10.524833Z",
            "url": "https://files.pythonhosted.org/packages/7b/f0/806480e1bc76c0253721064bb2445360b21771cb6090bf18a9adf2c6560e/knowledge_clustering-0.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b570fb475fa26ff709034ed1cdd40b84e0c98e1d5fcd6427efe91ebb52f265ba",
                "md5": "08cdaa5322dd7b6d265c3c0570f7563f",
                "sha256": "47366c6cb0a1186d4d6023c62731c9d324b60fae4415659bbd454913f632a2c2"
            },
            "downloads": -1,
            "filename": "knowledge-clustering-0.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "08cdaa5322dd7b6d265c3c0570f7563f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 29593,
            "upload_time": "2024-02-23T22:52:12",
            "upload_time_iso_8601": "2024-02-23T22:52:12.563059Z",
            "url": "https://files.pythonhosted.org/packages/b5/70/fb475fa26ff709034ed1cdd40b84e0c98e1d5fcd6427efe91ebb52f265ba/knowledge-clustering-0.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-23 22:52:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "remimorvan",
    "github_project": "knowledge-clustering",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "knowledge-clustering"
}
        
Elapsed time: 0.18735s