# MCTS Crossword Generator
This package provides a pure Python implementation for generating crosswords using
Monte Carlo Tree Search (MCTS).
- A good overview about the project can be found in this [blog post](http://schumacher.pythonanywhere.com/homepage/crossword).
- The pip package can be found on [PyPI](https://pypi.org/project/crossword-generator)
![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_empty.png)
![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_filled.png)
## Quickstart
### A: Install package:
1. Create and activate a virtual environment based on Python >= 3.8
2. Install crossword_generator package:
```
pip install crossword-generator
```
### B: Generate crossword with default settings:
You can generate a crossword without providing any arguments.
This will fill a 4x5 layout without any black squares using words from an English dictionary.
To do so, activate your virtual environment and chose one of the following (equivalent) options:
1. Call application directly:
```
crossword
```
2. Execute package:
```
python -m crossword_generator
```
3. Run the main function in a python shell or your own script:
```
>>> from crossword_generator import generate_crossword
>>> generate_crossword()
````
For the next examples I assume you are using the first option to interact with the package.
## Examples
- To get started and see which input formats are required, you can download some
[english](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en)
(comma-separated) or
[german](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_de)
(semicolon-separated) sample data.
- Let's assume you have downloaded the sample files into a directory called "crossword_input" inside your working directory.
### A: Use your own layouts
- In order to use your own layouts, you will need to set argument
`path_to_layout` to a CSV file on your local machine.
- the CSV file must have an index column and a header row
- potential letters are marked with "_" (underscore)
- black squares are marked with "" (empty)
Fill an
[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv):
```
crossword --path_to_layout "crossword_input/layout_5_12_empty.csv"
```
Fill a
[prefilled 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_prefilled.csv):
```
crossword --path_to_layout "crossword_input/layout_5_12_prefilled.csv"
```
Fill an entire NYT-style
[15x15 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_15_15_empty.csv):
```
crossword --path_to_layout "crossword_input/layout_15_15_empty.csv"
```
Of course, you can also provide arguments from within your code:
```
generate_crossword(
path_to_layout="crossword_input/layout_15_15_empty.csv",
)
```
### B: Use your own words
- In order to use your own set of words, you will need to set argument
`path_to_words` to a CSV file (or pattern of CSV files) on your local machine.
- the CSV file(s) must contain a column named "answer" with the relevant words
Fill an
[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv)
with [words from a list](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/sample_words.csv)
```
crossword --path_to_layout "crossword_input/layout_5_12_empty.csv" --path_to_words "crossword_input/sample_words.csv"
```
Again, you can do the same from within your code:
```
generate_crossword(
path_to_layout="crossword_input/layout_5_12_empty.csv",
path_to_words="crossword_input/sample_words.csv",
)
```
### C: Other arguments you might want to play with:
- *num_rows & num_cols [int]*
- number of rows / columns the layout should have
- will only be considered if `path_to_layout` is not specified
- *max_num_words [int]*
- limits the number of words to improve runtime
- *max_mcts_iterations [int]*
- sets the maximum number of MCTS iterations
- can be increased to get a better solution or decreased to improve runtime
- *random_seed [int]*
- change the seed to obtain different filled crosswords
- *output_path [str]*
- if provided, save the final grid and a summary as CSV files into the provided directory
## Modules
- **optimizer.py**
- script that contains the main function `generate_crossword()`
- **layout_handler.py**
- Provides the layout that will later be filled with words
- `NewLayoutHandler`: creates a new layout from scratch
- `ExistingLayoutHandler`: reads an existing layout from a CSV file
- **word_handler.py**
- Provides the words that will later be filled into the layout
- `DictionaryWordHandler`: get words from NLTK corpus
- `FileWordHandler`: read words from CSV files
- **state.py**
- `Entry`: class that represents the current state of one entry of the crossword
- `CrosswordState`: class that represents the current state of whole crossword
- **tree_search.py**
- `TreeNode`: represents one node of the MCTS tree
- `MCTS`: represents the whole MCTS tree and provides all necessary functionalities such as
- Selection
- Expansion
- Simulation / Rollout
- Backpropagation
## References & Dependencies
- The MCTS implementation in `tree_search.py` is based on the algorithm provided by [pbsinclair42](https://github.com/pbsinclair42/MCTS),
which I adapted in several ways:
- Convert from 2-player to 1-player domain
- Adjust reward function + exploration term
- Add additional methods to analyze the game tree
- Use PEP 8 code style
- Have a look at `pyproject.toml` for a list of all required and optional dependencies
- Python >= 3.8
- Required packages
- nltk>=3.5
- pandas>=1.4.0
- numpy>=1.22.0
- tqdm>=4.41.0
## Future work
- Add a python module that creates questions for given answers using NLP techniques
- Add a graphical user interface (GUI)
Raw data
{
"_id": null,
"home_page": "",
"name": "crossword-generator",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "crossword,generator,creator,mcts,monte carlo tree search",
"author": "",
"author_email": "Jonas Schumacher <jonas.schumacher@tu-dortmund.de>",
"download_url": "https://files.pythonhosted.org/packages/e1/ab/abe45774c9cb9027e7319268da964edb518391450a04ad6800a18224dfde/crossword-generator-0.2.2.tar.gz",
"platform": null,
"description": "# MCTS Crossword Generator\r\n\r\nThis package provides a pure Python implementation for generating crosswords using\r\nMonte Carlo Tree Search (MCTS).\r\n- A good overview about the project can be found in this [blog post](http://schumacher.pythonanywhere.com/homepage/crossword).\r\n- The pip package can be found on [PyPI](https://pypi.org/project/crossword-generator)\r\n\r\n![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_empty.png)\r\n![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_filled.png)\r\n\r\n## Quickstart\r\n\r\n### A: Install package:\r\n1. Create and activate a virtual environment based on Python >= 3.8 \r\n2. Install crossword_generator package: \r\n```\r\npip install crossword-generator\r\n```\r\n\r\n### B: Generate crossword with default settings:\r\nYou can generate a crossword without providing any arguments.\r\nThis will fill a 4x5 layout without any black squares using words from an English dictionary. \r\n\r\nTo do so, activate your virtual environment and chose one of the following (equivalent) options:\r\n1. Call application directly:\r\n```\r\ncrossword\r\n```\r\n2. Execute package: \r\n```\r\npython -m crossword_generator\r\n```\r\n3. Run the main function in a python shell or your own script:\r\n```\r\n>>> from crossword_generator import generate_crossword\r\n>>> generate_crossword()\r\n````\r\n\r\nFor the next examples I assume you are using the first option to interact with the package.\r\n\r\n## Examples\r\n- To get started and see which input formats are required, you can download some\r\n[english](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en)\r\n(comma-separated) or\r\n[german](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_de)\r\n(semicolon-separated) sample data.\r\n- Let's assume you have downloaded the sample files into a directory called \"crossword_input\" inside your working directory.\r\n\r\n### A: Use your own layouts\r\n\r\n- In order to use your own layouts, you will need to set argument \r\n`path_to_layout` to a CSV file on your local machine. \r\n- the CSV file must have an index column and a header row\r\n- potential letters are marked with \"_\" (underscore)\r\n- black squares are marked with \"\" (empty)\r\n\r\nFill an \r\n[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv):\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_5_12_empty.csv\"\r\n```\r\n\r\nFill a \r\n[prefilled 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_prefilled.csv):\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_5_12_prefilled.csv\"\r\n```\r\n\r\nFill an entire NYT-style\r\n[15x15 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_15_15_empty.csv):\r\n\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_15_15_empty.csv\"\r\n```\r\n\r\nOf course, you can also provide arguments from within your code:\r\n```\r\ngenerate_crossword(\r\n path_to_layout=\"crossword_input/layout_15_15_empty.csv\",\r\n)\r\n```\r\n\r\n### B: Use your own words\r\n\r\n- In order to use your own set of words, you will need to set argument \r\n`path_to_words` to a CSV file (or pattern of CSV files) on your local machine.\r\n- the CSV file(s) must contain a column named \"answer\" with the relevant words\r\n\r\nFill an \r\n[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv)\r\nwith [words from a list](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/sample_words.csv)\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_5_12_empty.csv\" --path_to_words \"crossword_input/sample_words.csv\"\r\n```\r\n\r\nAgain, you can do the same from within your code:\r\n```\r\ngenerate_crossword(\r\n path_to_layout=\"crossword_input/layout_5_12_empty.csv\",\r\n path_to_words=\"crossword_input/sample_words.csv\",\r\n)\r\n```\r\n\r\n\r\n### C: Other arguments you might want to play with:\r\n- *num_rows & num_cols [int]*\r\n - number of rows / columns the layout should have\r\n - will only be considered if `path_to_layout` is not specified\r\n- *max_num_words [int]*\r\n - limits the number of words to improve runtime\r\n- *max_mcts_iterations [int]*\r\n - sets the maximum number of MCTS iterations\r\n - can be increased to get a better solution or decreased to improve runtime\r\n- *random_seed [int]*\r\n - change the seed to obtain different filled crosswords\r\n- *output_path [str]*\r\n - if provided, save the final grid and a summary as CSV files into the provided directory\r\n\r\n## Modules\r\n- **optimizer.py**\r\n - script that contains the main function `generate_crossword()`\r\n- **layout_handler.py**\r\n - Provides the layout that will later be filled with words\r\n - `NewLayoutHandler`: creates a new layout from scratch \r\n - `ExistingLayoutHandler`: reads an existing layout from a CSV file\r\n- **word_handler.py**\r\n - Provides the words that will later be filled into the layout\r\n - `DictionaryWordHandler`: get words from NLTK corpus\r\n - `FileWordHandler`: read words from CSV files\r\n- **state.py**\r\n - `Entry`: class that represents the current state of one entry of the crossword\r\n - `CrosswordState`: class that represents the current state of whole crossword\r\n- **tree_search.py**\r\n - `TreeNode`: represents one node of the MCTS tree\r\n - `MCTS`: represents the whole MCTS tree and provides all necessary functionalities such as\r\n - Selection\r\n - Expansion\r\n - Simulation / Rollout\r\n - Backpropagation\r\n\r\n## References & Dependencies\r\n- The MCTS implementation in `tree_search.py` is based on the algorithm provided by [pbsinclair42](https://github.com/pbsinclair42/MCTS),\r\n which I adapted in several ways:\r\n - Convert from 2-player to 1-player domain\r\n - Adjust reward function + exploration term\r\n - Add additional methods to analyze the game tree\r\n - Use PEP 8 code style\r\n- Have a look at `pyproject.toml` for a list of all required and optional dependencies\r\n- Python >= 3.8\r\n- Required packages\r\n - nltk>=3.5\r\n - pandas>=1.4.0\r\n - numpy>=1.22.0\r\n - tqdm>=4.41.0\r\n\r\n## Future work\r\n- Add a python module that creates questions for given answers using NLP techniques\r\n- Add a graphical user interface (GUI)\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Generate crosswords using Monte Carlo Tree Search (MCTS)",
"version": "0.2.2",
"project_urls": {
"blogpost": "https://schumacher.pythonanywhere.com/udacity/crossword",
"repository": "https://github.com/jonas-schumacher/crossword-generator"
},
"split_keywords": [
"crossword",
"generator",
"creator",
"mcts",
"monte carlo tree search"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c4c0f62d6038ebbb4421d62342be4d5dc87e692f7659b110d4283047e5970abc",
"md5": "dcffe1854eb2c8b0c7c4d7450b141226",
"sha256": "71de259060c8c65c6d608568553e5cd0e806d29667355f22f16c081adcb21f6c"
},
"downloads": -1,
"filename": "crossword_generator-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dcffe1854eb2c8b0c7c4d7450b141226",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 20024,
"upload_time": "2024-03-04T07:31:32",
"upload_time_iso_8601": "2024-03-04T07:31:32.910678Z",
"url": "https://files.pythonhosted.org/packages/c4/c0/f62d6038ebbb4421d62342be4d5dc87e692f7659b110d4283047e5970abc/crossword_generator-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e1ababe45774c9cb9027e7319268da964edb518391450a04ad6800a18224dfde",
"md5": "deeb4187e9979fbb77b7b4d793afd68c",
"sha256": "bba7fd094939df010b12c89d5a23965bd8941218ca103037b97dded0b02d54f9"
},
"downloads": -1,
"filename": "crossword-generator-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "deeb4187e9979fbb77b7b4d793afd68c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19705,
"upload_time": "2024-03-04T07:31:34",
"upload_time_iso_8601": "2024-03-04T07:31:34.046546Z",
"url": "https://files.pythonhosted.org/packages/e1/ab/abe45774c9cb9027e7319268da964edb518391450a04ad6800a18224dfde/crossword-generator-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-04 07:31:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jonas-schumacher",
"github_project": "crossword-generator",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "crossword-generator"
}