crossword-generator


Namecrossword-generator JSON
Version 0.2.2 PyPI version JSON
download
home_page
SummaryGenerate crosswords using Monte Carlo Tree Search (MCTS)
upload_time2024-03-04 07:31:34
maintainer
docs_urlNone
author
requires_python>=3.8
license
keywords crossword generator creator mcts monte carlo tree search
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MCTS Crossword Generator

This package provides a pure Python implementation for generating crosswords using
Monte Carlo Tree Search (MCTS).
- A good overview about the project can be found in this [blog post](http://schumacher.pythonanywhere.com/homepage/crossword).
- The pip package can be found on [PyPI](https://pypi.org/project/crossword-generator)

![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_empty.png)
![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_filled.png)

## Quickstart

### A: Install package:
1. Create and activate a virtual environment based on Python >= 3.8 
2. Install crossword_generator package: 
```
pip install crossword-generator
```

### B: Generate crossword with default settings:
You can generate a crossword without providing any arguments.
This will fill a 4x5 layout without any black squares using words from an English dictionary. 

To do so, activate your virtual environment and chose one of the following (equivalent) options:
1. Call application directly:
```
crossword
```
2. Execute package: 
```
python -m crossword_generator
```
3. Run the main function in a python shell or your own script:
```
>>> from crossword_generator import generate_crossword
>>> generate_crossword()
````

For the next examples I assume you are using the first option to interact with the package.

## Examples
- To get started and see which input formats are required, you can download some
[english](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en)
(comma-separated) or
[german](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_de)
(semicolon-separated) sample data.
- Let's assume you have downloaded the sample files into a directory called "crossword_input" inside your working directory.

### A: Use your own layouts

- In order to use your own layouts, you will need to set argument 
`path_to_layout` to a CSV file on your local machine. 
- the CSV file must have an index column and a header row
- potential letters are marked with "_" (underscore)
- black squares are marked with "" (empty)

Fill an 
[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv):
```
crossword --path_to_layout "crossword_input/layout_5_12_empty.csv"
```

Fill a 
[prefilled 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_prefilled.csv):
```
crossword --path_to_layout "crossword_input/layout_5_12_prefilled.csv"
```

Fill an entire NYT-style
[15x15 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_15_15_empty.csv):

```
crossword --path_to_layout "crossword_input/layout_15_15_empty.csv"
```

Of course, you can also provide arguments from within your code:
```
generate_crossword(
  path_to_layout="crossword_input/layout_15_15_empty.csv",
)
```

### B: Use your own words

- In order to use your own set of words, you will need to set argument 
`path_to_words` to a CSV file (or pattern of CSV files) on your local machine.
- the CSV file(s) must contain a column named "answer" with the relevant words

Fill an 
[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv)
with [words from a list](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/sample_words.csv)
```
crossword --path_to_layout "crossword_input/layout_5_12_empty.csv" --path_to_words "crossword_input/sample_words.csv"
```

Again, you can do the same from within your code:
```
generate_crossword(
  path_to_layout="crossword_input/layout_5_12_empty.csv",
  path_to_words="crossword_input/sample_words.csv",
)
```


### C: Other arguments you might want to play with:
- *num_rows & num_cols [int]*
    - number of rows / columns the layout should have
    - will only be considered if `path_to_layout` is not specified
- *max_num_words [int]*
    - limits the number of words to improve runtime
- *max_mcts_iterations [int]*
    - sets the maximum number of MCTS iterations
    - can be increased to get a better solution or decreased to improve runtime
- *random_seed [int]*
    - change the seed to obtain different filled crosswords
- *output_path [str]*
    - if provided, save the final grid and a summary as CSV files into the provided directory

## Modules
- **optimizer.py**
  - script that contains the main function `generate_crossword()`
- **layout_handler.py**
  - Provides the layout that will later be filled with words
  - `NewLayoutHandler`: creates a new layout from scratch 
  - `ExistingLayoutHandler`: reads an existing layout from a CSV file
- **word_handler.py**
  - Provides the words that will later be filled into the layout
  - `DictionaryWordHandler`: get words from NLTK corpus
  - `FileWordHandler`: read words from CSV files
- **state.py**
  - `Entry`: class that represents the current state of one entry of the crossword
  - `CrosswordState`: class that represents the current state of whole crossword
- **tree_search.py**
  - `TreeNode`: represents one node of the MCTS tree
  - `MCTS`: represents the whole MCTS tree and provides all necessary functionalities such as
    - Selection
    - Expansion
    - Simulation / Rollout
    - Backpropagation

## References & Dependencies
- The MCTS implementation in `tree_search.py` is based on the algorithm provided by [pbsinclair42](https://github.com/pbsinclair42/MCTS),
   which I adapted in several ways:
  - Convert from 2-player to 1-player domain
  - Adjust reward function + exploration term
  - Add additional methods to analyze the game tree
  - Use PEP 8 code style
- Have a look at `pyproject.toml` for a list of all required and optional dependencies
- Python >= 3.8
- Required packages
  - nltk>=3.5
  - pandas>=1.4.0
  - numpy>=1.22.0
  - tqdm>=4.41.0

## Future work
- Add a python module that creates questions for given answers using NLP techniques
- Add a graphical user interface (GUI)

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "crossword-generator",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "crossword,generator,creator,mcts,monte carlo tree search",
    "author": "",
    "author_email": "Jonas Schumacher <jonas.schumacher@tu-dortmund.de>",
    "download_url": "https://files.pythonhosted.org/packages/e1/ab/abe45774c9cb9027e7319268da964edb518391450a04ad6800a18224dfde/crossword-generator-0.2.2.tar.gz",
    "platform": null,
    "description": "# MCTS Crossword Generator\r\n\r\nThis package provides a pure Python implementation for generating crosswords using\r\nMonte Carlo Tree Search (MCTS).\r\n- A good overview about the project can be found in this [blog post](http://schumacher.pythonanywhere.com/homepage/crossword).\r\n- The pip package can be found on [PyPI](https://pypi.org/project/crossword-generator)\r\n\r\n![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_empty.png)\r\n![screenshot filled crossword](https://github.com/jonas-schumacher/crossword-generator/raw/main/images/layout_5_12_filled.png)\r\n\r\n## Quickstart\r\n\r\n### A: Install package:\r\n1. Create and activate a virtual environment based on Python >= 3.8 \r\n2. Install crossword_generator package: \r\n```\r\npip install crossword-generator\r\n```\r\n\r\n### B: Generate crossword with default settings:\r\nYou can generate a crossword without providing any arguments.\r\nThis will fill a 4x5 layout without any black squares using words from an English dictionary. \r\n\r\nTo do so, activate your virtual environment and chose one of the following (equivalent) options:\r\n1. Call application directly:\r\n```\r\ncrossword\r\n```\r\n2. Execute package: \r\n```\r\npython -m crossword_generator\r\n```\r\n3. Run the main function in a python shell or your own script:\r\n```\r\n>>> from crossword_generator import generate_crossword\r\n>>> generate_crossword()\r\n````\r\n\r\nFor the next examples I assume you are using the first option to interact with the package.\r\n\r\n## Examples\r\n- To get started and see which input formats are required, you can download some\r\n[english](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en)\r\n(comma-separated) or\r\n[german](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_de)\r\n(semicolon-separated) sample data.\r\n- Let's assume you have downloaded the sample files into a directory called \"crossword_input\" inside your working directory.\r\n\r\n### A: Use your own layouts\r\n\r\n- In order to use your own layouts, you will need to set argument \r\n`path_to_layout` to a CSV file on your local machine. \r\n- the CSV file must have an index column and a header row\r\n- potential letters are marked with \"_\" (underscore)\r\n- black squares are marked with \"\" (empty)\r\n\r\nFill an \r\n[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv):\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_5_12_empty.csv\"\r\n```\r\n\r\nFill a \r\n[prefilled 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_prefilled.csv):\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_5_12_prefilled.csv\"\r\n```\r\n\r\nFill an entire NYT-style\r\n[15x15 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_15_15_empty.csv):\r\n\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_15_15_empty.csv\"\r\n```\r\n\r\nOf course, you can also provide arguments from within your code:\r\n```\r\ngenerate_crossword(\r\n  path_to_layout=\"crossword_input/layout_15_15_empty.csv\",\r\n)\r\n```\r\n\r\n### B: Use your own words\r\n\r\n- In order to use your own set of words, you will need to set argument \r\n`path_to_words` to a CSV file (or pattern of CSV files) on your local machine.\r\n- the CSV file(s) must contain a column named \"answer\" with the relevant words\r\n\r\nFill an \r\n[empty 5x12 layout](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/layout_5_12_empty.csv)\r\nwith [words from a list](https://github.com/jonas-schumacher/crossword-generator/tree/main/sample_input_en/sample_words.csv)\r\n```\r\ncrossword --path_to_layout \"crossword_input/layout_5_12_empty.csv\" --path_to_words \"crossword_input/sample_words.csv\"\r\n```\r\n\r\nAgain, you can do the same from within your code:\r\n```\r\ngenerate_crossword(\r\n  path_to_layout=\"crossword_input/layout_5_12_empty.csv\",\r\n  path_to_words=\"crossword_input/sample_words.csv\",\r\n)\r\n```\r\n\r\n\r\n### C: Other arguments you might want to play with:\r\n- *num_rows & num_cols [int]*\r\n    - number of rows / columns the layout should have\r\n    - will only be considered if `path_to_layout` is not specified\r\n- *max_num_words [int]*\r\n    - limits the number of words to improve runtime\r\n- *max_mcts_iterations [int]*\r\n    - sets the maximum number of MCTS iterations\r\n    - can be increased to get a better solution or decreased to improve runtime\r\n- *random_seed [int]*\r\n    - change the seed to obtain different filled crosswords\r\n- *output_path [str]*\r\n    - if provided, save the final grid and a summary as CSV files into the provided directory\r\n\r\n## Modules\r\n- **optimizer.py**\r\n  - script that contains the main function `generate_crossword()`\r\n- **layout_handler.py**\r\n  - Provides the layout that will later be filled with words\r\n  - `NewLayoutHandler`: creates a new layout from scratch \r\n  - `ExistingLayoutHandler`: reads an existing layout from a CSV file\r\n- **word_handler.py**\r\n  - Provides the words that will later be filled into the layout\r\n  - `DictionaryWordHandler`: get words from NLTK corpus\r\n  - `FileWordHandler`: read words from CSV files\r\n- **state.py**\r\n  - `Entry`: class that represents the current state of one entry of the crossword\r\n  - `CrosswordState`: class that represents the current state of whole crossword\r\n- **tree_search.py**\r\n  - `TreeNode`: represents one node of the MCTS tree\r\n  - `MCTS`: represents the whole MCTS tree and provides all necessary functionalities such as\r\n    - Selection\r\n    - Expansion\r\n    - Simulation / Rollout\r\n    - Backpropagation\r\n\r\n## References & Dependencies\r\n- The MCTS implementation in `tree_search.py` is based on the algorithm provided by [pbsinclair42](https://github.com/pbsinclair42/MCTS),\r\n   which I adapted in several ways:\r\n  - Convert from 2-player to 1-player domain\r\n  - Adjust reward function + exploration term\r\n  - Add additional methods to analyze the game tree\r\n  - Use PEP 8 code style\r\n- Have a look at `pyproject.toml` for a list of all required and optional dependencies\r\n- Python >= 3.8\r\n- Required packages\r\n  - nltk>=3.5\r\n  - pandas>=1.4.0\r\n  - numpy>=1.22.0\r\n  - tqdm>=4.41.0\r\n\r\n## Future work\r\n- Add a python module that creates questions for given answers using NLP techniques\r\n- Add a graphical user interface (GUI)\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Generate crosswords using Monte Carlo Tree Search (MCTS)",
    "version": "0.2.2",
    "project_urls": {
        "blogpost": "https://schumacher.pythonanywhere.com/udacity/crossword",
        "repository": "https://github.com/jonas-schumacher/crossword-generator"
    },
    "split_keywords": [
        "crossword",
        "generator",
        "creator",
        "mcts",
        "monte carlo tree search"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c4c0f62d6038ebbb4421d62342be4d5dc87e692f7659b110d4283047e5970abc",
                "md5": "dcffe1854eb2c8b0c7c4d7450b141226",
                "sha256": "71de259060c8c65c6d608568553e5cd0e806d29667355f22f16c081adcb21f6c"
            },
            "downloads": -1,
            "filename": "crossword_generator-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dcffe1854eb2c8b0c7c4d7450b141226",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 20024,
            "upload_time": "2024-03-04T07:31:32",
            "upload_time_iso_8601": "2024-03-04T07:31:32.910678Z",
            "url": "https://files.pythonhosted.org/packages/c4/c0/f62d6038ebbb4421d62342be4d5dc87e692f7659b110d4283047e5970abc/crossword_generator-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e1ababe45774c9cb9027e7319268da964edb518391450a04ad6800a18224dfde",
                "md5": "deeb4187e9979fbb77b7b4d793afd68c",
                "sha256": "bba7fd094939df010b12c89d5a23965bd8941218ca103037b97dded0b02d54f9"
            },
            "downloads": -1,
            "filename": "crossword-generator-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "deeb4187e9979fbb77b7b4d793afd68c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 19705,
            "upload_time": "2024-03-04T07:31:34",
            "upload_time_iso_8601": "2024-03-04T07:31:34.046546Z",
            "url": "https://files.pythonhosted.org/packages/e1/ab/abe45774c9cb9027e7319268da964edb518391450a04ad6800a18224dfde/crossword-generator-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-04 07:31:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jonas-schumacher",
    "github_project": "crossword-generator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "crossword-generator"
}
        
Elapsed time: 3.08820s