# StrTokenizer
A Python module that mimics the functionality of the Java `StringTokenizer` class. This class splits a given string into tokens based on a specified delimiter and offers methods to iterate over the tokens, count them, and manipulate the tokenizer's state.
## Installation
To install the `StrTokenizer` package globally, you can use `pip`. Here are the steps to install:
1. Ensure you have `pip` installed on your system.
2. Open your command line interface (CLI) and run:
```bash
pip install StrTokenizer
```
If you want to use it locally without installing, simply download or copy the `tokenizer.py` file and import it into your project.
## Usage
### Import the Module
If the module is installed via pip, import the class from your module:
```python
from StrTokenizer import StrTokenizer
```
If the module (tokenizer.py) is downloaded from GitHub, import it like this:
```python
from tokenizer import StrTokenizer
```
### Creating a StrTokenizer Object
To create an instance of `StrTokenizer`, provide the input string, the delimiter (optional, defaults to a space `" "`), and whether to return the delimiters as tokens (optional, defaults to `False`).
```python
# Example with default delimiter (space)
tokenizer = StrTokenizer("This is a test string")
# Example with custom delimiter
tokenizer = StrTokenizer("This,is,a,test,string", ",")
# Example with custom delimiter and returning the delimiter as tokens
tokenizer = StrTokenizer("This,is,a,test,string", ",", return_delims=True)
```
### Methods
#### `countTokens() -> int`
Returns the total number of tokens in the string.
```python
token_count = tokenizer.countTokens()
print("Number of tokens:", token_count)
```
#### `countTokensLeft() -> int`
Returns the number of tokens left to be iterated.
```python
tokens_left = tokenizer.countTokensLeft()
print("Tokens left:", tokens_left)
```
#### `hasMoreTokens() -> bool`
Checks if there are more tokens to iterate over.
```python
if tokenizer.hasMoreTokens():
print("There are more tokens available.")
```
#### `nextToken() -> str`
Returns the next token. Raises an `IndexError` if no more tokens are available.
```python
while tokenizer.hasMoreTokens():
print(tokenizer.nextToken())
```
#### `rewind(steps: int = None) -> None`
Resets the tokenizer's index either completely or by a specified number of steps:
- **Without arguments**: Resets the tokenizer back to the first token.
- **With `steps`**: Moves the tokenizer back by the given number of steps.
```python
# Rewind completely
tokenizer.rewind()
# Rewind by 2 tokens
tokenizer.rewind(2)
```
### Example
```python
from tokenizer import StrTokenizer
# Create a tokenizer with a custom delimiter
tokenizer = StrTokenizer("apple,orange,banana,grape", ",")
# Get the number of tokens
print("Number of tokens:", tokenizer.countTokens())
# Iterate over the tokens
while tokenizer.hasMoreTokens():
print("Token:", tokenizer.nextToken())
# Rewind the tokenizer and iterate again
tokenizer.rewind()
print("After rewinding:")
while tokenizer.hasMoreTokens():
print("Token:", tokenizer.nextToken())
```
### Output:
```text
Number of tokens: 4
Token: apple
Token: orange
Token: banana
Token: grape
After rewinding:
Token: apple
Token: orange
Token: banana
Token: grape
```
## Methods Overview
- `__init__(self, inputstring: str, delimiter: str = " ", return_delims: bool = False)`:
- Initializes the `StrTokenizer` with the given string, delimiter, and whether to return delimiters as tokens.
- `__create_token(self) -> None`:
- Splits the input string into tokens based on the delimiter.
- `countTokens(self) -> int`:
- Returns the total number of tokens.
- `countTokensLeft(self) -> int`:
- Returns the number of tokens left for iteration.
- `hasMoreTokens(self) -> bool`:
- Checks if there are more tokens to be retrieved.
- `nextToken(self) -> str`:
- Returns the next available token or raises an `IndexError` if no tokens are left.
- `rewind(self, steps: int = None) -> None`:
- Resets the tokenizer's index either completely or by a given number of steps.
You can install the `StrTokenizer` package from PyPI:
[Install StrTokenizer from PyPI](https://pypi.org/project/StrTokenizer/1.1.0/)
## Source Code:
[Github Link](https://github.com/CyberPokemon/StrTokenizer)
## License
This project is open-source and available for modification or distribution.
Raw data
{
"_id": null,
"home_page": "https://github.com/CyberPokemon/StrTokenizer.git",
"name": "StrTokenizer",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "string tokenizer split parse",
"author": "Imon Mallik",
"author_email": "imoncoding@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/bc/87/33987fde57d456599f18b245a24017f9d120e5dc4119673099411ec2a5d5/strtokenizer-1.1.0.tar.gz",
"platform": null,
"description": "\r\n# StrTokenizer\r\n\r\nA Python module that mimics the functionality of the Java `StringTokenizer` class. This class splits a given string into tokens based on a specified delimiter and offers methods to iterate over the tokens, count them, and manipulate the tokenizer's state.\r\n\r\n## Installation\r\n\r\nTo install the `StrTokenizer` package globally, you can use `pip`. Here are the steps to install:\r\n\r\n1. Ensure you have `pip` installed on your system.\r\n2. Open your command line interface (CLI) and run:\r\n\r\n```bash\r\npip install StrTokenizer\r\n```\r\n\r\nIf you want to use it locally without installing, simply download or copy the `tokenizer.py` file and import it into your project.\r\n\r\n## Usage\r\n\r\n### Import the Module\r\n\r\nIf the module is installed via pip, import the class from your module:\r\n\r\n```python\r\nfrom StrTokenizer import StrTokenizer\r\n```\r\n\r\n\r\nIf the module (tokenizer.py) is downloaded from GitHub, import it like this:\r\n\r\n```python\r\nfrom tokenizer import StrTokenizer\r\n```\r\n\r\n### Creating a StrTokenizer Object\r\n\r\nTo create an instance of `StrTokenizer`, provide the input string, the delimiter (optional, defaults to a space `\" \"`), and whether to return the delimiters as tokens (optional, defaults to `False`).\r\n\r\n```python\r\n# Example with default delimiter (space)\r\ntokenizer = StrTokenizer(\"This is a test string\")\r\n\r\n# Example with custom delimiter\r\ntokenizer = StrTokenizer(\"This,is,a,test,string\", \",\")\r\n\r\n# Example with custom delimiter and returning the delimiter as tokens\r\ntokenizer = StrTokenizer(\"This,is,a,test,string\", \",\", return_delims=True)\r\n```\r\n\r\n### Methods\r\n\r\n#### `countTokens() -> int`\r\n\r\nReturns the total number of tokens in the string.\r\n\r\n```python\r\ntoken_count = tokenizer.countTokens()\r\nprint(\"Number of tokens:\", token_count)\r\n```\r\n\r\n#### `countTokensLeft() -> int`\r\n\r\nReturns the number of tokens left to be iterated.\r\n\r\n```python\r\ntokens_left = tokenizer.countTokensLeft()\r\nprint(\"Tokens left:\", tokens_left)\r\n```\r\n\r\n#### `hasMoreTokens() -> bool`\r\n\r\nChecks if there are more tokens to iterate over.\r\n\r\n```python\r\nif tokenizer.hasMoreTokens():\r\n print(\"There are more tokens available.\")\r\n```\r\n\r\n#### `nextToken() -> str`\r\n\r\nReturns the next token. Raises an `IndexError` if no more tokens are available.\r\n\r\n```python\r\nwhile tokenizer.hasMoreTokens():\r\n print(tokenizer.nextToken())\r\n```\r\n\r\n#### `rewind(steps: int = None) -> None`\r\n\r\nResets the tokenizer's index either completely or by a specified number of steps:\r\n- **Without arguments**: Resets the tokenizer back to the first token.\r\n- **With `steps`**: Moves the tokenizer back by the given number of steps.\r\n\r\n```python\r\n# Rewind completely\r\ntokenizer.rewind()\r\n\r\n# Rewind by 2 tokens\r\ntokenizer.rewind(2)\r\n```\r\n\r\n### Example\r\n\r\n```python\r\nfrom tokenizer import StrTokenizer\r\n\r\n# Create a tokenizer with a custom delimiter\r\ntokenizer = StrTokenizer(\"apple,orange,banana,grape\", \",\")\r\n\r\n# Get the number of tokens\r\nprint(\"Number of tokens:\", tokenizer.countTokens())\r\n\r\n# Iterate over the tokens\r\nwhile tokenizer.hasMoreTokens():\r\n print(\"Token:\", tokenizer.nextToken())\r\n\r\n# Rewind the tokenizer and iterate again\r\ntokenizer.rewind()\r\nprint(\"After rewinding:\")\r\nwhile tokenizer.hasMoreTokens():\r\n print(\"Token:\", tokenizer.nextToken())\r\n```\r\n\r\n### Output:\r\n\r\n```text\r\nNumber of tokens: 4\r\nToken: apple\r\nToken: orange\r\nToken: banana\r\nToken: grape\r\nAfter rewinding:\r\nToken: apple\r\nToken: orange\r\nToken: banana\r\nToken: grape\r\n```\r\n\r\n## Methods Overview\r\n\r\n- `__init__(self, inputstring: str, delimiter: str = \" \", return_delims: bool = False)`:\r\n - Initializes the `StrTokenizer` with the given string, delimiter, and whether to return delimiters as tokens.\r\n \r\n- `__create_token(self) -> None`:\r\n - Splits the input string into tokens based on the delimiter.\r\n \r\n- `countTokens(self) -> int`:\r\n - Returns the total number of tokens.\r\n \r\n- `countTokensLeft(self) -> int`:\r\n - Returns the number of tokens left for iteration.\r\n \r\n- `hasMoreTokens(self) -> bool`:\r\n - Checks if there are more tokens to be retrieved.\r\n \r\n- `nextToken(self) -> str`:\r\n - Returns the next available token or raises an `IndexError` if no tokens are left.\r\n \r\n- `rewind(self, steps: int = None) -> None`:\r\n - Resets the tokenizer's index either completely or by a given number of steps.\r\n \r\nYou can install the `StrTokenizer` package from PyPI:\r\n\r\n[Install StrTokenizer from PyPI](https://pypi.org/project/StrTokenizer/1.1.0/)\r\n\r\n## Source Code:\r\n\r\n[Github Link](https://github.com/CyberPokemon/StrTokenizer)\r\n\r\n## License\r\n\r\nThis project is open-source and available for modification or distribution.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python equivalent of Java's StringTokenizer with some added functionality",
"version": "1.1.0",
"project_urls": {
"Homepage": "https://github.com/CyberPokemon/StrTokenizer.git"
},
"split_keywords": [
"string",
"tokenizer",
"split",
"parse"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "446cd11d591189820fbd04df66e961130185bb3ad9448decd94d61d2698c2a71",
"md5": "d9d9e3c5215a262940c242615d047cea",
"sha256": "2971e2572b9d92a455e83eeba501e4ccb03dc9cde7ecad2a8cdcf74bd6706879"
},
"downloads": -1,
"filename": "StrTokenizer-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d9d9e3c5215a262940c242615d047cea",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 4511,
"upload_time": "2024-10-14T18:37:16",
"upload_time_iso_8601": "2024-10-14T18:37:16.651106Z",
"url": "https://files.pythonhosted.org/packages/44/6c/d11d591189820fbd04df66e961130185bb3ad9448decd94d61d2698c2a71/StrTokenizer-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bc8733987fde57d456599f18b245a24017f9d120e5dc4119673099411ec2a5d5",
"md5": "7ad642ee0e6acaa6849b59c5d95427c3",
"sha256": "5bdccbf31c4f7956850344772d0d6949f148faa56ffe3b03517e623c4e24eb6b"
},
"downloads": -1,
"filename": "strtokenizer-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "7ad642ee0e6acaa6849b59c5d95427c3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 4143,
"upload_time": "2024-10-14T18:37:19",
"upload_time_iso_8601": "2024-10-14T18:37:19.031304Z",
"url": "https://files.pythonhosted.org/packages/bc/87/33987fde57d456599f18b245a24017f9d120e5dc4119673099411ec2a5d5/strtokenizer-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-14 18:37:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "CyberPokemon",
"github_project": "StrTokenizer",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "strtokenizer"
}