ctextlib

Name	ctextlib JSON
Version	1.0.21 JSON
	download
home_page	https://github.com/antonmilev/CText
Summary	Python package with CText C++ extension
upload_time	2023-04-17 22:12:04
maintainer
docs_url	None
author	Anton Milev
requires_python	>=2.7
license
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # CText
# Advanced text processing library in C++ and Python 

## About
A Modern C++ library with many useful text processing routines. CText can solve some complicated text processing tasks that otherwise are taking too much time in C++ and Python, some of these like managing lines and words are available on higher level languages like C#, Java and Python but not in C++. But C++ gives more low-level control, except supporting the missing text functions CText implements optimized text routines. Library is very flexible and scalale, it is easy to add quickly custom text processing routnes, can be used to make  pre-processing problems for different NLP and ML tasks or just to practice Modern C++. 

## Main Features
* **Modern C++ Template library**: You only need to include one header, very simple to use.
* **Unicode Support**: - you can have both UNICODE and ANSI in one project.
* **Hundreds of optimized text processing methods**: - Many standard and non-standard text processing operations are covered. I have a long TODO list with much more to add. 
* **Clean and easy to understand code**: - You can use CText to quickly start more complicated text processing applications and abstracting from the too many lower level details and optimizations.
* **Portable**:  I am using CText with VS2017/VS2019 and GCC 7.4 but it easily can be ported to other platforms.
* **Stand alone**:  CText do not depends on any other libraries, the only requirements are C++11 and STL
* **Scalable**:  All text routines are easily to be further extended for all commonly supported char types and platforms. 
* **Python**:  Support of all Python versions 


Please feel free to contact me for questions or suggestions.

### Python
To install CText:
```
pip install ctextlib
```

To test if CText is installed:

```python
import ctextlib
a = ctextlib.Text("Hello World")
print(a)
```

Or:

```python
from ctextlib import Text as text
a = text("Hello World")
print(a)
```


Python methods reference:

<b>addToFileName</b>
```python
a = text("C:\\Temp\\Temp2\\File.bmp")
a.addToFileName("_mask")
print(a)
```

```
C:\Temp\Temp2\File_mask.bmp
```

<b>append</b>
```python
a = text("Hello ")
a.append("World")
```

```
Hello World
```

```python
a = text("123")
a.append('4',4)
```

```
1234444
```

```python
a = text("")
a.append(['Hello', ' ', 'World'])
```

```
Hello World
```

<b>appendRange</b>
```python
a = text()
a.appendRange('a','z').appendRange('0','9')

```

```
abcdefghijklmnopqrstuvwxyz0123456789
```

<b>between</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
a.between('q','d')
print(a)
```

```
uick brown fox jumps over the lazy
```

```python
a = text('The quick brown fox jumps over the lazy dog')
a.between('quick','lazy')
print(a)
```

```
 brown fox jumps over the
```

<b>contain</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
if a.contain('quick') :
    print("contain 'quick'")
```

```
contain 'quick'
```
    
Case-incensitive
   
```python
a = text('The quick brown fox jumps over the lazy dog')
if a.contain('Quick', False) :
    print("contain 'quick'")
```

```
contain 'quick'
```

```python
a = text('The quick brown fox jumps over the lazy dog')
if a.contain(['slow','fast','quick']):
    print("contain 'quick'")
```

```
contain 'quick'
```

<b>containAny</b>
```python
a = text('Hello World')
a.containAny('abcd')
True
```
<b>containOnly</b>
```python
a = text('4365767')
a.containOnly('0123456789')
True
```

<b>count</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
a.count('the', False)
```

```
2
```

<b>countWordFrequencies</b>
```python
from ctextlib import Text as text
a = text("The quick brown fox jumps over the lazy dog")
a.countWordFrequencies(False)
```

```
[(2, 'the'), (1, 'brown'), (1, 'dog'), (1, 'fox'), (1, 'jumps'), (1, 'lazy'), (1, 'over'), (1, 'quick')]
```

<b>cutAfterFirst</b>
```python
s = text('The quick brown fox jumps over the lazy dog')
a.cutAfterFirst('o')
```

```
The quick br
```

<b>cutAfterLast</b>
```python
s = text('The quick brown fox jumps over the lazy dog')
a.cutAfterLast('o')
```

```
The quick brown fox jumps over the lazy d
```


<b>cutBeforeFirst</b>
```python
s = text('The quick brown fox jumps over the lazy dog')
a.cutBeforeFirst('o')
```

```
own fox jumps over the lazy dog
```

<b>cutEnds</b>
```python
s = text('The quick brown fox jumps over the lazy dog')
a.cutEnds(4)
```

```
quick brown fox jumps over the lazy
```

<b>cutLeft</b>
```python
s = text("Hello World")
s.cutLeft(6)
```

```
World
```

<b>cutRight</b>
```python
s = text("Hello World")
s.cutRight(6)
```

```
Hello
```

<b>enclose</b>
```python
a = text("Hello World")
a.enclose('<','>')
a.enclose('"')
```

```
<Hello World>
"Hello World"
```

<b>endsWith</b>
```python
a = text("Hello World")
if a.endsWith('World'):
    print("ends with 'World'")
```


```
ends with 'World'
```

With case-insensitive search:

```python
a = text("Hello World")
if a.endsWith('world', False):
    print("ends with 'world'")
```

```
ends with 'world'
```

<b>endsWithAny</b>
```python
if(a.endsWithAny(['cat','dog'])):
    print('end to animal...')
```

```
end to animal...
```

<b>erase</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
a.erase(8, 10)
print(a)
```

``` 
The quicx jumps over the lazy dog
``` 

<b>equal</b>
```python
a = text()
a.equal('A',10)
```

```
AAAAAAAAAA
```

<b>find</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
a.find('brown')
```

```
'brown fox jumps over the lazy dog'
```

With case-incensitive search:

```python
a = text('The quick brown fox jumps over the lazy dog')
a.find('Brown', False)
```

```
'brown fox jumps over the lazy dog'
```

<b>fromArray</b>
```python
a = text()
a.fromArray([1,2,3,4])
print(a)
```

```
1 2 3 4
```

```python
a = text()
a.fromArray([1,2,3,4], '|')
print(a)
```

```
1|2|3|4
```

```python
a = text()
a.fromArray([1,2,3,4], '')
print(a)
```

```
1234
```

Array of floats

```python
a = text()
a.fromArray([1.1,2.2,3.3,4.4])
print(a)
```

```
1.1 2.2 3.3 4.4
```

Array of strings
```python
a = text()
a.fromArray(['hello','world'])
print(a)
```

```
hello world
```

```python
import numpy as np
a = text()
a.fromArray(np.array(["hello","world"]))
print(a)
```

```
hello world
```

<b>fromArrayAsHex</b>
```python
a = text()
a.fromArrayAsHex([10,20,30,40])
print(a)
```

```
0A 14 1E 28
```

Use without separator

```python
a.fromArrayAsHex([10,20,30,40],2,'')
print(a)
```

```
0A141E28
```

```python
a = text()
a.fromArrayAsHex([1000,2000,3000,4000])
print(a)
```

```
3E8 7D0 BB8 FA0
```

```python
a = text()
a.fromArrayAsHex([1000,2000,3000,4000], 4, ',')
print(a)
```

```
03E8,07D0,0BB8,0FA0
```

<b>fromBinary</b>
```python
a = text()
a.fromBinary(12345)
print(a)
```

```
00000000000000000011000000111001
```

<b>fromDouble</b>
```python
a = text()
a.fromDouble(3.333338478)
print(a)
a.fromDouble(3.33989, 4)
print(a)
a.fromDouble(3.333338478, 10)
```

```
3.333338
3.3399
3.3333384780
```

<b>fromHex</b>
```python
a = text()
a.fromHex(1234567)
a.fromHex('a')
a.fromHex("48 65 6C 6C 6F 20 57 6F 72 6C 64")
```

```
0012D687
61
Hello World
```

<b>fromInteger</b>
```python
a = text()
a.fromInteger(358764)
print(a)
```

```
358764
```

<b>fromMatrix</b>
```python
from ctextlib import Text as text
import numpy as np
x = np.array([[10, 20, 30], [40, 50, 60]])
a = text()
a.fromMatrix(x)
print(a)
```

```
10 20 30
40 50 60
```

```python
from ctextlib import Text as text
import numpy as np
x = np.array([[10, 20, 30], [40, 50, 60]])
a = text()
a.fromMatrix(x, ',')

```

```
10,20,30
40,50,60
```

<b>fromMatrixAsHex</b>
```python
from ctextlib import Text as text
import numpy as np
x = np.array([[10, 20, 30], [40, 50, 60]])
a = text()
a.fromMatrixAsHex(x)
print(a)
```

```
0A 14 1E
28 32 3C
```

```python
from ctextlib import Text as text
import numpy as np
x = np.array([[1000, 2000, 3000], [4000, 5000, 6000]])
a = text()
a.fromMatrixAsHex(x,4)
print(a)
```

```
03E8 07D0 0BB8
0FA0 1388 1770
```

<b>getDir</b>
```python
a = text("D:\\Folder\\SubFolder\\TEXT\\file.dat")
a.getDir()
```

```
D:\Folder\SubFolder\TEXT\
```

<b>getExtension</b>
```python
a = text("D:\\Folder\\SubFolder\\TEXT\\file.dat")
a.getExtension()
```

```
'.dat'
```

<b>getFileName</b>
```python
a = text("D:\\Folder\\SubFolder\\TEXT\\file.dat")
a.getFileName()
```

```
'file.dat'
```

<b>hash</b>
```python
s.hash()
```

```
9257130453210036571
```

<b>indexOf</b>
```python
a = text("The quick brown fox jumps over the lazy dog.")
a.indexOf("brown")
```

```
10
```

<b>indexOfAny</b>
```python
a = text("The quick brown fox jumps over the lazy dog.")
a.indexOfAny(["fox", "dog"])
```

```
16
```

<b>indexOfAny</b>

```python
a = text("The quick brown fox jumps over the lazy dog.")
a.indexOfAny("abc")
```

```
7
```

<b>insert</b>
```python
a = text("abc")
a.insert(1,'d',2)
```

```
addbc
```

```python
a = text("The quick jumps over the lazy dog.")
a.insert(10,"fox ")
```

```
The quick fox jumps over the lazy dog.
```

<b>insertAtBegin</b>
<br><b>insertAtEnd</b>
```python
a = text("Hello")
a.insertAtBegin("<begin>")
a.insertAtEnd("</begin>")
```

```
<begin>abc</begin>
```

<b>isAlpha</b>
```python
a = text("Abcd")
a.isAlpha()
True
```

<b>isBinary</b>
```python
a = text("01111011100001")
a.isBinary()
True
```

<b>isEmpty</b>
```python
a = text()
a.isEmpty()
True
```

<b>isHexNumber</b>
```python
a = text("12AB56FE")
a.isHexNumber()
True
```

<b>isNumber</b>
```python
a = text("123456")
a.isNumber()
True
```

<b>isLower</b>
```python
a = text("hello world")
a.isLower()
True
```

<b>isUpper</b>
```python
a = text("HELLO WORLD")
a.isUpper()
True
```

<b>isPalindrome</b>
```python
a = text("racecar")
a.isPalindrome()
True
```

<b>keep</b>
```python
s = text("Hello World").keep(3,5)
```

```
lo Wo
```

<b>keepLeft</b>
```python
a = text("The quick jumps over the lazy dog.")
a.keepLeft(10)
```

```
The quick
```

<b>keepRight</b>
```python
a = text("The quick jumps over the lazy dog.")
a.keepRight(10)
```

```
 lazy dog.
```

<b>lastIndexOf</b>
```python
s = text("Hello World")
s.lastIndexOf('l')
```

```
9
```

<b>lines</b>
```python
a = text("L1\nL2\n\nL3\nL4\n  \n\nL5")
a.lines()
```

```
['L1', 'L2', 'L3', 'L4', 'L5']
```

<b>linesCount</b>
```python
a = text("L1\nL2\n\nL3\nL4\n  \n\nL5")
a.linesCount()
```

```
7
```

<b>linesRemoveEmpty</b>
```python
a = text("L1\nL2\n\nL3\nL4\n  \n\nL5")
a.linesRemoveEmpty()
print(a)
```

```
L1
L2
L3
L4
L5
```

Several per line methods
<br><b>linesAppend</b>
<br><b>linesInsertAtBegin</b>
<br><b>linesSort</b>
<br><b>linesPaddRight</b>
<br><b>linesTrim</b>
<br>Example of opening a text file, sort all lines, and save it with another name
```python
from ctextlib import Text as text
s = text()
s.readFile('Unordered.txt')
s.linesSort()
s.writeFile('Sorted_python.txt')
```

<b>limit</b>
```python
s = text("Hello World")
s.limit(6)
```

```
Hello
```

<b>lower</b>
```python
s = text("Hello World")
s.lower()
```

```
hello world
```

<b>makeUnique</b>
```python
a = text()
a.appendRange('a','z').appendRange('a','z')
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz
a.makeUnique()
print(a)
```

```
abcdefghijklmnopqrstuvwxyz
```

<b>mid</b>
```python
a = text("Hello World").mid(3)
```

```
lo Wo
```

<b>nextLine</b>
```python
# Example of iterating all lines
from ctextlib import Text as text
a = text("Line1\nLine2\nLine3")
line = text()
pos = 0
while(pos >= 0):
    pos = a.nextLine(pos,line)
    print(line)
```

```
Line1
Line2
Line3
```

<b>nextWord</b>
```python
# Example of iterating all words
from ctextlib import Text as text
a = text('The quick brown fox jumps over the lazy dog')
word = text()
pos = 0
while(pos >= 0):
    pos = a.nextWord(pos,word)
    print(word)
```

```
The
quick
brown
fox
jumps
over
the
lazy
dog
```

<b>paddLeft</b>
```python
s = text("Abra")
s.paddLeft('.', 16)
```

```
............Abra
```

<b>paddRight</b>
```python
s = text("Abra")
s.paddRight('.', 16)
```

```
Abra............
```

<b>pathCombine</b>
```python
a = text("C:\\Temp")
a.pathCombine("..\\Folder")
```

```
C:\Folder
```

<b>quote</b>
```python
a = text("Hello")
a.quote()
```

```
"Hello"
```

<b>random</b>
```python
a = text()
a.random()
"P1kAlMiG2Kb7FzP5"
a.sort()
"1257AFGKMPPbiklz"
a.shuffle()
"k2lF7KAPG5M1Pzbi"
a.random(32)
P1kAlMiG2Kb7FzP5tM1QBI6DSS92c31A
```

<b>randomAlpha</b>
```python
s = text()
s.randomAlpha()
IkEffmzNiMKKASVW
```

<b>randomNumber</b>
```python
s = text()
s.randomNumber()
3892795431
s.randomNumber(32)
33341138742779319865028602486509
```

<b>readFile</b>
```python
# demontrates how to read a whole text file
from ctextlib import Text as text
a = text()
a.readFile('test.txt')
print(a)
```

```
Hello World
```


<b>regexMatch</b>
```python
s = text("+336587890078")
if(s.regexMatch("(\\+|-)?[[:digit:]]+")):
    print("it is a number")
```

```
it is a number
```

<b>regexLines</b>
```txt
animals.txt
------------
Cat
Dog
Giraffe
Lion
Llama
Monkey
Mouse
Parrot
Poodle
Scorpion
Snake
Weasel
```

```python
# collect all lines starting with given characters
from ctextlib import Text as text
a = text()
a.readFile("animals.txt")
a.regexLines("^[A-G][a-z]+")
```

```
['Cat', 'Dog', 'Giraffe']
```

<b>regexReplace</b>    
```python
from ctextlib import Text as text
a = text("there is sub-sequence in the sub-way string")
a.regexReplace("\\b(sub)([^ ]*)", "sub-$2")
```

```
there is sub--sequence in the sub--way string
```

<b>regexSearch</b>    
```python
# collect all words using regex
from ctextlib import Text as text
a = text("The quick brown fox jumps over the lazy dog")
a.regexSearch("\\w+")
```
   
```
'The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
```

<b>regexWords</b>    
```python
# collect all words starting with given characters
from ctextlib import Text as text
a = text("The quick brown fox jumps over the lazy dog")
a.regexWords("^[a-n][a-z]+")
```
   
```
['brown', 'fox', 'jumps', 'lazy', 'dog']   
```
   
<b>remove</b>
```python
a = text('we few, we happy few, we band of brothers.')
a.remove('we')
a.reduceChain()
a.trim()
```

```
few happy few band of brothers
```

<b>removeAny</b>
```python
from ctextlib import Text as text
a = text('The quick brown fox jumps over the lazy dog')
a.removeAny(['brown','quick','lazy'])
a.reduceChain()
```

```
The fox jumps over the dog
```

<b>removeExtension</b>
```python
a = text("D:\\Folder\\SubFolder\\TEXT\\File.dat")
a.removeExtension()
```

```
D:\Folder\SubFolder\TEXT\File
```

<b>removeFileName</b>
```python
a = text("D:\\Folder\\SubFolder\\TEXT\\File.dat")
a.removeFileName()
```

```
D:\Folder\SubFolder\TEXT\
```

<b>removeWhileBegins</b>
```python
a = text("Some text ending with something")
a.removeWhileBegins("Some text ")
print(a)
```

```
ending with something
```

<b>removeWhileEnds</b>
```python
a = text("Some text ending with something")
a.removeWhileEnds(" something")
print(a)
```

```
Some text ending with
```

<b>replace</b>
```python
a = text("The quick brown fox jumps over the lazy dog")
a.replace("fox", "cat")
print(a)
```

```
The quick brown cat jumps over the lazy dog
```

```python
a = text("The quick brown fox jumps over the lazy dog")
a.replace(["fox", "cat","dog","quick"], "-")
```

```
The ----- brown --- jumps over the lazy ---
```

<b>replaceAny</b>
```python
a = text("The quick brown fox jumps over the lazy dog")
a.replaceAny(["fox", "cat","dog"], "***")
print(a)
```

```
The quick brown *** jumps over the lazy ***
```

```python
a = text("The quick brown fox jumps over the lazy dog")
a.replaceAny(["fox", "dog"], ["dog", "fox"])
```

```
The quick brown dog jumps over the lazy fox
```

<b>reverse</b>
```python
a = text("Hello")
a.reverse()
```

```
olleH
```






<b>right</b>
```python
a = text("Hello World")
a.right(5)
```

```
World
```

<b>rotate</b>
```python
a = text("Hello World")
a.rotateLeft(2)
a.rotateRight(4)
```

Output
```
llo WorldHe
ldHello Wor
```

<b>split</b>
```python
# by default split uses the standard separators (" \t\r\n")
a = text("The quick brown fox jumps over the lazy dog")
a.split()
```

```
['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
```

```python
# split can be used with any list of separator characters
a = text("The quick, brown....fox,,, ,jumps over,the  lazy.dog")
a.split(",. ")
```

```
['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
```

<b>toBinary</b>
```python
bOk = False
a = text("100001")
a.toBinary(bOk)
33
```

<b>toHex</b>
```python
a = text("Hello World")
a.toHex()
print(a)
```

```
48 65 6C 6C 6F 20 57 6F 72 6C 64
```

Using separator character. 

```python
a = text("Hello World")
a.toHex(',')
print(a)
```

```
48,65,6C,6C,6F,20,57,6F,72,6C,64
```

<b>toHex</b>
```python
bOk = False
a = text("1E1E")
a.toHex(bOk)
7710
```

<b>trim</b>
```python
a = text(" \t\n   lazy dog  \t\n   ")
a.trim()
lazy dog
a = text("000000000000101")
a.trimLeft("0")
101
a = ("101000000000000")
a.trimRight('0')
101
a = text("0000000101000000000")
a.trim("0")
101
```

<b>upper</b>
```python
s = text("Hello World")
s.upper()
```

```
HELLO WORLD
```

<b>words</b>
```python
a = text("The quick brown fox jumps over the lazy dog")
a.words()
```

```
['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
```

```python
a = text("The|quick|brown|fox|jumps|over|the|lazy|dog")
a.words('|')
```

```
['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
```

<b>wordsCapitalize</b>
```python
a = text("The quick brown fox jumps over the lazy dog")
a.wordsCapitalize()
```

```
The Quick Brown Fox Jumps Over The Lazy Dog
```

<b>wordsCount</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
a.wordsCount()
```

```
9
```

<b>wordsEnclose</b>
```python
a = text("The quick brown fox jumps over the lazy dog")
a.wordsEnclose('[',']')
```

```
[The] [quick] [brown] [fox] [jumps] [over] [the] [lazy] [dog]
```

<b>wordsReverse</b>
```python
a = text("The quick brown fox jumps over the lazy dog")
a.wordsReverse()
```

```
ehT kciuq nworb xof spmuj revo eht yzal god
```


<b>wordsSort</b>
```python
a = text('The quick brown fox jumps over the lazy dog')
a.wordsSort()
```

Output
```
The brown dog fox jumps lazy over quick the
```

<b>writeFile</b>
```python
# demontrates how to write to a text file
from ctextlib import Text as text
a = text("Hello World")
a.writeFile('test.txt')
print(a)
```


## UNICODE for Python
Python is using UTF8 as strings representation. When using Python texts containing non-English Unicode characters it is recommended to use the Unicode version of CText as demonstrated below:
```python
# demonstrate text processing of Swedish unicode text
from ctextlib import TextU as text
s = text('Den snabbbruna räven hoppar över den lata hunden')
>>> s.cutBeforeFirst('ö')
```

```
över den lata hunden
```

```python
# demonstrate text processing of Russian unicode text
from ctextlib import TextU as text
s = text('Быстрая коричневая лиса прыгает на ленивую собаку')
s.cutAfterLast('ы')
```

```
Быстрая коричневая лиса пр
```

```python
# demonstrate text processing of Czech unicode text
from ctextlib import TextU as text
s = text('Rychlá hnědá liška skočí přes líného psa')
s.cutAfterFirst('á', True)
```

```
Rychlá
```


```python
# demonstrate text processing of Greek unicode text
from ctextlib import TextU as text
s = text('Η γρήγορη καφέ αλεπού πηδάει πάνω από το τεμπέλικο σκυλί')
s.cutAfterFirst('έ', True)
```

```
Η γρήγορη καφέ
```


```python
# demonstrate text processing of Armenian unicode text
from ctextlib import TextU as text
s = text('Արագ շագանակագույն աղվեսը ցատկում է ծույլ շան վրա')
s.cutBeforeFirst('է')
```
 
```
է ծույլ շան վրա
```

```python
# demonstrate text processing of Georgian unicode text
from ctextlib import TextU as text
s = text('სწრაფი ყავისფერი მელა გადაბმულია ზარმაცი ძაღლი')
s.cutBeforeFirst('მ')
```

```
მელა გადაბმულია ზარმაცი ძაღლი
```

For the full info type help(text).


## Build CText Unit Test and Demo projects

<br>To build the UnitTest project and the demos with CMake and Visual Studio:
<br> open terminal in the folder \Apps and type
<br>cmake .
<br>Alternatively, you can load in VS2017 or later \Apps\CMakeLists.txt from File->Open->CMake.., after generates cache is completed, choose CMake->Build All

<br>To compile with GCC in Debug or Release:
<br>cmake -D CMAKE_BUILD_TYPE=Release .
<br>cmake -D CMAKE_BUILD_TYPE=Debug .
<br>
<br>This will build a console application that runs the Unit Tests.
<br>
<br> Also there is a Visual Studio solution (CText.sln) with all projects. Run UnitTests project first to see if all tests pass.


<br>
## C++ Examples

For all examples how to use CText please see the Unit Test project.

### Sort all lines in a text file

```cpp
// this example reads a text file and sorts all lines in alphabeta order.
#include <iostream>
#include "../CTEXT/CText.h"
#include "tchar_utils.h"

int main()
{    
    const char* input_name = "/Unsorted.txt";
    const char* output_name = "/Sorted.txt";

    CText pathIn = getcwd(0, 0);
    CText pathOut = pathIn;
    pathIn += input_name;
    pathOut += output_name;
    
    CText str;
    if(!str.readFile(pathIn.str()))
    {
        std::cerr << "Error, can not open file: " << pathIn << std::endl;
        return 0;
    }
    str.linesSort();
    str.writeFile(pathOut.str(), CText::ENCODING_ASCII);

    return 0;
}
```

### Replace words
```cpp
    CText s = _T("The quick brown fox jumps over the lazy dog");
    s.replace(_T("brown"), _T("red"));
    cout << s << endl;
```
Output:
```
   The quick red fox jumps over the lazy dog 
```  

```cpp
    CText s = _T("The quick brown fox jumps over the lazy dog");
    const CText::Char* words[] = {_T("quick"), _T("fox"), _T("dog")};
    s.replaceAny(words, 3, _T('-'));
    cout << s << endl;
```

Output:
```
   The ----- brown --- jumps over the lazy ---     
```  

```cpp
    CText s = _T("The quick brown fox jumps over the lazy dog");
    s.replaceAny({_T("fox"), _T("dog")}, {_T("dog"), _T("fox")});
    cout << s << endl;
```

```cpp
    CText s = _T("The quick brown Fox jumps over the lazy Dog");
    s.replaceAny({_T("fox"), _T("dog")}, {_T("dog"), _T("fox")}, false);
    cout << s << endl;
```

Output:
```
   The quick brown dog jumps over the lazy fox   
```  

```cpp
   CText s = _T("The quick brown fox jumps over the lazy dog");
   const CText::Char* words[] = {_T("quick"), _T("fox"), _T("dog")};
   s.replaceAny(words, 3, _T("****"));
   cout << s << endl;
```

Output:
```
   The **** brown **** jumps over the lazy ****  
```  

### Remove words, blocks and characters
```cpp
   CText s = _T("This is a monkey job!");
   s.remove(_T("monkey"));
   s.reduceChain(' ');
   cout << s << endl;
```

Output:
```
   This is a job!
```  

```cpp
   CText s = _T("Text containing <several> [blocks] separated by {brackets}");
   s.removeBlocks(_T("<[{"), _T(">]}"));
   s.reduceChain(' ');
   s.trim()
   cout << s << endl;
```

Output:
```
   Text containing separated by
```  

```cpp
   s = _T("one and two or three and five");
   s.removeAny({_T("or"), _T("and")});
   s.reduceChain(' ');
   cout << s << endl;
```

Output:
```
   one two three five
```  

### File paths 
```cpp
CText filepath = _T("D:\\Folder\\SubFolder\\TEXT\\File.dat");
cout << filepath.getExtension() << endl;
cout << filepath.getFileName() << endl;
cout << filepath.getDir() << endl;
filepath.replaceExtension(_T(".bin"));
cout << filepath << endl;
filepath.removeExtension();
cout << filepath << endl;
filepath.replaceExtension(_T(".dat"));
cout << filepath << endl;
filepath.replaceFileName(_T("File2"));
cout << filepath << endl;
filepath.addToFileName(_T("_mask"));
cout << filepath << endl;
filepath.replaceLastFolder(_T("Temp"));
cout << filepath << endl;
filepath.removeAfterSlash();
cout << filepath << endl;

```

Output
```
.dat
File.dat
D:\Folder\SubFolder\TEXT\
D:\Folder\SubFolder\TEXT\File.bin
D:\Folder\SubFolder\TEXT\File
D:\Folder\SubFolder\TEXT\File.dat
D:\Folder\SubFolder\TEXT\File2.dat
D:\Folder\SubFolder\TEXT\File2_mask.dat
D:\Folder\SubFolder\Temp\File2_mask.dat
D:\Folder\SubFolder\Temp
```

```cpp
CText path1(_T("C:\\Temp"));
CText path2(_T("..\\Folder"));
path1.pathCombine(path2.str());
cout << path1 << endl;
```

Output
```
C:\\Folder
```

### Split and collection routines
```cpp
    CText s = _T("The quick  brown fox jumps  over the lazy dog");
    vector<CText> words;
    if(s.split(words) < 9)
        cout << "Error!" << endl ;
    for(auto& s : words)
        cout << s << endl;
```

```cpp
   CText s = _T("The,quick,brown,fox,jumps,over,the,lazy,dog");
   vector<std::string> words;
   if(s.split(words,false,_T(",")) != 9)
      cout << "Error!" << endl ;
   for(auto& s : words)
      cout << s << endl;
```

 Output:
```
The
quick
brown
fox
jumps
over
the
lazy
dog
```

```cpp
    CText s = "Line 1\r\nLine 2\n\nLine 3\n";
    vector<std::string> lines;
    s.collectLines(lines);
    for(auto& s : lines)
      cout << s << endl;
```

 Output:
```
Line 1
Line 2
Line 3
```


### Read sentences from text file
```cpp
#include <iostream>
#include "../CTEXT/CText.h"
#include "tchar_utils.h"

int main()
{    
    const char* input_name = "/Columbus.txt";
    const char* output_name = "/Columbus_Sentences.txt";

    CText pathIn = getcwd(0, 0);
    CText pathOut = pathIn;
    pathIn += input_name;
    pathOut += output_name;
    
    CText str;
    if(!str.readFile(pathIn.str()))
    {
        std::cerr << "Error, can not open file: " << pathIn << std::endl;
        return 0;
    }
    std::vector<CText> sentences;

    str.collectSentences(sentences);

    str.fromArray(sentences, _T("\n\n") );

    str.writeFile(pathOut.str(), CText::ENCODING_UTF8);

    return 0;
}
```

### Count characters and words
```cpp
CText s = _T("12345678909678543213");
map<CText::Char, int> freq;
s.countChars(freq);
```

```cpp
CText s = _T("Nory was a Catholic because her mother was a Catholic, and Nory’s mother was a Catholic because her father was a Catholic, and her father was a Catholic because his mother was a Catholic, or had been.");
std::multimap<int, CText, std::greater<int> > freq;
s.countWordFrequencies(freq);
s.fromMap(freq);
cout << s;
```

Output:
```
Catholic 6
a 6
was 6
because 3
her 3
mother 3
and 2
father 2
Nory 1
Nory's 1
been 1
had 1
his 1
or 1
```

### Conversion routines
```cpp
CText s = _T("1 2 3 4 5 6 7 8 9");
vector<int> v;
s.toArray<int>(v);
``` 

Output:
```
{1,2,3,4,5,6,7,8,9}
```

```cpp
CText s = _T("1,2,3,4,5,6,7,8,9");
vector<int> v;
s.toArray<int>(v, _T(','));
``` 

Output:
```
{1,2,3,4,5,6,7,8,9}
```

```cpp
CText s = _T("1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,9.9");
vector<double> v;
s.toArray<double>(v, _T(','));
```

Output:
```
{1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,9.9}
```

From hexadecimal numbers array:
```cpp
CText s = _T("0A 1E 2A 1B");
vector<int> v;
s.toArray<int>(v, _T(' '), true);
```

Output:
```
{10, 30, 42, 27}
```

```cpp
CText s = _T("1a:2b:3c:4d:5e:6f");
vector<int> v;
s.toArray<int>(v, _T(':'), true);
```

Output:
```
{26, 43, 60, 77, 94, 111}
```

Without separator:
```cpp
CText s = _T("0A1E2A1B");
s.toArray<int>(v, 0, true);
```

Output:
```
{10, 30, 42, 27}
```

```cpp
Convert hex to chars string 
CText s = _T("48 65 6C 6C 6F 20 57 6F 72 6C 64");
std::vector<int> bytes;
s.toChars<int>(bytes, true);
s.fromChars<int>(bytes);
cout << s << endl;
```

Output:
```
Hello World
```

Parse numerical matrix:
```cpp
std::vector<std::vector<int>> m;
CText s = _T("1 2 3\n4 5 6\n7 8 9");
s.toMatrix<int>(m, _T(' '));
```

Output:
```
{
    {1, 2, 3},
    {4, 5, 6},
    {7, 8, 9},
};
```

### Highlight words

Following will make bold all words starting with "Col", "Spa","Isa", ending to "an"), "as" or containing "pe" or "sea":

```cpp
vector<CText> start = {_T("Col"), _T("Spa"), _T("Isa")};
vector<CText> end = {_T("an"), _T("as")};
vector<CText> contain = {_T("pe"), _T("sea")};
str.wordsEnclose(_T("<b>"), _T("</b>"), &start, &end, &contain);
```   
     
Portugal had been the main <b>European</b> power interested in pursuing trade routes <b>overseas</b>. Their next-door neighbors, Castile (predecessor of <b>Spain</b>) had been somewhat slower to begin exploring the Atlantic <b>because</b> of the bigger land area it had to re-conquer (the Reconquista) from the Moors. It <b>was</b> not until the late 15th century, following the dynastic union of the Crowns of Castile and Aragon and the completion of the Reconquista, that the unified crowns of what would become <b>Spain</b> (although countries still legally existing) emerged and became fully committed to looking for new trade routes and colonies <b>overseas</b>. In 1492 the joint rulers conquered the Moorish kingdom of Granada, which had been providing Castile with <b>African</b> goods through tribute. <b>Columbus</b> had previously failed to convince King John II of Portugal to fund his exploration of a western route, but the new king and queen of the re-conquered <b>Spain</b> decided to fund <b>Columbus's</b> expedition in hopes of bypassing Portugal's lock on Africa and the <b>Indian</b> <b>Ocean</b>, reaching Asia by traveling west
<b>Columbus</b> <b>was</b> granted <b>an</b> audience with them; on May 1, 1489, he <b>presented</b> his plans to Queen <b>Isabella</b>, who referred them to a committee. They pronounced the idea impractical, and <b>advised</b> the monarchs not to support the <b>proposed</b> venture

## TODO List
* **More methods for words,lines,sentences and complex expressions**:  There are lots more methods that can be added to support diferent NLP and lexical tasks.
* **Further improve containers abstraction**: CText needs more convertion routines to/from STL and other containers and generic data structures.
* **Regular Expressions**: - Partial or full support to regular expressions.
* **Other char types**: - Character types like char_32 can be also supported
* **Mini Text Editor**: - This is a text editor based on CText that I plan to port on Modern C++.
* **Export to Python**: - I want to export CText library to Python-3
* **Performance Test**: - Add performance tests comparing with STL string.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/antonmilev/CText",
    "name": "ctextlib",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=2.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "Anton Milev",
    "author_email": "baj.mile@abv.bg",
    "download_url": "https://files.pythonhosted.org/packages/10/21/02df03dd06fea36ab92fbe9ed8612365dbdf68bf018222079c510627ab0e/ctextlib-1.0.21.tar.gz",
    "platform": null,
    "description": "# CText\r\n# Advanced text processing library in C++ and Python \r\n\r\n## About\r\nA Modern C++ library with many useful text processing routines. CText can solve some complicated text processing tasks that otherwise are taking too much time in C++ and Python, some of these like managing lines and words are available on higher level languages like C#, Java and Python but not in C++. But C++ gives more low-level control, except supporting the missing text functions CText implements optimized text routines. Library is very flexible and scalale, it is easy to add quickly custom text processing routnes, can be used to make  pre-processing problems for different NLP and ML tasks or just to practice Modern C++. \r\n\r\n## Main Features\r\n* **Modern C++ Template library**: You only need to include one header, very simple to use.\r\n* **Unicode Support**: - you can have both UNICODE and ANSI in one project.\r\n* **Hundreds of optimized text processing methods**: - Many standard and non-standard text processing operations are covered. I have a long TODO list with much more to add. \r\n* **Clean and easy to understand code**: - You can use CText to quickly start more complicated text processing applications and abstracting from the too many lower level details and optimizations.\r\n* **Portable**:  I am using CText with VS2017/VS2019 and GCC 7.4 but it easily can be ported to other platforms.\r\n* **Stand alone**:  CText do not depends on any other libraries, the only requirements are C++11 and STL\r\n* **Scalable**:  All text routines are easily to be further extended for all commonly supported char types and platforms. \r\n* **Python**:  Support of all Python versions \r\n\r\n\r\nPlease feel free to contact me for questions or suggestions.\r\n\r\n### Python\r\nTo install CText:\r\n```\r\npip install ctextlib\r\n```\r\n\r\nTo test if CText is installed:\r\n\r\n```python\r\nimport ctextlib\r\na = ctextlib.Text(\"Hello World\")\r\nprint(a)\r\n```\r\n\r\nOr:\r\n\r\n```python\r\nfrom ctextlib import Text as text\r\na = text(\"Hello World\")\r\nprint(a)\r\n```\r\n\r\n\r\nPython methods reference:\r\n\r\n<b>addToFileName</b>\r\n```python\r\na = text(\"C:\\\\Temp\\\\Temp2\\\\File.bmp\")\r\na.addToFileName(\"_mask\")\r\nprint(a)\r\n```\r\n\r\n```\r\nC:\\Temp\\Temp2\\File_mask.bmp\r\n```\r\n\r\n<b>append</b>\r\n```python\r\na = text(\"Hello \")\r\na.append(\"World\")\r\n```\r\n\r\n```\r\nHello World\r\n```\r\n\r\n```python\r\na = text(\"123\")\r\na.append('4',4)\r\n```\r\n\r\n```\r\n1234444\r\n```\r\n\r\n```python\r\na = text(\"\")\r\na.append(['Hello', ' ', 'World'])\r\n```\r\n\r\n```\r\nHello World\r\n```\r\n\r\n<b>appendRange</b>\r\n```python\r\na = text()\r\na.appendRange('a','z').appendRange('0','9')\r\n\r\n```\r\n\r\n```\r\nabcdefghijklmnopqrstuvwxyz0123456789\r\n```\r\n\r\n<b>between</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.between('q','d')\r\nprint(a)\r\n```\r\n\r\n```\r\nuick brown fox jumps over the lazy\r\n```\r\n\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.between('quick','lazy')\r\nprint(a)\r\n```\r\n\r\n```\r\n brown fox jumps over the\r\n```\r\n\r\n<b>contain</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\nif a.contain('quick') :\r\n    print(\"contain 'quick'\")\r\n```\r\n\r\n```\r\ncontain 'quick'\r\n```\r\n    \r\nCase-incensitive\r\n   \r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\nif a.contain('Quick', False) :\r\n    print(\"contain 'quick'\")\r\n```\r\n\r\n```\r\ncontain 'quick'\r\n```\r\n\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\nif a.contain(['slow','fast','quick']):\r\n    print(\"contain 'quick'\")\r\n```\r\n\r\n```\r\ncontain 'quick'\r\n```\r\n\r\n<b>containAny</b>\r\n```python\r\na = text('Hello World')\r\na.containAny('abcd')\r\nTrue\r\n```\r\n<b>containOnly</b>\r\n```python\r\na = text('4365767')\r\na.containOnly('0123456789')\r\nTrue\r\n```\r\n\r\n<b>count</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.count('the', False)\r\n```\r\n\r\n```\r\n2\r\n```\r\n\r\n<b>countWordFrequencies</b>\r\n```python\r\nfrom ctextlib import Text as text\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.countWordFrequencies(False)\r\n```\r\n\r\n```\r\n[(2, 'the'), (1, 'brown'), (1, 'dog'), (1, 'fox'), (1, 'jumps'), (1, 'lazy'), (1, 'over'), (1, 'quick')]\r\n```\r\n\r\n<b>cutAfterFirst</b>\r\n```python\r\ns = text('The quick brown fox jumps over the lazy dog')\r\na.cutAfterFirst('o')\r\n```\r\n\r\n```\r\nThe quick br\r\n```\r\n\r\n<b>cutAfterLast</b>\r\n```python\r\ns = text('The quick brown fox jumps over the lazy dog')\r\na.cutAfterLast('o')\r\n```\r\n\r\n```\r\nThe quick brown fox jumps over the lazy d\r\n```\r\n\r\n\r\n<b>cutBeforeFirst</b>\r\n```python\r\ns = text('The quick brown fox jumps over the lazy dog')\r\na.cutBeforeFirst('o')\r\n```\r\n\r\n```\r\nown fox jumps over the lazy dog\r\n```\r\n\r\n<b>cutEnds</b>\r\n```python\r\ns = text('The quick brown fox jumps over the lazy dog')\r\na.cutEnds(4)\r\n```\r\n\r\n```\r\nquick brown fox jumps over the lazy\r\n```\r\n\r\n<b>cutLeft</b>\r\n```python\r\ns = text(\"Hello World\")\r\ns.cutLeft(6)\r\n```\r\n\r\n```\r\nWorld\r\n```\r\n\r\n<b>cutRight</b>\r\n```python\r\ns = text(\"Hello World\")\r\ns.cutRight(6)\r\n```\r\n\r\n```\r\nHello\r\n```\r\n\r\n<b>enclose</b>\r\n```python\r\na = text(\"Hello World\")\r\na.enclose('<','>')\r\na.enclose('\"')\r\n```\r\n\r\n```\r\n<Hello World>\r\n\"Hello World\"\r\n```\r\n\r\n<b>endsWith</b>\r\n```python\r\na = text(\"Hello World\")\r\nif a.endsWith('World'):\r\n    print(\"ends with 'World'\")\r\n```\r\n\r\n\r\n```\r\nends with 'World'\r\n```\r\n\r\nWith case-insensitive search:\r\n\r\n```python\r\na = text(\"Hello World\")\r\nif a.endsWith('world', False):\r\n    print(\"ends with 'world'\")\r\n```\r\n\r\n```\r\nends with 'world'\r\n```\r\n\r\n<b>endsWithAny</b>\r\n```python\r\nif(a.endsWithAny(['cat','dog'])):\r\n    print('end to animal...')\r\n```\r\n\r\n```\r\nend to animal...\r\n```\r\n\r\n<b>erase</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.erase(8, 10)\r\nprint(a)\r\n```\r\n\r\n``` \r\nThe quicx jumps over the lazy dog\r\n``` \r\n\r\n<b>equal</b>\r\n```python\r\na = text()\r\na.equal('A',10)\r\n```\r\n\r\n```\r\nAAAAAAAAAA\r\n```\r\n\r\n<b>find</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.find('brown')\r\n```\r\n\r\n```\r\n'brown fox jumps over the lazy dog'\r\n```\r\n\r\nWith case-incensitive search:\r\n\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.find('Brown', False)\r\n```\r\n\r\n```\r\n'brown fox jumps over the lazy dog'\r\n```\r\n\r\n<b>fromArray</b>\r\n```python\r\na = text()\r\na.fromArray([1,2,3,4])\r\nprint(a)\r\n```\r\n\r\n```\r\n1 2 3 4\r\n```\r\n\r\n```python\r\na = text()\r\na.fromArray([1,2,3,4], '|')\r\nprint(a)\r\n```\r\n\r\n```\r\n1|2|3|4\r\n```\r\n\r\n```python\r\na = text()\r\na.fromArray([1,2,3,4], '')\r\nprint(a)\r\n```\r\n\r\n```\r\n1234\r\n```\r\n\r\nArray of floats\r\n\r\n```python\r\na = text()\r\na.fromArray([1.1,2.2,3.3,4.4])\r\nprint(a)\r\n```\r\n\r\n```\r\n1.1 2.2 3.3 4.4\r\n```\r\n\r\nArray of strings\r\n```python\r\na = text()\r\na.fromArray(['hello','world'])\r\nprint(a)\r\n```\r\n\r\n```\r\nhello world\r\n```\r\n\r\n```python\r\nimport numpy as np\r\na = text()\r\na.fromArray(np.array([\"hello\",\"world\"]))\r\nprint(a)\r\n```\r\n\r\n```\r\nhello world\r\n```\r\n\r\n<b>fromArrayAsHex</b>\r\n```python\r\na = text()\r\na.fromArrayAsHex([10,20,30,40])\r\nprint(a)\r\n```\r\n\r\n```\r\n0A 14 1E 28\r\n```\r\n\r\nUse without separator\r\n\r\n```python\r\na.fromArrayAsHex([10,20,30,40],2,'')\r\nprint(a)\r\n```\r\n\r\n```\r\n0A141E28\r\n```\r\n\r\n```python\r\na = text()\r\na.fromArrayAsHex([1000,2000,3000,4000])\r\nprint(a)\r\n```\r\n\r\n```\r\n3E8 7D0 BB8 FA0\r\n```\r\n\r\n```python\r\na = text()\r\na.fromArrayAsHex([1000,2000,3000,4000], 4, ',')\r\nprint(a)\r\n```\r\n\r\n```\r\n03E8,07D0,0BB8,0FA0\r\n```\r\n\r\n<b>fromBinary</b>\r\n```python\r\na = text()\r\na.fromBinary(12345)\r\nprint(a)\r\n```\r\n\r\n```\r\n00000000000000000011000000111001\r\n```\r\n\r\n<b>fromDouble</b>\r\n```python\r\na = text()\r\na.fromDouble(3.333338478)\r\nprint(a)\r\na.fromDouble(3.33989, 4)\r\nprint(a)\r\na.fromDouble(3.333338478, 10)\r\n```\r\n\r\n```\r\n3.333338\r\n3.3399\r\n3.3333384780\r\n```\r\n\r\n<b>fromHex</b>\r\n```python\r\na = text()\r\na.fromHex(1234567)\r\na.fromHex('a')\r\na.fromHex(\"48 65 6C 6C 6F 20 57 6F 72 6C 64\")\r\n```\r\n\r\n```\r\n0012D687\r\n61\r\nHello World\r\n```\r\n\r\n<b>fromInteger</b>\r\n```python\r\na = text()\r\na.fromInteger(358764)\r\nprint(a)\r\n```\r\n\r\n```\r\n358764\r\n```\r\n\r\n<b>fromMatrix</b>\r\n```python\r\nfrom ctextlib import Text as text\r\nimport numpy as np\r\nx = np.array([[10, 20, 30], [40, 50, 60]])\r\na = text()\r\na.fromMatrix(x)\r\nprint(a)\r\n```\r\n\r\n```\r\n10 20 30\r\n40 50 60\r\n```\r\n\r\n```python\r\nfrom ctextlib import Text as text\r\nimport numpy as np\r\nx = np.array([[10, 20, 30], [40, 50, 60]])\r\na = text()\r\na.fromMatrix(x, ',')\r\n\r\n```\r\n\r\n```\r\n10,20,30\r\n40,50,60\r\n```\r\n\r\n<b>fromMatrixAsHex</b>\r\n```python\r\nfrom ctextlib import Text as text\r\nimport numpy as np\r\nx = np.array([[10, 20, 30], [40, 50, 60]])\r\na = text()\r\na.fromMatrixAsHex(x)\r\nprint(a)\r\n```\r\n\r\n```\r\n0A 14 1E\r\n28 32 3C\r\n```\r\n\r\n```python\r\nfrom ctextlib import Text as text\r\nimport numpy as np\r\nx = np.array([[1000, 2000, 3000], [4000, 5000, 6000]])\r\na = text()\r\na.fromMatrixAsHex(x,4)\r\nprint(a)\r\n```\r\n\r\n```\r\n03E8 07D0 0BB8\r\n0FA0 1388 1770\r\n```\r\n\r\n<b>getDir</b>\r\n```python\r\na = text(\"D:\\\\Folder\\\\SubFolder\\\\TEXT\\\\file.dat\")\r\na.getDir()\r\n```\r\n\r\n```\r\nD:\\Folder\\SubFolder\\TEXT\\\r\n```\r\n\r\n<b>getExtension</b>\r\n```python\r\na = text(\"D:\\\\Folder\\\\SubFolder\\\\TEXT\\\\file.dat\")\r\na.getExtension()\r\n```\r\n\r\n```\r\n'.dat'\r\n```\r\n\r\n<b>getFileName</b>\r\n```python\r\na = text(\"D:\\\\Folder\\\\SubFolder\\\\TEXT\\\\file.dat\")\r\na.getFileName()\r\n```\r\n\r\n```\r\n'file.dat'\r\n```\r\n\r\n<b>hash</b>\r\n```python\r\ns.hash()\r\n```\r\n\r\n```\r\n9257130453210036571\r\n```\r\n\r\n<b>indexOf</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog.\")\r\na.indexOf(\"brown\")\r\n```\r\n\r\n```\r\n10\r\n```\r\n\r\n<b>indexOfAny</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog.\")\r\na.indexOfAny([\"fox\", \"dog\"])\r\n```\r\n\r\n```\r\n16\r\n```\r\n\r\n<b>indexOfAny</b>\r\n\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog.\")\r\na.indexOfAny(\"abc\")\r\n```\r\n\r\n```\r\n7\r\n```\r\n\r\n<b>insert</b>\r\n```python\r\na = text(\"abc\")\r\na.insert(1,'d',2)\r\n```\r\n\r\n```\r\naddbc\r\n```\r\n\r\n```python\r\na = text(\"The quick jumps over the lazy dog.\")\r\na.insert(10,\"fox \")\r\n```\r\n\r\n```\r\nThe quick fox jumps over the lazy dog.\r\n```\r\n\r\n<b>insertAtBegin</b>\r\n<br><b>insertAtEnd</b>\r\n```python\r\na = text(\"Hello\")\r\na.insertAtBegin(\"<begin>\")\r\na.insertAtEnd(\"</begin>\")\r\n```\r\n\r\n```\r\n<begin>abc</begin>\r\n```\r\n\r\n<b>isAlpha</b>\r\n```python\r\na = text(\"Abcd\")\r\na.isAlpha()\r\nTrue\r\n```\r\n\r\n<b>isBinary</b>\r\n```python\r\na = text(\"01111011100001\")\r\na.isBinary()\r\nTrue\r\n```\r\n\r\n<b>isEmpty</b>\r\n```python\r\na = text()\r\na.isEmpty()\r\nTrue\r\n```\r\n\r\n<b>isHexNumber</b>\r\n```python\r\na = text(\"12AB56FE\")\r\na.isHexNumber()\r\nTrue\r\n```\r\n\r\n<b>isNumber</b>\r\n```python\r\na = text(\"123456\")\r\na.isNumber()\r\nTrue\r\n```\r\n\r\n<b>isLower</b>\r\n```python\r\na = text(\"hello world\")\r\na.isLower()\r\nTrue\r\n```\r\n\r\n<b>isUpper</b>\r\n```python\r\na = text(\"HELLO WORLD\")\r\na.isUpper()\r\nTrue\r\n```\r\n\r\n<b>isPalindrome</b>\r\n```python\r\na = text(\"racecar\")\r\na.isPalindrome()\r\nTrue\r\n```\r\n\r\n<b>keep</b>\r\n```python\r\ns = text(\"Hello World\").keep(3,5)\r\n```\r\n\r\n```\r\nlo Wo\r\n```\r\n\r\n<b>keepLeft</b>\r\n```python\r\na = text(\"The quick jumps over the lazy dog.\")\r\na.keepLeft(10)\r\n```\r\n\r\n```\r\nThe quick\r\n```\r\n\r\n<b>keepRight</b>\r\n```python\r\na = text(\"The quick jumps over the lazy dog.\")\r\na.keepRight(10)\r\n```\r\n\r\n```\r\n lazy dog.\r\n```\r\n\r\n<b>lastIndexOf</b>\r\n```python\r\ns = text(\"Hello World\")\r\ns.lastIndexOf('l')\r\n```\r\n\r\n```\r\n9\r\n```\r\n\r\n<b>lines</b>\r\n```python\r\na = text(\"L1\\nL2\\n\\nL3\\nL4\\n  \\n\\nL5\")\r\na.lines()\r\n```\r\n\r\n```\r\n['L1', 'L2', 'L3', 'L4', 'L5']\r\n```\r\n\r\n<b>linesCount</b>\r\n```python\r\na = text(\"L1\\nL2\\n\\nL3\\nL4\\n  \\n\\nL5\")\r\na.linesCount()\r\n```\r\n\r\n```\r\n7\r\n```\r\n\r\n<b>linesRemoveEmpty</b>\r\n```python\r\na = text(\"L1\\nL2\\n\\nL3\\nL4\\n  \\n\\nL5\")\r\na.linesRemoveEmpty()\r\nprint(a)\r\n```\r\n\r\n```\r\nL1\r\nL2\r\nL3\r\nL4\r\nL5\r\n```\r\n\r\nSeveral per line methods\r\n<br><b>linesAppend</b>\r\n<br><b>linesInsertAtBegin</b>\r\n<br><b>linesSort</b>\r\n<br><b>linesPaddRight</b>\r\n<br><b>linesTrim</b>\r\n<br>Example of opening a text file, sort all lines, and save it with another name\r\n```python\r\nfrom ctextlib import Text as text\r\ns = text()\r\ns.readFile('Unordered.txt')\r\ns.linesSort()\r\ns.writeFile('Sorted_python.txt')\r\n```\r\n\r\n<b>limit</b>\r\n```python\r\ns = text(\"Hello World\")\r\ns.limit(6)\r\n```\r\n\r\n```\r\nHello\r\n```\r\n\r\n<b>lower</b>\r\n```python\r\ns = text(\"Hello World\")\r\ns.lower()\r\n```\r\n\r\n```\r\nhello world\r\n```\r\n\r\n<b>makeUnique</b>\r\n```python\r\na = text()\r\na.appendRange('a','z').appendRange('a','z')\r\nabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\r\na.makeUnique()\r\nprint(a)\r\n```\r\n\r\n```\r\nabcdefghijklmnopqrstuvwxyz\r\n```\r\n\r\n<b>mid</b>\r\n```python\r\na = text(\"Hello World\").mid(3)\r\n```\r\n\r\n```\r\nlo Wo\r\n```\r\n\r\n<b>nextLine</b>\r\n```python\r\n# Example of iterating all lines\r\nfrom ctextlib import Text as text\r\na = text(\"Line1\\nLine2\\nLine3\")\r\nline = text()\r\npos = 0\r\nwhile(pos >= 0):\r\n    pos = a.nextLine(pos,line)\r\n    print(line)\r\n```\r\n\r\n```\r\nLine1\r\nLine2\r\nLine3\r\n```\r\n\r\n<b>nextWord</b>\r\n```python\r\n# Example of iterating all words\r\nfrom ctextlib import Text as text\r\na = text('The quick brown fox jumps over the lazy dog')\r\nword = text()\r\npos = 0\r\nwhile(pos >= 0):\r\n    pos = a.nextWord(pos,word)\r\n    print(word)\r\n```\r\n\r\n```\r\nThe\r\nquick\r\nbrown\r\nfox\r\njumps\r\nover\r\nthe\r\nlazy\r\ndog\r\n```\r\n\r\n<b>paddLeft</b>\r\n```python\r\ns = text(\"Abra\")\r\ns.paddLeft('.', 16)\r\n```\r\n\r\n```\r\n............Abra\r\n```\r\n\r\n<b>paddRight</b>\r\n```python\r\ns = text(\"Abra\")\r\ns.paddRight('.', 16)\r\n```\r\n\r\n```\r\nAbra............\r\n```\r\n\r\n<b>pathCombine</b>\r\n```python\r\na = text(\"C:\\\\Temp\")\r\na.pathCombine(\"..\\\\Folder\")\r\n```\r\n\r\n```\r\nC:\\Folder\r\n```\r\n\r\n<b>quote</b>\r\n```python\r\na = text(\"Hello\")\r\na.quote()\r\n```\r\n\r\n```\r\n\"Hello\"\r\n```\r\n\r\n<b>random</b>\r\n```python\r\na = text()\r\na.random()\r\n\"P1kAlMiG2Kb7FzP5\"\r\na.sort()\r\n\"1257AFGKMPPbiklz\"\r\na.shuffle()\r\n\"k2lF7KAPG5M1Pzbi\"\r\na.random(32)\r\nP1kAlMiG2Kb7FzP5tM1QBI6DSS92c31A\r\n```\r\n\r\n<b>randomAlpha</b>\r\n```python\r\ns = text()\r\ns.randomAlpha()\r\nIkEffmzNiMKKASVW\r\n```\r\n\r\n<b>randomNumber</b>\r\n```python\r\ns = text()\r\ns.randomNumber()\r\n3892795431\r\ns.randomNumber(32)\r\n33341138742779319865028602486509\r\n```\r\n\r\n<b>readFile</b>\r\n```python\r\n# demontrates how to read a whole text file\r\nfrom ctextlib import Text as text\r\na = text()\r\na.readFile('test.txt')\r\nprint(a)\r\n```\r\n\r\n```\r\nHello World\r\n```\r\n\r\n\r\n<b>regexMatch</b>\r\n```python\r\ns = text(\"+336587890078\")\r\nif(s.regexMatch(\"(\\\\+|-)?[[:digit:]]+\")):\r\n    print(\"it is a number\")\r\n```\r\n\r\n```\r\nit is a number\r\n```\r\n\r\n<b>regexLines</b>\r\n```txt\r\nanimals.txt\r\n------------\r\nCat\r\nDog\r\nGiraffe\r\nLion\r\nLlama\r\nMonkey\r\nMouse\r\nParrot\r\nPoodle\r\nScorpion\r\nSnake\r\nWeasel\r\n```\r\n\r\n```python\r\n# collect all lines starting with given characters\r\nfrom ctextlib import Text as text\r\na = text()\r\na.readFile(\"animals.txt\")\r\na.regexLines(\"^[A-G][a-z]+\")\r\n```\r\n\r\n```\r\n['Cat', 'Dog', 'Giraffe']\r\n```\r\n\r\n<b>regexReplace</b>    \r\n```python\r\nfrom ctextlib import Text as text\r\na = text(\"there is sub-sequence in the sub-way string\")\r\na.regexReplace(\"\\\\b(sub)([^ ]*)\", \"sub-$2\")\r\n```\r\n\r\n```\r\nthere is sub--sequence in the sub--way string\r\n```\r\n\r\n<b>regexSearch</b>    \r\n```python\r\n# collect all words using regex\r\nfrom ctextlib import Text as text\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.regexSearch(\"\\\\w+\")\r\n```\r\n   \r\n```\r\n'The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']\r\n```\r\n\r\n<b>regexWords</b>    \r\n```python\r\n# collect all words starting with given characters\r\nfrom ctextlib import Text as text\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.regexWords(\"^[a-n][a-z]+\")\r\n```\r\n   \r\n```\r\n['brown', 'fox', 'jumps', 'lazy', 'dog']   \r\n```\r\n   \r\n<b>remove</b>\r\n```python\r\na = text('we few, we happy few, we band of brothers.')\r\na.remove('we')\r\na.reduceChain()\r\na.trim()\r\n```\r\n\r\n```\r\nfew happy few band of brothers\r\n```\r\n\r\n<b>removeAny</b>\r\n```python\r\nfrom ctextlib import Text as text\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.removeAny(['brown','quick','lazy'])\r\na.reduceChain()\r\n```\r\n\r\n```\r\nThe fox jumps over the dog\r\n```\r\n\r\n<b>removeExtension</b>\r\n```python\r\na = text(\"D:\\\\Folder\\\\SubFolder\\\\TEXT\\\\File.dat\")\r\na.removeExtension()\r\n```\r\n\r\n```\r\nD:\\Folder\\SubFolder\\TEXT\\File\r\n```\r\n\r\n<b>removeFileName</b>\r\n```python\r\na = text(\"D:\\\\Folder\\\\SubFolder\\\\TEXT\\\\File.dat\")\r\na.removeFileName()\r\n```\r\n\r\n```\r\nD:\\Folder\\SubFolder\\TEXT\\\r\n```\r\n\r\n<b>removeWhileBegins</b>\r\n```python\r\na = text(\"Some text ending with something\")\r\na.removeWhileBegins(\"Some text \")\r\nprint(a)\r\n```\r\n\r\n```\r\nending with something\r\n```\r\n\r\n<b>removeWhileEnds</b>\r\n```python\r\na = text(\"Some text ending with something\")\r\na.removeWhileEnds(\" something\")\r\nprint(a)\r\n```\r\n\r\n```\r\nSome text ending with\r\n```\r\n\r\n<b>replace</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.replace(\"fox\", \"cat\")\r\nprint(a)\r\n```\r\n\r\n```\r\nThe quick brown cat jumps over the lazy dog\r\n```\r\n\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.replace([\"fox\", \"cat\",\"dog\",\"quick\"], \"-\")\r\n```\r\n\r\n```\r\nThe ----- brown --- jumps over the lazy ---\r\n```\r\n\r\n<b>replaceAny</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.replaceAny([\"fox\", \"cat\",\"dog\"], \"***\")\r\nprint(a)\r\n```\r\n\r\n```\r\nThe quick brown *** jumps over the lazy ***\r\n```\r\n\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.replaceAny([\"fox\", \"dog\"], [\"dog\", \"fox\"])\r\n```\r\n\r\n```\r\nThe quick brown dog jumps over the lazy fox\r\n```\r\n\r\n<b>reverse</b>\r\n```python\r\na = text(\"Hello\")\r\na.reverse()\r\n```\r\n\r\n```\r\nolleH\r\n```\r\n\r\n\r\n\r\n\r\n\r\n\r\n<b>right</b>\r\n```python\r\na = text(\"Hello World\")\r\na.right(5)\r\n```\r\n\r\n```\r\nWorld\r\n```\r\n\r\n<b>rotate</b>\r\n```python\r\na = text(\"Hello World\")\r\na.rotateLeft(2)\r\na.rotateRight(4)\r\n```\r\n\r\nOutput\r\n```\r\nllo WorldHe\r\nldHello Wor\r\n```\r\n\r\n<b>split</b>\r\n```python\r\n# by default split uses the standard separators (\" \\t\\r\\n\")\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.split()\r\n```\r\n\r\n```\r\n['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']\r\n```\r\n\r\n```python\r\n# split can be used with any list of separator characters\r\na = text(\"The quick, brown....fox,,, ,jumps over,the  lazy.dog\")\r\na.split(\",. \")\r\n```\r\n\r\n```\r\n['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']\r\n```\r\n\r\n<b>toBinary</b>\r\n```python\r\nbOk = False\r\na = text(\"100001\")\r\na.toBinary(bOk)\r\n33\r\n```\r\n\r\n<b>toHex</b>\r\n```python\r\na = text(\"Hello World\")\r\na.toHex()\r\nprint(a)\r\n```\r\n\r\n```\r\n48 65 6C 6C 6F 20 57 6F 72 6C 64\r\n```\r\n\r\nUsing separator character. \r\n\r\n```python\r\na = text(\"Hello World\")\r\na.toHex(',')\r\nprint(a)\r\n```\r\n\r\n```\r\n48,65,6C,6C,6F,20,57,6F,72,6C,64\r\n```\r\n\r\n<b>toHex</b>\r\n```python\r\nbOk = False\r\na = text(\"1E1E\")\r\na.toHex(bOk)\r\n7710\r\n```\r\n\r\n<b>trim</b>\r\n```python\r\na = text(\" \\t\\n   lazy dog  \\t\\n   \")\r\na.trim()\r\nlazy dog\r\na = text(\"000000000000101\")\r\na.trimLeft(\"0\")\r\n101\r\na = (\"101000000000000\")\r\na.trimRight('0')\r\n101\r\na = text(\"0000000101000000000\")\r\na.trim(\"0\")\r\n101\r\n```\r\n\r\n<b>upper</b>\r\n```python\r\ns = text(\"Hello World\")\r\ns.upper()\r\n```\r\n\r\n```\r\nHELLO WORLD\r\n```\r\n\r\n<b>words</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.words()\r\n```\r\n\r\n```\r\n['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']\r\n```\r\n\r\n```python\r\na = text(\"The|quick|brown|fox|jumps|over|the|lazy|dog\")\r\na.words('|')\r\n```\r\n\r\n```\r\n['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']\r\n```\r\n\r\n<b>wordsCapitalize</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.wordsCapitalize()\r\n```\r\n\r\n```\r\nThe Quick Brown Fox Jumps Over The Lazy Dog\r\n```\r\n\r\n<b>wordsCount</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.wordsCount()\r\n```\r\n\r\n```\r\n9\r\n```\r\n\r\n<b>wordsEnclose</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.wordsEnclose('[',']')\r\n```\r\n\r\n```\r\n[The] [quick] [brown] [fox] [jumps] [over] [the] [lazy] [dog]\r\n```\r\n\r\n<b>wordsReverse</b>\r\n```python\r\na = text(\"The quick brown fox jumps over the lazy dog\")\r\na.wordsReverse()\r\n```\r\n\r\n```\r\nehT kciuq nworb xof spmuj revo eht yzal god\r\n```\r\n\r\n\r\n<b>wordsSort</b>\r\n```python\r\na = text('The quick brown fox jumps over the lazy dog')\r\na.wordsSort()\r\n```\r\n\r\nOutput\r\n```\r\nThe brown dog fox jumps lazy over quick the\r\n```\r\n\r\n<b>writeFile</b>\r\n```python\r\n# demontrates how to write to a text file\r\nfrom ctextlib import Text as text\r\na = text(\"Hello World\")\r\na.writeFile('test.txt')\r\nprint(a)\r\n```\r\n\r\n\r\n## UNICODE for Python\r\nPython is using UTF8 as strings representation. When using Python texts containing non-English Unicode characters it is recommended to use the Unicode version of CText as demonstrated below:\r\n```python\r\n# demonstrate text processing of Swedish unicode text\r\nfrom ctextlib import TextU as text\r\ns = text('Den snabbbruna r\u00e4ven hoppar \u00f6ver den lata hunden')\r\n>>> s.cutBeforeFirst('\u00f6')\r\n```\r\n\r\n```\r\n\u00f6ver den lata hunden\r\n```\r\n\r\n```python\r\n# demonstrate text processing of Russian unicode text\r\nfrom ctextlib import TextU as text\r\ns = text('\u0411\u044b\u0441\u0442\u0440\u0430\u044f \u043a\u043e\u0440\u0438\u0447\u043d\u0435\u0432\u0430\u044f \u043b\u0438\u0441\u0430 \u043f\u0440\u044b\u0433\u0430\u0435\u0442 \u043d\u0430 \u043b\u0435\u043d\u0438\u0432\u0443\u044e \u0441\u043e\u0431\u0430\u043a\u0443')\r\ns.cutAfterLast('\u044b')\r\n```\r\n\r\n```\r\n\u0411\u044b\u0441\u0442\u0440\u0430\u044f \u043a\u043e\u0440\u0438\u0447\u043d\u0435\u0432\u0430\u044f \u043b\u0438\u0441\u0430 \u043f\u0440\r\n```\r\n\r\n```python\r\n# demonstrate text processing of Czech unicode text\r\nfrom ctextlib import TextU as text\r\ns = text('Rychl\u00e1 hn\u011bd\u00e1 li\u0161ka sko\u010d\u00ed p\u0159es l\u00edn\u00e9ho psa')\r\ns.cutAfterFirst('\u00e1', True)\r\n```\r\n\r\n```\r\nRychl\u00e1\r\n```\r\n\r\n\r\n```python\r\n# demonstrate text processing of Greek unicode text\r\nfrom ctextlib import TextU as text\r\ns = text('\u0397 \u03b3\u03c1\u03ae\u03b3\u03bf\u03c1\u03b7 \u03ba\u03b1\u03c6\u03ad \u03b1\u03bb\u03b5\u03c0\u03bf\u03cd \u03c0\u03b7\u03b4\u03ac\u03b5\u03b9 \u03c0\u03ac\u03bd\u03c9 \u03b1\u03c0\u03cc \u03c4\u03bf \u03c4\u03b5\u03bc\u03c0\u03ad\u03bb\u03b9\u03ba\u03bf \u03c3\u03ba\u03c5\u03bb\u03af')\r\ns.cutAfterFirst('\u03ad', True)\r\n```\r\n\r\n```\r\n\u0397 \u03b3\u03c1\u03ae\u03b3\u03bf\u03c1\u03b7 \u03ba\u03b1\u03c6\u03ad\r\n```\r\n\r\n\r\n```python\r\n# demonstrate text processing of Armenian unicode text\r\nfrom ctextlib import TextU as text\r\ns = text('\u0531\u0580\u0561\u0563 \u0577\u0561\u0563\u0561\u0576\u0561\u056f\u0561\u0563\u0578\u0582\u0575\u0576 \u0561\u0572\u057e\u0565\u057d\u0568 \u0581\u0561\u057f\u056f\u0578\u0582\u0574 \u0567 \u056e\u0578\u0582\u0575\u056c \u0577\u0561\u0576 \u057e\u0580\u0561')\r\ns.cutBeforeFirst('\u0567')\r\n```\r\n \r\n```\r\n\u0567 \u056e\u0578\u0582\u0575\u056c \u0577\u0561\u0576 \u057e\u0580\u0561\r\n```\r\n\r\n```python\r\n# demonstrate text processing of Georgian unicode text\r\nfrom ctextlib import TextU as text\r\ns = text('\u10e1\u10ec\u10e0\u10d0\u10e4\u10d8 \u10e7\u10d0\u10d5\u10d8\u10e1\u10e4\u10d4\u10e0\u10d8 \u10db\u10d4\u10da\u10d0 \u10d2\u10d0\u10d3\u10d0\u10d1\u10db\u10e3\u10da\u10d8\u10d0 \u10d6\u10d0\u10e0\u10db\u10d0\u10ea\u10d8 \u10eb\u10d0\u10e6\u10da\u10d8')\r\ns.cutBeforeFirst('\u10db')\r\n```\r\n\r\n```\r\n\u10db\u10d4\u10da\u10d0 \u10d2\u10d0\u10d3\u10d0\u10d1\u10db\u10e3\u10da\u10d8\u10d0 \u10d6\u10d0\u10e0\u10db\u10d0\u10ea\u10d8 \u10eb\u10d0\u10e6\u10da\u10d8\r\n```\r\n\r\nFor the full info type help(text).\r\n\r\n\r\n## Build CText Unit Test and Demo projects\r\n\r\n<br>To build the UnitTest project and the demos with CMake and Visual Studio:\r\n<br> open terminal in the folder \\Apps and type\r\n<br>cmake .\r\n<br>Alternatively, you can load in VS2017 or later \\Apps\\CMakeLists.txt from File->Open->CMake.., after generates cache is completed, choose CMake->Build All\r\n\r\n<br>To compile with GCC in Debug or Release:\r\n<br>cmake -D CMAKE_BUILD_TYPE=Release .\r\n<br>cmake -D CMAKE_BUILD_TYPE=Debug .\r\n<br>\r\n<br>This will build a console application that runs the Unit Tests.\r\n<br>\r\n<br> Also there is a Visual Studio solution (CText.sln) with all projects. Run UnitTests project first to see if all tests pass.\r\n\r\n\r\n<br>\r\n## C++ Examples\r\n\r\nFor all examples how to use CText please see the Unit Test project.\r\n\r\n### Sort all lines in a text file\r\n\r\n```cpp\r\n// this example reads a text file and sorts all lines in alphabeta order.\r\n#include <iostream>\r\n#include \"../CTEXT/CText.h\"\r\n#include \"tchar_utils.h\"\r\n\r\nint main()\r\n{    \r\n    const char* input_name = \"/Unsorted.txt\";\r\n    const char* output_name = \"/Sorted.txt\";\r\n\r\n    CText pathIn = getcwd(0, 0);\r\n    CText pathOut = pathIn;\r\n    pathIn += input_name;\r\n    pathOut += output_name;\r\n    \r\n    CText str;\r\n    if(!str.readFile(pathIn.str()))\r\n    {\r\n        std::cerr << \"Error, can not open file: \" << pathIn << std::endl;\r\n        return 0;\r\n    }\r\n    str.linesSort();\r\n    str.writeFile(pathOut.str(), CText::ENCODING_ASCII);\r\n\r\n    return 0;\r\n}\r\n```\r\n\r\n### Replace words\r\n```cpp\r\n    CText s = _T(\"The quick brown fox jumps over the lazy dog\");\r\n    s.replace(_T(\"brown\"), _T(\"red\"));\r\n    cout << s << endl;\r\n```\r\nOutput:\r\n```\r\n   The quick red fox jumps over the lazy dog \r\n```  \r\n\r\n```cpp\r\n    CText s = _T(\"The quick brown fox jumps over the lazy dog\");\r\n    const CText::Char* words[] = {_T(\"quick\"), _T(\"fox\"), _T(\"dog\")};\r\n    s.replaceAny(words, 3, _T('-'));\r\n    cout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\n   The ----- brown --- jumps over the lazy ---     \r\n```  \r\n\r\n```cpp\r\n    CText s = _T(\"The quick brown fox jumps over the lazy dog\");\r\n    s.replaceAny({_T(\"fox\"), _T(\"dog\")}, {_T(\"dog\"), _T(\"fox\")});\r\n    cout << s << endl;\r\n```\r\n\r\n```cpp\r\n    CText s = _T(\"The quick brown Fox jumps over the lazy Dog\");\r\n    s.replaceAny({_T(\"fox\"), _T(\"dog\")}, {_T(\"dog\"), _T(\"fox\")}, false);\r\n    cout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\n   The quick brown dog jumps over the lazy fox   \r\n```  \r\n\r\n```cpp\r\n   CText s = _T(\"The quick brown fox jumps over the lazy dog\");\r\n   const CText::Char* words[] = {_T(\"quick\"), _T(\"fox\"), _T(\"dog\")};\r\n   s.replaceAny(words, 3, _T(\"****\"));\r\n   cout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\n   The **** brown **** jumps over the lazy ****  \r\n```  \r\n\r\n### Remove words, blocks and characters\r\n```cpp\r\n   CText s = _T(\"This is a monkey job!\");\r\n   s.remove(_T(\"monkey\"));\r\n   s.reduceChain(' ');\r\n   cout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\n   This is a job!\r\n```  \r\n\r\n```cpp\r\n   CText s = _T(\"Text containing <several> [blocks] separated by {brackets}\");\r\n   s.removeBlocks(_T(\"<[{\"), _T(\">]}\"));\r\n   s.reduceChain(' ');\r\n   s.trim()\r\n   cout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\n   Text containing separated by\r\n```  \r\n\r\n```cpp\r\n   s = _T(\"one and two or three and five\");\r\n   s.removeAny({_T(\"or\"), _T(\"and\")});\r\n   s.reduceChain(' ');\r\n   cout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\n   one two three five\r\n```  \r\n\r\n### File paths \r\n```cpp\r\nCText filepath = _T(\"D:\\\\Folder\\\\SubFolder\\\\TEXT\\\\File.dat\");\r\ncout << filepath.getExtension() << endl;\r\ncout << filepath.getFileName() << endl;\r\ncout << filepath.getDir() << endl;\r\nfilepath.replaceExtension(_T(\".bin\"));\r\ncout << filepath << endl;\r\nfilepath.removeExtension();\r\ncout << filepath << endl;\r\nfilepath.replaceExtension(_T(\".dat\"));\r\ncout << filepath << endl;\r\nfilepath.replaceFileName(_T(\"File2\"));\r\ncout << filepath << endl;\r\nfilepath.addToFileName(_T(\"_mask\"));\r\ncout << filepath << endl;\r\nfilepath.replaceLastFolder(_T(\"Temp\"));\r\ncout << filepath << endl;\r\nfilepath.removeAfterSlash();\r\ncout << filepath << endl;\r\n\r\n```\r\n\r\nOutput\r\n```\r\n.dat\r\nFile.dat\r\nD:\\Folder\\SubFolder\\TEXT\\\r\nD:\\Folder\\SubFolder\\TEXT\\File.bin\r\nD:\\Folder\\SubFolder\\TEXT\\File\r\nD:\\Folder\\SubFolder\\TEXT\\File.dat\r\nD:\\Folder\\SubFolder\\TEXT\\File2.dat\r\nD:\\Folder\\SubFolder\\TEXT\\File2_mask.dat\r\nD:\\Folder\\SubFolder\\Temp\\File2_mask.dat\r\nD:\\Folder\\SubFolder\\Temp\r\n```\r\n\r\n```cpp\r\nCText path1(_T(\"C:\\\\Temp\"));\r\nCText path2(_T(\"..\\\\Folder\"));\r\npath1.pathCombine(path2.str());\r\ncout << path1 << endl;\r\n```\r\n\r\nOutput\r\n```\r\nC:\\\\Folder\r\n```\r\n\r\n### Split and collection routines\r\n```cpp\r\n    CText s = _T(\"The quick  brown fox jumps  over the lazy dog\");\r\n    vector<CText> words;\r\n    if(s.split(words) < 9)\r\n        cout << \"Error!\" << endl ;\r\n    for(auto& s : words)\r\n        cout << s << endl;\r\n```\r\n\r\n```cpp\r\n   CText s = _T(\"The,quick,brown,fox,jumps,over,the,lazy,dog\");\r\n   vector<std::string> words;\r\n   if(s.split(words,false,_T(\",\")) != 9)\r\n      cout << \"Error!\" << endl ;\r\n   for(auto& s : words)\r\n      cout << s << endl;\r\n```\r\n\r\n Output:\r\n```\r\nThe\r\nquick\r\nbrown\r\nfox\r\njumps\r\nover\r\nthe\r\nlazy\r\ndog\r\n```\r\n\r\n```cpp\r\n    CText s = \"Line 1\\r\\nLine 2\\n\\nLine 3\\n\";\r\n    vector<std::string> lines;\r\n    s.collectLines(lines);\r\n    for(auto& s : lines)\r\n      cout << s << endl;\r\n```\r\n\r\n Output:\r\n```\r\nLine 1\r\nLine 2\r\nLine 3\r\n```\r\n\r\n\r\n### Read sentences from text file\r\n```cpp\r\n#include <iostream>\r\n#include \"../CTEXT/CText.h\"\r\n#include \"tchar_utils.h\"\r\n\r\nint main()\r\n{    \r\n    const char* input_name = \"/Columbus.txt\";\r\n    const char* output_name = \"/Columbus_Sentences.txt\";\r\n\r\n    CText pathIn = getcwd(0, 0);\r\n    CText pathOut = pathIn;\r\n    pathIn += input_name;\r\n    pathOut += output_name;\r\n    \r\n    CText str;\r\n    if(!str.readFile(pathIn.str()))\r\n    {\r\n        std::cerr << \"Error, can not open file: \" << pathIn << std::endl;\r\n        return 0;\r\n    }\r\n    std::vector<CText> sentences;\r\n\r\n    str.collectSentences(sentences);\r\n\r\n    str.fromArray(sentences, _T(\"\\n\\n\") );\r\n\r\n    str.writeFile(pathOut.str(), CText::ENCODING_UTF8);\r\n\r\n    return 0;\r\n}\r\n```\r\n\r\n### Count characters and words\r\n```cpp\r\nCText s = _T(\"12345678909678543213\");\r\nmap<CText::Char, int> freq;\r\ns.countChars(freq);\r\n```\r\n\r\n```cpp\r\nCText s = _T(\"Nory was a Catholic because her mother was a Catholic, and Nory\u2019s mother was a Catholic because her father was a Catholic, and her father was a Catholic because his mother was a Catholic, or had been.\");\r\nstd::multimap<int, CText, std::greater<int> > freq;\r\ns.countWordFrequencies(freq);\r\ns.fromMap(freq);\r\ncout << s;\r\n```\r\n\r\nOutput:\r\n```\r\nCatholic 6\r\na 6\r\nwas 6\r\nbecause 3\r\nher 3\r\nmother 3\r\nand 2\r\nfather 2\r\nNory 1\r\nNory's 1\r\nbeen 1\r\nhad 1\r\nhis 1\r\nor 1\r\n```\r\n\r\n### Conversion routines\r\n```cpp\r\nCText s = _T(\"1 2 3 4 5 6 7 8 9\");\r\nvector<int> v;\r\ns.toArray<int>(v);\r\n``` \r\n\r\nOutput:\r\n```\r\n{1,2,3,4,5,6,7,8,9}\r\n```\r\n\r\n```cpp\r\nCText s = _T(\"1,2,3,4,5,6,7,8,9\");\r\nvector<int> v;\r\ns.toArray<int>(v, _T(','));\r\n``` \r\n\r\nOutput:\r\n```\r\n{1,2,3,4,5,6,7,8,9}\r\n```\r\n\r\n```cpp\r\nCText s = _T(\"1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,9.9\");\r\nvector<double> v;\r\ns.toArray<double>(v, _T(','));\r\n```\r\n\r\nOutput:\r\n```\r\n{1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,9.9}\r\n```\r\n\r\nFrom hexadecimal numbers array:\r\n```cpp\r\nCText s = _T(\"0A 1E 2A 1B\");\r\nvector<int> v;\r\ns.toArray<int>(v, _T(' '), true);\r\n```\r\n\r\nOutput:\r\n```\r\n{10, 30, 42, 27}\r\n```\r\n\r\n```cpp\r\nCText s = _T(\"1a:2b:3c:4d:5e:6f\");\r\nvector<int> v;\r\ns.toArray<int>(v, _T(':'), true);\r\n```\r\n\r\nOutput:\r\n```\r\n{26, 43, 60, 77, 94, 111}\r\n```\r\n\r\nWithout separator:\r\n```cpp\r\nCText s = _T(\"0A1E2A1B\");\r\ns.toArray<int>(v, 0, true);\r\n```\r\n\r\nOutput:\r\n```\r\n{10, 30, 42, 27}\r\n```\r\n\r\n```cpp\r\nConvert hex to chars string \r\nCText s = _T(\"48 65 6C 6C 6F 20 57 6F 72 6C 64\");\r\nstd::vector<int> bytes;\r\ns.toChars<int>(bytes, true);\r\ns.fromChars<int>(bytes);\r\ncout << s << endl;\r\n```\r\n\r\nOutput:\r\n```\r\nHello World\r\n```\r\n\r\nParse numerical matrix:\r\n```cpp\r\nstd::vector<std::vector<int>> m;\r\nCText s = _T(\"1 2 3\\n4 5 6\\n7 8 9\");\r\ns.toMatrix<int>(m, _T(' '));\r\n```\r\n\r\nOutput:\r\n```\r\n{\r\n    {1, 2, 3},\r\n    {4, 5, 6},\r\n    {7, 8, 9},\r\n};\r\n```\r\n\r\n### Highlight words\r\n\r\nFollowing will make bold all words starting with \"Col\", \"Spa\",\"Isa\", ending to \"an\"), \"as\" or containing \"pe\" or \"sea\":\r\n\r\n```cpp\r\nvector<CText> start = {_T(\"Col\"), _T(\"Spa\"), _T(\"Isa\")};\r\nvector<CText> end = {_T(\"an\"), _T(\"as\")};\r\nvector<CText> contain = {_T(\"pe\"), _T(\"sea\")};\r\nstr.wordsEnclose(_T(\"<b>\"), _T(\"</b>\"), &start, &end, &contain);\r\n```   \r\n     \r\nPortugal had been the main <b>European</b> power interested in pursuing trade routes <b>overseas</b>. Their next-door neighbors, Castile (predecessor of <b>Spain</b>) had been somewhat slower to begin exploring the Atlantic <b>because</b> of the bigger land area it had to re-conquer (the Reconquista) from the Moors. It <b>was</b> not until the late 15th century, following the dynastic union of the Crowns of Castile and Aragon and the completion of the Reconquista, that the unified crowns of what would become <b>Spain</b> (although countries still legally existing) emerged and became fully committed to looking for new trade routes and colonies <b>overseas</b>. In 1492 the joint rulers conquered the Moorish kingdom of Granada, which had been providing Castile with <b>African</b> goods through tribute. <b>Columbus</b> had previously failed to convince King John II of Portugal to fund his exploration of a western route, but the new king and queen of the re-conquered <b>Spain</b> decided to fund <b>Columbus's</b> expedition in hopes of bypassing Portugal's lock on Africa and the <b>Indian</b> <b>Ocean</b>, reaching Asia by traveling west\r\n<b>Columbus</b> <b>was</b> granted <b>an</b> audience with them; on May 1, 1489, he <b>presented</b> his plans to Queen <b>Isabella</b>, who referred them to a committee. They pronounced the idea impractical, and <b>advised</b> the monarchs not to support the <b>proposed</b> venture\r\n\r\n## TODO List\r\n* **More methods for words,lines,sentences and complex expressions**:  There are lots more methods that can be added to support diferent NLP and lexical tasks.\r\n* **Further improve containers abstraction**: CText needs more convertion routines to/from STL and other containers and generic data structures.\r\n* **Regular Expressions**: - Partial or full support to regular expressions.\r\n* **Other char types**: - Character types like char_32 can be also supported\r\n* **Mini Text Editor**: - This is a text editor based on CText that I plan to port on Modern C++.\r\n* **Export to Python**: - I want to export CText library to Python-3\r\n* **Performance Test**: - Add performance tests comparing with STL string.\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Python package with CText C++ extension",
    "version": "1.0.21",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6c3c390a07e7ff1dec68304ec8c8ce920ac3f4ad518877e43db2cf1144835662",
                "md5": "d9eaa9be972ea15c579d7bf45a0689c3",
                "sha256": "b8ff4e58f275c905ad319115810dcbc5347ea8f8775e88f93c038f8815784473"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp310-cp310-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "d9eaa9be972ea15c579d7bf45a0689c3",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=2.7",
            "size": 332154,
            "upload_time": "2023-04-17T22:11:17",
            "upload_time_iso_8601": "2023-04-17T22:11:17.965690Z",
            "url": "https://files.pythonhosted.org/packages/6c/3c/390a07e7ff1dec68304ec8c8ce920ac3f4ad518877e43db2cf1144835662/ctextlib-1.0.21-cp310-cp310-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cd6742e65f7406acb14f765392c74dff2ded7a94f3955336a4dacdfbb12079d1",
                "md5": "c1710b0e9664d20363ade89db8c7f3bc",
                "sha256": "b6e6e4ab31d61de04440d4cf3d222a279b7fa1e9a8df3c050c45cd0565927d4c"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp311-cp311-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "c1710b0e9664d20363ade89db8c7f3bc",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=2.7",
            "size": 332159,
            "upload_time": "2023-04-17T22:11:23",
            "upload_time_iso_8601": "2023-04-17T22:11:23.918151Z",
            "url": "https://files.pythonhosted.org/packages/cd/67/42e65f7406acb14f765392c74dff2ded7a94f3955336a4dacdfbb12079d1/ctextlib-1.0.21-cp311-cp311-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "38c4f7ac151e4d3b8628bbd3a159df0e2c46044ecbe3153063b4f6958b99681b",
                "md5": "71fefa90f3d440ff904f0cb08b8fca12",
                "sha256": "83ab0f9e87a0cb3bcf3f45e5cd784df6321f4e290a07f18ea6b0cfea015a60d7"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp312-cp312-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "71fefa90f3d440ff904f0cb08b8fca12",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=2.7",
            "size": 332119,
            "upload_time": "2023-04-17T22:11:30",
            "upload_time_iso_8601": "2023-04-17T22:11:30.535193Z",
            "url": "https://files.pythonhosted.org/packages/38/c4/f7ac151e4d3b8628bbd3a159df0e2c46044ecbe3153063b4f6958b99681b/ctextlib-1.0.21-cp312-cp312-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a257c92d20010c03127b2c60cb0050a629490b75687fb8acc60899131251f545",
                "md5": "2a619d8741ab2b62a04113ed43bda75f",
                "sha256": "5342e1aa4c6faf9f3baa39cd5b3b7073fe0bc85cd506b97d71137fefdea4cafe"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp36-cp36m-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "2a619d8741ab2b62a04113ed43bda75f",
            "packagetype": "bdist_wheel",
            "python_version": "cp36",
            "requires_python": ">=2.7",
            "size": 333630,
            "upload_time": "2023-04-17T22:11:36",
            "upload_time_iso_8601": "2023-04-17T22:11:36.212260Z",
            "url": "https://files.pythonhosted.org/packages/a2/57/c92d20010c03127b2c60cb0050a629490b75687fb8acc60899131251f545/ctextlib-1.0.21-cp36-cp36m-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a7be13f2f461ab24bbf5b9c45b5073f9f54510f65d3772520e5cbdc8a5e4b454",
                "md5": "cfd7a643ad74e33777050b11c47abca8",
                "sha256": "721f8448d6980cade458d62e59fa257bfa7de59071b5ec08f871210bebde357f"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp37-cp37m-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "cfd7a643ad74e33777050b11c47abca8",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=2.7",
            "size": 333698,
            "upload_time": "2023-04-17T22:11:43",
            "upload_time_iso_8601": "2023-04-17T22:11:43.355860Z",
            "url": "https://files.pythonhosted.org/packages/a7/be/13f2f461ab24bbf5b9c45b5073f9f54510f65d3772520e5cbdc8a5e4b454/ctextlib-1.0.21-cp37-cp37m-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9df08f89f312cbc832e4846cbf679905f230b8c625dd3365df01d0c36026bd18",
                "md5": "5da8450e558163c0fab1baa4786bcfc6",
                "sha256": "2efe7b01ae94bf1c43f78447d8c383b5cfdd5a03f80d31742264e416099535e2"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp38-cp38-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "5da8450e558163c0fab1baa4786bcfc6",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=2.7",
            "size": 330983,
            "upload_time": "2023-04-17T22:11:49",
            "upload_time_iso_8601": "2023-04-17T22:11:49.735862Z",
            "url": "https://files.pythonhosted.org/packages/9d/f0/8f89f312cbc832e4846cbf679905f230b8c625dd3365df01d0c36026bd18/ctextlib-1.0.21-cp38-cp38-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "305c3e2254eba986616e98a43028936b176175400196f8d2646bee27eb638201",
                "md5": "1ce7777b95a2812b98328d1b325e9d9e",
                "sha256": "d72cbc2a7a51c27690e67c7e5e65eae51ccb996486681da74fa6de7fb7156c2c"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21-cp39-cp39-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "1ce7777b95a2812b98328d1b325e9d9e",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=2.7",
            "size": 332205,
            "upload_time": "2023-04-17T22:11:58",
            "upload_time_iso_8601": "2023-04-17T22:11:58.222670Z",
            "url": "https://files.pythonhosted.org/packages/30/5c/3e2254eba986616e98a43028936b176175400196f8d2646bee27eb638201/ctextlib-1.0.21-cp39-cp39-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "102102df03dd06fea36ab92fbe9ed8612365dbdf68bf018222079c510627ab0e",
                "md5": "604264d2907b4c8c6dabc4f47117fcc9",
                "sha256": "6f6d7e9971414b8b951f77e325d0be39bbdf53ee4c066c5cd32f8ac30a6663c7"
            },
            "downloads": -1,
            "filename": "ctextlib-1.0.21.tar.gz",
            "has_sig": false,
            "md5_digest": "604264d2907b4c8c6dabc4f47117fcc9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=2.7",
            "size": 218123,
            "upload_time": "2023-04-17T22:12:04",
            "upload_time_iso_8601": "2023-04-17T22:12:04.669828Z",
            "url": "https://files.pythonhosted.org/packages/10/21/02df03dd06fea36ab92fbe9ed8612365dbdf68bf018222079c510627ab0e/ctextlib-1.0.21.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-17 22:12:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "antonmilev",
    "github_project": "CText",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "ctextlib"
}

Anton Milev