Name | file-utils-operations JSON |
Version |
0.1.3
JSON |
| download |
home_page | None |
Summary | This is a python library to parse files, it's giving tools to easily read a file with efficiency. It's based on linux commands like grep, sed, cat, head, tail and tested with them. |
upload_time | 2024-11-30 18:38:36 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | None |
keywords |
head
tail
parse
count_lines
utf
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# file utils
## Table of contents
- [Intro](#Intro)
- [Installation](#Installation)
- [python](#python)
- [rust](#rust)
- [Before starting](#Before-starting)
- [Arguments-explaination](#Arguments-explaination)
Examples:
- **WithEOL: python**:
- [Example-file](#Example-file)
- Examples:
- [Example-simple-head](#Example-simple-head-python)
- [Example-simple-tail](#Example-simple-tail-python)
- [Example-simple-between](#Example-simple-between-python)
- [Example-simple-parse](#Example-simple-parse-python)
- [Example-simple-count_lines](#Example-simple-count_lines-python)
- [Example-remove_empty_string](#Example-remove_empty_string-python)
- [Example-regex_keep](#Example-regex_keep-python)
- [Example-regex_pass](#Example-regex_pass-python)
- [Example-restrict](#Example-restrict-python)
- **WithCustomDelims: python**:
- [How to use it?](#How-to-use-it-python)
- [What delim can be used?](#What-delim-can-be-used-python)
- [With more than one delimiter?](#With-more-than-one-delimiter-python)
- [How to use the rust crate?](#How-to-use-the-rust-crate?)
- [Python class](#Python-class)
- [Rust Structure](#Rust-Structure)
- [Structure](#Structure)
## Intro
This package allows to read/parse a file in python. When should we use this package? If your file is really big (> 100 000 lines), because if you want to parse a file in python you'll write:
```py
f = open("my_file", "r")
buffer: str = f.read()
...
```
or:
```py
f = open("my_file", "r")
for line in f.readlines():
...
```
- With the first one, there is a memory issue because you must save the entire file into a buffer.
- With the second one, there is a time issue because a loop can be very slow in python.
So, this package gives tools to easily read a file with efficiently. It's based on Linux tools like **grep**, **sed**, **cat**, **head**, **tail** and tested with them. \
**WithEOL** class as the same memory problem as the first example. If you want to resolve it, you must use **WithCustomDelims** with the **"\n"** delimiter. \
So, why I keep **WithEOL**? \
**WithEOL** is helping me to test the code, it's using a built in rust function and I'm using it as a reference to compare with **WithCustomDelims**.
## Installation
### python
With **pypi**:
```sh
pip install file-utils-operations
```
From source:
```sh
maturin develop
```
### rust
```sh
cargo add file_utils_operations
```
## Before-starting
This package is ASCII/UTF-8 compliant, all others encoded files will not work...
## Arguments-explaination
- **path**: the path to the file
- **remove_empty_string**: ignore the empty string **"[ ]\*"**
- **n**: get n lines with **tail**/**head**
- **n1**: the beginning line to take with **between**
- **n2**: the last line to take with **between**
- **restrict**: if enable, if we have last N lines, it just keep the regex in those lines. If not enable, it takes last N regex
with **regex**:
- **regex_keep**: list of regex to keep
- **regex_pass**: list of regex to pass/ignore
## WithEOL-python
### Example-file
We will use this example file **test.txt**
With **cat -e test.txt**:
```txt
[Warning]:Entity not found$
[Error]:Unable to recover data$
[Info]:Segfault$
[Warning]:Indentation$
[Error]:Memory leaks$
[Info]:Entity not found$
[Warning]:Unable to recover data$
$
[Error]:Segfault$
[Info]:Indentation$
[Warning]:Memory leaks$
```
### Example-simple-head-python
1\ Simple head (can be change to tail)
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 2 # Number of lines to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n)
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data']
```
### Example-simple-tail-python
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 2 # Number of lines to read
try:
tail: list = file_utils_operations_lib.WithEOL.tail(path=path, n=n)
print(tail)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Info]:Indentation', '[Warning]:Memory leaks']
```
### Example-simple-between-python
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n1: int = 2 # First line to read
n2: int = 4 # Last line to read
try:
between: list = file_utils_operations_lib.WithEOL.between(path=path, n1=n1, n2=n2)
print(between)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Error]:Unable to recover data', '[Info]:Segfault', '[Warning]:Indentation']
```
### Example-simple-parse-python
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
try:
parse: list = file_utils_operations_lib.WithEOL.parse(path=path)
print(parse)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Info]:Segfault', '[Warning]:Indentation', '[Error]:Memory leaks', '[Info]:Entity not found', '[Warning]:Unable to recover data', ' ', '[Error]:Segfault', '[Info]:Indentation', '[Warning]:Memory leaks']
```
### Example-simple-count_lines-python
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
try:
count: list = file_utils_operations_lib.WithEOL.count_lines(path=path)
print(count)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
11
```
### Example-remove_empty_string-python
With **remove_empty_string** enable: \
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 4 # First line to read
try:
tail: list = file_utils_operations_lib.WithEOL.tail(path=path, n=n, remove_empty_string=True)
print(tail)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Unable to recover data', '[Error]:Segfault', '[Info]:Indentation', '[Warning]:Memory leaks']
```
With **remove_empty_string** disable (default option): \
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 4 # First line to read
try:
tail: list = file_utils_operations_lib.WithEOL.tail(path=path, n=n, remove_empty_string=False)
print(tail)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
[' ', '[Error]:Segfault', '[Info]:Indentation', '[Warning]:Memory leaks']
```
### Example-regex_keep-python
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 4 # First line to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n, remove_empty_string=False, regex_keep=["\[Warning\]:*", "\[Error\]:*"])
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Warning]:Indentation']
```
Why there is just 3 elements instead of 4? You should look at the **restrict** option
### Example-regex_pass-python
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 4 # First line to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n, remove_empty_string=False, regex_pass=["\[Warning\]:*", "\[Error\]:*"])
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Info]:Segfault']
```
Why there is just 3 elements instead of 4? You should look at the **restrict** option
### Example-restrict-python
With **restrict** disable: \
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 4 # First line to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=4, remove_empty_string=False, regex_keep=["\[Warning\]:*", "\[Error\]:*"], restrict=False)
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Warning]:Indentation', '[Error]:Memory leaks']
```
With **restrict** enbale(default): \
Code:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 4 # First line to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=4, remove_empty_string=False, regex_keep=["\[Warning\]:*", "\[Error\]:*"], restrict=True)
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Warning]:Indentation']
```
## WithCustomDelims-python
### How-to-use-it-python
It it like **WithEOL** but with a list of custom delimiter. For example:
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 2 # Number of lines to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n)
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data']
```
has the same behavious as
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 2 # Number of lines to read
try:
head: list = file_utils_operations_lib.WithCustomDelims.head(path=path, n=n, delimiter=['\n])
print(head)
except:
print("Unable to open/read the file")
```
Stdout:
```sh
['[Warning]:Entity not found', '[Error]:Unable to recover data']
```
So, you use it as same as **WithEOL** but with a list of custom delimiter.
### What-delim-can-be-used
All string can be used like:
- ";"
- "abc"
- "éà"
- ::
- "小六号"
- "毫"
### With-more-than-one-delimiter
If my file contains:
```sh
;À ;la ;;
pêche éèaux moules, @moules, ::小六号moules::Je n'veux小六号 plus ::y
aller éèmaman小六号
```
We'll have with ";", "\n", "éè", "@", "小六号", "::"
```py
import file_utils_operations_lib
path: str = "my_path_to_file"
try:
parse: list = file_utils_operations_lib.WithCustomDelims.parse(path=path, delimiter=[";", "\n", "éè", "@", "::"])
print(parse)
except:
print("Unable to open/read the file")
```
Stdout
```sh
['', 'À ', 'la ', '', '', 'pêche ', 'aux moules, ', 'moules, ', '', 'moules', "Je n'veux", ' plus ', 'y ', 'aller ', 'maman', '']
```
## How-to-use-the-rust-crate?
You must import the library with
```rs
use file_utils_operations_lib::with_custom_delims::WithCustomDelims;
```
or
```rs
use file_utils_operations_lib::with_eol::WithEOL;
```
Then, you can use the same functions as python because there are the same behavious. \
Example:
```rs
use file_utils_operations_lib::with_custom_delims::WithCustomDelims;
fn main() {
let mut delimiters: Vec<String> = Vec::new();
delimiters.push("\n".to_string());
let n: usize = 10;
let res: Vec<String> = WithCustomDelims::head(
"my path".to_string(),
n,
delimiters,
false,
Vec::new(),
Vec::new(),
true,
1024,
);
}
```
has the same behaviour as
```rs
import file_utils_operations_lib
path: str = "my_path_to_file"
n: int = 2 # Number of lines to read
try:
head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n)
print(head)
except:
print("Unable to open/read the file")
```
## Python-class
If we translate the rust into python, we'll have:
```py
class WithEOL:
# head: Read the n first lines
# if n > (numbers of lines in the file) => return the whole file
def head(path: str, n: int, \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
restrict: bool = True):
...
# between: Read the lines [n1, n2]
# if n1 > n2 => return an empty list
# if n1 > (numbers of lines in the file) => return an empty list
def between(path: str, n1: int, n2: int \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
restrict: bool = True):
...
# tail: Read the n last lines
# if n > (numbers of lines in the file) => return the whole file
def tail(path: str, n: int, \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
restrict: bool = True):
...
# parse: Read the whole file
def parse(path: str, \
remove_empty_string: bool = False \
regex_keep: list = [] \
regex_pass: list = []):
...
# Count the number of lines
def count_lines(path: str \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = []):
...
class WithCustomDelims:
# head: Read the n first lines
# if n > (numbers of lines in the file) => return the whole file
def head(path: str, n: int, delimiter: list \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
restrict: bool = True \
buffer_size: int = 1024):
...
# between: Read the lines [n1, n2]
# if n1 > n2 => return an empty list
# if n1 > (numbers of lines in the file) => return an empty list
def between(path: str, n1: int, n2: int, delimiter: list \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
restrict: bool = True \
buffer_size: int = 1024):
...
# tail: Read the n last lines
# if n > (numbers of lines in the file) => return the whole file
def tail(path: str, n: int, delimiter: list \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
restrict: bool = True \
buffer_size: int = 1024):
...
# parse: Read the whole file
def parse(path: str, delimiter: list \
remove_empty_string: bool = False \
regex_keep: list = [] \
regex_pass: list = [] \
buffer_size: int = 1024):
...
# Count the number of lines
def count_lines(path: str, delimiter: list \
remove_empty_string: bool = False, \
regex_keep: list = [] \
regex_pass: list = [] \
buffer_size: int = 1024):
...
```
## Rust-Structure
Take a look at [https://docs.rs/file_utils_operations/latest/file_utils_operations_lib/](https://docs.rs/file_utils_operations/latest/file_utils_operations_lib/)
## Structure
- **src/**: all sources files
- **tests/**: all tests for rust
- **tests_files/**: all files used for tests
- **tests_python/**: a python script to test
Raw data
{
"_id": null,
"home_page": null,
"name": "file-utils-operations",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "head, tail, parse, count_lines, utf",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/ae/13/5247d8a8a4f97d7efe7f254468aa4a9c628dc7c3bcf85e6ccd24c6af5f50/file_utils_operations-0.1.3.tar.gz",
"platform": null,
"description": "# file utils\n\n## Table of contents\n\n- [Intro](#Intro)\n- [Installation](#Installation)\n - [python](#python)\n - [rust](#rust)\n- [Before starting](#Before-starting)\n- [Arguments-explaination](#Arguments-explaination)\n\nExamples:\n- **WithEOL: python**:\n - [Example-file](#Example-file)\n - Examples:\n - [Example-simple-head](#Example-simple-head-python)\n - [Example-simple-tail](#Example-simple-tail-python)\n - [Example-simple-between](#Example-simple-between-python)\n - [Example-simple-parse](#Example-simple-parse-python)\n - [Example-simple-count_lines](#Example-simple-count_lines-python)\n - [Example-remove_empty_string](#Example-remove_empty_string-python)\n - [Example-regex_keep](#Example-regex_keep-python)\n - [Example-regex_pass](#Example-regex_pass-python)\n - [Example-restrict](#Example-restrict-python)\n- **WithCustomDelims: python**:\n - [How to use it?](#How-to-use-it-python)\n - [What delim can be used?](#What-delim-can-be-used-python)\n - [With more than one delimiter?](#With-more-than-one-delimiter-python)\n- [How to use the rust crate?](#How-to-use-the-rust-crate?)\n\n- [Python class](#Python-class)\n- [Rust Structure](#Rust-Structure)\n- [Structure](#Structure)\n\n## Intro\n\nThis package allows to read/parse a file in python. When should we use this package? If your file is really big (> 100 000 lines), because if you want to parse a file in python you'll write:\n```py\nf = open(\"my_file\", \"r\")\nbuffer: str = f.read()\n...\n```\nor:\n```py\nf = open(\"my_file\", \"r\")\nfor line in f.readlines():\n ...\n```\n- With the first one, there is a memory issue because you must save the entire file into a buffer. \n- With the second one, there is a time issue because a loop can be very slow in python.\n\nSo, this package gives tools to easily read a file with efficiently. It's based on Linux tools like **grep**, **sed**, **cat**, **head**, **tail** and tested with them. \\\n**WithEOL** class as the same memory problem as the first example. If you want to resolve it, you must use **WithCustomDelims** with the **\"\\n\"** delimiter. \\\nSo, why I keep **WithEOL**? \\\n**WithEOL** is helping me to test the code, it's using a built in rust function and I'm using it as a reference to compare with **WithCustomDelims**.\n\n## Installation\n\n### python\n\nWith **pypi**:\n```sh\npip install file-utils-operations\n```\n\nFrom source:\n```sh\nmaturin develop\n```\n\n### rust\n\n```sh\ncargo add file_utils_operations\n```\n\n## Before-starting\n\nThis package is ASCII/UTF-8 compliant, all others encoded files will not work...\n\n## Arguments-explaination\n\n- **path**: the path to the file\n- **remove_empty_string**: ignore the empty string **\"[ ]\\*\"**\n- **n**: get n lines with **tail**/**head** \n- **n1**: the beginning line to take with **between**\n- **n2**: the last line to take with **between**\n- **restrict**: if enable, if we have last N lines, it just keep the regex in those lines. If not enable, it takes last N regex\n\nwith **regex**:\n- **regex_keep**: list of regex to keep\n- **regex_pass**: list of regex to pass/ignore\n\n## WithEOL-python\n\n### Example-file\n\nWe will use this example file **test.txt**\n\nWith **cat -e test.txt**:\n\n```txt\n[Warning]:Entity not found$\n[Error]:Unable to recover data$\n[Info]:Segfault$\n[Warning]:Indentation$\n[Error]:Memory leaks$\n[Info]:Entity not found$\n[Warning]:Unable to recover data$\n $\n[Error]:Segfault$\n[Info]:Indentation$\n[Warning]:Memory leaks$\n ```\n\n### Example-simple-head-python\n\n1\\ Simple head (can be change to tail)\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 2 # Number of lines to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n)\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data']\n```\n\n### Example-simple-tail-python\n\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 2 # Number of lines to read\n\ntry:\n tail: list = file_utils_operations_lib.WithEOL.tail(path=path, n=n)\n print(tail)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Info]:Indentation', '[Warning]:Memory leaks']\n```\n\n### Example-simple-between-python\n\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn1: int = 2 # First line to read\nn2: int = 4 # Last line to read\n\ntry:\n between: list = file_utils_operations_lib.WithEOL.between(path=path, n1=n1, n2=n2)\n print(between)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Error]:Unable to recover data', '[Info]:Segfault', '[Warning]:Indentation']\n```\n\n### Example-simple-parse-python\n\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\n\ntry:\n parse: list = file_utils_operations_lib.WithEOL.parse(path=path)\n print(parse)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Info]:Segfault', '[Warning]:Indentation', '[Error]:Memory leaks', '[Info]:Entity not found', '[Warning]:Unable to recover data', ' ', '[Error]:Segfault', '[Info]:Indentation', '[Warning]:Memory leaks']\n```\n\n### Example-simple-count_lines-python\n\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\n\ntry:\n count: list = file_utils_operations_lib.WithEOL.count_lines(path=path)\n print(count)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n11\n```\n\n### Example-remove_empty_string-python\n\nWith **remove_empty_string** enable: \\\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 4 # First line to read\n\ntry:\n tail: list = file_utils_operations_lib.WithEOL.tail(path=path, n=n, remove_empty_string=True)\n print(tail)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Unable to recover data', '[Error]:Segfault', '[Info]:Indentation', '[Warning]:Memory leaks']\n```\n\nWith **remove_empty_string** disable (default option): \\\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 4 # First line to read\n\ntry:\n tail: list = file_utils_operations_lib.WithEOL.tail(path=path, n=n, remove_empty_string=False)\n print(tail)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n[' ', '[Error]:Segfault', '[Info]:Indentation', '[Warning]:Memory leaks']\n```\n\n### Example-regex_keep-python\n\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 4 # First line to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n, remove_empty_string=False, regex_keep=[\"\\[Warning\\]:*\", \"\\[Error\\]:*\"])\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Warning]:Indentation']\n```\n\nWhy there is just 3 elements instead of 4? You should look at the **restrict** option\n\n### Example-regex_pass-python\n\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 4 # First line to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n, remove_empty_string=False, regex_pass=[\"\\[Warning\\]:*\", \"\\[Error\\]:*\"])\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Info]:Segfault']\n```\n\nWhy there is just 3 elements instead of 4? You should look at the **restrict** option\n\n### Example-restrict-python\n\nWith **restrict** disable: \\\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 4 # First line to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=4, remove_empty_string=False, regex_keep=[\"\\[Warning\\]:*\", \"\\[Error\\]:*\"], restrict=False)\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Warning]:Indentation', '[Error]:Memory leaks']\n```\n\nWith **restrict** enbale(default): \\\nCode:\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 4 # First line to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=4, remove_empty_string=False, regex_keep=[\"\\[Warning\\]:*\", \"\\[Error\\]:*\"], restrict=True)\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data', '[Warning]:Indentation']\n```\n\n## WithCustomDelims-python\n\n### How-to-use-it-python\n\nIt it like **WithEOL** but with a list of custom delimiter. For example:\n\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 2 # Number of lines to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n)\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data']\n```\n\nhas the same behavious as \n\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 2 # Number of lines to read\n\ntry:\n head: list = file_utils_operations_lib.WithCustomDelims.head(path=path, n=n, delimiter=['\\n])\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\nStdout:\n```sh\n['[Warning]:Entity not found', '[Error]:Unable to recover data']\n```\n\nSo, you use it as same as **WithEOL** but with a list of custom delimiter.\n\n### What-delim-can-be-used\n\nAll string can be used like:\n- \";\"\n- \"abc\"\n- \"\u00e9\u00e0\"\n- ::\n- \"\u5c0f\u516d\u53f7\"\n- \"\u6beb\" \n\n### With-more-than-one-delimiter\n\nIf my file contains:\n```sh\n;\u00c0 ;la ;;\np\u00eache \u00e9\u00e8aux moules, @moules, ::\u5c0f\u516d\u53f7moules::Je n'veux\u5c0f\u516d\u53f7 plus ::y \naller \u00e9\u00e8maman\u5c0f\u516d\u53f7\n```\n\nWe'll have with \";\", \"\\n\", \"\u00e9\u00e8\", \"@\", \"\u5c0f\u516d\u53f7\", \"::\"\n```py\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\n\ntry:\n parse: list = file_utils_operations_lib.WithCustomDelims.parse(path=path, delimiter=[\";\", \"\\n\", \"\u00e9\u00e8\", \"@\", \"::\"])\n print(parse)\nexcept:\n print(\"Unable to open/read the file\")\n```\n\nStdout\n\n```sh\n['', '\u00c0 ', 'la ', '', '', 'p\u00eache ', 'aux moules, ', 'moules, ', '', 'moules', \"Je n'veux\", ' plus ', 'y ', 'aller ', 'maman', '']\n```\n\n## How-to-use-the-rust-crate?\n\nYou must import the library with\n```rs\nuse file_utils_operations_lib::with_custom_delims::WithCustomDelims;\n```\nor\n```rs\nuse file_utils_operations_lib::with_eol::WithEOL;\n```\n\nThen, you can use the same functions as python because there are the same behavious. \\\nExample:\n```rs\nuse file_utils_operations_lib::with_custom_delims::WithCustomDelims;\n\nfn main() {\n let mut delimiters: Vec<String> = Vec::new();\n delimiters.push(\"\\n\".to_string());\n let n: usize = 10;\n let res: Vec<String> = WithCustomDelims::head(\n \"my path\".to_string(),\n n,\n delimiters,\n false,\n Vec::new(),\n Vec::new(),\n true,\n 1024,\n );\n}\n```\nhas the same behaviour as\n```rs\nimport file_utils_operations_lib\n\npath: str = \"my_path_to_file\"\nn: int = 2 # Number of lines to read\n\ntry:\n head: list = file_utils_operations_lib.WithEOL.head(path=path, n=n)\n print(head)\nexcept:\n print(\"Unable to open/read the file\")\n```\n\n## Python-class\n\nIf we translate the rust into python, we'll have:\n```py\nclass WithEOL:\n # head: Read the n first lines\n # if n > (numbers of lines in the file) => return the whole file\n def head(path: str, n: int, \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n restrict: bool = True):\n ...\n\n # between: Read the lines [n1, n2]\n # if n1 > n2 => return an empty list\n # if n1 > (numbers of lines in the file) => return an empty list\n def between(path: str, n1: int, n2: int \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n restrict: bool = True):\n ...\n \n # tail: Read the n last lines\n # if n > (numbers of lines in the file) => return the whole file\n def tail(path: str, n: int, \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n restrict: bool = True):\n ...\n \n # parse: Read the whole file\n def parse(path: str, \\ \n remove_empty_string: bool = False \\\n regex_keep: list = [] \\\n regex_pass: list = []):\n ...\n\n # Count the number of lines\n def count_lines(path: str \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = []):\n ...\n\nclass WithCustomDelims:\n # head: Read the n first lines\n # if n > (numbers of lines in the file) => return the whole file\n def head(path: str, n: int, delimiter: list \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n restrict: bool = True \\\n buffer_size: int = 1024):\n ...\n\n # between: Read the lines [n1, n2]\n # if n1 > n2 => return an empty list\n # if n1 > (numbers of lines in the file) => return an empty list\n def between(path: str, n1: int, n2: int, delimiter: list \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n restrict: bool = True \\\n buffer_size: int = 1024):\n ...\n \n # tail: Read the n last lines\n # if n > (numbers of lines in the file) => return the whole file\n def tail(path: str, n: int, delimiter: list \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n restrict: bool = True \\\n buffer_size: int = 1024):\n ...\n \n # parse: Read the whole file\n def parse(path: str, delimiter: list \\\n remove_empty_string: bool = False \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n buffer_size: int = 1024):\n ...\n\n # Count the number of lines\n def count_lines(path: str, delimiter: list \\\n remove_empty_string: bool = False, \\\n regex_keep: list = [] \\\n regex_pass: list = [] \\\n buffer_size: int = 1024):\n ...\n```\n\n## Rust-Structure\n\nTake a look at [https://docs.rs/file_utils_operations/latest/file_utils_operations_lib/](https://docs.rs/file_utils_operations/latest/file_utils_operations_lib/)\n\n## Structure\n\n- **src/**: all sources files\n- **tests/**: all tests for rust\n- **tests_files/**: all files used for tests\n- **tests_python/**: a python script to test\n",
"bugtrack_url": null,
"license": null,
"summary": "This is a python library to parse files, it's giving tools to easily read a file with efficiency. It's based on linux commands like grep, sed, cat, head, tail and tested with them.",
"version": "0.1.3",
"project_urls": {
"Source Code": "https://github.com/FlaveFlav20/file-utils-operations"
},
"split_keywords": [
"head",
" tail",
" parse",
" count_lines",
" utf"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "ff7e7cb45f9c95ee17549ced1f7f8b79799f8e82c5bbbb85a5823d16df164565",
"md5": "0a746b8f0a516d92ae859b248ad86056",
"sha256": "583c8ec484918c3a9f640695068503ff3b28017311e7b8b1be0d9f5b9f444c63"
},
"downloads": -1,
"filename": "file_utils_operations-0.1.3-cp310-cp310-manylinux_2_34_x86_64.whl",
"has_sig": false,
"md5_digest": "0a746b8f0a516d92ae859b248ad86056",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 997876,
"upload_time": "2024-11-30T18:38:18",
"upload_time_iso_8601": "2024-11-30T18:38:18.783482Z",
"url": "https://files.pythonhosted.org/packages/ff/7e/7cb45f9c95ee17549ced1f7f8b79799f8e82c5bbbb85a5823d16df164565/file_utils_operations-0.1.3-cp310-cp310-manylinux_2_34_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ae135247d8a8a4f97d7efe7f254468aa4a9c628dc7c3bcf85e6ccd24c6af5f50",
"md5": "771df4ba52cd10cc0a00cae4d0fec435",
"sha256": "2e38301fa30de55b5fa532154f66753da2612b316ef8776a168fe3132f99ce34"
},
"downloads": -1,
"filename": "file_utils_operations-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "771df4ba52cd10cc0a00cae4d0fec435",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 25167,
"upload_time": "2024-11-30T18:38:36",
"upload_time_iso_8601": "2024-11-30T18:38:36.770779Z",
"url": "https://files.pythonhosted.org/packages/ae/13/5247d8a8a4f97d7efe7f254468aa4a9c628dc7c3bcf85e6ccd24c6af5f50/file_utils_operations-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-30 18:38:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "FlaveFlav20",
"github_project": "file-utils-operations",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "file-utils-operations"
}