Name | excel-list-transform JSON |
Version |
0.7.0
JSON |
| download |
home_page | None |
Summary | Transform a list in excel or CSV |
upload_time | 2024-11-03 13:08:34 |
maintainer | None |
docs_url | None |
author | Tom Björkholm |
requires_python | >=3.12.6 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# excel-list-transform
## Background
This python application was born out of an experience at sail racing events. At the start of the events we received an excel list with participants (from a registration web) to enter into the scoring software and into tool for online notice board. The information was present in the excel file, but the columns were all wrong. To avoid the stressful manual rework of the information in the excel list this application was created.
## What it does
This small python application:
* reads data (that is a list with columns) from an excel file or from a comma separate values (CSV) file.
* splits columns in the list (like creating "first name" and "last name" columns from "name" column)
* merges columns in the list (like creating "name" column from "first name" and "last name" columns)
* removes columns in the list
* reorders columns in the list
* renames columns in the list
* inserts columns in the list
* rewrites columns in the list (like transforming telephone numbers from local/national format to international format)
* writes the resulting data (that is a list with columns) to an excel file or to a comma separate values (CSV) file.
How this is done is governed by a configuration file. The application can create a number of example configuration files with accompanying description text files.
## Installing it
If you want to use it, install it using pip. A precondition is that you have Python 3.12.6 or newer installed on you computer. (Python version 3.10.5 or newer for version 0.5.)
Python can be downloaded from [https://www.python.org/downloads/](https://www.python.org/downloads/).
### Installing on mac and Linux
````sh
pip3 install excel-list-transform
````
### Installing on Microsoft Windows
````sh
pip install excel-list-transform
````
## Running the application
### Running the application on mac and Linux
````sh
python3 -m excel_list_transform --help
python3 -m excel_list_transform example --help
python3 -m excel_list_transform transform --help
python3 -m excel_list_transform example -k forms_to_rrs -r by_name -o example.cfg
python3 -m excel_list_transform transform -c example.cfg -i input.xlsx -o output.xlsx
````
### Running the application on Microsoft Windows
````sh
python -m excel_list_transform --help
python -m excel_list_transform example --help
python -m excel_list_transform transform --help
python -m excel_list_transform example -k forms_to_rrs -r by_name -o example.cfg
python -m excel_list_transform transform -c example.cfg -i input.xlsx -o output.xlsx
````
## Suggested way to get started
1. Use the "example" sub-command to generate a few example configuration (.cfg) files with description (.txt) files.
2. Read the example configuration (.cfg) files and the accompanying description (.txt) files.
3. Find an example that is close to what you want to achieve.
4. Modify that configuration file to achieve what you want to achieve.
5. Use the "transform" sub-command to read your data and output it transformed or reorganized according to your modified configuration file.
6. Read the produced output. If necessary go back to step 4 and adjust how the data is transformed.
### Example configuration files
When using the "example" sub-command to generate an example configuration file (say example.cfg) a text file describing the configuration and the syntax of the configuration file is also generated. If the example configuration file is named example.cfg, then the text file descriging the configuration is named example.txt.
You can generate several example configuration files each with an accompanying text file descriping it.
Read the text file describing the configuration file while looking at the configuration file to understand the syntax and the possible options.
## Performance
This application is written for the moderate amounts of data when registering participants for the majority of sports events. If you have millions of rows this application is not for you.
With an input file consisting of 20 columns producing an output file of 11 columns I have measured the following performance on a MacBook Air (laptop) from 2020:
* 1 000 rows processed in less than 0.5 seconds
* 10 000 rows processed in less than 4 seconds
* 80 000 rows processed in less than 30 seconds
* 120 000 rows processed in 40 seconds
Naturally your performance will be different based on computer hardware, operating system and Python version. Generally it should be reasonably fast for less than 10 000 rows, painfully slow but somewhat usable up to 100 000 rows and probably so slow that it is unusable for more than 100 000 rows.
## Description of how to write/change the configuration file
The configuration file is in JSON syntax. [https://en.wikipedia.org/wiki/JSON](https://en.wikipedia.org/wiki/JSON)
The keywords and the nesting is important. The order of keywords
have no significance (the examples use alphabetical order).
Indentation and line breaks have no significance.
The encoding for the configuration file must be UTF-8. (US-ACII is a subset of UTF-8.)
It is recommended that you let the command generate a configuration
file and then edit that file to match your needs. It is NOT recommended
that the user writes the configuration file from scratch.
## column_ref
The keyword **column_ref** is used to tell if the configuration
references columns by name (value *"BY_NAME"*) or by number
(value *"BY_NUMBER"*). The syntax of parts of the configuration file
is slightly different depending on the how columns are referenced.
Thus, you first have to decide how you want to reference columns.
Generally it is easier to write a correct configuration using *"BY_NAME"*,
so *"BY_NAME"* is recommended when possible. *"BY_NAME"* imposes a few restrictions:
* each column need to have a unique text at the first line to use as column name.
* the column names need to be stable and known.
If you cannot meet these restrictions you will have to reference the
columns by number *"BY_NUMBER"*. When referencing the columns by number
it is important to note that the column numbers change when splitting,
merging, inserting or removing columns.
When referencing *"BY_NUMBER"* the columns are numbered from left to
right. The leftmost column is number 0.
## Type of input and output file
The type of input file to read is determined by **"in_type"**.
The type of output file to write is determined by **"out_type"**.
**"in_type"** and **"out_type"** can have values *"CSV"* or *"EXCEL"*.
Excel files can be read and written using three libraries.
**"in_excel_library"** and **"out_excel_library"** can have values
*"OPENPYXL"*, *"XLSXWRITER"* or *"PYLIGHTXL"*. These are different
third party libraries that can read/write excel. My experience
is that *"PYLIGHTXL"* most often is able to read and write excel
files correctly. If you have trouble reading/writing your
particular excel file, please try different combinations of
these libraries. **"in_excel_library"** is always needed in
the configuration file but is not used if the input is CSV.
**"out_excel_library"** is always needed in the configuration file
but is not used if the output is CSV.
Currently *"PYLIGHTXL"* appears to be best to write files for
Microsoft Excel to open without complaints. But *"OPENPYXL"*
appears to be best to write files to be imported by
[https://www.racingrulesofsailing.org](https://www.racingrulesofsailing.org).
When newer versions arrive this might change...
Comma separated values files (CSV files) may differ slightly
depening on the programs used to read/write them and the locale
used. **"in_csv_dialect"** and **"out_csv_dialect"** changes how CSV files
are read and written. They are always needed in the configuration
file, but are not used if the input and output are excel files.
Comma separated values files might have different encoding for
the text in the file. [https://en.wikipedia.org/wiki/Character_encoding](https://en.wikipedia.org/wiki/Character_encoding)
This is specified with **"in_csv_encoding"** and **"out_csv_encoding"**.
Unless you know that you need another encoding leave these as
in the generated example configuration. (In version 0.5 these
configuration parameters are missing. To be compatible with 0.5
configuration files "in_csv_encoding" defaults to "utf_8_sig"
and "out_csv_encoding" defaults to "utf-8" if missing in the
configuration file.)
## Extra spaces in excel input files
When viewing an excel file in excel it is very hard to notice if
some string value in a cell has trailing white space. These trailing
trailing spaces can make the further processing of a file difficult
as the strings in the file are not what you thought they are.
(This especially is a problem if you have trailing spaces in a
cell on the first line, and refer to columns by their names.)
The configuration **"in_excel_col_name_strip"** can be set to true,
to strip off leading and trailing whitespace from all columns
values read from the first line of the excel input file.
The configuration **"in_excel_values_strip"** can be set to true,
to strip off leading and trailing whitespace from all columns
values read from from the other lines (not the first line) of
the excel input file. (In version 0.6.2 and earlier these are
missing. To be compatible with version 0.6.2 and earlier
**"in_excel_col_name_strip"** and **"in_excel_values_strip"**
defaults to false if missing in the configuration file.)
## Column manipulation
A number or records starting with "s" and a number describe
column manipulation to be done. These manipulations are done
in order of the number: s1_(something) before s2_(something),
before s3_(something). As the "s" number records might add, remove
or rename columns, it is important to keep track of the order that
they are applied.
Some "s" numbers are only used if columns are referenced
by name while other "s" numbers are only used if columns are
referenced by number.
### "s1_split_columns"
The first operation that is done is splitting of columns.
The key **"s1_split_columns"** have an array of splits to be
done. (When the list of splits has more than one split,
the least confusion is to split columns to the right before
columns to the left. Named references also helps to avoid
confusion.)
Each split has the following keys: **"column"**, **"separator"** and
**"where"**. In the case of number column references the there is
also the key **"store_single"**. In the case of named column references
there is instead the key **"right_name"**.
**"column"** take as argument the column to be split. This is a
column number in the case of *"BY_NUMBER"*, and a column name/title
in the sace of *"BY_NAME"*.
**"separator"** is a string of characters that indicate the position
where the column shall be split. (For instance a single space
if splitting between two words.)
**"where"** can either have the value *"RIGHTMOST"* or the value
*"LEFTMOST"*. This is used if the separator string is present
more than once in the column value to split. The split is then
done at the leftmost or rightmost occurence according to the
value of **"where"**.
**"right_name"** is only used if column references are *"BY_NAME"*.
After the split the left column uses the original column name,
and **"right_name"** is used as the name of the new right column.
**"store_single"** is used only if column references are *"BY_NUMBER"*.
If the column value before split does not include the separator,
the result of the split is a single value.
**"store_single"** can have the value *"RIGHTMOST"* or the value
*"LEFTMOST"*. This determines which column the single value shall
be stored in. (The other column will be empty.)
In the case of column references *"BY_NAME"* the single value is
always stored in the column with the original name.
### "s2_remove_columns"
**"s2_remove_columns"** is only used with column references *"BY_NUMBER"*.
The value of **"s2_remove_columns"** is a list of column numbers to
remove. (For columns references *"BY_NAME"* see **"s8_column_order"**.)
### "s3_merge_columns"
The key **"s3_merge_columns"** have an array of merges to be done.
Each merge have the keys **"columns"** and **"separator"**.
**"columns"** have a list of column references. If *"BY_NAME"* the
column references are column names/titles. If *"BY_NUMBER"* the
column references are column numbers.
**"separator"** is a string of characters that is inserted between
the column values being merged.
### "s4_place_columns_first"
**"s4_place_columns_first"** is only used with column references *"BY_NUMBER"*.
The key **"s4_place_columns_first"** has a value that is a list of the
column numbers to be placed first in order. This step re-orders the
columns.
(For columns references *"BY_NAME"* see **"s8_column_order"**.)
### "s5_rename_columns"
The key **"s5_rename_columns"** has a value that is a list of column rename
operations. Each column rename operation has the keys **"column"** and **"name"**.
**"column"** is the number/name of the column before renameing. This is a
column number in the case of *"BY_NUMBER"*, and a column name/title
in the sace of *"BY_NAME"*.
**"name"** is the new name/title of the column identified by **"column"**.
### "s6_insert_columns"
The key **"s6_insert_columns"** has a value that is a list of columns to
insert. Each column to insert is described by the keys: **"column"**,
**"value"** and for *"BY_NUMBER"* only also **"name"**.
**"column"** is the column reference of the inserted column.
In the case of *"BY_NUMBER"* this is also the position where
the column is inserted. In the case of *"BY_NAME"* this is the
name of the inserted column.
**"value"** is the value that this column shall have for every row.
The special word *null* can be used to state that column shall be
empty (with no value).
**"name"** is the name/title of the column in the case of *"BY_NUMBER"*.
### "s7_rewrite_columns"
The key **"s7_rewrite_columns"** has a value that is a list of
rewrite operations that will be applied in order. Each rewrite
operation is described by several keys.
**"kind"** key is used to specify which kind of rewrite operation
this is. Different values for **"kind"** will have different other
keys. **"kind"** can have the values: *"STRIP"*, *"REMOVECHARS"*,
*"STR_SUBSTITUTE"* and *"REGEX_SUBSTITUTE"*.
**"kind"** value *"STRIP"* shall have the keys: **"kind"**, **"case"**, **"chars"**
and **"column"**. This case means that the characters in **"chars"** shall
be stripped off the beginning and end of the column value. If key
**"chars"** is empty string, then white space is stripped off.
**"kind"** value *"REMOVECHARS"* shall have the keys: **"kind"**, **"case"**,
**"chars"** and **"column"**. This case means that the characters in **"chars"**
shall be removed from the column (also when the characters are in the
middle of the column value).
**"kind"** value *"REGEX_SUBSTITUTE"* shall have the keys: **"kind"**, **"case"**,
**"from"**, **"to"** and **"column"**. This case means that **"from"** value is used
as a regular expression, and if it matches the matching part is replaced
with the value of key **"to"**.
**"kind"** value *"STR_SUBSTITUTE"* shall have the keys: **"kind"**, **"case"**,
**"from"**, **"to"** and **"column"**. This case means that **"from"** value is a string,
and if a sub-string of the column value equals this string that part is
replaced with the value of key **"to"**.
**"column"** identifies the column to rewrite. This is a column number in the
case of *"BY_NUMBER"*, and a column name/title in the sace of *"BY_NAME"*.
**"chars"** is a list of characters written as a string. These characters
are removed or stripped off the beginning/end depending on the **"kind"**
value.
**"case"** specifies if comparison of column value to value of **"chars"** or
value of **"from"** shall be case sensitive or not. Possible values are
*"MATCH_CASE"* and *"IGNORE_CASE"*.
**"from"** specifies what the part of the column value shall be to substituted
when matching. The from part in the substitute from something to something.
**"to"** specifies the string that substitution will replace **"from"** with.
### "s8_column_order"
The key **"s8_column_order"** is used only in the *"BY_NAME"* case.
The value of the **"s8_column_order"** key is a list of column names.
The columns will be output in this order.
Columns not mentioned in **"s8_column_order"** will not be output,
and will thus be removed.
(For *"BY_NUMBER"* see **"s2_remove_columns"** and **"s4_place_columns_first"**.)
## Source code
Source code and tests are available at [https://bitbucket.org/tom-bjorkholm/excel-list-transform](https://bitbucket.org/tom-bjorkholm/excel-list-transform).
Raw data
{
"_id": null,
"home_page": null,
"name": "excel-list-transform",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12.6",
"maintainer_email": null,
"keywords": null,
"author": "Tom Bj\u00f6rkholm",
"author_email": "Tom Bj\u00f6rkholm <klausuler_linnet0q@icloud.com>",
"download_url": "https://files.pythonhosted.org/packages/68/04/322da3f31c230b5679114b9728f477fc784f0b904fc0efbbbc6a2faa1faf/excel_list_transform-0.7.0.tar.gz",
"platform": null,
"description": "# excel-list-transform\n\n## Background\n\nThis python application was born out of an experience at sail racing events. At the start of the events we received an excel list with participants (from a registration web) to enter into the scoring software and into tool for online notice board. The information was present in the excel file, but the columns were all wrong. To avoid the stressful manual rework of the information in the excel list this application was created.\n\n## What it does\n\nThis small python application:\n\n* reads data (that is a list with columns) from an excel file or from a comma separate values (CSV) file.\n* splits columns in the list (like creating \"first name\" and \"last name\" columns from \"name\" column)\n* merges columns in the list (like creating \"name\" column from \"first name\" and \"last name\" columns)\n* removes columns in the list\n* reorders columns in the list\n* renames columns in the list\n* inserts columns in the list\n* rewrites columns in the list (like transforming telephone numbers from local/national format to international format)\n* writes the resulting data (that is a list with columns) to an excel file or to a comma separate values (CSV) file.\n\nHow this is done is governed by a configuration file. The application can create a number of example configuration files with accompanying description text files.\n\n## Installing it\n\nIf you want to use it, install it using pip. A precondition is that you have Python 3.12.6 or newer installed on you computer. (Python version 3.10.5 or newer for version 0.5.)\nPython can be downloaded from [https://www.python.org/downloads/](https://www.python.org/downloads/).\n\n### Installing on mac and Linux\n\n````sh\npip3 install excel-list-transform\n````\n\n### Installing on Microsoft Windows\n\n````sh\npip install excel-list-transform\n````\n\n## Running the application\n\n### Running the application on mac and Linux\n\n````sh\npython3 -m excel_list_transform --help\npython3 -m excel_list_transform example --help\npython3 -m excel_list_transform transform --help\npython3 -m excel_list_transform example -k forms_to_rrs -r by_name -o example.cfg\npython3 -m excel_list_transform transform -c example.cfg -i input.xlsx -o output.xlsx\n````\n\n### Running the application on Microsoft Windows\n\n````sh\npython -m excel_list_transform --help\npython -m excel_list_transform example --help\npython -m excel_list_transform transform --help\npython -m excel_list_transform example -k forms_to_rrs -r by_name -o example.cfg\npython -m excel_list_transform transform -c example.cfg -i input.xlsx -o output.xlsx\n````\n\n## Suggested way to get started\n\n 1. Use the \"example\" sub-command to generate a few example configuration (.cfg) files with description (.txt) files.\n 2. Read the example configuration (.cfg) files and the accompanying description (.txt) files.\n 3. Find an example that is close to what you want to achieve.\n 4. Modify that configuration file to achieve what you want to achieve.\n 5. Use the \"transform\" sub-command to read your data and output it transformed or reorganized according to your modified configuration file.\n 6. Read the produced output. If necessary go back to step 4 and adjust how the data is transformed.\n\n### Example configuration files\n\nWhen using the \"example\" sub-command to generate an example configuration file (say example.cfg) a text file describing the configuration and the syntax of the configuration file is also generated. If the example configuration file is named example.cfg, then the text file descriging the configuration is named example.txt.\n\nYou can generate several example configuration files each with an accompanying text file descriping it.\n\nRead the text file describing the configuration file while looking at the configuration file to understand the syntax and the possible options.\n\n## Performance\n\nThis application is written for the moderate amounts of data when registering participants for the majority of sports events. If you have millions of rows this application is not for you.\n\nWith an input file consisting of 20 columns producing an output file of 11 columns I have measured the following performance on a MacBook Air (laptop) from 2020:\n\n* 1 000 rows processed in less than 0.5 seconds\n* 10 000 rows processed in less than 4 seconds\n* 80 000 rows processed in less than 30 seconds\n* 120 000 rows processed in 40 seconds\n\nNaturally your performance will be different based on computer hardware, operating system and Python version. Generally it should be reasonably fast for less than 10 000 rows, painfully slow but somewhat usable up to 100 000 rows and probably so slow that it is unusable for more than 100 000 rows.\n\n## Description of how to write/change the configuration file\n\nThe configuration file is in JSON syntax. [https://en.wikipedia.org/wiki/JSON](https://en.wikipedia.org/wiki/JSON)\n\nThe keywords and the nesting is important. The order of keywords\nhave no significance (the examples use alphabetical order).\nIndentation and line breaks have no significance.\n\nThe encoding for the configuration file must be UTF-8. (US-ACII is a subset of UTF-8.)\n\nIt is recommended that you let the command generate a configuration\nfile and then edit that file to match your needs. It is NOT recommended\nthat the user writes the configuration file from scratch.\n\n## column_ref\n\nThe keyword **column_ref** is used to tell if the configuration\nreferences columns by name (value *\"BY_NAME\"*) or by number\n(value *\"BY_NUMBER\"*). The syntax of parts of the configuration file\nis slightly different depending on the how columns are referenced.\nThus, you first have to decide how you want to reference columns.\n\nGenerally it is easier to write a correct configuration using *\"BY_NAME\"*,\nso *\"BY_NAME\"* is recommended when possible. *\"BY_NAME\"* imposes a few restrictions:\n\n* each column need to have a unique text at the first line to use as column name.\n* the column names need to be stable and known.\n\nIf you cannot meet these restrictions you will have to reference the\ncolumns by number *\"BY_NUMBER\"*. When referencing the columns by number\nit is important to note that the column numbers change when splitting,\nmerging, inserting or removing columns.\n\nWhen referencing *\"BY_NUMBER\"* the columns are numbered from left to\nright. The leftmost column is number 0.\n\n## Type of input and output file\n\nThe type of input file to read is determined by **\"in_type\"**.\nThe type of output file to write is determined by **\"out_type\"**.\n**\"in_type\"** and **\"out_type\"** can have values *\"CSV\"* or *\"EXCEL\"*.\n\nExcel files can be read and written using three libraries.\n**\"in_excel_library\"** and **\"out_excel_library\"** can have values\n*\"OPENPYXL\"*, *\"XLSXWRITER\"* or *\"PYLIGHTXL\"*. These are different\nthird party libraries that can read/write excel. My experience\nis that *\"PYLIGHTXL\"* most often is able to read and write excel\nfiles correctly. If you have trouble reading/writing your\nparticular excel file, please try different combinations of\nthese libraries. **\"in_excel_library\"** is always needed in\nthe configuration file but is not used if the input is CSV.\n**\"out_excel_library\"** is always needed in the configuration file\nbut is not used if the output is CSV.\n\nCurrently *\"PYLIGHTXL\"* appears to be best to write files for\nMicrosoft Excel to open without complaints. But *\"OPENPYXL\"*\nappears to be best to write files to be imported by\n[https://www.racingrulesofsailing.org](https://www.racingrulesofsailing.org).\nWhen newer versions arrive this might change...\n\nComma separated values files (CSV files) may differ slightly\ndepening on the programs used to read/write them and the locale\nused. **\"in_csv_dialect\"** and **\"out_csv_dialect\"** changes how CSV files\nare read and written. They are always needed in the configuration\nfile, but are not used if the input and output are excel files.\n\nComma separated values files might have different encoding for\nthe text in the file. [https://en.wikipedia.org/wiki/Character_encoding](https://en.wikipedia.org/wiki/Character_encoding)\nThis is specified with **\"in_csv_encoding\"** and **\"out_csv_encoding\"**.\nUnless you know that you need another encoding leave these as\nin the generated example configuration. (In version 0.5 these\nconfiguration parameters are missing. To be compatible with 0.5\nconfiguration files \"in_csv_encoding\" defaults to \"utf_8_sig\"\nand \"out_csv_encoding\" defaults to \"utf-8\" if missing in the\nconfiguration file.)\n\n## Extra spaces in excel input files\n\nWhen viewing an excel file in excel it is very hard to notice if\nsome string value in a cell has trailing white space. These trailing\ntrailing spaces can make the further processing of a file difficult\nas the strings in the file are not what you thought they are.\n(This especially is a problem if you have trailing spaces in a\ncell on the first line, and refer to columns by their names.)\n\nThe configuration **\"in_excel_col_name_strip\"** can be set to true,\nto strip off leading and trailing whitespace from all columns\nvalues read from the first line of the excel input file.\nThe configuration **\"in_excel_values_strip\"** can be set to true,\nto strip off leading and trailing whitespace from all columns\nvalues read from from the other lines (not the first line) of\nthe excel input file. (In version 0.6.2 and earlier these are\nmissing. To be compatible with version 0.6.2 and earlier\n**\"in_excel_col_name_strip\"** and **\"in_excel_values_strip\"**\ndefaults to false if missing in the configuration file.)\n\n## Column manipulation\n\nA number or records starting with \"s\" and a number describe\ncolumn manipulation to be done. These manipulations are done\nin order of the number: s1_(something) before s2_(something),\nbefore s3_(something). As the \"s\" number records might add, remove\nor rename columns, it is important to keep track of the order that\nthey are applied.\n\nSome \"s\" numbers are only used if columns are referenced\nby name while other \"s\" numbers are only used if columns are\nreferenced by number.\n\n### \"s1_split_columns\"\n\nThe first operation that is done is splitting of columns.\nThe key **\"s1_split_columns\"** have an array of splits to be\ndone. (When the list of splits has more than one split,\nthe least confusion is to split columns to the right before\ncolumns to the left. Named references also helps to avoid\nconfusion.)\n\nEach split has the following keys: **\"column\"**, **\"separator\"** and\n**\"where\"**. In the case of number column references the there is\nalso the key **\"store_single\"**. In the case of named column references\nthere is instead the key **\"right_name\"**.\n\n**\"column\"** take as argument the column to be split. This is a\ncolumn number in the case of *\"BY_NUMBER\"*, and a column name/title\nin the sace of *\"BY_NAME\"*.\n\n**\"separator\"** is a string of characters that indicate the position\nwhere the column shall be split. (For instance a single space\nif splitting between two words.)\n\n**\"where\"** can either have the value *\"RIGHTMOST\"* or the value\n*\"LEFTMOST\"*. This is used if the separator string is present\nmore than once in the column value to split. The split is then\ndone at the leftmost or rightmost occurence according to the\nvalue of **\"where\"**.\n\n**\"right_name\"** is only used if column references are *\"BY_NAME\"*.\nAfter the split the left column uses the original column name,\nand **\"right_name\"** is used as the name of the new right column.\n\n**\"store_single\"** is used only if column references are *\"BY_NUMBER\"*.\nIf the column value before split does not include the separator,\nthe result of the split is a single value.\n**\"store_single\"** can have the value *\"RIGHTMOST\"* or the value\n*\"LEFTMOST\"*. This determines which column the single value shall\nbe stored in. (The other column will be empty.)\nIn the case of column references *\"BY_NAME\"* the single value is\nalways stored in the column with the original name.\n\n### \"s2_remove_columns\"\n\n**\"s2_remove_columns\"** is only used with column references *\"BY_NUMBER\"*.\nThe value of **\"s2_remove_columns\"** is a list of column numbers to\nremove. (For columns references *\"BY_NAME\"* see **\"s8_column_order\"**.)\n\n### \"s3_merge_columns\"\n\nThe key **\"s3_merge_columns\"** have an array of merges to be done.\nEach merge have the keys **\"columns\"** and **\"separator\"**.\n\n**\"columns\"** have a list of column references. If *\"BY_NAME\"* the\ncolumn references are column names/titles. If *\"BY_NUMBER\"* the\ncolumn references are column numbers.\n\n**\"separator\"** is a string of characters that is inserted between\nthe column values being merged.\n\n### \"s4_place_columns_first\"\n\n**\"s4_place_columns_first\"** is only used with column references *\"BY_NUMBER\"*.\nThe key **\"s4_place_columns_first\"** has a value that is a list of the\ncolumn numbers to be placed first in order. This step re-orders the\ncolumns.\n(For columns references *\"BY_NAME\"* see **\"s8_column_order\"**.)\n\n### \"s5_rename_columns\"\n\nThe key **\"s5_rename_columns\"** has a value that is a list of column rename\noperations. Each column rename operation has the keys **\"column\"** and **\"name\"**.\n\n**\"column\"** is the number/name of the column before renameing. This is a\ncolumn number in the case of *\"BY_NUMBER\"*, and a column name/title\nin the sace of *\"BY_NAME\"*.\n\n**\"name\"** is the new name/title of the column identified by **\"column\"**.\n\n### \"s6_insert_columns\"\n\nThe key **\"s6_insert_columns\"** has a value that is a list of columns to\ninsert. Each column to insert is described by the keys: **\"column\"**,\n**\"value\"** and for *\"BY_NUMBER\"* only also **\"name\"**.\n\n**\"column\"** is the column reference of the inserted column.\nIn the case of *\"BY_NUMBER\"* this is also the position where\nthe column is inserted. In the case of *\"BY_NAME\"* this is the\nname of the inserted column.\n\n**\"value\"** is the value that this column shall have for every row.\nThe special word *null* can be used to state that column shall be\nempty (with no value).\n\n**\"name\"** is the name/title of the column in the case of *\"BY_NUMBER\"*.\n\n### \"s7_rewrite_columns\"\n\nThe key **\"s7_rewrite_columns\"** has a value that is a list of\nrewrite operations that will be applied in order. Each rewrite\noperation is described by several keys.\n\n**\"kind\"** key is used to specify which kind of rewrite operation\nthis is. Different values for **\"kind\"** will have different other\nkeys. **\"kind\"** can have the values: *\"STRIP\"*, *\"REMOVECHARS\"*,\n*\"STR_SUBSTITUTE\"* and *\"REGEX_SUBSTITUTE\"*.\n\n**\"kind\"** value *\"STRIP\"* shall have the keys: **\"kind\"**, **\"case\"**, **\"chars\"**\nand **\"column\"**. This case means that the characters in **\"chars\"** shall\nbe stripped off the beginning and end of the column value. If key\n**\"chars\"** is empty string, then white space is stripped off.\n\n**\"kind\"** value *\"REMOVECHARS\"* shall have the keys: **\"kind\"**, **\"case\"**,\n**\"chars\"** and **\"column\"**. This case means that the characters in **\"chars\"**\nshall be removed from the column (also when the characters are in the\nmiddle of the column value).\n\n**\"kind\"** value *\"REGEX_SUBSTITUTE\"* shall have the keys: **\"kind\"**, **\"case\"**,\n**\"from\"**, **\"to\"** and **\"column\"**. This case means that **\"from\"** value is used\nas a regular expression, and if it matches the matching part is replaced\nwith the value of key **\"to\"**.\n\n**\"kind\"** value *\"STR_SUBSTITUTE\"* shall have the keys: **\"kind\"**, **\"case\"**,\n**\"from\"**, **\"to\"** and **\"column\"**. This case means that **\"from\"** value is a string,\nand if a sub-string of the column value equals this string that part is\nreplaced with the value of key **\"to\"**.\n\n**\"column\"** identifies the column to rewrite. This is a column number in the\ncase of *\"BY_NUMBER\"*, and a column name/title in the sace of *\"BY_NAME\"*.\n\n**\"chars\"** is a list of characters written as a string. These characters\nare removed or stripped off the beginning/end depending on the **\"kind\"**\nvalue.\n\n**\"case\"** specifies if comparison of column value to value of **\"chars\"** or\nvalue of **\"from\"** shall be case sensitive or not. Possible values are\n*\"MATCH_CASE\"* and *\"IGNORE_CASE\"*.\n\n**\"from\"** specifies what the part of the column value shall be to substituted\nwhen matching. The from part in the substitute from something to something.\n\n**\"to\"** specifies the string that substitution will replace **\"from\"** with.\n\n### \"s8_column_order\"\n\nThe key **\"s8_column_order\"** is used only in the *\"BY_NAME\"* case.\nThe value of the **\"s8_column_order\"** key is a list of column names.\nThe columns will be output in this order.\nColumns not mentioned in **\"s8_column_order\"** will not be output,\nand will thus be removed.\n(For *\"BY_NUMBER\"* see **\"s2_remove_columns\"** and **\"s4_place_columns_first\"**.)\n\n## Source code\n\nSource code and tests are available at [https://bitbucket.org/tom-bjorkholm/excel-list-transform](https://bitbucket.org/tom-bjorkholm/excel-list-transform).\n",
"bugtrack_url": null,
"license": null,
"summary": "Transform a list in excel or CSV",
"version": "0.7.0",
"project_urls": {
"Source code": "https://bitbucket.org/tom-bjorkholm/excel-list-transform"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4e682917397fe8778ddcd9c47c945f42a6372874e0db94b3e7dfb1500e68fd12",
"md5": "85e71a14272a3026f4be765cf0b7c69b",
"sha256": "0e794f7e73a1fe3a3c8ff1d099e179d04a70484fed212a63acbe684a6c8f76fd"
},
"downloads": -1,
"filename": "excel_list_transform-0.7.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "85e71a14272a3026f4be765cf0b7c69b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12.6",
"size": 46320,
"upload_time": "2024-11-03T13:08:32",
"upload_time_iso_8601": "2024-11-03T13:08:32.830632Z",
"url": "https://files.pythonhosted.org/packages/4e/68/2917397fe8778ddcd9c47c945f42a6372874e0db94b3e7dfb1500e68fd12/excel_list_transform-0.7.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6804322da3f31c230b5679114b9728f477fc784f0b904fc0efbbbc6a2faa1faf",
"md5": "101aed85279114a50b9edb878d5c6d9e",
"sha256": "4047ffda981615e507b30caf5cf26e94a017375b56190712193219b16bfa15c9"
},
"downloads": -1,
"filename": "excel_list_transform-0.7.0.tar.gz",
"has_sig": false,
"md5_digest": "101aed85279114a50b9edb878d5c6d9e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12.6",
"size": 35984,
"upload_time": "2024-11-03T13:08:34",
"upload_time_iso_8601": "2024-11-03T13:08:34.783059Z",
"url": "https://files.pythonhosted.org/packages/68/04/322da3f31c230b5679114b9728f477fc784f0b904fc0efbbbc6a2faa1faf/excel_list_transform-0.7.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-03 13:08:34",
"github": false,
"gitlab": false,
"bitbucket": true,
"codeberg": false,
"bitbucket_user": "tom-bjorkholm",
"bitbucket_project": "excel-list-transform",
"lcname": "excel-list-transform"
}