imsciences


Nameimsciences JSON
Version 0.5.7.1 PyPI version JSON
download
home_pageNone
SummaryIMS Data Processing Package
upload_time2024-05-17 13:25:47
maintainerNone
docs_urlNone
authorIMS
requires_pythonNone
licenseNone
keywords python data processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # IMS Package Documentation

The IMS package is a python library for processing incoming data into a format that can be used for projects. IMS processing offers a variety of functions to manipulate and analyze data efficiently. Here are the functionalities provided by the package:

### 1. `get_wd_levels(levels)`
- **Description**: Get the working directory with the option of moving up parents.
- **Usage**: `get_wd_levels(levels)`

### 2. `remove_rows(data_frame, num_rows_to_remove)`
- **Description**: Removes a specified number of rows from a pandas DataFrame.
- **Usage**: `remove_rows(data_frame, num_rows_to_remove)`

### 3. `aggregate_daily_to_wc_long(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`
- **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week. In the long format.
- **Usage**: `aggregate_daily_to_wc_long(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`

### 4. `convert_monthly_to_daily(df, date_column)`
- **Description**: Converts monthly data in a DataFrame to daily data by expanding and dividing the numeric values.
- **Usage**: `convert_monthly_to_daily(df, date_column)`

### 5. `plot_two(df1, col1, df2, col2, date_column, same_axis=True)`
- **Description**: Plots specified columns from two different DataFrames using a shared date column. Useful for comparing data.
- **Usage**: `plot_two(df1, col1, df2, col2, date_column, same_axis=True)`

### 6. `remove_nan_rows(df, col_to_remove_rows)`
- **Description**: Removes rows from a DataFrame where the specified column has NaN values.
- **Usage**: `remove_nan_rows(df, col_to_remove_rows)`

### 7. `filter_rows(df, col_to_filter, list_of_filters)`
- **Description**: Filters the DataFrame based on whether the values in a specified column are in a provided list.
- **Usage**: `filter_rows(df, col_to_filter, list_of_filters)`

### 8. `plot_one(df1, col1, date_column)`
- **Description**: Plots a specified column from a DataFrame.
- **Usage**: `plot_one(df1, col1, date_column)`

### 9. `week_of_year_mapping(df, week_col, start_day_str)`
- **Description**: Converts a week column in 'yyyy-Www' or 'yyyy-ww' format to week commencing date.
- **Usage**: `week_of_year_mapping(df, week_col, start_day_str)`

### 10. `exclude_rows(df, col_to_filter, list_of_filters)`
- **Description**: Removes rows from a DataFrame based on whether the values in a specified column are not in a provided list.
- **Usage**: `exclude_rows(df, col_to_filter, list_of_filters)`

### 11. `rename_cols(df, cols_to_rename)`
- **Description**: Renames columns in a pandas DataFrame.
- **Usage**: `rename_cols(df, cols_to_rename)`

### 12. `merge_new_and_old(old_df, old_col, new_df, new_col, cutoff_date, date_col_name='OBS')`
- **Description**: Creates a new DataFrame with two columns: one for dates and one for merged numeric values.
- **Usage**: `merge_new_and_old(old_df, old_col, new_df, new_col, cutoff_date, date_col_name='OBS')`

### 13. `merge_dataframes_on_date(dataframes, common_column='OBS', merge_how='outer')`
- **Description**: Merge a list of DataFrames on a common column.
- **Usage**: `merge_dataframes_on_date(dataframes, common_column='OBS', merge_how='outer')`

### 14. `merge_and_update_dfs(df1, df2, key_column)`
- **Description**: Merges two dataframes on a key column, updates the first dataframe's columns with the second's where available, and returns a dataframe sorted by the key column.
- **Usage**: `merge_and_update_dfs(df1, df2, key_column)`

### 15. `convert_us_to_uk_dates(df, date_col)`
- **Description**: Convert a DataFrame column with mixed date formats to datetime.
- **Usage**: `convert_us_to_uk_dates(df, date_col)`

### 16. `combine_sheets(all_sheets)`
- **Description**: Combines multiple DataFrames from a dictionary into a single DataFrame.
- **Usage**: `combine_sheets({'Sheet1': df1, 'Sheet2': df2})`

### 17. `pivot_table(df, filters_dict, index_col, columns, values_col, fill_value=0,aggfunc='sum',margins=False,margins_name='Total',datetime_trans_needed=True)`
- **Description**: Dynamically pivots a DataFrame based on specified columns.
- **Usage**: `pivot_table(df, {'Master Include':' == 1','OBS':' >= datetime(2019,9,9)','Metric Short Names':' == 'spd''}, 'OBS', 'Channel Short Names', 'Value', fill_value=0,aggfunc='sum',margins=False,margins_name='Total',datetime_trans_needed=True)`

### 18. `apply_lookup_table_for_columns(df, col_names, to_find_dict, if_not_in_country_dict='Other'), new_column_name='Mapping')`
- **Description**: Equivalent of xlookup in excel. Allows you to map a dictionary of substrings within a column. If multiple columns are need for the LUT then a | seperator is needed.
- **Usage**: `classify_within_column(df, ['campaign type','media type'], {'France Paid Social FB|paid social': 'facebook','France Paid Social TW|paid social': 'twitter'}, 'other','mapping')`

### 19. `aggregate_daily_to_wc_wide(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`
- **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week. In the wide format.
- **Usage**: `aggregate_daily_to_wc_wide(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`

### 20. `merge_cols_with_seperator(self, df, col_names,seperator='_',output_column_name = 'Merged',starting_prefix_str=None,ending_prefix_str=None)`
- **Description**: Merge multiple columns in a dataframe into 1 column with a seperator.Can be used if multiple columns are needed for a LUT.
- **Usage**: `merge_cols_with_seperator(df, ['Campaign','Product'],seperator='|','Merged Columns',starting_prefix_str='start_',ending_prefix_str='_end')`

### 21. `check_sum_of_df_cols_are_equal(df_1,df_2,cols_1,cols_2)`
- **Description**: Checks if the sum of two columns in two dataframes are the same, and provides the sums of each column and the difference between them.
- **Usage**: `check_sum_of_df_cols_are_equal(df_1,df_2,'Media Cost','Spend')`

### 22. `convert_2_df_cols_to_dict(df, key_col, value_col)`
- **Description**: Can be used to create an LUT. Creates a dictionary using two columns in a dataframe.
- **Usage**: `convert_2_df_cols_to_dict(df, 'Campaign', 'Channel')`

### 23. `create_FY_and_H_columns(df, index_col, start_date, starting_FY,short_format='No',half_years='No',combined_FY_and_H='No')`
- **Description**: Used to create a financial year, half year, and financial half year column.
- **Usage**: `create_FY_and_H_columns(df, 'Week (M-S)', '2022-10-03', 'FY2023',short_format='Yes',half_years='Yes',combined_FY_and_H='Yes')`

### 24. `keyword_lookup_replacement(df, col, replacement_rows, cols_to_merge, replacement_lookup_dict,output_column_name='Updated Column')`
- **Description**: Essentially provides an if statement with a xlookup if a value is something. Updates certain chosen values in a specified column of the DataFrame based on a lookup dictionary.
- **Usage**: `keyword_lookup_replacement(df, 'channel', 'Paid Search Generic', ['channel','segment','product'], qlik_dict_for_channel,output_column_name='Channel New')`

### 25. `create_new_version_of_col_using_LUT(df, keys_col,value_col, dict_for_specific_changes, new_col_name='New Version of Old Col')`
- **Description**: Creates a new column in a dataframe, which takes an old column and uses a lookup table to changes values in the new column to reflect the lookup table. The lookup is based on a column in the dataframe.
- **Usage**: `keyword_lookup_replacement(df, '*Campaign Name','Campaign Type',search_campaign_name_retag_lut,'Campaign Name New')`

### 26. `convert_df_wide_2_long(df,value_cols,variable_col_name='Stacked',value_col_name='Value')`
- **Description**: Changes a dataframe from wide to long format.
- **Usage**: `keyword_lookup_replacement(df, ['Media Cost','Impressions','Clicks'],variable_col_name='Metric')`

### 27. `manually_edit_data(df, filters_dict, col_to_change, new_value, change_in_existing_df_col='No', new_col_to_change_name='New', manual_edit_col_name=None, add_notes='No', existing_note_col_name=None, note=None)`
- **Description**: Allows the capability to manually update any cell in dataframe by applying filters and chosing a column to edit in dataframe.
- **Usage**: `keyword_lookup_replacement(df, {'OBS':' <= datetime(2023,1,23)','File_Name':' == 'France media''},'Master Include',1,change_in_existing_df_col = 'Yes',new_col_to_change_name = 'Master Include',manual_edit_col_name = 'Manual Changes')`

### 28. `format_numbers_with_commas(df, decimal_length_chosen=2)`
- **Description**: Converts data in numerical format into numbers with commas and a chosen decimal place length.
- **Usage**: `format_numbers_with_commas(df,1)`

### 29. `filter_df_on_multiple_conditions(df, filters_dict)`
- **Description**: Filters dataframe on multiple conditions, which come in the form of a dictionary.
- **Usage**: `filter_df_on_multiple_conditions(df, {'OBS':' <= datetime(2023,1,23)','File_Name':' == 'France media''})`

### 30. `read_and_concatenate_files(folder_path, file_type='csv')`
- **Description**: Read and Concatinate all files of one type in a folder.
- **Usage**: `read_and_concatenate_files(folder_path, file_type='csv')`

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "imsciences",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, data processing",
    "author": "IMS",
    "author_email": "cam@im-sciences.com",
    "download_url": "https://files.pythonhosted.org/packages/31/94/ffe8a81002633bcb9dfb7e180d4301f1cf4b93baa1261de8b10369af5daf/imsciences-0.5.7.1.tar.gz",
    "platform": null,
    "description": "# IMS Package Documentation\r\n\r\nThe IMS package is a python library for processing incoming data into a format that can be used for projects. IMS processing offers a variety of functions to manipulate and analyze data efficiently. Here are the functionalities provided by the package:\r\n\r\n### 1. `get_wd_levels(levels)`\r\n- **Description**: Get the working directory with the option of moving up parents.\r\n- **Usage**: `get_wd_levels(levels)`\r\n\r\n### 2. `remove_rows(data_frame, num_rows_to_remove)`\r\n- **Description**: Removes a specified number of rows from a pandas DataFrame.\r\n- **Usage**: `remove_rows(data_frame, num_rows_to_remove)`\r\n\r\n### 3. `aggregate_daily_to_wc_long(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`\r\n- **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week. In the long format.\r\n- **Usage**: `aggregate_daily_to_wc_long(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`\r\n\r\n### 4. `convert_monthly_to_daily(df, date_column)`\r\n- **Description**: Converts monthly data in a DataFrame to daily data by expanding and dividing the numeric values.\r\n- **Usage**: `convert_monthly_to_daily(df, date_column)`\r\n\r\n### 5. `plot_two(df1, col1, df2, col2, date_column, same_axis=True)`\r\n- **Description**: Plots specified columns from two different DataFrames using a shared date column. Useful for comparing data.\r\n- **Usage**: `plot_two(df1, col1, df2, col2, date_column, same_axis=True)`\r\n\r\n### 6. `remove_nan_rows(df, col_to_remove_rows)`\r\n- **Description**: Removes rows from a DataFrame where the specified column has NaN values.\r\n- **Usage**: `remove_nan_rows(df, col_to_remove_rows)`\r\n\r\n### 7. `filter_rows(df, col_to_filter, list_of_filters)`\r\n- **Description**: Filters the DataFrame based on whether the values in a specified column are in a provided list.\r\n- **Usage**: `filter_rows(df, col_to_filter, list_of_filters)`\r\n\r\n### 8. `plot_one(df1, col1, date_column)`\r\n- **Description**: Plots a specified column from a DataFrame.\r\n- **Usage**: `plot_one(df1, col1, date_column)`\r\n\r\n### 9. `week_of_year_mapping(df, week_col, start_day_str)`\r\n- **Description**: Converts a week column in 'yyyy-Www' or 'yyyy-ww' format to week commencing date.\r\n- **Usage**: `week_of_year_mapping(df, week_col, start_day_str)`\r\n\r\n### 10. `exclude_rows(df, col_to_filter, list_of_filters)`\r\n- **Description**: Removes rows from a DataFrame based on whether the values in a specified column are not in a provided list.\r\n- **Usage**: `exclude_rows(df, col_to_filter, list_of_filters)`\r\n\r\n### 11. `rename_cols(df, cols_to_rename)`\r\n- **Description**: Renames columns in a pandas DataFrame.\r\n- **Usage**: `rename_cols(df, cols_to_rename)`\r\n\r\n### 12. `merge_new_and_old(old_df, old_col, new_df, new_col, cutoff_date, date_col_name='OBS')`\r\n- **Description**: Creates a new DataFrame with two columns: one for dates and one for merged numeric values.\r\n- **Usage**: `merge_new_and_old(old_df, old_col, new_df, new_col, cutoff_date, date_col_name='OBS')`\r\n\r\n### 13. `merge_dataframes_on_date(dataframes, common_column='OBS', merge_how='outer')`\r\n- **Description**: Merge a list of DataFrames on a common column.\r\n- **Usage**: `merge_dataframes_on_date(dataframes, common_column='OBS', merge_how='outer')`\r\n\r\n### 14. `merge_and_update_dfs(df1, df2, key_column)`\r\n- **Description**: Merges two dataframes on a key column, updates the first dataframe's columns with the second's where available, and returns a dataframe sorted by the key column.\r\n- **Usage**: `merge_and_update_dfs(df1, df2, key_column)`\r\n\r\n### 15. `convert_us_to_uk_dates(df, date_col)`\r\n- **Description**: Convert a DataFrame column with mixed date formats to datetime.\r\n- **Usage**: `convert_us_to_uk_dates(df, date_col)`\r\n\r\n### 16. `combine_sheets(all_sheets)`\r\n- **Description**: Combines multiple DataFrames from a dictionary into a single DataFrame.\r\n- **Usage**: `combine_sheets({'Sheet1': df1, 'Sheet2': df2})`\r\n\r\n### 17. `pivot_table(df, filters_dict, index_col, columns, values_col, fill_value=0,aggfunc='sum',margins=False,margins_name='Total',datetime_trans_needed=True)`\r\n- **Description**: Dynamically pivots a DataFrame based on specified columns.\r\n- **Usage**: `pivot_table(df, {'Master Include':' == 1','OBS':' >= datetime(2019,9,9)','Metric Short Names':' == 'spd''}, 'OBS', 'Channel Short Names', 'Value', fill_value=0,aggfunc='sum',margins=False,margins_name='Total',datetime_trans_needed=True)`\r\n\r\n### 18. `apply_lookup_table_for_columns(df, col_names, to_find_dict, if_not_in_country_dict='Other'), new_column_name='Mapping')`\r\n- **Description**: Equivalent of xlookup in excel. Allows you to map a dictionary of substrings within a column. If multiple columns are need for the LUT then a | seperator is needed.\r\n- **Usage**: `classify_within_column(df, ['campaign type','media type'], {'France Paid Social FB|paid social': 'facebook','France Paid Social TW|paid social': 'twitter'}, 'other','mapping')`\r\n\r\n### 19. `aggregate_daily_to_wc_wide(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`\r\n- **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week. In the wide format.\r\n- **Usage**: `aggregate_daily_to_wc_wide(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`\r\n\r\n### 20. `merge_cols_with_seperator(self, df, col_names,seperator='_',output_column_name = 'Merged',starting_prefix_str=None,ending_prefix_str=None)`\r\n- **Description**: Merge multiple columns in a dataframe into 1 column with a seperator.Can be used if multiple columns are needed for a LUT.\r\n- **Usage**: `merge_cols_with_seperator(df, ['Campaign','Product'],seperator='|','Merged Columns',starting_prefix_str='start_',ending_prefix_str='_end')`\r\n\r\n### 21. `check_sum_of_df_cols_are_equal(df_1,df_2,cols_1,cols_2)`\r\n- **Description**: Checks if the sum of two columns in two dataframes are the same, and provides the sums of each column and the difference between them.\r\n- **Usage**: `check_sum_of_df_cols_are_equal(df_1,df_2,'Media Cost','Spend')`\r\n\r\n### 22. `convert_2_df_cols_to_dict(df, key_col, value_col)`\r\n- **Description**: Can be used to create an LUT. Creates a dictionary using two columns in a dataframe.\r\n- **Usage**: `convert_2_df_cols_to_dict(df, 'Campaign', 'Channel')`\r\n\r\n### 23. `create_FY_and_H_columns(df, index_col, start_date, starting_FY,short_format='No',half_years='No',combined_FY_and_H='No')`\r\n- **Description**: Used to create a financial year, half year, and financial half year column.\r\n- **Usage**: `create_FY_and_H_columns(df, 'Week (M-S)', '2022-10-03', 'FY2023',short_format='Yes',half_years='Yes',combined_FY_and_H='Yes')`\r\n\r\n### 24. `keyword_lookup_replacement(df, col, replacement_rows, cols_to_merge, replacement_lookup_dict,output_column_name='Updated Column')`\r\n- **Description**: Essentially provides an if statement with a xlookup if a value is something. Updates certain chosen values in a specified column of the DataFrame based on a lookup dictionary.\r\n- **Usage**: `keyword_lookup_replacement(df, 'channel', 'Paid Search Generic', ['channel','segment','product'], qlik_dict_for_channel,output_column_name='Channel New')`\r\n\r\n### 25. `create_new_version_of_col_using_LUT(df, keys_col,value_col, dict_for_specific_changes, new_col_name='New Version of Old Col')`\r\n- **Description**: Creates a new column in a dataframe, which takes an old column and uses a lookup table to changes values in the new column to reflect the lookup table. The lookup is based on a column in the dataframe.\r\n- **Usage**: `keyword_lookup_replacement(df, '*Campaign Name','Campaign Type',search_campaign_name_retag_lut,'Campaign Name New')`\r\n\r\n### 26. `convert_df_wide_2_long(df,value_cols,variable_col_name='Stacked',value_col_name='Value')`\r\n- **Description**: Changes a dataframe from wide to long format.\r\n- **Usage**: `keyword_lookup_replacement(df, ['Media Cost','Impressions','Clicks'],variable_col_name='Metric')`\r\n\r\n### 27. `manually_edit_data(df, filters_dict, col_to_change, new_value, change_in_existing_df_col='No', new_col_to_change_name='New', manual_edit_col_name=None, add_notes='No', existing_note_col_name=None, note=None)`\r\n- **Description**: Allows the capability to manually update any cell in dataframe by applying filters and chosing a column to edit in dataframe.\r\n- **Usage**: `keyword_lookup_replacement(df, {'OBS':' <= datetime(2023,1,23)','File_Name':' == 'France media''},'Master Include',1,change_in_existing_df_col = 'Yes',new_col_to_change_name = 'Master Include',manual_edit_col_name = 'Manual Changes')`\r\n\r\n### 28. `format_numbers_with_commas(df, decimal_length_chosen=2)`\r\n- **Description**: Converts data in numerical format into numbers with commas and a chosen decimal place length.\r\n- **Usage**: `format_numbers_with_commas(df,1)`\r\n\r\n### 29. `filter_df_on_multiple_conditions(df, filters_dict)`\r\n- **Description**: Filters dataframe on multiple conditions, which come in the form of a dictionary.\r\n- **Usage**: `filter_df_on_multiple_conditions(df, {'OBS':' <= datetime(2023,1,23)','File_Name':' == 'France media''})`\r\n\r\n### 30. `read_and_concatenate_files(folder_path, file_type='csv')`\r\n- **Description**: Read and Concatinate all files of one type in a folder.\r\n- **Usage**: `read_and_concatenate_files(folder_path, file_type='csv')`\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "IMS Data Processing Package",
    "version": "0.5.7.1",
    "project_urls": null,
    "split_keywords": [
        "python",
        " data processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6403be5f715b5e17ed8dfc5084170c7ea17380b9e6304ed46653bab82689ed01",
                "md5": "abf3a5804cf6ebfa5f6d9f4dd24395ae",
                "sha256": "1334f1f0646773a2860d7da9b1ab18235e09235641b0b57172038fdbe579132c"
            },
            "downloads": -1,
            "filename": "imsciences-0.5.7.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "abf3a5804cf6ebfa5f6d9f4dd24395ae",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 24321,
            "upload_time": "2024-05-17T13:25:45",
            "upload_time_iso_8601": "2024-05-17T13:25:45.543900Z",
            "url": "https://files.pythonhosted.org/packages/64/03/be5f715b5e17ed8dfc5084170c7ea17380b9e6304ed46653bab82689ed01/imsciences-0.5.7.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3194ffe8a81002633bcb9dfb7e180d4301f1cf4b93baa1261de8b10369af5daf",
                "md5": "82ce53008292ca08ea1e024d64bf4700",
                "sha256": "61a941dab6742f2fe6fc3295ccf226fc817911570e40b6b7fa7dd207f1b9c73f"
            },
            "downloads": -1,
            "filename": "imsciences-0.5.7.1.tar.gz",
            "has_sig": false,
            "md5_digest": "82ce53008292ca08ea1e024d64bf4700",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 23843,
            "upload_time": "2024-05-17T13:25:47",
            "upload_time_iso_8601": "2024-05-17T13:25:47.041211Z",
            "url": "https://files.pythonhosted.org/packages/31/94/ffe8a81002633bcb9dfb7e180d4301f1cf4b93baa1261de8b10369af5daf/imsciences-0.5.7.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-17 13:25:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "imsciences"
}
        
IMS
Elapsed time: 0.27657s