idsw


Nameidsw JSON
Version 1.3.1 PyPI version JSON
download
home_pagehttps://github.com/marcosoares-92/IndustrialDataScienceWorkflow
SummaryFull workflow for ETL, statistics, and Machine learning modelling of (usually) time-stamped industrial facilities data.
upload_time2024-03-12 21:05:29
maintainer
docs_urlNone
authorMarco Cesar Prado Soares; Gabriel Fernandes Luz; Sergio Guilherme Neto
requires_python>=3.7
licenseMIT
keywords idsw industrialdatascienceworkflow
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Industrial Data Science Workflow
## Industrial Data Science Workflow: full workflow for ETL, statistics, and Machine learning modelling of (usually) time-stamped industrial facilities data.
### Not only applicable to monitoring quality and industrial facilities systems, the package can be applied to data manipulation, characterization and modelling of different numeric and categorical datasets to boost your work and replace tradicional tools like SAS, Minitab and Statistica software.

- Check the project Github: https://github.com/marcosoares-92/IndustrialDataScienceWorkflow
- Check our `Steel Industry Simulator` on: https://github.com/marcosoares-92/steelindustrysimulator
	- The Ideal Tool for Process Improvement, and Data Collection, Analyzing and Modelling Training.
	- User interface available in: 
	
	https://colab.research.google.com/github/marcosoares-92/steelindustrysimulator/blob/main/steelindustry_digitaltwin.ipynb


Authors:
- Marco Cesar Prado Soares, Data Scientist Specialist at Bayer (Crop Science)
  - marcosoares.feq@gmail.com

- Gabriel Fernandes Luz, Senior Data Scientist
  - gfluz94@gmail.com

- Sergio Guilherme Neto, Data Analyst
  - sguilhermeneto@gmail.com

- If you cannot install the last version from idsw package directly from PyPI using `pip install idsw`:

1. Open the terminal and:

Run:

	git clone "https://github.com/marcosoares-92/IndustrialDataScienceWorkflow" 

to clone all the files (you could also fork them).

2. Go to the directory called idsw.
3. Now, open the Python terminal and: 

Navigate to the idsw folder to run: 

	pip install .

- You can use command `cd "...\idsw"`, providing the full idsw path to navigate to it.
Alternatively, run `pip install ".\*.tar.gz"` in the folder terminal. 

### After cloning the directory, you can also run the package without installing it:
1. Copy the whole idsw folder to the working directory where your python or jupyter notebook file is saved.
- There must be an idsw folder on the python file directory.
2. In your Python file: 

Run the command or run a cell (Jupyter notebook) with:

	from idsw import *

for importing all idsw functions without the alias idsw; or:

	import idsw

to import the package with the alias idsw.

### Alternatively, if you do not want to clone the repository, you may download the file `load.py` and copy it to the working directory.
1. After downloading load.py and copying it to the working directory, in your Python environment, run:
	
	import load

2. After conclusion of this step, you may import the package as:

	from idsw import *

or as:
	
	import idsw

#### The `load.py` file runs the following code, which may be copied to your Python environment and run:

	class LoadIDSW:

	  def __init__(self, timeout = 60):  
	    self.cmd_line1 = """git clone https://github.com/marcosoares-92/IndustrialDataScienceWorkflow IndustrialDataScienceWorkflow"""
	    self.msg1 = "Cloning IndustrialDataScienceWorkflow to working directory."
	    self.cmd_line2 = """mv IndustrialDataScienceWorkflow/idsw ."""
	    self.msg2 = "Subdirectory 'idsw' moved to root directory. Now it can be directly imported."
	    self.timeout = timeout

  	  def set_process (self, cmd_line):
	    from subprocess import Popen, PIPE, TimeoutExpired
	    proc = Popen(cmd_line.split(" "), stdout = PIPE, stderr = PIPE)
	    return proc

	  def run_process (self, proc, msg = ''):
	    try:
	        output, error = proc.communicate(timeout = self.timeout)
	        if len(msg > 0):
	          print (msg)
	    except:
	        output, error = proc.communicate()       
	    return output, error

	  def clone_repo(self):
	    self.proc1 = self.set_process (self.cmd_line1)
	    self.output1, self.error1 = self.run_process(self.proc1, self.msg1)
	    return self

	  def move_pkg(self):
	    self.proc2 = self.set_process (self.cmd_line2)
	    self.output2, self.error2 = self.run_process(self.proc2, self.msg2)
	    return self

	  def move_pkg_alternative(self):
	    import shutil
	    source = 'IndustrialDataScienceWorkflow/idsw'  
	    destination = '.'
	    dest = shutil.move(source, destination)    
	    return self

	loader = LoadIDSW(timeout = 60)
	loader = loader.clone_repo()
	loader = loader.move_pkg()

	try:
	  from idsw import *
	except ModuleNotFoundError:
	  loader = loader.move_pkg_alternative()

	msg = """Package copied to the working directory.
		To import its whole content, run:
		
		    from idsw import *
		"""
	print(msg)

# History

## 1.2.0
### Fixed
- Deprecated structures

### Added
- New functionalities added.

### Reshape of project design.
- New division into modules and new names for functions and classes.

### Removed
- Removed support for Python < 3.7

## 1.2.1
### Fixed
- Setup issues.

## 1.2.2
### Fixed
- Setup issues: need for rigid and specific versions of the libraries.

## 1.2.3
### Fixed
- Setup issues.

## 1.2.4
### Fixed
- Import bugs.
- Introduced function for Excel writing.

## 1.2.5
### Fixed
- Matplotlib export figures bugs.
- 'quality' argument is no longer supported by plt.savefig function (Matplotlib), so it was removed.
- This modification was needed for allowing the correct functioning of the steelindustrysimulator, which is based on idsw.
- Check simulator project on: https://github.com/marcosoares-92/steelindustrysimulator
	- The Ideal Tool for Process Improvement, and Data Collection, Analyzing and Modelling Training.

## 1.2.6
### Fixed
- Export of figures generated a message like with '{new_file_path}.png.png'. Fixed to '{new_file_path}.png'.

## 1.3.0
### Added
- New functionalities added.

### Reshape of project design.
- New division of functions and classes and correspondent modules.
- Refactoring of functions and classes to improve code efficiency.
- Added new pipelines for fetching data and modified the storage of connectors.
- It includes pipelines for fetching table regions in Excel files, even if they are stored in a same tab; and a pipeline for downloading files stored in MS SharePoint.
- Added ControlVars dataclass to store if the user wants to hide results and plots.

## 1.3.1
### Improved
- Benford algorith for fraud and outlier detection.
- Pipeline for fetching SharePoint and downloading files.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/marcosoares-92/IndustrialDataScienceWorkflow",
    "name": "idsw",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "idsw,IndustrialDataScienceWorkflow",
    "author": "Marco Cesar Prado Soares; Gabriel Fernandes Luz; Sergio Guilherme Neto",
    "author_email": "marcosoares.feq@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/4c/92/aa76191c3c768d521829771dc9a16429b0f7cbe9a26292e3570551caf631/idsw-1.3.1.tar.gz",
    "platform": null,
    "description": "# Industrial Data Science Workflow\r\n## Industrial Data Science Workflow: full workflow for ETL, statistics, and Machine learning modelling of (usually) time-stamped industrial facilities data.\r\n### Not only applicable to monitoring quality and industrial facilities systems, the package can be applied to data manipulation, characterization and modelling of different numeric and categorical datasets to boost your work and replace tradicional tools like SAS, Minitab and Statistica software.\r\n\r\n- Check the project Github: https://github.com/marcosoares-92/IndustrialDataScienceWorkflow\r\n- Check our `Steel Industry Simulator` on: https://github.com/marcosoares-92/steelindustrysimulator\r\n\t- The Ideal Tool for Process Improvement, and Data Collection, Analyzing and Modelling Training.\r\n\t- User interface available in: \r\n\t\r\n\thttps://colab.research.google.com/github/marcosoares-92/steelindustrysimulator/blob/main/steelindustry_digitaltwin.ipynb\r\n\r\n\r\nAuthors:\r\n- Marco Cesar Prado Soares, Data Scientist Specialist at Bayer (Crop Science)\r\n  - marcosoares.feq@gmail.com\r\n\r\n- Gabriel Fernandes Luz, Senior Data Scientist\r\n  - gfluz94@gmail.com\r\n\r\n- Sergio Guilherme Neto, Data Analyst\r\n  - sguilhermeneto@gmail.com\r\n\r\n- If you cannot install the last version from idsw package directly from PyPI using `pip install idsw`:\r\n\r\n1. Open the terminal and:\r\n\r\nRun:\r\n\r\n\tgit clone \"https://github.com/marcosoares-92/IndustrialDataScienceWorkflow\" \r\n\r\nto clone all the files (you could also fork them).\r\n\r\n2. Go to the directory called idsw.\r\n3. Now, open the Python terminal and: \r\n\r\nNavigate to the idsw folder to run: \r\n\r\n\tpip install .\r\n\r\n- You can use command `cd \"...\\idsw\"`, providing the full idsw path to navigate to it.\r\nAlternatively, run `pip install \".\\*.tar.gz\"` in the folder terminal. \r\n\r\n### After cloning the directory, you can also run the package without installing it:\r\n1. Copy the whole idsw folder to the working directory where your python or jupyter notebook file is saved.\r\n- There must be an idsw folder on the python file directory.\r\n2. In your Python file: \r\n\r\nRun the command or run a cell (Jupyter notebook) with:\r\n\r\n\tfrom idsw import *\r\n\r\nfor importing all idsw functions without the alias idsw; or:\r\n\r\n\timport idsw\r\n\r\nto import the package with the alias idsw.\r\n\r\n### Alternatively, if you do not want to clone the repository, you may download the file `load.py` and copy it to the working directory.\r\n1. After downloading load.py and copying it to the working directory, in your Python environment, run:\r\n\t\r\n\timport load\r\n\r\n2. After conclusion of this step, you may import the package as:\r\n\r\n\tfrom idsw import *\r\n\r\nor as:\r\n\t\r\n\timport idsw\r\n\r\n#### The `load.py` file runs the following code, which may be copied to your Python environment and run:\r\n\r\n\tclass LoadIDSW:\r\n\r\n\t  def __init__(self, timeout = 60):  \r\n\t    self.cmd_line1 = \"\"\"git clone https://github.com/marcosoares-92/IndustrialDataScienceWorkflow IndustrialDataScienceWorkflow\"\"\"\r\n\t    self.msg1 = \"Cloning IndustrialDataScienceWorkflow to working directory.\"\r\n\t    self.cmd_line2 = \"\"\"mv IndustrialDataScienceWorkflow/idsw .\"\"\"\r\n\t    self.msg2 = \"Subdirectory 'idsw' moved to root directory. Now it can be directly imported.\"\r\n\t    self.timeout = timeout\r\n\r\n  \t  def set_process (self, cmd_line):\r\n\t    from subprocess import Popen, PIPE, TimeoutExpired\r\n\t    proc = Popen(cmd_line.split(\" \"), stdout = PIPE, stderr = PIPE)\r\n\t    return proc\r\n\r\n\t  def run_process (self, proc, msg = ''):\r\n\t    try:\r\n\t        output, error = proc.communicate(timeout = self.timeout)\r\n\t        if len(msg > 0):\r\n\t          print (msg)\r\n\t    except:\r\n\t        output, error = proc.communicate()       \r\n\t    return output, error\r\n\r\n\t  def clone_repo(self):\r\n\t    self.proc1 = self.set_process (self.cmd_line1)\r\n\t    self.output1, self.error1 = self.run_process(self.proc1, self.msg1)\r\n\t    return self\r\n\r\n\t  def move_pkg(self):\r\n\t    self.proc2 = self.set_process (self.cmd_line2)\r\n\t    self.output2, self.error2 = self.run_process(self.proc2, self.msg2)\r\n\t    return self\r\n\r\n\t  def move_pkg_alternative(self):\r\n\t    import shutil\r\n\t    source = 'IndustrialDataScienceWorkflow/idsw'  \r\n\t    destination = '.'\r\n\t    dest = shutil.move(source, destination)    \r\n\t    return self\r\n\r\n\tloader = LoadIDSW(timeout = 60)\r\n\tloader = loader.clone_repo()\r\n\tloader = loader.move_pkg()\r\n\r\n\ttry:\r\n\t  from idsw import *\r\n\texcept ModuleNotFoundError:\r\n\t  loader = loader.move_pkg_alternative()\r\n\r\n\tmsg = \"\"\"Package copied to the working directory.\r\n\t\tTo import its whole content, run:\r\n\t\t\r\n\t\t    from idsw import *\r\n\t\t\"\"\"\r\n\tprint(msg)\r\n\r\n# History\r\n\r\n## 1.2.0\r\n### Fixed\r\n- Deprecated structures\r\n\r\n### Added\r\n- New functionalities added.\r\n\r\n### Reshape of project design.\r\n- New division into modules and new names for functions and classes.\r\n\r\n### Removed\r\n- Removed support for Python < 3.7\r\n\r\n## 1.2.1\r\n### Fixed\r\n- Setup issues.\r\n\r\n## 1.2.2\r\n### Fixed\r\n- Setup issues: need for rigid and specific versions of the libraries.\r\n\r\n## 1.2.3\r\n### Fixed\r\n- Setup issues.\r\n\r\n## 1.2.4\r\n### Fixed\r\n- Import bugs.\r\n- Introduced function for Excel writing.\r\n\r\n## 1.2.5\r\n### Fixed\r\n- Matplotlib export figures bugs.\r\n- 'quality' argument is no longer supported by plt.savefig function (Matplotlib), so it was removed.\r\n- This modification was needed for allowing the correct functioning of the steelindustrysimulator, which is based on idsw.\r\n- Check simulator project on: https://github.com/marcosoares-92/steelindustrysimulator\r\n\t- The Ideal Tool for Process Improvement, and Data Collection, Analyzing and Modelling Training.\r\n\r\n## 1.2.6\r\n### Fixed\r\n- Export of figures generated a message like with '{new_file_path}.png.png'. Fixed to '{new_file_path}.png'.\r\n\r\n## 1.3.0\r\n### Added\r\n- New functionalities added.\r\n\r\n### Reshape of project design.\r\n- New division of functions and classes and correspondent modules.\r\n- Refactoring of functions and classes to improve code efficiency.\r\n- Added new pipelines for fetching data and modified the storage of connectors.\r\n- It includes pipelines for fetching table regions in Excel files, even if they are stored in a same tab; and a pipeline for downloading files stored in MS SharePoint.\r\n- Added ControlVars dataclass to store if the user wants to hide results and plots.\r\n\r\n## 1.3.1\r\n### Improved\r\n- Benford algorith for fraud and outlier detection.\r\n- Pipeline for fetching SharePoint and downloading files.\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Full workflow for ETL, statistics, and Machine learning modelling of (usually) time-stamped industrial facilities data.",
    "version": "1.3.1",
    "project_urls": {
        "Homepage": "https://github.com/marcosoares-92/IndustrialDataScienceWorkflow"
    },
    "split_keywords": [
        "idsw",
        "industrialdatascienceworkflow"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e729c4bc10193758e131e700231e5205a9f0faf648685facdc2b05fa8c837387",
                "md5": "dba079d8d058fe6ceb671c6e59a725f9",
                "sha256": "f170e0639d759a6dd2e256f2b3aa648369b769fc3b49feabb2c326eda9e12191"
            },
            "downloads": -1,
            "filename": "idsw-1.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dba079d8d058fe6ceb671c6e59a725f9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 4678,
            "upload_time": "2024-03-12T21:05:27",
            "upload_time_iso_8601": "2024-03-12T21:05:27.666402Z",
            "url": "https://files.pythonhosted.org/packages/e7/29/c4bc10193758e131e700231e5205a9f0faf648685facdc2b05fa8c837387/idsw-1.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4c92aa76191c3c768d521829771dc9a16429b0f7cbe9a26292e3570551caf631",
                "md5": "d7700b279230559b4f97fd40c745c785",
                "sha256": "2accce4026824fa741b93e153e04a2b7b019acc5991e2810fab08fc58c953cb9"
            },
            "downloads": -1,
            "filename": "idsw-1.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d7700b279230559b4f97fd40c745c785",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 6142,
            "upload_time": "2024-03-12T21:05:29",
            "upload_time_iso_8601": "2024-03-12T21:05:29.879463Z",
            "url": "https://files.pythonhosted.org/packages/4c/92/aa76191c3c768d521829771dc9a16429b0f7cbe9a26292e3570551caf631/idsw-1.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-12 21:05:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "marcosoares-92",
    "github_project": "IndustrialDataScienceWorkflow",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "idsw"
}
        
Elapsed time: 1.58476s