smipoly


Namesmipoly JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/PEJpOhno/SMiPoly
Summaryrule-based virtual polymer library generator
upload_time2023-09-16 03:03:14
maintainer
docs_urlNone
authorMitsuru Ohno
requires_python>=3.7
licenseBSD 3-Clause License
keywords rdkit polymer
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SMiPoly

## 1. What is SMiPoly?  
"SMiPoly (**S**mall **M**olecules **i**nto **Poly**mers)" is rule-based virtual library generator for discovery of functional polymers. It is consist of two submodules, "monc.py" and "polg.py".  
"monc.py" is a monomer classifier from a list of small molecules, and "polg.py" is a polymer repeating unit generator from the classified monomer list.  

### 1-1. monc.py  
The functions of monc.py is as follows.  
  - extract monomers from a list of small molecules.
  - classify extracted monomers into each monomer class.

The chemical structure of the small molecule compounds should be expressed in simplified molecular input line entry system (SMILES) and given as pandas DataFrame.  

Defined monomer class  
  - vinyl  
  - cyclic olefin  
  - epoxide and diepoxide  
  - lactone  
  - lactam  
  - hydroxy carboxylic acid  
  - amino acid  
  - cyclic carboxylic acid anhydride and bis(cyclic carboxylic acid anhydride)  
  - hindered phenol  
  - dicarboxylic acid and acid halide  
  - diol  
  - diamine and primary diamine  
  - diisocyanate  
  - bis(halo aryl)sulfone  
  - bis(fluoro aryl)ketone

### 1-2. polg.py  
"polg.py" gives all synthesizable polymer repeating units starting from the classified monomer list generated by "monc.py".  
For chain polymerization (polyolefins and some polyether), it gives homo and binary-copolymers. For successive (or step) polymerization,  it gives homopolymer only.

Defined polymer class  
  - polyolefin, polycyclic olefin and their binary copolymers  
  - polyester (from lactone, hydroxy carboxylic acid, dicarboxylic acid + diol, diol + CO and cyclic carboxylic acid anhydride + epoxide)  
  - polyether (from epoxide, hindered phenol, bis(halo aryl)sulfone + diol and bis(fluoro aryl)ketone + diol)  
  - polyamide (from lactam, amino acid and dicarboxylic acid + diamine)  
  - polyimide (bis(cyclic carboxylic acid anhydride + primary diamine)  
  - polyurethane (diisocyanate + diol)  
  - polyoxazolidone (diepoxide + diisocyanate)  

### 1-3. Publications  
SMiPoly: Generation of a Synthesizable Polymer Virtual Library Using Rule-Based Polymerization Reactions  
Mitsuru Ohno, Yoshihiro Hayashi, Qi Zhang, Yu Kaneko, and Ryo Yoshida  
*Journal of Chemical Information and Modeling* **2023** *63* (17), 5539-5548  
DOI: 10.1021/acs.jcim.3c00329  
https://doi.org/10.1021/acs.jcim.3c00329  
(version 0.0.1 was used)

## 2. Current version and requirements
current version = 0.0.3  
requirements
  - pyhon 3.7, 3.8, 3.9, 3.10, 3.11  
  - rdkit >= 2020.09.1.0 #(2019.09.3 is unavailable)  
  - numpy >= 1.20.2  
  - pandas >= 1.2.4  

## 3. Module contents  

smip.**monc.moncls**(*df, smiColn, minFG = 2, maxFG = 4, dsp_rsl=False*)  

ARGUMENTS:  

  - df: name of the object DataFrame  
  - smicoln: The column label of the SMILES column, given as a *str*.  
  - minFG: minimum number of the polymerizable functional groups in the monomer for successive polymerization (default 2; 2 or more)  
  - maxFG: maxmum nimber of the polymerizable functional groups in the monomer for successive polymerization (default 4; 4 or less)  
  - dsp_rsl: display classified result (default False)  


smip.**polg.biplym**(*df, targ = \['all'\], Pmode = 'a', dsp_rsl=False*)  

ARGUMENTS:  

  - df: name of the DataFrame of classified monomers generated by *monm.moncls*.  
  - targ: targetted polymer class. When present, it can be a list of *str*. The selectable elements are 'polyolefin', 'polyester', 'polyether', 'polyamide', 'polyimide', 'polyurethane', 'polyoxazolidone' and 'all' (default = ['all'])  
  - Pmod: generate all isomers of the polymer repeating unit ('a') or the polymer repeating unit of its representation ('r'). (default = 'a')  
  - dsp_rsl: display the DataFrame of the generated polymers. (default False)  

## 4. Copyright and license  
Copyright (c) 2022 Mitsuru Ohno  
Released under the BSD-3 license, license that can be found in the LICENSE file.  

## 5. Installation and usage
### 5-1. Installatin
Create new virtual environment and activate it.
To install this package, run as follows.

```sh
$pip install smipoly
```
### 5-2. Quick start
Download 'sample_data/202207_smip_monset.csv' and 'sample_script/sample_smip_demo.ipynb' from [SMiPoly repository](https://github.com/PEJpOhno/SMiPoly) to the same directry on your computer.
Then run sample_smip_demo.ipynb. To run this demo script, Jupyter Notebook is required.

## 6. Sample dataset
The sample dataset './sample_data/202207_smip_monset.csv' includes common 1,083 monomers collected from published documents such as scientific articles, catalogues and so on.

## 7. Utilities  
By using the files in './utilities' directory, one can modify or add the definition of monomers, the rules of polymerization reactions and polymer classes.  
To apply the new rule(s), replace the old './smipoly/rules' directory by the new one. The files must be run according to the number assigned the head of the each filename.  

  - 1_MonomerDefiner.ipynb: definitions of monomers  
  - 2_Ps_rxnL.ipynb: rules of polymerization reactions    
  - 3_Ps_GenL.ipynb: definitions of polymer classes with combinations of starting monomer(s) and polymerization reaction  

## 8. Directry configuration  

```sh
SMiPoly
├── src
│   └── smipoly
│       ├── __init__.py
│       ├── _version.py
│       ├── smip
│       │   ├── __init__.py
│       │   ├── funclib.py
│       │   ├── monc.py
│       │   └── polg.py
│       └── rules
│           ├── excl_lst.json
│           ├── mon_dic_inv.json
│           ├── mon_dic.json
│           ├── mon_lst.json
│           ├── mon_vals.json
│           ├── ps_class.json
│           ├── ps_gen.pkl
│           └── ps.rxn.pkl
├── LICENSE
├── pyproject.toml
├── setup.py
├── setup.cfg
├── README.md
├── sample_data
│   └── 202207_smip_monset.csv
├── sample_script
│   └── sample_smip_demo.ipynb
└── utilities
    ├── 1_MonomerDefiner.ipynb
    ├── 2_Ps_rxnL.ipynb
    ├── 3_Ps_GenL.ipynb
    └── rules/
```

## 9. Revision history
ver. 0.0.1: relased  
ver. 0.0.3: revised the code to reduse memory consumption

## 10. Reference  
https://future-chem.com/rdkit-chemical-rxn/  
https://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html  
https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html  

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/PEJpOhno/SMiPoly",
    "name": "smipoly",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "rdkit,polymer",
    "author": "Mitsuru Ohno",
    "author_email": "pejpohn@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/c3/b2/e05a47bc50b483c329c0d4b519de2e38b981ef47415d997ba961e5d5a154/smipoly-0.0.3.tar.gz",
    "platform": null,
    "description": "# SMiPoly\r\n\r\n## 1. What is SMiPoly?  \r\n\"SMiPoly (**S**mall **M**olecules **i**nto **Poly**mers)\" is rule-based virtual library generator for discovery of functional polymers. It is consist of two submodules, \"monc.py\" and \"polg.py\".  \r\n\"monc.py\" is a monomer classifier from a list of small molecules, and \"polg.py\" is a polymer repeating unit generator from the classified monomer list.  \r\n\r\n### 1-1. monc.py  \r\nThe functions of monc.py is as follows.  \r\n  - extract monomers from a list of small molecules.\r\n  - classify extracted monomers into each monomer class.\r\n\r\nThe chemical structure of the small molecule compounds should be expressed in simplified molecular input line entry system (SMILES) and given as pandas DataFrame.  \r\n\r\nDefined monomer class  \r\n  - vinyl  \r\n  - cyclic olefin  \r\n  - epoxide and diepoxide  \r\n  - lactone  \r\n  - lactam  \r\n  - hydroxy carboxylic acid  \r\n  - amino acid  \r\n  - cyclic carboxylic acid anhydride and bis(cyclic carboxylic acid anhydride)  \r\n  - hindered phenol  \r\n  - dicarboxylic acid and acid halide  \r\n  - diol  \r\n  - diamine and primary diamine  \r\n  - diisocyanate  \r\n  - bis(halo aryl)sulfone  \r\n  - bis(fluoro aryl)ketone\r\n\r\n### 1-2. polg.py  \r\n\"polg.py\" gives all synthesizable polymer repeating units starting from the classified monomer list generated by \"monc.py\".  \r\nFor chain polymerization (polyolefins and some polyether), it gives homo and binary-copolymers. For successive (or step) polymerization,  it gives homopolymer only.\r\n\r\nDefined polymer class  \r\n  - polyolefin, polycyclic olefin and their binary copolymers  \r\n  - polyester (from lactone, hydroxy carboxylic acid, dicarboxylic acid + diol, diol + CO and cyclic carboxylic acid anhydride + epoxide)  \r\n  - polyether (from epoxide, hindered phenol, bis(halo aryl)sulfone + diol and bis(fluoro aryl)ketone + diol)  \r\n  - polyamide (from lactam, amino acid and dicarboxylic acid + diamine)  \r\n  - polyimide (bis(cyclic carboxylic acid anhydride + primary diamine)  \r\n  - polyurethane (diisocyanate + diol)  \r\n  - polyoxazolidone (diepoxide + diisocyanate)  \r\n\r\n### 1-3. Publications  \r\nSMiPoly: Generation of a Synthesizable Polymer Virtual Library Using Rule-Based Polymerization Reactions  \r\nMitsuru Ohno, Yoshihiro Hayashi, Qi Zhang, Yu Kaneko, and Ryo Yoshida  \r\n*Journal of Chemical Information and Modeling* **2023** *63* (17), 5539-5548  \r\nDOI: 10.1021/acs.jcim.3c00329  \r\nhttps://doi.org/10.1021/acs.jcim.3c00329  \r\n(version 0.0.1 was used)\r\n\r\n## 2. Current version and requirements\r\ncurrent version = 0.0.3  \r\nrequirements\r\n  - pyhon 3.7, 3.8, 3.9, 3.10, 3.11  \r\n  - rdkit >= 2020.09.1.0 #(2019.09.3 is unavailable)  \r\n  - numpy >= 1.20.2  \r\n  - pandas >= 1.2.4  \r\n\r\n## 3. Module contents  \r\n\r\nsmip.**monc.moncls**(*df, smiColn, minFG = 2, maxFG = 4, dsp_rsl=False*)  \r\n\r\nARGUMENTS:  \r\n\r\n  - df: name of the object DataFrame  \r\n  - smicoln: The column label of the SMILES column, given as a *str*.  \r\n  - minFG: minimum number of the polymerizable functional groups in the monomer for successive polymerization (default 2; 2 or more)  \r\n  - maxFG: maxmum nimber of the polymerizable functional groups in the monomer for successive polymerization (default 4; 4 or less)  \r\n  - dsp_rsl: display classified result (default False)  \r\n\r\n\r\nsmip.**polg.biplym**(*df, targ = \\['all'\\], Pmode = 'a', dsp_rsl=False*)  \r\n\r\nARGUMENTS:  \r\n\r\n  - df: name of the DataFrame of classified monomers generated by *monm.moncls*.  \r\n  - targ: targetted polymer class. When present, it can be a list of *str*. The selectable elements are 'polyolefin', 'polyester', 'polyether', 'polyamide', 'polyimide', 'polyurethane', 'polyoxazolidone' and 'all' (default = ['all'])  \r\n  - Pmod: generate all isomers of the polymer repeating unit ('a') or the polymer repeating unit of its representation ('r'). (default = 'a')  \r\n  - dsp_rsl: display the DataFrame of the generated polymers. (default False)  \r\n\r\n## 4. Copyright and license  \r\nCopyright (c) 2022 Mitsuru Ohno  \r\nReleased under the BSD-3 license, license that can be found in the LICENSE file.  \r\n\r\n## 5. Installation and usage\r\n### 5-1. Installatin\r\nCreate new virtual environment and activate it.\r\nTo install this package, run as follows.\r\n\r\n```sh\r\n$pip install smipoly\r\n```\r\n### 5-2. Quick start\r\nDownload 'sample_data/202207_smip_monset.csv' and 'sample_script/sample_smip_demo.ipynb' from [SMiPoly repository](https://github.com/PEJpOhno/SMiPoly) to the same directry on your computer.\r\nThen run sample_smip_demo.ipynb. To run this demo script, Jupyter Notebook is required.\r\n\r\n## 6. Sample dataset\r\nThe sample dataset './sample_data/202207_smip_monset.csv' includes common 1,083 monomers collected from published documents such as scientific articles, catalogues and so on.\r\n\r\n## 7. Utilities  \r\nBy using the files in './utilities' directory, one can modify or add the definition of monomers, the rules of polymerization reactions and polymer classes.  \r\nTo apply the new rule(s), replace the old './smipoly/rules' directory by the new one. The files must be run according to the number assigned the head of the each filename.  \r\n\r\n  - 1_MonomerDefiner.ipynb: definitions of monomers  \r\n  - 2_Ps_rxnL.ipynb: rules of polymerization reactions    \r\n  - 3_Ps_GenL.ipynb: definitions of polymer classes with combinations of starting monomer(s) and polymerization reaction  \r\n\r\n## 8. Directry configuration  \r\n\r\n```sh\r\nSMiPoly\r\n\u251c\u2500\u2500 src\r\n\u2502   \u2514\u2500\u2500 smipoly\r\n\u2502       \u251c\u2500\u2500 __init__.py\r\n\u2502       \u251c\u2500\u2500 _version.py\r\n\u2502       \u251c\u2500\u2500 smip\r\n\u2502       \u2502   \u251c\u2500\u2500 __init__.py\r\n\u2502       \u2502   \u251c\u2500\u2500 funclib.py\r\n\u2502       \u2502   \u251c\u2500\u2500 monc.py\r\n\u2502       \u2502   \u2514\u2500\u2500 polg.py\r\n\u2502       \u2514\u2500\u2500 rules\r\n\u2502           \u251c\u2500\u2500 excl_lst.json\r\n\u2502           \u251c\u2500\u2500 mon_dic_inv.json\r\n\u2502           \u251c\u2500\u2500 mon_dic.json\r\n\u2502           \u251c\u2500\u2500 mon_lst.json\r\n\u2502           \u251c\u2500\u2500 mon_vals.json\r\n\u2502           \u251c\u2500\u2500 ps_class.json\r\n\u2502           \u251c\u2500\u2500 ps_gen.pkl\r\n\u2502           \u2514\u2500\u2500 ps.rxn.pkl\r\n\u251c\u2500\u2500 LICENSE\r\n\u251c\u2500\u2500 pyproject.toml\r\n\u251c\u2500\u2500 setup.py\r\n\u251c\u2500\u2500 setup.cfg\r\n\u251c\u2500\u2500 README.md\r\n\u251c\u2500\u2500 sample_data\r\n\u2502   \u2514\u2500\u2500 202207_smip_monset.csv\r\n\u251c\u2500\u2500 sample_script\r\n\u2502   \u2514\u2500\u2500 sample_smip_demo.ipynb\r\n\u2514\u2500\u2500 utilities\r\n    \u251c\u2500\u2500 1_MonomerDefiner.ipynb\r\n    \u251c\u2500\u2500 2_Ps_rxnL.ipynb\r\n    \u251c\u2500\u2500 3_Ps_GenL.ipynb\r\n    \u2514\u2500\u2500 rules/\r\n```\r\n\r\n## 9. Revision history\r\nver. 0.0.1: relased  \r\nver. 0.0.3: revised the code to reduse memory consumption\r\n\r\n## 10. Reference  \r\nhttps://future-chem.com/rdkit-chemical-rxn/  \r\nhttps://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html  \r\nhttps://www.daylight.com/dayhtml/doc/theory/theory.smarts.html  \r\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause License",
    "summary": "rule-based virtual polymer library generator",
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/PEJpOhno/SMiPoly"
    },
    "split_keywords": [
        "rdkit",
        "polymer"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4b8fd428a1febbc63b9a7c1200961607584d8ac28ff87d0390839e6354161a08",
                "md5": "1e3fd934e20f65ec451f5ed12eeb27e2",
                "sha256": "2288f43b8e6a266dfe2725a2b9c6ebe79059d10f9c4b69c04c980b9b95087495"
            },
            "downloads": -1,
            "filename": "smipoly-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1e3fd934e20f65ec451f5ed12eeb27e2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 18951,
            "upload_time": "2023-09-16T03:02:14",
            "upload_time_iso_8601": "2023-09-16T03:02:14.555205Z",
            "url": "https://files.pythonhosted.org/packages/4b/8f/d428a1febbc63b9a7c1200961607584d8ac28ff87d0390839e6354161a08/smipoly-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c3b2e05a47bc50b483c329c0d4b519de2e38b981ef47415d997ba961e5d5a154",
                "md5": "35193bccf4613a2e97e71a2da4ccee38",
                "sha256": "a1ed9131e2788fe96924fb6e96ba14fdab5c8da9c01082eed5dc37ddec7991f5"
            },
            "downloads": -1,
            "filename": "smipoly-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "35193bccf4613a2e97e71a2da4ccee38",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4186656,
            "upload_time": "2023-09-16T03:03:14",
            "upload_time_iso_8601": "2023-09-16T03:03:14.586963Z",
            "url": "https://files.pythonhosted.org/packages/c3/b2/e05a47bc50b483c329c0d4b519de2e38b981ef47415d997ba961e5d5a154/smipoly-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-16 03:03:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "PEJpOhno",
    "github_project": "SMiPoly",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "smipoly"
}
        
Elapsed time: 0.11209s