naimai

Name	naimai JSON
Version	1.0.0.6 JSON
	download
home_page	https://github.com/yassinekdi/naimai
Summary	Python library to help with scientific literature research
upload_time	2022-12-08 11:36:16
maintainer	Yassine Kaddi
docs_url	None
author	Yassine Kaddi
requires_python
license	CC BY-NC-SA
keywords	science review bibliography python nlp machine-learning information-extraction
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <p align ="center">
  <img src="https://github.com/yassinekdi/naimai/blob/master/logo.png?raw=true" 
  alt="Naimai logo" height="25%" width="25%"/>
</p>

NaimAI is a Python package that (1) <b>searches effeciently in papers</b> and (2) <b>generates an automatic review</b>. It does do by structures each scientific paper using their abstract into 3 categories : objectives methods and results. 
Hence, when searching, the results will be showed by category. The results can then be reviewed and a review text will be 
automatically generated along with the references.
<br>
All the features are deployed on the <a href="https://www.naimai.fr" target="_blank">NaimAI's website</a>, where millions of paper are processed. 
<br>
A <a href="https://yaassinekaddi.medium.com/literature-review-with-naimai-open-sourced-fcbdb36762de" target="_blank">Medium article</a> goes more in depth with naimai's features of the <a href="https://www.naimai.fr" target="_blank">web app</a>. 
<h1>Search in your own papers</h1>

You can either give a directory of the folder with articles in PDF format or a csv file with abstracts and other meta data as showed 
<a href="tests/papers/input_data" target="_blank">here</a>.
<br>
The processing, the results and searching for relevent papers are explained in 
<a href="https://colab.research.google.com/drive/1xUDOkalxR7MFO6Zug48Cx1ysmgipaJCT?usp=sharing" target="_blank">this colab</a>.

<h1> Search in millions of papers </h1>
To search in the millions of papers already processed, you can use the <a href="https://www.naimai.fr" target="_blank">naimai website</a>.
I might open source this part too if needed.

<h1>Structure your abstract</h1>
If you already have an abstract and want to test the segmentor (naimai's algorithm that structures abstract into Background, 
Objectives, Methods and Results), <a href="https://colab.research.google.com/drive/16PMGC7yxkTcFpUnlZtioBMa22tpaTid5?usp=sharing">this colab</a>
walks you through the necessary steps. 

Example of structured abstract :
<p>
  <img src="https://github.com/yassinekdi/naimai/blob/master/bomr_classif.jpg?raw=true" 
  alt="classified abstract"/>
</p>


<h1> Features to improve </h2>
<h3>Review Generation </h3>
<p>
The review generation needs more enhancement. The actual method consists of only rephrasing the objective phrase of each paper. 
I've some idea to go further and improve the review generation part. Let me know if you're interested and we'll do it
together!</p>
<p> Besides the generated text, the references generation still can be brushed up to meet with many references style,
 and also to export it to other formats (BibTeX..).
</p>
<h3>Semantic search </h3>
The search is mainly based on a v0 semantic algorithm (using TfIdf model mainly). In a previous version, 
I've finetuned bert model for each field and the results were pretty interesting. The problem is that, with 10 fields 
on the web app, I ended up having 10 fine-tuned model. So the usage was pretty slow and the models were heavy.
If you have any idea and/or want to contribute in this part, I'll be happy to talk to you! 

<h3>Data papers </h3>
I've used about 10 millions open access abstracts I found here and there on the internet. If you've any source that could be useful, or even better, if we can process much more papers together to get more informations for the users, that'd be cool!
<h1>References</h1>
<ul>
    <li>
    For abbreviations purposes, I used <a href="https://gist.github.com/ijmarshall/b3d1de6ccf4fb8b5ee53" target="_blank">this code</a>.
    </li>
    <li>
    For PDF processing, I used <a href="https://github.com/kermitt2/grobid" target="_blank">Grobid</a>.
    </li>
</ul>


[![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa]

This work is licensed under a
[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].

[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]

[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png
[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yassinekdi/naimai",
    "name": "naimai",
    "maintainer": "Yassine Kaddi",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "yassine@naimai.fr",
    "keywords": "science,review,bibliography,python,nlp,machine-learning,information-extraction",
    "author": "Yassine Kaddi",
    "author_email": "yassine@naimai.fr",
    "download_url": "https://files.pythonhosted.org/packages/e1/23/24ff79d0276b37ef90ca928ecd095b20ca4052b8b70faae1077afb12a1de/naimai-1.0.0.6.tar.gz",
    "platform": null,
    "description": "<p align =\"center\">\n  <img src=\"https://github.com/yassinekdi/naimai/blob/master/logo.png?raw=true\" \n  alt=\"Naimai logo\" height=\"25%\" width=\"25%\"/>\n</p>\n\nNaimAI is a Python package that (1) <b>searches effeciently in papers</b> and (2) <b>generates an automatic review</b>. It does do by structures each scientific paper using their abstract into 3 categories : objectives methods and results. \nHence, when searching, the results will be showed by category. The results can then be reviewed and a review text will be \nautomatically generated along with the references.\n<br>\nAll the features are deployed on the <a href=\"https://www.naimai.fr\" target=\"_blank\">NaimAI's website</a>, where millions of paper are processed. \n<br>\nA <a href=\"https://yaassinekaddi.medium.com/literature-review-with-naimai-open-sourced-fcbdb36762de\" target=\"_blank\">Medium article</a> goes more in depth with naimai's features of the <a href=\"https://www.naimai.fr\" target=\"_blank\">web app</a>. \n<h1>Search in your own papers</h1>\n\nYou can either give a directory of the folder with articles in PDF format or a csv file with abstracts and other meta data as showed \n<a href=\"tests/papers/input_data\" target=\"_blank\">here</a>.\n<br>\nThe processing, the results and searching for relevent papers are explained in \n<a href=\"https://colab.research.google.com/drive/1xUDOkalxR7MFO6Zug48Cx1ysmgipaJCT?usp=sharing\" target=\"_blank\">this colab</a>.\n\n<h1> Search in millions of papers </h1>\nTo search in the millions of papers already processed, you can use the <a href=\"https://www.naimai.fr\" target=\"_blank\">naimai website</a>.\nI might open source this part too if needed.\n\n<h1>Structure your abstract</h1>\nIf you already have an abstract and want to test the segmentor (naimai's algorithm that structures abstract into Background, \nObjectives, Methods and Results), <a href=\"https://colab.research.google.com/drive/16PMGC7yxkTcFpUnlZtioBMa22tpaTid5?usp=sharing\">this colab</a>\nwalks you through the necessary steps. \n\nExample of structured abstract :\n<p>\n  <img src=\"https://github.com/yassinekdi/naimai/blob/master/bomr_classif.jpg?raw=true\" \n  alt=\"classified abstract\"/>\n</p>\n\n\n<h1> Features to improve </h2>\n<h3>Review Generation\u00a0</h3>\n<p>\nThe review generation needs more enhancement. The actual method consists of only rephrasing the objective phrase of each paper. \nI've some idea to go further and improve the review generation part. Let me know if you're interested and we'll do it\ntogether!</p>\n<p>\u00a0Besides the generated text, the references generation still can be brushed up to meet with many references style,\n and also to export it to other formats (BibTeX..).\n</p>\n<h3>Semantic search\u00a0</h3>\nThe search is mainly based on a v0 semantic algorithm (using TfIdf model mainly). In a previous version, \nI've finetuned bert model for each field and the results were pretty interesting. The problem is that, with 10 fields \non the web app, I ended up having 10 fine-tuned model. So the usage was pretty slow and the models were heavy.\nIf you have any idea and/or want to contribute in this part, I'll be happy to talk to you!\u00a0\n\n<h3>Data papers\u00a0</h3>\nI've used about 10 millions open access abstracts I found here and there on the internet. If you've any source that could be useful, or even better, if we can process much more papers together to get more informations for the users, that'd be cool!\n<h1>References</h1>\n<ul>\n    <li>\n    For abbreviations purposes, I used <a href=\"https://gist.github.com/ijmarshall/b3d1de6ccf4fb8b5ee53\" target=\"_blank\">this code</a>.\n    </li>\n    <li>\n    For PDF processing, I used <a href=\"https://github.com/kermitt2/grobid\" target=\"_blank\">Grobid</a>.\n    </li>\n</ul>\n\n\n[![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa]\n\nThis work is licensed under a\n[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].\n\n[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]\n\n[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/\n[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png\n[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg\n\n",
    "bugtrack_url": null,
    "license": "CC BY-NC-SA",
    "summary": "Python library to help with scientific literature research",
    "version": "1.0.0.6",
    "split_keywords": [
        "science",
        "review",
        "bibliography",
        "python",
        "nlp",
        "machine-learning",
        "information-extraction"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "fd6822ee61b5941f5f0ca02a1337a2f4",
                "sha256": "a3af0b5a37df0424d05f0f22bfae342aa4ba74d462e545193cd38d8950193307"
            },
            "downloads": -1,
            "filename": "naimai-1.0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fd6822ee61b5941f5f0ca02a1337a2f4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 255513,
            "upload_time": "2022-12-08T11:36:10",
            "upload_time_iso_8601": "2022-12-08T11:36:10.286623Z",
            "url": "https://files.pythonhosted.org/packages/45/28/e5de78f394966940bee5c49362e1c0c03751c09db8c4ff3921ff2d959927/naimai-1.0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "00917fe58ddd25f2b09f9379e9303e01",
                "sha256": "78c4d556adf7f66586adf5158a17ba350398659d030d31df5002ecd11b375f31"
            },
            "downloads": -1,
            "filename": "naimai-1.0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "00917fe58ddd25f2b09f9379e9303e01",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 99393,
            "upload_time": "2022-12-08T11:36:16",
            "upload_time_iso_8601": "2022-12-08T11:36:16.262441Z",
            "url": "https://files.pythonhosted.org/packages/e1/23/24ff79d0276b37ef90ca928ecd095b20ca4052b8b70faae1077afb12a1de/naimai-1.0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-08 11:36:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "yassinekdi",
    "github_project": "naimai",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "naimai"
}

Yassine Kaddi