articlesumm

Name	articlesumm JSON
Version	0.0.2 JSON
	download
home_page
Summary	Custom scientific/research article summarization library based on Statistical features
upload_time	2023-06-15 01:16:34
maintainer
docs_url	None
author	Maxwell Tetteh
requires_python
license	MIT
keywords	text summarization scientific articles
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            The summarization of research articles is a complex task compared to general-purpose summaries.This is a result of
the distict nature or semantic structure of these scientific articles. The presence of inline citations and summarization
modules bias to certain Text features that often work well on less-tecnical text but fail to produce coherence in 
this area are all underlying factors.

We circumvent these challenges in order to produce more coherent, human-understandable summaries of manuscripts and research text
using this libary.


## Installation
You can easily install the package using the ```pip``` command:

 ```
   pip install articlesumm
 ```

## Usage
The package takes a string as input(specify a path/directory for an article or alternatively pass a string as a variable). The tokenization of the sentences and words can be performed with the first function: 

```
  parse=purge(text)
  type(parse) 

  #tuple
```

Alternatively, you can tokenize the sentences and words with any other technique and pass the processed text to the summarization model.

## Example
```
  text='''TextRank is a graph-based ranking model for text processing which can be used in order to find the most relevant sentences in text and also to find keywords. The algorithm is explained in detail in the paper at https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf . In order to find the most relevant sentences in text, a graph is constructed where the vertices of the graph represent each sentence in a document and the edges between sentences are based on content overlap, namely by calculating the number of words that 2 sentences have in common.'''

  from ArticleSumm import purge
  from ArticleSumm import summarizer

  parse=purge(text)

  #summary=summarizer(text,parse[0], parse[1], summary_length=3)
  summary=summarizer(text,words=purge(text)[0], sentence_list=purge(text)[1], summary_length=3)
  print(summary)


```

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "articlesumm",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "text summarization,scientific articles",
    "author": "Maxwell Tetteh",
    "author_email": "tettehmaxwell11@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/fb/ba/5e44a8a76ac366619498ae789153e4d946c508b7977da0db56dcf2fa57d4/articlesumm-0.0.2.tar.gz",
    "platform": null,
    "description": "The summarization of research articles is a complex task compared to general-purpose summaries.This is a result of\nthe distict nature or semantic structure of these scientific articles. The presence of inline citations and summarization\nmodules bias to certain Text features that often work well on less-tecnical text but fail to produce coherence in \nthis area are all underlying factors.\n\nWe circumvent these challenges in order to produce more coherent, human-understandable summaries of manuscripts and research text\nusing this libary.\n\n\n## Installation\nYou can easily install the package using the ```pip``` command:\n\n ```\n   pip install articlesumm\n ```\n\n## Usage\nThe package takes a string as input(specify a path/directory for an article or alternatively pass a string as a variable). The tokenization of the sentences and words can be performed with the first function: \n\n```\n  parse=purge(text)\n  type(parse) \n\n  #tuple\n```\n\nAlternatively, you can tokenize the sentences and words with any other technique and pass the processed text to the summarization model.\n\n## Example\n```\n  text='''TextRank is a graph-based ranking model for text processing which can be used in order to find the most relevant sentences in text and also to find keywords. The algorithm is explained in detail in the paper at https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf . In order to find the most relevant sentences in text, a graph is constructed where the vertices of the graph represent each sentence in a document and the edges between sentences are based on content overlap, namely by calculating the number of words that 2 sentences have in common.'''\n\n  from ArticleSumm import purge\n  from ArticleSumm import summarizer\n\n  parse=purge(text)\n\n  #summary=summarizer(text,parse[0], parse[1], summary_length=3)\n  summary=summarizer(text,words=purge(text)[0], sentence_list=purge(text)[1], summary_length=3)\n  print(summary)\n\n\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Custom scientific/research article summarization library based on Statistical features",
    "version": "0.0.2",
    "project_urls": null,
    "split_keywords": [
        "text summarization",
        "scientific articles"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "866a217efc29ae3b15bccd559515a37cfc96e297640bea6d0db3100dd73af7c3",
                "md5": "3f3411190d2aecbbaa1ad5295cfa7f99",
                "sha256": "c34fe03a9c746db3e3b682d873a8e58b3765d601dec4727227a4d0b857e57a02"
            },
            "downloads": -1,
            "filename": "articlesumm-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3f3411190d2aecbbaa1ad5295cfa7f99",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4058,
            "upload_time": "2023-06-15T01:16:31",
            "upload_time_iso_8601": "2023-06-15T01:16:31.163637Z",
            "url": "https://files.pythonhosted.org/packages/86/6a/217efc29ae3b15bccd559515a37cfc96e297640bea6d0db3100dd73af7c3/articlesumm-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fbba5e44a8a76ac366619498ae789153e4d946c508b7977da0db56dcf2fa57d4",
                "md5": "678780b06f27fd831cc969539a87365b",
                "sha256": "d2dd86026d1c348f1a72a64a9d367db99f554ba296f6c79b471cfccea12c2e26"
            },
            "downloads": -1,
            "filename": "articlesumm-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "678780b06f27fd831cc969539a87365b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4266,
            "upload_time": "2023-06-15T01:16:34",
            "upload_time_iso_8601": "2023-06-15T01:16:34.366588Z",
            "url": "https://files.pythonhosted.org/packages/fb/ba/5e44a8a76ac366619498ae789153e4d946c508b7977da0db56dcf2fa57d4/articlesumm-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-15 01:16:34",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "articlesumm"
}

Maxwell Tetteh