What is it?
===========
Companion library of machine learning book [Feature Engineering & Selection for Explainable Models: A Second Course for Data Scientists](https://statguyuser.github.io/feature-engg-selection-for-explainable-models.github.io/index.html)
SNgramExtractor module helps extract Syntactic relations (SR tags) as elements of sn-grams.
We follow the path marked by the arrows in the dependencies and obtain sngrams.[1]
The advantage of syntactic n-grams (SN-grams), i.e., n-grams that are constructed using paths in syntactic trees, is that they are less arbitrary than traditional n-grams. Thus, their number is less than the number of traditional n-grams. Besides, they can be interpreted as linguistic phenomenon, while traditional n-grams have no plausible linguistic interpretation they are merely statistical artifact. [1]
SN-gram has usability across many natural language processing application areas, such as classification tasks in machine learning[2], information extraction[3], query understanding[4], machine translation[5], question answering systems[6]
Input parameters
================
- **text** input text as a single sentence.
- **meta_tag** Resultant bigram and trigram should be concatenated with part of speech tag('pos') or dependency tag('dep') or original SN-gram('original')
- **trigram_flag** if we need to include trigrams derived from SN-grams as well ('yes') or not ('no'). Default is 'yes'
- **nlp_model** Specify the spacy language model you want to use. Default is spacy English language model en_core_web_sm. This is useful for being able to use languages other than english.
Output
================
Dictionary object with key value pairs for bigram and trigram derived from SN-gram.
- **SNBigram** dictionary key for bigram derived from SN-gram
- **SNTrigram** dictionary key for trigram derived from SN-gram
How to use is it?
=================
```python
from SNgramExtractor import SNgramExtractor
text='Economic news have little effect on financial markets.'
SNgram_obj=SNgramExtractor(text,meta_tag='original',trigram_flag='yes',nlp_model=None)
output=SNgram_obj.get_SNgram()
print(text)
print('SNGram bigram:',output['SNBigram'])
print('SNGram trigram:',output['SNTrigram'])
print('-----------------------------------')
text='every cloud has a silver lining'
SNgram_obj=SNgramExtractor(text,meta_tag='original',trigram_flag='yes',nlp_model=None)
output=SNgram_obj.get_SNgram()
print(text)
print('SNGram bigram:',output['SNBigram'])
print('SNGram trigram:',output['SNTrigram'])
print('-----------------------------------')
nlp_french = spacy.load('fr_core_news_sm')
text='Je voudrais réserver un hôtel à Rennes.'
SNgram_obj=SNgramExtractor(text,meta_tag='original',trigram_flag='yes',nlp_model=nlp_french)
output=SNgram_obj.get_SNgram()
print(text)
print('SNGram bigram:',output['SNBigram'])
print('SNGram trigram:',output['SNTrigram'])
```
Where to get it?
================
`pip install SNgramExtractor`
How to cite?
================
Md Azimul Haque (2022). Feature Engineering & Selection for Explainable Models: A Second Course for Data Scientists. Lulu Press, Inc.
Dependencies
============
- [spacy](https://spacy.io/)
- [spacy model en_core_web_sm](https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz)
References
============
1. [Syntactic Dependency-Based N-grams as Classification Features](http://www.icsd.aegean.gr/lecturers/stamatatos/papers/MICAI2012.pdf) by Grigori Sidorov , Francisco Velasquez, Efstathios Stamatatos, Alexander Gelbukh and Liliana Chanona-Hernández
2. [Syntactic N-grams as Machine Learning Features for Natural Language Processing](http://www.cic.ipn.mx/~sidorov/Synt_n_grams_ESWA_FINAL.pdf) by Grigori Sidorov , Francisco Velasquez, Efstathios Stamatatos, Alexander Gelbukh and Liliana Chanona-Hernández
3. [Dependency-Based Open Information Extraction](http://www.anthology.aclweb.org/W/W12/W12-0702.pdf) by Pablo Gamallo, Marcos Garcia and Santiago Fernandez-Lanza
4. [Query Understanding Enhanced By Hierarchical Parsing Structures](https://groups.csail.mit.edu/sls/publications/2013/Liu_ASRU_2013.pdf) by Jingjing Liu, Panupong Pasupat, Yining Wang, Scott Cyphers, and Jim Glass
5. [Dependency Structure Trees in Syntax Based Machine Translation](http://www.cs.cmu.edu/~vamshi/publications/DependencyMT_report.pdf) by Vamshi Ambati
6. [Question Answering Passage Retrieval Using Dependency Relations](https://www.comp.nus.edu.sg/~kanmy/papers/f66-cui.pdf) by Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan and Tat-Seng Chua
Raw data
{
"_id": null,
"home_page": "https://github.com/StatguyUser/SNgramExtractor",
"name": "SNgramExtractor",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "StatguyUser",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/49/7a/691e56ff4af9aa2f94b2f2d71bb7d26a876481224903f1b69cddc79cd5ff/SNgramExtractor-0.0.6.tar.gz",
"platform": null,
"description": "What is it?\n===========\n\nCompanion library of machine learning book [Feature Engineering & Selection for Explainable Models: A Second Course for Data Scientists](https://statguyuser.github.io/feature-engg-selection-for-explainable-models.github.io/index.html)\n\nSNgramExtractor module helps extract Syntactic relations (SR tags) as elements of sn-grams. \n\nWe follow the path marked by the arrows in the dependencies and obtain sngrams.[1]\n\nThe advantage of syntactic n-grams (SN-grams), i.e., n-grams that are constructed using paths in syntactic trees, is that they are less arbitrary than traditional n-grams. Thus, their number is less than the number of traditional n-grams. Besides, they can be interpreted as linguistic phenomenon, while traditional n-grams have no plausible linguistic interpretation they are merely statistical artifact. [1]\n\nSN-gram has usability across many natural language processing application areas, such as classification tasks in machine learning[2], information extraction[3], query understanding[4], machine translation[5], question answering systems[6]\n\nInput parameters\n================\n\n - **text** input text as a single sentence.\n - **meta_tag** Resultant bigram and trigram should be concatenated with part of speech tag('pos') or dependency tag('dep') or original SN-gram('original')\n - **trigram_flag** if we need to include trigrams derived from SN-grams as well ('yes') or not ('no'). Default is 'yes'\n - **nlp_model** Specify the spacy language model you want to use. Default is spacy English language model en_core_web_sm. This is useful for being able to use languages other than english.\n\nOutput\n================\n\nDictionary object with key value pairs for bigram and trigram derived from SN-gram.\n\n - **SNBigram** dictionary key for bigram derived from SN-gram\n - **SNTrigram** dictionary key for trigram derived from SN-gram\n\nHow to use is it?\n=================\n\n```python\n\nfrom SNgramExtractor import SNgramExtractor\n\ntext='Economic news have little effect on financial markets.' \nSNgram_obj=SNgramExtractor(text,meta_tag='original',trigram_flag='yes',nlp_model=None)\noutput=SNgram_obj.get_SNgram()\nprint(text)\nprint('SNGram bigram:',output['SNBigram'])\nprint('SNGram trigram:',output['SNTrigram'])\n\nprint('-----------------------------------')\ntext='every cloud has a silver lining'\nSNgram_obj=SNgramExtractor(text,meta_tag='original',trigram_flag='yes',nlp_model=None)\noutput=SNgram_obj.get_SNgram()\nprint(text)\nprint('SNGram bigram:',output['SNBigram'])\nprint('SNGram trigram:',output['SNTrigram'])\n\nprint('-----------------------------------')\nnlp_french = spacy.load('fr_core_news_sm')\ntext='Je voudrais r\u00e9server un h\u00f4tel \u00e0 Rennes.'\nSNgram_obj=SNgramExtractor(text,meta_tag='original',trigram_flag='yes',nlp_model=nlp_french)\noutput=SNgram_obj.get_SNgram() \nprint(text)\nprint('SNGram bigram:',output['SNBigram'])\nprint('SNGram trigram:',output['SNTrigram'])\n\n```\n\nWhere to get it?\n================\n\n`pip install SNgramExtractor`\n\nHow to cite?\n================\n\nMd Azimul Haque (2022). Feature Engineering & Selection for Explainable Models: A Second Course for Data Scientists. Lulu Press, Inc.\n\nDependencies\n============\n\n - [spacy](https://spacy.io/)\n - [spacy model en_core_web_sm](https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz)\n\nReferences\n============\n\n1. [Syntactic Dependency-Based N-grams as Classification Features](http://www.icsd.aegean.gr/lecturers/stamatatos/papers/MICAI2012.pdf) by Grigori Sidorov , Francisco Velasquez, Efstathios Stamatatos, Alexander Gelbukh and Liliana Chanona-Hern\u00e1ndez\n2. [Syntactic N-grams as Machine Learning Features for Natural Language Processing](http://www.cic.ipn.mx/~sidorov/Synt_n_grams_ESWA_FINAL.pdf) by Grigori Sidorov , Francisco Velasquez, Efstathios Stamatatos, Alexander Gelbukh and Liliana Chanona-Hern\u00e1ndez\n3. [Dependency-Based Open Information Extraction](http://www.anthology.aclweb.org/W/W12/W12-0702.pdf) by Pablo Gamallo, Marcos Garcia and Santiago Fernandez-Lanza\n4. [Query Understanding Enhanced By Hierarchical Parsing Structures](https://groups.csail.mit.edu/sls/publications/2013/Liu_ASRU_2013.pdf) by Jingjing Liu, Panupong Pasupat, Yining Wang, Scott Cyphers, and Jim Glass\n5. [Dependency Structure Trees in Syntax Based Machine Translation](http://www.cs.cmu.edu/~vamshi/publications/DependencyMT_report.pdf) by Vamshi Ambati\n6. [Question Answering Passage Retrieval Using Dependency Relations](https://www.comp.nus.edu.sg/~kanmy/papers/f66-cui.pdf) by Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan and Tat-Seng Chua\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Implementation of syntactic n-grams (sn-gram) extraction",
"version": "0.0.6",
"project_urls": {
"Download": "https://github.com/StatguyUser/SNgramExtractor.git",
"Homepage": "https://github.com/StatguyUser/SNgramExtractor"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7291aab4fbb497b8f405898b7f6cfe2dae5b2af9bb20a3a918b622ef80b89b86",
"md5": "81c587a4d51bbdcdf0fe472a0e4c8d87",
"sha256": "b32b8546cc554312793a18fc20c552634c2843117bc8b5fe83a7277bc63487c5"
},
"downloads": -1,
"filename": "SNgramExtractor-0.0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "81c587a4d51bbdcdf0fe472a0e4c8d87",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 5191,
"upload_time": "2023-08-06T14:01:58",
"upload_time_iso_8601": "2023-08-06T14:01:58.811532Z",
"url": "https://files.pythonhosted.org/packages/72/91/aab4fbb497b8f405898b7f6cfe2dae5b2af9bb20a3a918b622ef80b89b86/SNgramExtractor-0.0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "497a691e56ff4af9aa2f94b2f2d71bb7d26a876481224903f1b69cddc79cd5ff",
"md5": "a8fbbf960cb772314a7bb0ef80bd1366",
"sha256": "71a9da1043fcd81d304414c5082b5043662621dec896e3fd9a5f86ed69ed419e"
},
"downloads": -1,
"filename": "SNgramExtractor-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "a8fbbf960cb772314a7bb0ef80bd1366",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4993,
"upload_time": "2023-08-06T14:02:00",
"upload_time_iso_8601": "2023-08-06T14:02:00.571152Z",
"url": "https://files.pythonhosted.org/packages/49/7a/691e56ff4af9aa2f94b2f2d71bb7d26a876481224903f1b69cddc79cd5ff/SNgramExtractor-0.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-06 14:02:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "StatguyUser",
"github_project": "SNgramExtractor",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "sngramextractor"
}