NUGigSkillNER


NameNUGigSkillNER JSON
Version 2.0.4 PyPI version JSON
download
home_pagehttps://github.com/emandel2630/NUGigSkillNER
SummaryAn NLP module to automatically Extract skills and certifications from unstructured job postings, texts, and applicant's resumes
upload_time2024-04-05 01:59:35
maintainerNone
docs_urlNone
authorEthan Mandel
requires_pythonNone
licenseNone
keywords skillner python nlp ner skills-extraction job-description
VCS
bugtrack_url
requirements pandas nltk spacy jellyfish sphinx furo sphinx-copybutton twine scipy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center"><img width="50%" src="https://user-images.githubusercontent.com/56308112/128958594-79813e72-b688-4a9a-9267-324f098d4b0c.png" /></p>

[**Live demo**](https://share.streamlit.io/anasaito/skillner_demo/index.py) | [**Documentation**](https://badr-moufad.github.io/SkillNER/get_started.html) | [**Website**](https://skillner.vercel.app/)

----------------------


[![Downloads](https://static.pepy.tech/personalized-badge/skillner?period=month&units=international_system&left_color=blue&right_color=green&left_text=Downloads%20/%20months)](https://pepy.tech/project/skillner)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**Just looking to test out SkillNer? Check out our [demo](https://anasaito-skillner-demo-index-4fiwi3.streamlit.app/)**.

SkillNer is an NLP module to automatically Extract skills and certifications from unstructured job postings, texts, and applicant's resumes.

Skillner uses [EMSI](https://skills.emsidata.com/) databse (an open source skill database) as a knowldge base linker to prevent skill duplications.



<p align="center"><img width="50%" src="https://user-images.githubusercontent.com/56308112/138768792-a25d25e7-1e43-4a44-aa46-8de9895ffe88.png" /></p>


## Installation

It is easy to get started with **SkillNer** and take advantage of its features.

1. First, install **SkillNer** through the ``pip``

```bash
pip install skillNer
```

2. Next, run the following command to install ``spacy en_core_web_lg ``
which is one of the main plugins of SkillNer. Thanks its modular nature, you can 
customize SkillNer behavior just by adjusting  | plugin | unplugin modules. Don't worry about these details, we will discuss them in details in the an **upcomming Tutorial section**.

```bash
python -m spacy download en_core_web_lg
```

**Note:** The later installation will take few seconds before it get done since ``spacy en_core_web_lg `` is a bit too large (800 MB). Yet, you need to wait only one time.


## Example of usage

With these initial steps being accomplished, let’s dive a bit deeper into skillNer through a worked example.

Let’s say you want to extract skills from the following job posting:

    “You are a Python developer with a solid experience in web development and can manage projects. 
    You quickly adapt to new environments and speak fluently English and French”

### Annotating skills

We start first by importing modules, particularly spacy and SkillExtractor. Note that if you are using skillNer for the first time, it might take a while to download SKILL_DB.

**SKILL_DB** is SkillNer default skills database. It was built upon [EMSI skills database ](https://skills.emsidata.com/).



```python
# imports
import spacy
from spacy.matcher import PhraseMatcher

# load default skills data base
from skillNer.general_params import SKILL_DB
# import skill extractor
from skillNer.skill_extractor_class import SkillExtractor

# init params of skill extractor
nlp = spacy.load("en_core_web_lg")
# init skill extractor
skill_extractor = SkillExtractor(nlp, SKILL_DB, PhraseMatcher)

# extract skills from job_description
job_description = """
You are a Python developer with a solid experience in web development
and can manage projects. You quickly adapt to new environments
and speak fluently English and French
"""

annotations = skill_extractor.annotate(job_description)

```



### Exploit annotations

Voilà! Now you can inspect results by rendering the text with the annotated skills.
You can acheive that through the ``.describe`` method. Note that the output of this method is 
litteraly an HTML document that gets rendered in your notebook.


<p align="center">
    <img src="./screenshots/output-describe.gif" alt="example output skillNer"/>
</p>


Besides, you can use the raw result of the annotations. 
Below is the value of the ``annotations`` variable from the code above.


```python
# output
{
    'text': 'you are a python developer with a solid experience in web development and can manage projects you quickly adapt to new environments and speak fluently english and french',
    'results': {
        'full_matches': [
            {
                'skill_id': 'KS122Z36QK3N5097B5JH', 
                'doc_node_value': 'web development', 
                'score': 1, 'doc_node_id': [10, 11]
            }
        ], '
        ngram_scored': [
            {
                'skill_id': 'KS125LS6N7WP4S6SFTCK', 
                'doc_node_id': [3], 
                'doc_node_value': 'python', 
                'type': 'fullUni', 
                'score': 1, 
                'len': 1
            }, 
        # the other annotated skills
        # ...
        ]
    }
}
```

# Contribure

SkillNer is the first **Open Source** skill extractor. 
Hence it is a tool dedicated to the community and thereby relies on its contribution to evolve.

We did our best to adapt SkillNer for usage and fixed many of its bugs. Therefore, we believe its key features 
make it ready for a diversity of use cases. However, it still has not reached 100% stability. SkillNer needs the assistance of the community to be adapted further
and broaden its usage. 


You can contribute to SkillNer either by

1. Reporting issues. Indeed, you may encounter one while you are using SkillNer. So do not hesitate to mention them in the [issue section of our GitHub repository](https://github.com/AnasAito/SkillNER/issues). Also, you can use the issue as a way to suggest new features to be added.

2. Pushing code to our repository through pull requests. In case you fixed an issue or wanted to extend SkillNer features.


3. A third (friendly and not technical) option to contribute to SkillNer will be soon released. *So, stay tuned...*



Finally, make sure to read carefully [our guidelines](https://badr-moufad.github.io/SkillNER/contribute.html) before contributing. It will specifies standards to follow so that we can understand what you want to say.


Besides, it will help you setup SkillNer on your local machine, in case you are willing to push code.


## Useful links

- [Visit our website](https://skillner.vercel.app/) to learn about SkillNer features, how it works, and particularly explore our roadmap
- Get started with SkillNer and get to know its API by visiting the [Documentation](https://badr-moufad.github.io/SkillNER/get_started.html)
- [Test our Demo](https://share.streamlit.io/anasaito/skillner_demo/index.py) to see some of SkillNer capabilities

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/emandel2630/NUGigSkillNER",
    "name": "NUGigSkillNER",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "skillNer, python, NLP, NER, skills-extraction, job-description",
    "author": "Ethan Mandel",
    "author_email": "emandel2630@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/8a/7d/f9be4b4fb3fdaf06a9c422cde871a3f232d5fd06a73565497bad9589ea01/NUGigSkillNER-2.0.4.tar.gz",
    "platform": null,
    "description": "<p align=\"center\"><img width=\"50%\" src=\"https://user-images.githubusercontent.com/56308112/128958594-79813e72-b688-4a9a-9267-324f098d4b0c.png\" /></p>\r\n\r\n[**Live demo**](https://share.streamlit.io/anasaito/skillner_demo/index.py) | [**Documentation**](https://badr-moufad.github.io/SkillNER/get_started.html) | [**Website**](https://skillner.vercel.app/)\r\n\r\n----------------------\r\n\r\n\r\n[![Downloads](https://static.pepy.tech/personalized-badge/skillner?period=month&units=international_system&left_color=blue&right_color=green&left_text=Downloads%20/%20months)](https://pepy.tech/project/skillner)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\n**Just looking to test out SkillNer? Check out our [demo](https://anasaito-skillner-demo-index-4fiwi3.streamlit.app/)**.\r\n\r\nSkillNer is an NLP module to automatically Extract skills and certifications from unstructured job postings, texts, and applicant's resumes.\r\n\r\nSkillner uses [EMSI](https://skills.emsidata.com/) databse (an open source skill database) as a knowldge base linker to prevent skill duplications.\r\n\r\n\r\n\r\n<p align=\"center\"><img width=\"50%\" src=\"https://user-images.githubusercontent.com/56308112/138768792-a25d25e7-1e43-4a44-aa46-8de9895ffe88.png\" /></p>\r\n\r\n\r\n## Installation\r\n\r\nIt is easy to get started with **SkillNer** and take advantage of its features.\r\n\r\n1. First, install **SkillNer** through the ``pip``\r\n\r\n```bash\r\npip install skillNer\r\n```\r\n\r\n2. Next, run the following command to install ``spacy en_core_web_lg ``\r\nwhich is one of the main plugins of SkillNer. Thanks its modular nature, you can \r\ncustomize SkillNer behavior just by adjusting  | plugin | unplugin modules. Don't worry about these details, we will discuss them in details in the an **upcomming Tutorial section**.\r\n\r\n```bash\r\npython -m spacy download en_core_web_lg\r\n```\r\n\r\n**Note:** The later installation will take few seconds before it get done since ``spacy en_core_web_lg `` is a bit too large (800 MB). Yet, you need to wait only one time.\r\n\r\n\r\n## Example of usage\r\n\r\nWith these initial steps being accomplished, let\u2019s dive a bit deeper into skillNer through a worked example.\r\n\r\nLet\u2019s say you want to extract skills from the following job posting:\r\n\r\n    \u201cYou are a Python developer with a solid experience in web development and can manage projects. \r\n    You quickly adapt to new environments and speak fluently English and French\u201d\r\n\r\n### Annotating skills\r\n\r\nWe start first by importing modules, particularly spacy and SkillExtractor. Note that if you are using skillNer for the first time, it might take a while to download SKILL_DB.\r\n\r\n**SKILL_DB** is SkillNer default skills database. It was built upon [EMSI skills database ](https://skills.emsidata.com/).\r\n\r\n\r\n\r\n```python\r\n# imports\r\nimport spacy\r\nfrom spacy.matcher import PhraseMatcher\r\n\r\n# load default skills data base\r\nfrom skillNer.general_params import SKILL_DB\r\n# import skill extractor\r\nfrom skillNer.skill_extractor_class import SkillExtractor\r\n\r\n# init params of skill extractor\r\nnlp = spacy.load(\"en_core_web_lg\")\r\n# init skill extractor\r\nskill_extractor = SkillExtractor(nlp, SKILL_DB, PhraseMatcher)\r\n\r\n# extract skills from job_description\r\njob_description = \"\"\"\r\nYou are a Python developer with a solid experience in web development\r\nand can manage projects. You quickly adapt to new environments\r\nand speak fluently English and French\r\n\"\"\"\r\n\r\nannotations = skill_extractor.annotate(job_description)\r\n\r\n```\r\n\r\n\r\n\r\n### Exploit annotations\r\n\r\nVoil\u00e0! Now you can inspect results by rendering the text with the annotated skills.\r\nYou can acheive that through the ``.describe`` method. Note that the output of this method is \r\nlitteraly an HTML document that gets rendered in your notebook.\r\n\r\n\r\n<p align=\"center\">\r\n    <img src=\"./screenshots/output-describe.gif\" alt=\"example output skillNer\"/>\r\n</p>\r\n\r\n\r\nBesides, you can use the raw result of the annotations. \r\nBelow is the value of the ``annotations`` variable from the code above.\r\n\r\n\r\n```python\r\n# output\r\n{\r\n    'text': 'you are a python developer with a solid experience in web development and can manage projects you quickly adapt to new environments and speak fluently english and french',\r\n    'results': {\r\n        'full_matches': [\r\n            {\r\n                'skill_id': 'KS122Z36QK3N5097B5JH', \r\n                'doc_node_value': 'web development', \r\n                'score': 1, 'doc_node_id': [10, 11]\r\n            }\r\n        ], '\r\n        ngram_scored': [\r\n            {\r\n                'skill_id': 'KS125LS6N7WP4S6SFTCK', \r\n                'doc_node_id': [3], \r\n                'doc_node_value': 'python', \r\n                'type': 'fullUni', \r\n                'score': 1, \r\n                'len': 1\r\n            }, \r\n        # the other annotated skills\r\n        # ...\r\n        ]\r\n    }\r\n}\r\n```\r\n\r\n# Contribure\r\n\r\nSkillNer is the first **Open Source** skill extractor. \r\nHence it is a tool dedicated to the community and thereby relies on its contribution to evolve.\r\n\r\nWe did our best to adapt SkillNer for usage and fixed many of its bugs. Therefore, we believe its key features \r\nmake it ready for a diversity of use cases. However, it still has not reached 100% stability. SkillNer needs the assistance of the community to be adapted further\r\nand broaden its usage. \r\n\r\n\r\nYou can contribute to SkillNer either by\r\n\r\n1. Reporting issues. Indeed, you may encounter one while you are using SkillNer. So do not hesitate to mention them in the [issue section of our GitHub repository](https://github.com/AnasAito/SkillNER/issues). Also, you can use the issue as a way to suggest new features to be added.\r\n\r\n2. Pushing code to our repository through pull requests. In case you fixed an issue or wanted to extend SkillNer features.\r\n\r\n\r\n3. A third (friendly and not technical) option to contribute to SkillNer will be soon released. *So, stay tuned...*\r\n\r\n\r\n\r\nFinally, make sure to read carefully [our guidelines](https://badr-moufad.github.io/SkillNER/contribute.html) before contributing. It will specifies standards to follow so that we can understand what you want to say.\r\n\r\n\r\nBesides, it will help you setup SkillNer on your local machine, in case you are willing to push code.\r\n\r\n\r\n## Useful links\r\n\r\n- [Visit our website](https://skillner.vercel.app/) to learn about SkillNer features, how it works, and particularly explore our roadmap\r\n- Get started with SkillNer and get to know its API by visiting the [Documentation](https://badr-moufad.github.io/SkillNER/get_started.html)\r\n- [Test our Demo](https://share.streamlit.io/anasaito/skillner_demo/index.py) to see some of SkillNer capabilities\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "An NLP module to automatically Extract skills and certifications from unstructured job postings, texts, and applicant's resumes",
    "version": "2.0.4",
    "project_urls": {
        "Homepage": "https://github.com/emandel2630/NUGigSkillNER"
    },
    "split_keywords": [
        "skillner",
        " python",
        " nlp",
        " ner",
        " skills-extraction",
        " job-description"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c9bde2c6ad7e43f3e2b851a83dd41173e089e8dcb1f257dcfab9e3826ac1e8a0",
                "md5": "e5baacd37176ed353578f7d0b93dbb22",
                "sha256": "6ff40abe69110a09a8e58e4f345882a0f2d8eca8d2ecc2962086b44ce0b6541f"
            },
            "downloads": -1,
            "filename": "NUGigSkillNER-2.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e5baacd37176ed353578f7d0b93dbb22",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 43399,
            "upload_time": "2024-04-05T01:59:33",
            "upload_time_iso_8601": "2024-04-05T01:59:33.981691Z",
            "url": "https://files.pythonhosted.org/packages/c9/bd/e2c6ad7e43f3e2b851a83dd41173e089e8dcb1f257dcfab9e3826ac1e8a0/NUGigSkillNER-2.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8a7df9be4b4fb3fdaf06a9c422cde871a3f232d5fd06a73565497bad9589ea01",
                "md5": "741c5a872234c0dd4d9cbc88e77543a5",
                "sha256": "778965241a053f4e34c4d40e12dac0fee9ef8ad8d962f66a925e2d926a6af900"
            },
            "downloads": -1,
            "filename": "NUGigSkillNER-2.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "741c5a872234c0dd4d9cbc88e77543a5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 23802,
            "upload_time": "2024-04-05T01:59:35",
            "upload_time_iso_8601": "2024-04-05T01:59:35.669426Z",
            "url": "https://files.pythonhosted.org/packages/8a/7d/f9be4b4fb3fdaf06a9c422cde871a3f232d5fd06a73565497bad9589ea01/NUGigSkillNER-2.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-05 01:59:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "emandel2630",
    "github_project": "NUGigSkillNER",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "nltk",
            "specs": []
        },
        {
            "name": "spacy",
            "specs": []
        },
        {
            "name": "jellyfish",
            "specs": []
        },
        {
            "name": "sphinx",
            "specs": []
        },
        {
            "name": "furo",
            "specs": []
        },
        {
            "name": "sphinx-copybutton",
            "specs": []
        },
        {
            "name": "twine",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        }
    ],
    "lcname": "nugigskillner"
}
        
Elapsed time: 0.21771s