<div align="center"><img src="https://file.hankcs.com/img/hanlp-github-banner.png" height="100px"/></div>
<h2 align="center">HanLP: Han Language Processing</h2>
<div align="center">
<a href="https://github.com/hankcs/HanLP/actions/workflows/unit-tests.yml">
<img alt="Unit Tests" src="https://github.com/hankcs/hanlp/actions/workflows/unit-tests.yml/badge.svg?branch=master">
</a>
<a href="https://pypi.org/project/hanlp/">
<img alt="PyPI Version" src="https://img.shields.io/pypi/v/hanlp?color=blue">
</a>
<a href="https://pypi.org/project/hanlp/">
<img alt="Python Versions" src="https://img.shields.io/pypi/pyversions/hanlp?colorB=blue">
</a>
<a href="https://pepy.tech/project/hanlp">
<img alt="Downloads" src="https://static.pepy.tech/badge/hanlp">
</a>
<a href="https://colab.research.google.com/drive/1KPX6t1y36TOzRIeB4Kt3uJ1twuj6WuFv?usp=sharing">
<img alt="Open In Colab" src="https://file.hankcs.com/img/colab-badge.svg">
</a>
</div>
<h4 align="center">
<a href="https://github.com/hankcs/HanLP/tree/doc-zh">中文</a> |
<a href="https://github.com/hankcs/HanLP/tree/doc-ja">日本語</a> |
<a href="https://hanlp.hankcs.com/docs/">Docs</a> |
<a href="https://bbs.hankcs.com/">Forum</a>
</h4>
HanLP is the multilingual NLP library designed for researchers and enterprises, built on PyTorch and TensorFlow 2.x to advance state-of-the-art deep learning techniques in academia and industry. HanLP was designed from day one to be
efficient, user-friendly and extendable.
Thanks to open-access corpora like Universal Dependencies and OntoNotes, HanLP 2.1 now offers 10 joint tasks on [130
languages](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/mtl.html#hanlp.pretrained.mtl.UD_ONTONOTES_TOK_POS_LEM_FEA_NER_SRL_DEP_SDP_CON_MMINILMV2L6): tokenization, lemmatization, part-of-speech tagging, token feature extraction, dependency parsing,
constituency parsing, semantic role labeling, semantic dependency parsing, abstract meaning representation (AMR)
parsing.
For end users, HanLP offers light-weighted RESTful APIs and native Python APIs.
## RESTful APIs
Tiny packages in several KBs for agile development and mobile applications. Although anonymous users are welcomed, an
auth key is suggested
and [a free one can be applied here](https://bbs.hankcs.com/t/apply-for-free-hanlp-restful-apis/3178) under
the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
<details>
<summary>Click to expand tutorials for RESTful APIs</summary>
### Python
```bash
pip install hanlp_restful
```
Create a client with our API endpoint and your auth.
```python
from hanlp_restful import HanLPClient
HanLP = HanLPClient('https://hanlp.hankcs.com/api', auth=None, language='mul') # Support en, ja, zh, mul
```
### Java
Insert the following dependency into your `pom.xml`.
```xml
<dependency>
<groupId>com.hankcs.hanlp.restful</groupId>
<artifactId>hanlp-restful</artifactId>
<version>0.0.15</version>
</dependency>
```
Create a client with our API endpoint and your auth.
```java
HanLPClient HanLP = new HanLPClient("https://hanlp.hankcs.com/api", null, "mul"); // Support en, ja, zh, mul
```
### Quick Start
No matter which language you use, the same interface can be used to parse a document.
```python
HanLP.parse(
"In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments. 2021年、HanLPv2.1は次世代の最先端多言語NLP技術を本番環境に導入します。2021年 HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。")
```
See [docs](https://hanlp.hankcs.com/docs/tutorial.html) for visualization, annotation guidelines and more details.
</details>
## Native APIs
```bash
pip install hanlp
```
HanLP requires Python 3.6 or higher. While GPU or TPU acceleration is recommended, it is not mandatory.
### Quick Start
```python
import hanlp
HanLP = hanlp.load(hanlp.pretrained.mtl.UD_ONTONOTES_TOK_POS_LEM_FEA_NER_SRL_DEP_SDP_CON_XLMR_BASE)
print(HanLP(['In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments.',
'2021年、HanLPv2.1は次世代の最先端多言語NLP技術を本番環境に導入します。',
'2021年 HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。']))
```
- In particular, the Python `HanLPClient` can also be used as a callable function following the same semantics.
See [docs](https://hanlp.hankcs.com/docs/tutorial.html) for visualization, annotation guidelines and more details.
- To process English, Chinese or Japanese, HanLP provides mono-lingual models in each language which significantly outperform the
multilingual model. See [docs](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/index.html) for the list of models.
## Train Your Own Models
To write DL models is not hard, the real hard thing is to write a model able to reproduce the scores in papers. The
snippet below shows how to surpass the state-of-the-art tokenizer in 6 minutes.
```python
tokenizer = TransformerTaggingTokenizer()
save_dir = 'data/model/cws/sighan2005_pku_bert_base_96.7'
tokenizer.fit(
SIGHAN2005_PKU_TRAIN_ALL,
SIGHAN2005_PKU_TEST, # Conventionally, no devset is used. See Tian et al. (2020).
save_dir,
'bert-base-chinese',
max_seq_len=300,
char_level=True,
hard_constraint=True,
sampler_builder=SortingSamplerBuilder(batch_size=32),
epochs=3,
adam_epsilon=1e-6,
warmup_steps=0.1,
weight_decay=0.01,
word_dropout=0.1,
seed=1660853059,
)
tokenizer.evaluate(SIGHAN2005_PKU_TEST, save_dir)
```
The result is guaranteed to be `96.73` as the random seed is fixed. Different from some overclaiming papers and
projects, HanLP promises every single digit in our scores is reproducible. Any issues on reproducibility will be treated
and solved as a top-priority fatal bug.
## Performance
The performance of multi-task learning models is shown in the following table.
<table><thead><tr><th rowspan="2">lang</th><th rowspan="2">corpora</th><th rowspan="2">model</th><th colspan="2">tok</th><th colspan="4">pos</th><th colspan="3">ner</th><th rowspan="2">dep</th><th rowspan="2">con</th><th rowspan="2">srl</th><th colspan="4">sdp</th><th rowspan="2">lem</th><th rowspan="2">fea</th><th rowspan="2">amr</th></tr><tr><th>fine</th><th>coarse</th><th>ctb</th><th>pku</th><th>863</th><th>ud</th><th>pku</th><th>msra</th><th>ontonotes</th><th>SemEval16</th><th>DM</th><th>PAS</th><th>PSD</th></tr></thead><tbody><tr><td rowspan="2">mul</td><td rowspan="2">UD2.7<br>OntoNotes5</td><td>small</td><td>98.62</td><td>-</td><td>-</td><td>-</td><td>-</td><td>93.23</td><td>-</td><td>-</td><td>74.42</td><td>79.10</td><td>76.85</td><td>70.63</td><td>-</td><td>91.19</td><td>93.67</td><td>85.34</td><td>87.71</td><td>84.51</td><td>-</td></tr><tr><td>base</td><td>98.97</td><td>-</td><td>-</td><td>-</td><td>-</td><td>90.32</td><td>-</td><td>-</td><td>80.32</td><td>78.74</td><td>71.23</td><td>73.63</td><td>-</td><td>92.60</td><td>96.04</td><td>81.19</td><td>85.08</td><td>82.13</td><td>-</td></tr><tr><td rowspan="5">zh</td><td rowspan="2">open</td><td>small</td><td>97.25</td><td>-</td><td>96.66</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>95.00</td><td>84.57</td><td>87.62</td><td>73.40</td><td>84.57</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td>base</td><td>97.50</td><td>-</td><td>97.07</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>96.04</td><td>87.11</td><td>89.84</td><td>77.78</td><td>87.11</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td rowspan="3">close</td><td>small</td><td>96.70</td><td>95.93</td><td>96.87</td><td>97.56</td><td>95.05</td><td>-</td><td>96.22</td><td>95.74</td><td>76.79</td><td>84.44</td><td>88.13</td><td>75.81</td><td>74.28</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td>base</td><td>97.52</td><td>96.44</td><td>96.99</td><td>97.59</td><td>95.29</td><td>-</td><td>96.48</td><td>95.72</td><td>77.77</td><td>85.29</td><td>88.57</td><td>76.52</td><td>73.76</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td>ernie</td><td>96.95</td><td>97.29</td><td>96.76</td><td>97.64</td><td>95.22</td><td>-</td><td>97.31</td><td>96.47</td><td>77.95</td><td>85.67</td><td>89.17</td><td>78.51</td><td>74.10</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr></tbody></table>
- Multi-task learning models often under-perform their single-task learning counterparts according to our latest
research. Similarly, mono-lingual models often outperform multi-lingual models. Therefore, we strongly recommend the
use of [a single-task mono-lingual model](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/index.html) if you are
targeting at high accuracy instead of faster speed.
- A state-of-the-art [AMR model](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/amr.html) has been released.
## Citing
If you use HanLP in your research, please cite [our EMNLP paper](https://aclanthology.org/2021.emnlp-main.451):
```bibtex
@inproceedings{he-choi-2021-stem,
title = "The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders",
author = "He, Han and Choi, Jinho D.",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.451",
pages = "5555--5577",
abstract = "Multi-task learning with transformer encoders (MTL) has emerged as a powerful technique to improve performance on closely-related tasks for both accuracy and efficiency while a question still remains whether or not it would perform as well on tasks that are distinct in nature. We first present MTL results on five NLP tasks, POS, NER, DEP, CON, and SRL, and depict its deficiency over single-task learning. We then conduct an extensive pruning analysis to show that a certain set of attention heads get claimed by most tasks during MTL, who interfere with one another to fine-tune those heads for their own objectives. Based on this finding, we propose the Stem Cell Hypothesis to reveal the existence of attention heads naturally talented for many tasks that cannot be jointly trained to create adequate embeddings for all of those tasks. Finally, we design novel parameter-free probes to justify our hypothesis and demonstrate how attention heads are transformed across the five tasks during MTL through label analysis.",
}
```
## License
### Codes
HanLP is licensed under **Apache License 2.0**. You can use HanLP in your commercial products for free. We would
appreciate it if you add a link to HanLP on your website.
### Models
Unless otherwise specified, all models in HanLP are licensed
under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/).
## References
https://hanlp.hankcs.com/docs/references.html
Raw data
{
"_id": null,
"home_page": "https://github.com/hankcs/HanLP",
"name": "hanlp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "corpus, machine-learning, NLU, NLP",
"author": "hankcs",
"author_email": "hankcshe@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/dd/23/fabca6030d1060c793fbd0fb3098862255f613e4b3e7b96875e97c354f84/hanlp-2.1.0.tar.gz",
"platform": null,
"description": "<div align=\"center\"><img src=\"https://file.hankcs.com/img/hanlp-github-banner.png\" height=\"100px\"/></div>\n\n<h2 align=\"center\">HanLP: Han Language Processing</h2>\n\n<div align=\"center\">\n <a href=\"https://github.com/hankcs/HanLP/actions/workflows/unit-tests.yml\">\n <img alt=\"Unit Tests\" src=\"https://github.com/hankcs/hanlp/actions/workflows/unit-tests.yml/badge.svg?branch=master\">\n </a>\n <a href=\"https://pypi.org/project/hanlp/\">\n <img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/hanlp?color=blue\">\n </a>\n <a href=\"https://pypi.org/project/hanlp/\">\n <img alt=\"Python Versions\" src=\"https://img.shields.io/pypi/pyversions/hanlp?colorB=blue\">\n </a>\n <a href=\"https://pepy.tech/project/hanlp\">\n <img alt=\"Downloads\" src=\"https://static.pepy.tech/badge/hanlp\">\n </a>\n <a href=\"https://colab.research.google.com/drive/1KPX6t1y36TOzRIeB4Kt3uJ1twuj6WuFv?usp=sharing\">\n <img alt=\"Open In Colab\" src=\"https://file.hankcs.com/img/colab-badge.svg\">\n </a>\n</div>\n\n<h4 align=\"center\">\n <a href=\"https://github.com/hankcs/HanLP/tree/doc-zh\">\u4e2d\u6587</a> |\n <a href=\"https://github.com/hankcs/HanLP/tree/doc-ja\">\u65e5\u672c\u8a9e</a> |\n <a href=\"https://hanlp.hankcs.com/docs/\">Docs</a> |\n <a href=\"https://bbs.hankcs.com/\">Forum</a>\n</h4>\n\nHanLP is the multilingual NLP library designed for researchers and enterprises, built on PyTorch and TensorFlow 2.x to advance state-of-the-art deep learning techniques in academia and industry. HanLP was designed from day one to be\nefficient, user-friendly and extendable.\n\nThanks to open-access corpora like Universal Dependencies and OntoNotes, HanLP 2.1 now offers 10 joint tasks on [130\nlanguages](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/mtl.html#hanlp.pretrained.mtl.UD_ONTONOTES_TOK_POS_LEM_FEA_NER_SRL_DEP_SDP_CON_MMINILMV2L6): tokenization, lemmatization, part-of-speech tagging, token feature extraction, dependency parsing,\nconstituency parsing, semantic role labeling, semantic dependency parsing, abstract meaning representation (AMR)\nparsing.\n\nFor end users, HanLP offers light-weighted RESTful APIs and native Python APIs.\n\n## RESTful APIs\n\nTiny packages in several KBs for agile development and mobile applications. Although anonymous users are welcomed, an\nauth key is suggested\nand [a free one can be applied here](https://bbs.hankcs.com/t/apply-for-free-hanlp-restful-apis/3178) under\nthe [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.\n\n<details>\n <summary>Click to expand tutorials for RESTful APIs</summary>\n\n ### Python\n\n ```bash\n pip install hanlp_restful\n ```\n\n Create a client with our API endpoint and your auth.\n\n ```python\n from hanlp_restful import HanLPClient\n HanLP = HanLPClient('https://hanlp.hankcs.com/api', auth=None, language='mul') # Support en, ja, zh, mul\n ```\n\n ### Java\n\n Insert the following dependency into your `pom.xml`.\n\n ```xml\n <dependency>\n <groupId>com.hankcs.hanlp.restful</groupId>\n <artifactId>hanlp-restful</artifactId>\n <version>0.0.15</version>\n </dependency>\n ```\n\n Create a client with our API endpoint and your auth.\n\n ```java\n HanLPClient HanLP = new HanLPClient(\"https://hanlp.hankcs.com/api\", null, \"mul\"); // Support en, ja, zh, mul\n ```\n\n ### Quick Start\n\n No matter which language you use, the same interface can be used to parse a document.\n\n ```python\n HanLP.parse(\n \"In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments. 2021\u5e74\u3001HanLPv2.1\u306f\u6b21\u4e16\u4ee3\u306e\u6700\u5148\u7aef\u591a\u8a00\u8a9eNLP\u6280\u8853\u3092\u672c\u756a\u74b0\u5883\u306b\u5c0e\u5165\u3057\u307e\u3059\u30022021\u5e74 HanLPv2.1\u4e3a\u751f\u4ea7\u73af\u5883\u5e26\u6765\u6b21\u4e16\u4ee3\u6700\u5148\u8fdb\u7684\u591a\u8bed\u79cdNLP\u6280\u672f\u3002\")\n ```\n\n See [docs](https://hanlp.hankcs.com/docs/tutorial.html) for visualization, annotation guidelines and more details.\n\n</details>\n\n\n## Native APIs\n\n```bash\npip install hanlp\n```\n\nHanLP requires Python 3.6 or higher. While GPU or TPU acceleration is recommended, it is not mandatory.\n\n### Quick Start\n\n```python\nimport hanlp\n\nHanLP = hanlp.load(hanlp.pretrained.mtl.UD_ONTONOTES_TOK_POS_LEM_FEA_NER_SRL_DEP_SDP_CON_XLMR_BASE)\nprint(HanLP(['In 2021, HanLPv2.1 delivers state-of-the-art multilingual NLP techniques to production environments.',\n '2021\u5e74\u3001HanLPv2.1\u306f\u6b21\u4e16\u4ee3\u306e\u6700\u5148\u7aef\u591a\u8a00\u8a9eNLP\u6280\u8853\u3092\u672c\u756a\u74b0\u5883\u306b\u5c0e\u5165\u3057\u307e\u3059\u3002',\n '2021\u5e74 HanLPv2.1\u4e3a\u751f\u4ea7\u73af\u5883\u5e26\u6765\u6b21\u4e16\u4ee3\u6700\u5148\u8fdb\u7684\u591a\u8bed\u79cdNLP\u6280\u672f\u3002']))\n```\n\n- In particular, the Python `HanLPClient` can also be used as a callable function following the same semantics.\n See [docs](https://hanlp.hankcs.com/docs/tutorial.html) for visualization, annotation guidelines and more details.\n- To process English, Chinese or Japanese, HanLP provides mono-lingual models in each language which significantly outperform the\n multilingual model. See [docs](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/index.html) for the list of models.\n\n## Train Your Own Models\n\nTo write DL models is not hard, the real hard thing is to write a model able to reproduce the scores in papers. The\nsnippet below shows how to surpass the state-of-the-art tokenizer in 6 minutes.\n\n```python\ntokenizer = TransformerTaggingTokenizer()\nsave_dir = 'data/model/cws/sighan2005_pku_bert_base_96.7'\ntokenizer.fit(\n SIGHAN2005_PKU_TRAIN_ALL,\n SIGHAN2005_PKU_TEST, # Conventionally, no devset is used. See Tian et al. (2020).\n save_dir,\n 'bert-base-chinese',\n max_seq_len=300,\n char_level=True,\n hard_constraint=True,\n sampler_builder=SortingSamplerBuilder(batch_size=32),\n epochs=3,\n adam_epsilon=1e-6,\n warmup_steps=0.1,\n weight_decay=0.01,\n word_dropout=0.1,\n seed=1660853059,\n)\ntokenizer.evaluate(SIGHAN2005_PKU_TEST, save_dir)\n```\n\nThe result is guaranteed to be `96.73` as the random seed is fixed. Different from some overclaiming papers and\nprojects, HanLP promises every single digit in our scores is reproducible. Any issues on reproducibility will be treated\nand solved as a top-priority fatal bug.\n\n## Performance\n\nThe performance of multi-task learning models is shown in the following table.\n\n<table><thead><tr><th rowspan=\"2\">lang</th><th rowspan=\"2\">corpora</th><th rowspan=\"2\">model</th><th colspan=\"2\">tok</th><th colspan=\"4\">pos</th><th colspan=\"3\">ner</th><th rowspan=\"2\">dep</th><th rowspan=\"2\">con</th><th rowspan=\"2\">srl</th><th colspan=\"4\">sdp</th><th rowspan=\"2\">lem</th><th rowspan=\"2\">fea</th><th rowspan=\"2\">amr</th></tr><tr><th>fine</th><th>coarse</th><th>ctb</th><th>pku</th><th>863</th><th>ud</th><th>pku</th><th>msra</th><th>ontonotes</th><th>SemEval16</th><th>DM</th><th>PAS</th><th>PSD</th></tr></thead><tbody><tr><td rowspan=\"2\">mul</td><td rowspan=\"2\">UD2.7<br>OntoNotes5</td><td>small</td><td>98.62</td><td>-</td><td>-</td><td>-</td><td>-</td><td>93.23</td><td>-</td><td>-</td><td>74.42</td><td>79.10</td><td>76.85</td><td>70.63</td><td>-</td><td>91.19</td><td>93.67</td><td>85.34</td><td>87.71</td><td>84.51</td><td>-</td></tr><tr><td>base</td><td>98.97</td><td>-</td><td>-</td><td>-</td><td>-</td><td>90.32</td><td>-</td><td>-</td><td>80.32</td><td>78.74</td><td>71.23</td><td>73.63</td><td>-</td><td>92.60</td><td>96.04</td><td>81.19</td><td>85.08</td><td>82.13</td><td>-</td></tr><tr><td rowspan=\"5\">zh</td><td rowspan=\"2\">open</td><td>small</td><td>97.25</td><td>-</td><td>96.66</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>95.00</td><td>84.57</td><td>87.62</td><td>73.40</td><td>84.57</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td>base</td><td>97.50</td><td>-</td><td>97.07</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>96.04</td><td>87.11</td><td>89.84</td><td>77.78</td><td>87.11</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td rowspan=\"3\">close</td><td>small</td><td>96.70</td><td>95.93</td><td>96.87</td><td>97.56</td><td>95.05</td><td>-</td><td>96.22</td><td>95.74</td><td>76.79</td><td>84.44</td><td>88.13</td><td>75.81</td><td>74.28</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td>base</td><td>97.52</td><td>96.44</td><td>96.99</td><td>97.59</td><td>95.29</td><td>-</td><td>96.48</td><td>95.72</td><td>77.77</td><td>85.29</td><td>88.57</td><td>76.52</td><td>73.76</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr><tr><td>ernie</td><td>96.95</td><td>97.29</td><td>96.76</td><td>97.64</td><td>95.22</td><td>-</td><td>97.31</td><td>96.47</td><td>77.95</td><td>85.67</td><td>89.17</td><td>78.51</td><td>74.10</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td><td>-</td></tr></tbody></table>\n\n- Multi-task learning models often under-perform their single-task learning counterparts according to our latest\n research. Similarly, mono-lingual models often outperform multi-lingual models. Therefore, we strongly recommend the\n use of [a single-task mono-lingual model](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/index.html) if you are\n targeting at high accuracy instead of faster speed.\n- A state-of-the-art [AMR model](https://hanlp.hankcs.com/docs/api/hanlp/pretrained/amr.html) has been released.\n\n## Citing\n\nIf you use HanLP in your research, please cite [our EMNLP paper](https://aclanthology.org/2021.emnlp-main.451):\n\n```bibtex\n@inproceedings{he-choi-2021-stem,\n title = \"The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders\",\n author = \"He, Han and Choi, Jinho D.\",\n booktitle = \"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing\",\n month = nov,\n year = \"2021\",\n address = \"Online and Punta Cana, Dominican Republic\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2021.emnlp-main.451\",\n pages = \"5555--5577\",\n abstract = \"Multi-task learning with transformer encoders (MTL) has emerged as a powerful technique to improve performance on closely-related tasks for both accuracy and efficiency while a question still remains whether or not it would perform as well on tasks that are distinct in nature. We first present MTL results on five NLP tasks, POS, NER, DEP, CON, and SRL, and depict its deficiency over single-task learning. We then conduct an extensive pruning analysis to show that a certain set of attention heads get claimed by most tasks during MTL, who interfere with one another to fine-tune those heads for their own objectives. Based on this finding, we propose the Stem Cell Hypothesis to reveal the existence of attention heads naturally talented for many tasks that cannot be jointly trained to create adequate embeddings for all of those tasks. Finally, we design novel parameter-free probes to justify our hypothesis and demonstrate how attention heads are transformed across the five tasks during MTL through label analysis.\",\n}\n```\n\n## License\n\n### Codes\n\nHanLP is licensed under **Apache License 2.0**. You can use HanLP in your commercial products for free. We would\nappreciate it if you add a link to HanLP on your website.\n\n### Models\n\nUnless otherwise specified, all models in HanLP are licensed\nunder [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/).\n\n## References\n\nhttps://hanlp.hankcs.com/docs/references.html\n\n\n\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "HanLP: Han Language Processing",
"version": "2.1.0",
"project_urls": {
"Homepage": "https://github.com/hankcs/HanLP"
},
"split_keywords": [
"corpus",
" machine-learning",
" nlu",
" nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c7ac28a4a3eb477a0473b2577167cc0aef5e38a44a9e10fa6a79b372c2ad17fe",
"md5": "d51754d7da32d1769fc4948986cc4222",
"sha256": "fc37aa7809affb97af973ebcdb5c5d3dbdae18db2eb831d4dd9239111498381c"
},
"downloads": -1,
"filename": "hanlp-2.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d51754d7da32d1769fc4948986cc4222",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 653047,
"upload_time": "2024-12-29T00:42:20",
"upload_time_iso_8601": "2024-12-29T00:42:20.997914Z",
"url": "https://files.pythonhosted.org/packages/c7/ac/28a4a3eb477a0473b2577167cc0aef5e38a44a9e10fa6a79b372c2ad17fe/hanlp-2.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dd23fabca6030d1060c793fbd0fb3098862255f613e4b3e7b96875e97c354f84",
"md5": "104daf7f2616160e31519d87791d90fc",
"sha256": "f8a37a29f2deb1df9335b823ac06661475054dc2dcb3a1314ad201c9359bc96c"
},
"downloads": -1,
"filename": "hanlp-2.1.0.tar.gz",
"has_sig": false,
"md5_digest": "104daf7f2616160e31519d87791d90fc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 500557,
"upload_time": "2024-12-29T00:42:24",
"upload_time_iso_8601": "2024-12-29T00:42:24.174829Z",
"url": "https://files.pythonhosted.org/packages/dd/23/fabca6030d1060c793fbd0fb3098862255f613e4b3e7b96875e97c354f84/hanlp-2.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-29 00:42:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hankcs",
"github_project": "HanLP",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "hanlp"
}