Name | conlludiff JSON |
Version |
0.0.5
JSON |
| download |
home_page | None |
Summary | Analyze two CONLLU files |
upload_time | 2024-04-02 08:49:42 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. |
keywords |
conlludiff
conllu
conll
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# conllu-diff
A tool for statistically comparing two conllu files. It offers two equivalent use modes:
* as a command line script
* as a python package
## Installation
```
pip install conlludiff
```
## CLI use
Run as `python -m conlludiff <json>`.
The tool is configured through a JSON configuration file (example in `config_files/ssj_sst_upos.json`) where the user defines:
- `file1` - The conllu file containing the first language sample
- `file2` - The conllu file containing the second language sample
- `event` - The linguistic feature the comparison is to be based on, optional events are form, lemma, upos, xpos, upos+feats, feat (each feature separately), feats (all features of a word merged), deprel, deprel+head_deprel
- `filter` - The minimum p-value of the chi-square test for the entry to be filtered from the results
- `fields` - A list of fields to be retained in the output (list of available values is listed below)
- `order` - The field by which the output is to be ordered
- `reverse` - Whether the ordering should be reverse
- `output` - Where the output is to be produced, either stdout or filename
The following fields / values are currently available:
- `chisq` - The chi-square statistical test.
- `chisq_p` - The p-value of the chi-square test. Very useful for discarding results with p-value below 0.05. These results you simply cannot trust (they might have happened by chance) and do not have to look at.
- `cramers_v` - The Cramer's V effect size, based on the chi square statistic and the sample size. Traditionally it should be over 0.1 for small effect, over 0.3 for medium effect and over 0.5 for strong effect, but on language phenomena it will never achieve even medium effect. It is comparable across datasets of different sizes, so if the tool is run on multiple pairs of documents, these effect sizes CAN be used for comparison across datasets.
- `odds_ratio` - The odds ratio effect size. Put simply - it reports how many times the odds of an event are higher in one dataset in comparison to another dataset. It is always higher than 1. This is why `odds_ratio_direction` gives info on the dataset for which the odds of a specific event are higher.
- `odds_ratio_direction` - The direction of the odds ratio presented previously. If `first`, the odds of the event are greater in the first dataset. If `second`, the odds for this event are higher in the second dataset.
- `log_likelihood_ratio` - The log-likelihood ratio, as defined by Danning (1993), here mostly for reasons of popularity in the computational linguistics circles.
## Python API
### Use
The differ can be used through its python API as follows:
```python
from conlludiff import Differ
d = Differ(
"conllu_files/sl_ssj-ud-train.conllu",
"conllu_files/sl_sst-ud-train.conllu",
event="upos",
filter=0.05,
fields=[
"event",
"cramers_v",
"odds_ratio",
"odds_ratio_direction",
"contingency"
],
order="chisq",
reverse=True,
)
d.results
#[
# {'event': 'INTJ', 'cramers_v': 0.18546269144487967, 'odds_ratio': 205.64922609298688, 'odds_ratio_direction': 'second'},
# {'event': 'PART', 'cramers_v': 0.09362273156839818, 'odds_ratio': 3.3765821947519883, 'odds_ratio_direction': 'second'},
# {'event': 'PUNCT', 'cramers_v': 0.0817401329794944, 'odds_ratio': 3.619217699912104, 'odds_ratio_direction': 'first'},
# {'event': 'ADV', 'cramers_v': 0.0697921632735567, 'odds_ratio': 2.368271631503067, 'odds_ratio_direction': 'second'},
# {'event': 'NOUN', 'cramers_v': 0.06087356761646711, 'odds_ratio': 1.9182977375177561, 'odds_ratio_direction': 'first'}
# ...]
d.to_tsv("output.tsv")
# Writes the data to a tsv, same way as CLI.
```
## Outputs
Running the tool on the exemplary JSON configuration file compares the UPOS dependence between the two files, `sl_sst-ud-train.conllu` and `sl_ssj-ud-train.conllu`.
The output of the tool with the `event` set to `upos`:
```
event cramers_v odds_ratio odds_ratio_direction
INTJ 0.18546269144487967 205.64922609298688 second
PART 0.09362273156839818 3.3765821947519883 second
PUNCT 0.0817401329794944 3.619217699912104 first
ADV 0.0697921632735567 2.368271631503067 second
NOUN 0.06087356761646711 1.9182977375177561 first
X 0.054704563268060884 3.730096502268552 second
ADJ 0.04337452602078068 1.914494685493001 first
DET 0.039058849022439675 1.8178073753376662 second
VERB 0.038179122992568475 1.5098089234134449 second
PRON 0.03023447484586653 1.6315601187054114 second
ADP 0.027554631145556924 1.5029875604157752 first
CCONJ 0.020521602435811057 1.3849776400810214 second
PROPN 0.018222766826042337 1.48950205465959 first
SCONJ 0.01516452354505404 1.3147102075692845 second
SYM 0.00788752606312867 31.431147723995498 first
NUM 0.006279927568528497 1.1841482466696223 first
```
The output of the tool if the event is `lemma`:
```
event cramers_v odds_ratio odds_ratio_direction
ja 0.1443680388876384 316.50392575024387 second
eee 0.14224954241402205 9727.90290395421 second
[gap] 0.1404103062969733 9473.866117350688 second
_ 0.10342084138781253 5109.162639646662 second
[name:personal] 0.08543886760201726 3486.230522337837 second
, 0.0798983928857083 3017.3227483286178 first
[pause] 0.07793562469300712 2903.1409295352323 second
… 0.06957402089144266 29.21324122737166 second
. 0.06417706510165858 1943.1095197895877 first
ne 0.06172098673848206 4.567216172071131 second
mhm 0.06135550149322792 1808.49138820132 second
pa 0.05533923234690654 3.5866900542385225 second
[speaker:laughter] 0.05267879283333141 1341.047493947355 second
no 0.05236304456192136 33.68782033850626 second
eem 0.05177677215351289 1296.5817666752512 second
ta 0.04975510679153412 3.112790024717654 second
aha 0.04842184761560267 387.7447093352556 second
? 0.047909243941398276 6.54896239567592 second
[name:surname] 0.04599517464931115 1029.9796674731044 second
pol 0.04522323386592585 19.314885381209148 second
vedeti 0.044877978271710375 8.204695698185576 second
[audience:laughter] 0.04443254560578941 963.3805970149255 second
kaj 0.04422508295450475 6.665906246951688 second
jaz 0.04254326905719024 4.205848770006655 second
zdaj 0.04220703488553522 8.449570357876933 second
[:voice] 0.03937486931297157 763.7067235968929 second
pač 0.03880380764818679 16.29708140017637 second
ti 0.0374965082691098 5.789587920409137 second
a 0.036544753303008326 6.611184664673473 second
reči 0.03507827440464862 7.032673208444842 second
tako 0.03488349828161137 3.661944581348252 second
ampak 0.03300775971108695 8.776849122671143 second
mmm 0.03213178836124324 519.9118251928021 second
[incident] 0.03213178836124324 519.9118251928021 second
[all:laughter] 0.03213178836124324 519.9118251928021 second
iti 0.03203659707760247 5.261877926755125 second
aja 0.03063185252461786 475.61510384536297 second
misliti 0.030313213443565672 7.600055895339754 second
ka 0.02979521974817808 158.53763109191857 second
ful 0.02745038147121303 64.78054798745696 second
imeti 0.026728261048545494 2.8695967134026836 second
zdajle 0.02646949225144055 129.01572779605263 second
en 0.026425718058774403 3.8274431810951612 second
dati 0.025769048935856378 6.3530704294133375 second
tale 0.025422305095331388 17.767587428769104 second
te 0.024677083578157007 320.6482861400894 second
ma 0.024643752142178156 114.25932778291704 second
gor 0.023985218122048756 28.75423049244371 second
tam 0.02388517506596499 6.532873916176885 second
samo 0.023726743621988657 4.152648893958719 second
oni 0.02313889520888621 19.023543080403044 second
aaa 0.02269213397252964 276.39252864703764 second
...
```
The output of the tool if the event is `feat`:
```
event cramers_v odds_ratio odds_ratio_direction
Person:1 0.04939558517684953 3.5638167542809853 second
Person:2 0.047040110877374676 5.079993584778688 second
PronType:Dem 0.037175859411220154 3.245197490851571 second
Case:Gen 0.03144647693695685 2.172092243237678 first
NumForm:Word 0.028272960330824267 3.440339072082062 second
VerbForm:Fin 0.0267587007742558 1.5225701489232548 second
Mood:Ind 0.02504649601144956 1.5056171393724438 second
Gender:Masc 0.025042316104147304 1.4748890960035126 first
Reflex:Yes 0.024416673107363094 699.5006025688363 first
Tense:Pres 0.023517648140906504 1.4896134066798563 second
NumForm:Digit 0.022483697118357206 593.4206504177673 first
Case:Loc 0.02179297260457836 1.7083552918624372 first
Gender:Fem 0.02129852556496575 1.4327016742281735 first
PronType:Int 0.019116864692380664 2.8254802331440025 second
Mood:Imp 0.018378570965450702 3.791997651968869 second
Case:Ins 0.018308854809044352 2.0816886275516455 first
Aspect:Imp 0.01595530651024542 1.491526932928539 second
Polarity:Pos 0.015389343283813696 1.3931175980723853 second
Number:Plur 0.015101686601082178 1.2937540626185027 first
PronType:Prs 0.012840220897127438 1.391913401310955 second
Polarity:Neg 0.010467835392964546 1.6218991635410638 second
Abbr:Yes 0.009174573465203728 100.64168811481056 first
Tense:Fut 0.008659708802953942 1.6695810431815645 second
Poss:Yes 0.008622929553196686 1.8430908774898307 first
PronType:Ind 0.007988560895972635 1.5545205160767552 second
VerbForm:Sup 0.007606299226342561 6.994685959408413 second
VerbForm:Part 0.007588183914607259 1.2152331908399183 first
PronType:Neg 0.0072933899363323805 2.492256128366947 second
Gender[psor]:Masc 0.006270434086374932 7.047436934055062 first
Degree:Pos 0.006204462420882378 1.1017059175543398 second
Definite:Def 0.006094185561369171 1.6136203228373713 first
Case:Dat 0.005207882750413297 1.296449710663011 first
NumType:Ord 0.004907024006001947 1.6690069108717716 first
Gender[psor]:Fem 0.0043045691241584165 4.971538958295724 first
Number:Sing 0.0042501886679687595 1.0445633278559616 first
Number[psor]:Sing 0.004168042039340886 1.7214890333000743 first
Variant:Short 0.004116420078423952 1.15773584843268 second
Number[psor]:Plur 0.003520769896614546 1.4800187255151183 second
Animacy:Anim 0.0031959140489293135 1.609703528865278 first
Number:Dual 0.003119169910226467 1.1902942252586413 second
Degree:Sup 0.0028717358191274397 1.4607040532750566 first
NumForm:Roman 0.002735775335474741 10.900416396883163 first
Definite:Ind 0.0027015758531109017 1.230778104542293 first
NumType:Mult 0.002586134214293405 5.35236774349564 second
```
Output of the tool if the event is `deprel`:
```
event cramers_v odds_ratio odds_ratio_direction
discourse 0.1995606615688087 75.78024049293329 second
discourse:filler 0.1545168231515424 11514.411314984709 second
reparandum 0.14274706919825084 9797.236607142857 second
punct 0.08173482846847689 3.6189527899648515 first
parataxis:discourse 0.08119452932918503 3149.6412683633353 second
root 0.06758699304934265 2.238339638285662 second
advmod 0.06730374982664965 2.026798690361553 second
nmod 0.05472055507805954 3.295046424712571 first
parataxis:restart 0.05222972721988518 1318.8134851138354 second
amod 0.047244641439906455 2.3917425656320344 first
conj:extend 0.04225937967772594 874.6138211382115 second
parataxis 0.0398717788389968 2.265339381899786 second
dislocated 0.035933307354648995 47.53133350886273 second
case 0.030560438744177788 1.5760166637679451 first
vocative 0.029795832802305106 9.331577965730597 second
fixed 0.024153593693035522 2.418883738805336 second
obl 0.0214600136068333 1.4392252992918104 first
conj 0.01946154708029554 1.509069018715358 first
list 0.014859841032733846 105.90846429170628 first
nummod 0.013237722191482327 1.5856577201396964 first
flat 0.01226834979992188 2.865560442489671 second
ccomp 0.012109670762191542 1.5403849395768785 second
appos 0.01192489066635104 2.0258632065250364 first
mark 0.011210811993208638 1.2446775351581985 second
acl 0.011205361560489591 1.4731494732818564 first
orphan 0.009752798810071982 2.3358535915763845 first
cc 0.008654128343704083 1.1770939880197218 second
advcl 0.007143729012211228 1.284389185911365 second
flat:name 0.006644743576550729 1.417384017797551 first
dep 0.006150595406649066 4.385886921540406 first
aux 0.005735206555028642 1.1100447817440058 first
```
Output of the tool if the event is `deprel+head_deprel`:
```
discourse_root 0.18202736862729124 76.9150923698931 second
discourse:filler_root 0.10398408766447145 4883.512041884816 second
discourse_parataxis 0.08310549743890455 66.31108288242072 second
reparandum_root 0.07203903218914691 2350.031683626272 second
parataxis:discourse_root 0.0654853384402872 1946.6054409980939 second
advmod_root 0.06281476752222244 2.557906542509336 second
cc_root 0.055707328412306335 6.9863965946940025 second
discourse:filler_parataxis 0.052954640088375506 1283.8629836802952 second
advmod_parataxis 0.0494810299649954 3.0164190203770955 second
advmod_reparandum 0.04878776582774138 1094.9009164793358 second
parataxis:restart_root 0.04824164203380061 1071.2929106628242 second
mark_root 0.04787571436102816 12.4166781015429 second
discourse:filler_conj 0.04656491000281119 1000.4852186941738 second
cc_conj:extend 0.0454127571242677 953.2936905790837 second
advmod_parataxis:restart 0.044825592848028915 929.7020050702926 second
parataxis_root 0.04359034032395506 2.5377329342701413 second
reparandum_advmod 0.042395817338828114 835.3624423963133 second
advmod_parataxis:discourse 0.04112721883273275 788.2089621011404 second
reparandum_parataxis 0.04112721883273275 788.2089621011404 second
punct_acl 0.040188438192249216 81.79756089838132 first
case_root 0.039176461994083936 11.178670793434415 second
parataxis_parataxis:restart 0.038464927462689126 693.9345848209144 second
parataxis:discourse_parataxis 0.038464927462689126 693.9345848209144 second
discourse:filler_obl 0.03777009901119597 670.3727759543962 second
discourse:filler_obj 0.03777009901119597 670.3727759543962 second
reparandum_reparandum 0.03634068952927306 623.2572974840232 second
discourse_conj 0.03560451757262082 599.7036269430051 second
fixed_discourse 0.03489266792523979 74.48440528972571 second
discourse:filler_acl 0.03485283496868408 576.1526682401703 second
reparandum_nsubj 0.03485283496868408 576.1526682401703 second
punct_conj 0.0343037374311866 14.76598098276567 first
nsubj_parataxis:restart 0.03408461936979407 552.604420907207 second
discourse:filler_nsubj 0.03408461936979407 552.604420907207 second
amod_nmod 0.03357833694236596 4.584717922597381 first
obl_parataxis:restart 0.03329873115421483 529.0588844759109 second
reparandum_mark 0.03249389371815959 505.51605847818576 second
punct_reparandum 0.03249389371815959 505.51605847818576 second
discourse:filler_ccomp 0.03249389371815959 505.51605847818576 second
reparandum_ccomp 0.03249389371815959 505.51605847818576 second
dislocated_root 0.032099983181928804 47.94146069256899 second
nmod_nmod 0.03179350017167992 6.360152211571991 first
reparandum_obl 0.03166866930185818 481.9759424460431 second
conj:extend_root 0.030821429049221256 458.4385359116022 second
discourse:filler_parataxis:restart 0.030821429049221256 458.4385359116022 second
punct_root 0.03071311098406396 1.8745367606973529 first
mark_reparandum 0.029950315528831664 434.90383840708984 second
reparandum_case 0.029950315528831664 434.90383840708984 second
reparandum_amod 0.029950315528831664 434.90383840708984 second
punct_parataxis:restart 0.029950315528831664 434.90383840708984 second
punct_appos 0.02980646118765951 133.54577644396244 first
discourse:filler_nmod 0.02905319526331687 411.3718494648406 second
cc_parataxis 0.02872166122883359 22.302391238742672 second
reparandum_advcl 0.028127597813434026 387.8425686172967 second
...
```
# Notes for developers
## Building and publishing
```
cd conllu-diff/conlludiff
# Bump version when done:
bumpver update --patch # or --minor or --major
python -m build
twine check dist/*
# test upload:
twine upload --verbose -r testpypi dist/*
# real upload:
twine upload --verbose dist/*
```
## Testing
```
cd conllu-diff/conlludiff/tests
pytest -vv
```
Raw data
{
"_id": null,
"home_page": null,
"name": "conlludiff",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "conlludiff, conllu, conll",
"author": null,
"author_email": "Peter Rupnik <rupnikpeter@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/41/7f/30c3f4426fb9b169614efaa84d2681b2493fc0896ff9a880787ddccbec13/conlludiff-0.0.5.tar.gz",
"platform": null,
"description": "# conllu-diff\n\nA tool for statistically comparing two conllu files. It offers two equivalent use modes:\n* as a command line script\n* as a python package\n\n## Installation\n\n```\npip install conlludiff\n```\n\n## CLI use\n\nRun as `python -m conlludiff <json>`.\n\nThe tool is configured through a JSON configuration file (example in `config_files/ssj_sst_upos.json`) where the user defines:\n- `file1` - The conllu file containing the first language sample\n- `file2` - The conllu file containing the second language sample\n- `event` - The linguistic feature the comparison is to be based on, optional events are form, lemma, upos, xpos, upos+feats, feat (each feature separately), feats (all features of a word merged), deprel, deprel+head_deprel\n- `filter` - The minimum p-value of the chi-square test for the entry to be filtered from the results\n- `fields` - A list of fields to be retained in the output (list of available values is listed below)\n- `order` - The field by which the output is to be ordered\n- `reverse` - Whether the ordering should be reverse\n- `output` - Where the output is to be produced, either stdout or filename\n\nThe following fields / values are currently available:\n- `chisq` - The chi-square statistical test.\n- `chisq_p` - The p-value of the chi-square test. Very useful for discarding results with p-value below 0.05. These results you simply cannot trust (they might have happened by chance) and do not have to look at.\n- `cramers_v` - The Cramer's V effect size, based on the chi square statistic and the sample size. Traditionally it should be over 0.1 for small effect, over 0.3 for medium effect and over 0.5 for strong effect, but on language phenomena it will never achieve even medium effect. It is comparable across datasets of different sizes, so if the tool is run on multiple pairs of documents, these effect sizes CAN be used for comparison across datasets.\n- `odds_ratio` - The odds ratio effect size. Put simply - it reports how many times the odds of an event are higher in one dataset in comparison to another dataset. It is always higher than 1. This is why `odds_ratio_direction` gives info on the dataset for which the odds of a specific event are higher.\n- `odds_ratio_direction` - The direction of the odds ratio presented previously. If `first`, the odds of the event are greater in the first dataset. If `second`, the odds for this event are higher in the second dataset.\n- `log_likelihood_ratio` - The log-likelihood ratio, as defined by Danning (1993), here mostly for reasons of popularity in the computational linguistics circles.\n\n## Python API\n\n### Use\nThe differ can be used through its python API as follows:\n\n```python\nfrom conlludiff import Differ\n\nd = Differ(\n \"conllu_files/sl_ssj-ud-train.conllu\",\n \"conllu_files/sl_sst-ud-train.conllu\",\n event=\"upos\",\n filter=0.05,\n fields=[\n \"event\",\n \"cramers_v\",\n \"odds_ratio\",\n \"odds_ratio_direction\",\n \"contingency\"\n ],\n order=\"chisq\",\n reverse=True,\n)\nd.results\n#[\n# {'event': 'INTJ', 'cramers_v': 0.18546269144487967, 'odds_ratio': 205.64922609298688, 'odds_ratio_direction': 'second'},\n# {'event': 'PART', 'cramers_v': 0.09362273156839818, 'odds_ratio': 3.3765821947519883, 'odds_ratio_direction': 'second'},\n# {'event': 'PUNCT', 'cramers_v': 0.0817401329794944, 'odds_ratio': 3.619217699912104, 'odds_ratio_direction': 'first'},\n# {'event': 'ADV', 'cramers_v': 0.0697921632735567, 'odds_ratio': 2.368271631503067, 'odds_ratio_direction': 'second'},\n# {'event': 'NOUN', 'cramers_v': 0.06087356761646711, 'odds_ratio': 1.9182977375177561, 'odds_ratio_direction': 'first'}\n# ...]\n\nd.to_tsv(\"output.tsv\")\n# Writes the data to a tsv, same way as CLI.\n```\n\n\n## Outputs\n\n\nRunning the tool on the exemplary JSON configuration file compares the UPOS dependence between the two files, `sl_sst-ud-train.conllu` and `sl_ssj-ud-train.conllu`.\n\nThe output of the tool with the `event` set to `upos`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\nINTJ\t0.18546269144487967\t205.64922609298688\tsecond\nPART\t0.09362273156839818\t3.3765821947519883\tsecond\nPUNCT\t0.0817401329794944\t3.619217699912104\tfirst\nADV\t0.0697921632735567\t2.368271631503067\tsecond\nNOUN\t0.06087356761646711\t1.9182977375177561\tfirst\nX\t0.054704563268060884\t3.730096502268552\tsecond\nADJ\t0.04337452602078068\t1.914494685493001\tfirst\nDET\t0.039058849022439675\t1.8178073753376662\tsecond\nVERB\t0.038179122992568475\t1.5098089234134449\tsecond\nPRON\t0.03023447484586653\t1.6315601187054114\tsecond\nADP\t0.027554631145556924\t1.5029875604157752\tfirst\nCCONJ\t0.020521602435811057\t1.3849776400810214\tsecond\nPROPN\t0.018222766826042337\t1.48950205465959\tfirst\nSCONJ\t0.01516452354505404\t1.3147102075692845\tsecond\nSYM\t0.00788752606312867\t31.431147723995498\tfirst\nNUM\t0.006279927568528497\t1.1841482466696223\tfirst\n```\n\nThe output of the tool if the event is `lemma`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\nja\t0.1443680388876384\t316.50392575024387\tsecond\neee\t0.14224954241402205\t9727.90290395421\tsecond\n[gap]\t0.1404103062969733\t9473.866117350688\tsecond\n_\t0.10342084138781253\t5109.162639646662\tsecond\n[name:personal]\t0.08543886760201726\t3486.230522337837\tsecond\n,\t0.0798983928857083\t3017.3227483286178\tfirst\n[pause]\t0.07793562469300712\t2903.1409295352323\tsecond\n\u2026\t0.06957402089144266\t29.21324122737166\tsecond\n.\t0.06417706510165858\t1943.1095197895877\tfirst\nne\t0.06172098673848206\t4.567216172071131\tsecond\nmhm\t0.06135550149322792\t1808.49138820132\tsecond\npa\t0.05533923234690654\t3.5866900542385225\tsecond\n[speaker:laughter]\t0.05267879283333141\t1341.047493947355\tsecond\nno\t0.05236304456192136\t33.68782033850626\tsecond\neem\t0.05177677215351289\t1296.5817666752512\tsecond\nta\t0.04975510679153412\t3.112790024717654\tsecond\naha\t0.04842184761560267\t387.7447093352556\tsecond\n?\t0.047909243941398276\t6.54896239567592\tsecond\n[name:surname]\t0.04599517464931115\t1029.9796674731044\tsecond\npol\t0.04522323386592585\t19.314885381209148\tsecond\nvedeti\t0.044877978271710375\t8.204695698185576\tsecond\n[audience:laughter]\t0.04443254560578941\t963.3805970149255\tsecond\nkaj\t0.04422508295450475\t6.665906246951688\tsecond\njaz\t0.04254326905719024\t4.205848770006655\tsecond\nzdaj\t0.04220703488553522\t8.449570357876933\tsecond\n[:voice]\t0.03937486931297157\t763.7067235968929\tsecond\npa\u010d\t0.03880380764818679\t16.29708140017637\tsecond\nti\t0.0374965082691098\t5.789587920409137\tsecond\na\t0.036544753303008326\t6.611184664673473\tsecond\nre\u010di\t0.03507827440464862\t7.032673208444842\tsecond\ntako\t0.03488349828161137\t3.661944581348252\tsecond\nampak\t0.03300775971108695\t8.776849122671143\tsecond\nmmm\t0.03213178836124324\t519.9118251928021\tsecond\n[incident]\t0.03213178836124324\t519.9118251928021\tsecond\n[all:laughter]\t0.03213178836124324\t519.9118251928021\tsecond\niti\t0.03203659707760247\t5.261877926755125\tsecond\naja\t0.03063185252461786\t475.61510384536297\tsecond\nmisliti\t0.030313213443565672\t7.600055895339754\tsecond\nka\t0.02979521974817808\t158.53763109191857\tsecond\nful\t0.02745038147121303\t64.78054798745696\tsecond\nimeti\t0.026728261048545494\t2.8695967134026836\tsecond\nzdajle\t0.02646949225144055\t129.01572779605263\tsecond\nen\t0.026425718058774403\t3.8274431810951612\tsecond\ndati\t0.025769048935856378\t6.3530704294133375\tsecond\ntale\t0.025422305095331388\t17.767587428769104\tsecond\nte\t0.024677083578157007\t320.6482861400894\tsecond\nma\t0.024643752142178156\t114.25932778291704\tsecond\ngor\t0.023985218122048756\t28.75423049244371\tsecond\ntam\t0.02388517506596499\t6.532873916176885\tsecond\nsamo\t0.023726743621988657\t4.152648893958719\tsecond\noni\t0.02313889520888621\t19.023543080403044\tsecond\naaa\t0.02269213397252964\t276.39252864703764\tsecond\n...\n```\n\nThe output of the tool if the event is `feat`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\nPerson:1\t0.04939558517684953\t3.5638167542809853\tsecond\nPerson:2\t0.047040110877374676\t5.079993584778688\tsecond\nPronType:Dem\t0.037175859411220154\t3.245197490851571\tsecond\nCase:Gen\t0.03144647693695685\t2.172092243237678\tfirst\nNumForm:Word\t0.028272960330824267\t3.440339072082062\tsecond\nVerbForm:Fin\t0.0267587007742558\t1.5225701489232548\tsecond\nMood:Ind\t0.02504649601144956\t1.5056171393724438\tsecond\nGender:Masc\t0.025042316104147304\t1.4748890960035126\tfirst\nReflex:Yes\t0.024416673107363094\t699.5006025688363\tfirst\nTense:Pres\t0.023517648140906504\t1.4896134066798563\tsecond\nNumForm:Digit\t0.022483697118357206\t593.4206504177673\tfirst\nCase:Loc\t0.02179297260457836\t1.7083552918624372\tfirst\nGender:Fem\t0.02129852556496575\t1.4327016742281735\tfirst\nPronType:Int\t0.019116864692380664\t2.8254802331440025\tsecond\nMood:Imp\t0.018378570965450702\t3.791997651968869\tsecond\nCase:Ins\t0.018308854809044352\t2.0816886275516455\tfirst\nAspect:Imp\t0.01595530651024542\t1.491526932928539\tsecond\nPolarity:Pos\t0.015389343283813696\t1.3931175980723853\tsecond\nNumber:Plur\t0.015101686601082178\t1.2937540626185027\tfirst\nPronType:Prs\t0.012840220897127438\t1.391913401310955\tsecond\nPolarity:Neg\t0.010467835392964546\t1.6218991635410638\tsecond\nAbbr:Yes\t0.009174573465203728\t100.64168811481056\tfirst\nTense:Fut\t0.008659708802953942\t1.6695810431815645\tsecond\nPoss:Yes\t0.008622929553196686\t1.8430908774898307\tfirst\nPronType:Ind\t0.007988560895972635\t1.5545205160767552\tsecond\nVerbForm:Sup\t0.007606299226342561\t6.994685959408413\tsecond\nVerbForm:Part\t0.007588183914607259\t1.2152331908399183\tfirst\nPronType:Neg\t0.0072933899363323805\t2.492256128366947\tsecond\nGender[psor]:Masc\t0.006270434086374932\t7.047436934055062\tfirst\nDegree:Pos\t0.006204462420882378\t1.1017059175543398\tsecond\nDefinite:Def\t0.006094185561369171\t1.6136203228373713\tfirst\nCase:Dat\t0.005207882750413297\t1.296449710663011\tfirst\nNumType:Ord\t0.004907024006001947\t1.6690069108717716\tfirst\nGender[psor]:Fem\t0.0043045691241584165\t4.971538958295724\tfirst\nNumber:Sing\t0.0042501886679687595\t1.0445633278559616\tfirst\nNumber[psor]:Sing\t0.004168042039340886\t1.7214890333000743\tfirst\nVariant:Short\t0.004116420078423952\t1.15773584843268\tsecond\nNumber[psor]:Plur\t0.003520769896614546\t1.4800187255151183\tsecond\nAnimacy:Anim\t0.0031959140489293135\t1.609703528865278\tfirst\nNumber:Dual\t0.003119169910226467\t1.1902942252586413\tsecond\nDegree:Sup\t0.0028717358191274397\t1.4607040532750566\tfirst\nNumForm:Roman\t0.002735775335474741\t10.900416396883163\tfirst\nDefinite:Ind\t0.0027015758531109017\t1.230778104542293\tfirst\nNumType:Mult\t0.002586134214293405\t5.35236774349564\tsecond\n```\n\nOutput of the tool if the event is `deprel`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\ndiscourse\t0.1995606615688087\t75.78024049293329\tsecond\ndiscourse:filler\t0.1545168231515424\t11514.411314984709\tsecond\nreparandum\t0.14274706919825084\t9797.236607142857\tsecond\npunct\t0.08173482846847689\t3.6189527899648515\tfirst\nparataxis:discourse\t0.08119452932918503\t3149.6412683633353\tsecond\nroot\t0.06758699304934265\t2.238339638285662\tsecond\nadvmod\t0.06730374982664965\t2.026798690361553\tsecond\nnmod\t0.05472055507805954\t3.295046424712571\tfirst\nparataxis:restart\t0.05222972721988518\t1318.8134851138354\tsecond\namod\t0.047244641439906455\t2.3917425656320344\tfirst\nconj:extend\t0.04225937967772594\t874.6138211382115\tsecond\nparataxis\t0.0398717788389968\t2.265339381899786\tsecond\ndislocated\t0.035933307354648995\t47.53133350886273\tsecond\ncase\t0.030560438744177788\t1.5760166637679451\tfirst\nvocative\t0.029795832802305106\t9.331577965730597\tsecond\nfixed\t0.024153593693035522\t2.418883738805336\tsecond\nobl\t0.0214600136068333\t1.4392252992918104\tfirst\nconj\t0.01946154708029554\t1.509069018715358\tfirst\nlist\t0.014859841032733846\t105.90846429170628\tfirst\nnummod\t0.013237722191482327\t1.5856577201396964\tfirst\nflat\t0.01226834979992188\t2.865560442489671\tsecond\nccomp\t0.012109670762191542\t1.5403849395768785\tsecond\nappos\t0.01192489066635104\t2.0258632065250364\tfirst\nmark\t0.011210811993208638\t1.2446775351581985\tsecond\nacl\t0.011205361560489591\t1.4731494732818564\tfirst\norphan\t0.009752798810071982\t2.3358535915763845\tfirst\ncc\t0.008654128343704083\t1.1770939880197218\tsecond\nadvcl\t0.007143729012211228\t1.284389185911365\tsecond\nflat:name\t0.006644743576550729\t1.417384017797551\tfirst\ndep\t0.006150595406649066\t4.385886921540406\tfirst\naux\t0.005735206555028642\t1.1100447817440058\tfirst\n```\n\nOutput of the tool if the event is `deprel+head_deprel`:\n\n```\ndiscourse_root\t0.18202736862729124\t76.9150923698931\tsecond\ndiscourse:filler_root\t0.10398408766447145\t4883.512041884816\tsecond\ndiscourse_parataxis\t0.08310549743890455\t66.31108288242072\tsecond\nreparandum_root\t0.07203903218914691\t2350.031683626272\tsecond\nparataxis:discourse_root\t0.0654853384402872\t1946.6054409980939\tsecond\nadvmod_root\t0.06281476752222244\t2.557906542509336\tsecond\ncc_root\t0.055707328412306335\t6.9863965946940025\tsecond\ndiscourse:filler_parataxis\t0.052954640088375506\t1283.8629836802952\tsecond\nadvmod_parataxis\t0.0494810299649954\t3.0164190203770955\tsecond\nadvmod_reparandum\t0.04878776582774138\t1094.9009164793358\tsecond\nparataxis:restart_root\t0.04824164203380061\t1071.2929106628242\tsecond\nmark_root\t0.04787571436102816\t12.4166781015429\tsecond\ndiscourse:filler_conj\t0.04656491000281119\t1000.4852186941738\tsecond\ncc_conj:extend\t0.0454127571242677\t953.2936905790837\tsecond\nadvmod_parataxis:restart\t0.044825592848028915\t929.7020050702926\tsecond\nparataxis_root\t0.04359034032395506\t2.5377329342701413\tsecond\nreparandum_advmod\t0.042395817338828114\t835.3624423963133\tsecond\nadvmod_parataxis:discourse\t0.04112721883273275\t788.2089621011404\tsecond\nreparandum_parataxis\t0.04112721883273275\t788.2089621011404\tsecond\npunct_acl\t0.040188438192249216\t81.79756089838132\tfirst\ncase_root\t0.039176461994083936\t11.178670793434415\tsecond\nparataxis_parataxis:restart\t0.038464927462689126\t693.9345848209144\tsecond\nparataxis:discourse_parataxis\t0.038464927462689126\t693.9345848209144\tsecond\ndiscourse:filler_obl\t0.03777009901119597\t670.3727759543962\tsecond\ndiscourse:filler_obj\t0.03777009901119597\t670.3727759543962\tsecond\nreparandum_reparandum\t0.03634068952927306\t623.2572974840232\tsecond\ndiscourse_conj\t0.03560451757262082\t599.7036269430051\tsecond\nfixed_discourse\t0.03489266792523979\t74.48440528972571\tsecond\ndiscourse:filler_acl\t0.03485283496868408\t576.1526682401703\tsecond\nreparandum_nsubj\t0.03485283496868408\t576.1526682401703\tsecond\npunct_conj\t0.0343037374311866\t14.76598098276567\tfirst\nnsubj_parataxis:restart\t0.03408461936979407\t552.604420907207\tsecond\ndiscourse:filler_nsubj\t0.03408461936979407\t552.604420907207\tsecond\namod_nmod\t0.03357833694236596\t4.584717922597381\tfirst\nobl_parataxis:restart\t0.03329873115421483\t529.0588844759109\tsecond\nreparandum_mark\t0.03249389371815959\t505.51605847818576\tsecond\npunct_reparandum\t0.03249389371815959\t505.51605847818576\tsecond\ndiscourse:filler_ccomp\t0.03249389371815959\t505.51605847818576\tsecond\nreparandum_ccomp\t0.03249389371815959\t505.51605847818576\tsecond\ndislocated_root\t0.032099983181928804\t47.94146069256899\tsecond\nnmod_nmod\t0.03179350017167992\t6.360152211571991\tfirst\nreparandum_obl\t0.03166866930185818\t481.9759424460431\tsecond\nconj:extend_root\t0.030821429049221256\t458.4385359116022\tsecond\ndiscourse:filler_parataxis:restart\t0.030821429049221256\t458.4385359116022\tsecond\npunct_root\t0.03071311098406396\t1.8745367606973529\tfirst\nmark_reparandum\t0.029950315528831664\t434.90383840708984\tsecond\nreparandum_case\t0.029950315528831664\t434.90383840708984\tsecond\nreparandum_amod\t0.029950315528831664\t434.90383840708984\tsecond\npunct_parataxis:restart\t0.029950315528831664\t434.90383840708984\tsecond\npunct_appos\t0.02980646118765951\t133.54577644396244\tfirst\ndiscourse:filler_nmod\t0.02905319526331687\t411.3718494648406\tsecond\ncc_parataxis\t0.02872166122883359\t22.302391238742672\tsecond\nreparandum_advcl\t0.028127597813434026\t387.8425686172967\tsecond\n...\n```\n# Notes for developers\n\n## Building and publishing\n```\ncd conllu-diff/conlludiff\n# Bump version when done:\nbumpver update --patch # or --minor or --major\npython -m build\ntwine check dist/*\n# test upload:\ntwine upload --verbose -r testpypi dist/*\n# real upload:\ntwine upload --verbose dist/*\n```\n\n## Testing\n```\ncd conllu-diff/conlludiff/tests\npytest -vv\n```\n",
"bugtrack_url": null,
"license": "Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. \"License\" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. \"Licensor\" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. \"Legal Entity\" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, \"control\" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. \"You\" (or \"Your\") shall mean an individual or Legal Entity exercising permissions granted by this License. \"Source\" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. \"Object\" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. \"Work\" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). \"Derivative Works\" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. \"Contribution\" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, \"submitted\" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as \"Not a Contribution.\" \"Contributor\" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a \"NOTICE\" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets \"[]\" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same \"printed page\" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ",
"summary": "Analyze two CONLLU files",
"version": "0.0.5",
"project_urls": {
"Homepage": "https://github.com/clarinsi/conllu-diff"
},
"split_keywords": [
"conlludiff",
" conllu",
" conll"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2d2f5262b432e839172cc3a50933d1435538af16bfac6124de413ba4fd288758",
"md5": "80ecc09f1cca6d1bf44657a2aa0c8b3a",
"sha256": "706997c031f1bc9822c67097bc6cc8dc128c79d547bb17e7b619ff0acdb605b0"
},
"downloads": -1,
"filename": "conlludiff-0.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "80ecc09f1cca6d1bf44657a2aa0c8b3a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 21415,
"upload_time": "2024-04-02T08:49:41",
"upload_time_iso_8601": "2024-04-02T08:49:41.471388Z",
"url": "https://files.pythonhosted.org/packages/2d/2f/5262b432e839172cc3a50933d1435538af16bfac6124de413ba4fd288758/conlludiff-0.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "417f30c3f4426fb9b169614efaa84d2681b2493fc0896ff9a880787ddccbec13",
"md5": "0e8d2932ba40532b60b61dcd800ac026",
"sha256": "899d81f0486503da663a8d4fe6357e4209eba25cb09b047555715446851550cd"
},
"downloads": -1,
"filename": "conlludiff-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "0e8d2932ba40532b60b61dcd800ac026",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 29147,
"upload_time": "2024-04-02T08:49:42",
"upload_time_iso_8601": "2024-04-02T08:49:42.799466Z",
"url": "https://files.pythonhosted.org/packages/41/7f/30c3f4426fb9b169614efaa84d2681b2493fc0896ff9a880787ddccbec13/conlludiff-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-02 08:49:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "clarinsi",
"github_project": "conllu-diff",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "conlludiff"
}