conlludiff


Nameconlludiff JSON
Version 0.0.5 PyPI version JSON
download
home_pageNone
SummaryAnalyze two CONLLU files
upload_time2024-04-02 08:49:42
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseApache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
keywords conlludiff conllu conll
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # conllu-diff

A tool for statistically comparing two conllu files. It offers two equivalent use modes:
* as a command line script
* as a python package

## Installation

```
pip install conlludiff
```

## CLI use

Run as `python -m conlludiff <json>`.

The tool is configured through a JSON configuration file (example in `config_files/ssj_sst_upos.json`) where the user defines:
- `file1` - The conllu file containing the first language sample
- `file2` - The conllu file containing the second language sample
- `event` - The linguistic feature the comparison is to be based on, optional events are form, lemma, upos, xpos, upos+feats, feat (each feature separately), feats (all features of a word merged), deprel, deprel+head_deprel
- `filter` - The minimum p-value of the chi-square test for the entry to be filtered from the results
- `fields` - A list of fields to be retained in the output (list of available values is listed below)
- `order` - The field by which the output is to be ordered
- `reverse` - Whether the ordering should be reverse
- `output` - Where the output is to be produced, either stdout or filename

The following fields / values are currently available:
- `chisq` - The chi-square statistical test.
- `chisq_p` - The p-value of the chi-square test. Very useful for discarding results with p-value below 0.05. These results you simply cannot trust (they might have happened by chance) and do not have to look at.
- `cramers_v` - The Cramer's V effect size, based on the chi square statistic and the sample size. Traditionally it should be over 0.1 for small effect, over 0.3 for medium effect and over 0.5 for strong effect, but on language phenomena it will never achieve even medium effect. It is comparable across datasets of different sizes, so if the tool is run on multiple pairs of documents, these effect sizes CAN be used for comparison across datasets.
- `odds_ratio` - The odds ratio effect size. Put simply - it reports how many times the odds of an event are higher in one dataset in comparison to another dataset. It is always higher than 1. This is why `odds_ratio_direction` gives info on the dataset for which the odds of a specific event are higher.
- `odds_ratio_direction` - The direction of the odds ratio presented previously. If `first`, the odds of the event are greater in the first dataset. If `second`, the odds for this event are higher in the second dataset.
- `log_likelihood_ratio` - The log-likelihood ratio, as defined by Danning (1993), here mostly for reasons of popularity in the computational linguistics circles.

## Python API

### Use
The differ can be used through its python API as follows:

```python
from conlludiff import Differ

d = Differ(
    "conllu_files/sl_ssj-ud-train.conllu",
    "conllu_files/sl_sst-ud-train.conllu",
    event="upos",
    filter=0.05,
    fields=[
        "event",
        "cramers_v",
        "odds_ratio",
        "odds_ratio_direction",
        "contingency"
    ],
    order="chisq",
    reverse=True,
)
d.results
#[
# {'event': 'INTJ', 'cramers_v': 0.18546269144487967, 'odds_ratio': 205.64922609298688, 'odds_ratio_direction': 'second'},
# {'event': 'PART', 'cramers_v': 0.09362273156839818, 'odds_ratio': 3.3765821947519883, 'odds_ratio_direction': 'second'},
# {'event': 'PUNCT', 'cramers_v': 0.0817401329794944, 'odds_ratio': 3.619217699912104, 'odds_ratio_direction': 'first'},
# {'event': 'ADV', 'cramers_v': 0.0697921632735567, 'odds_ratio': 2.368271631503067, 'odds_ratio_direction': 'second'},
# {'event': 'NOUN', 'cramers_v': 0.06087356761646711, 'odds_ratio': 1.9182977375177561, 'odds_ratio_direction': 'first'}
# ...]

d.to_tsv("output.tsv")
# Writes the data to a tsv, same way as CLI.
```


## Outputs


Running the tool on the exemplary JSON configuration file compares the UPOS dependence between the two files, `sl_sst-ud-train.conllu` and `sl_ssj-ud-train.conllu`.

The output of the tool with the `event` set to `upos`:

```
event	cramers_v	odds_ratio	odds_ratio_direction
INTJ	0.18546269144487967	205.64922609298688	second
PART	0.09362273156839818	3.3765821947519883	second
PUNCT	0.0817401329794944	3.619217699912104	first
ADV	0.0697921632735567	2.368271631503067	second
NOUN	0.06087356761646711	1.9182977375177561	first
X	0.054704563268060884	3.730096502268552	second
ADJ	0.04337452602078068	1.914494685493001	first
DET	0.039058849022439675	1.8178073753376662	second
VERB	0.038179122992568475	1.5098089234134449	second
PRON	0.03023447484586653	1.6315601187054114	second
ADP	0.027554631145556924	1.5029875604157752	first
CCONJ	0.020521602435811057	1.3849776400810214	second
PROPN	0.018222766826042337	1.48950205465959	first
SCONJ	0.01516452354505404	1.3147102075692845	second
SYM	0.00788752606312867	31.431147723995498	first
NUM	0.006279927568528497	1.1841482466696223	first
```

The output of the tool if the event is `lemma`:

```
event	cramers_v	odds_ratio	odds_ratio_direction
ja	0.1443680388876384	316.50392575024387	second
eee	0.14224954241402205	9727.90290395421	second
[gap]	0.1404103062969733	9473.866117350688	second
_	0.10342084138781253	5109.162639646662	second
[name:personal]	0.08543886760201726	3486.230522337837	second
,	0.0798983928857083	3017.3227483286178	first
[pause]	0.07793562469300712	2903.1409295352323	second
…	0.06957402089144266	29.21324122737166	second
.	0.06417706510165858	1943.1095197895877	first
ne	0.06172098673848206	4.567216172071131	second
mhm	0.06135550149322792	1808.49138820132	second
pa	0.05533923234690654	3.5866900542385225	second
[speaker:laughter]	0.05267879283333141	1341.047493947355	second
no	0.05236304456192136	33.68782033850626	second
eem	0.05177677215351289	1296.5817666752512	second
ta	0.04975510679153412	3.112790024717654	second
aha	0.04842184761560267	387.7447093352556	second
?	0.047909243941398276	6.54896239567592	second
[name:surname]	0.04599517464931115	1029.9796674731044	second
pol	0.04522323386592585	19.314885381209148	second
vedeti	0.044877978271710375	8.204695698185576	second
[audience:laughter]	0.04443254560578941	963.3805970149255	second
kaj	0.04422508295450475	6.665906246951688	second
jaz	0.04254326905719024	4.205848770006655	second
zdaj	0.04220703488553522	8.449570357876933	second
[:voice]	0.03937486931297157	763.7067235968929	second
pač	0.03880380764818679	16.29708140017637	second
ti	0.0374965082691098	5.789587920409137	second
a	0.036544753303008326	6.611184664673473	second
reči	0.03507827440464862	7.032673208444842	second
tako	0.03488349828161137	3.661944581348252	second
ampak	0.03300775971108695	8.776849122671143	second
mmm	0.03213178836124324	519.9118251928021	second
[incident]	0.03213178836124324	519.9118251928021	second
[all:laughter]	0.03213178836124324	519.9118251928021	second
iti	0.03203659707760247	5.261877926755125	second
aja	0.03063185252461786	475.61510384536297	second
misliti	0.030313213443565672	7.600055895339754	second
ka	0.02979521974817808	158.53763109191857	second
ful	0.02745038147121303	64.78054798745696	second
imeti	0.026728261048545494	2.8695967134026836	second
zdajle	0.02646949225144055	129.01572779605263	second
en	0.026425718058774403	3.8274431810951612	second
dati	0.025769048935856378	6.3530704294133375	second
tale	0.025422305095331388	17.767587428769104	second
te	0.024677083578157007	320.6482861400894	second
ma	0.024643752142178156	114.25932778291704	second
gor	0.023985218122048756	28.75423049244371	second
tam	0.02388517506596499	6.532873916176885	second
samo	0.023726743621988657	4.152648893958719	second
oni	0.02313889520888621	19.023543080403044	second
aaa	0.02269213397252964	276.39252864703764	second
...
```

The output of the tool if the event is `feat`:

```
event	cramers_v	odds_ratio	odds_ratio_direction
Person:1	0.04939558517684953	3.5638167542809853	second
Person:2	0.047040110877374676	5.079993584778688	second
PronType:Dem	0.037175859411220154	3.245197490851571	second
Case:Gen	0.03144647693695685	2.172092243237678	first
NumForm:Word	0.028272960330824267	3.440339072082062	second
VerbForm:Fin	0.0267587007742558	1.5225701489232548	second
Mood:Ind	0.02504649601144956	1.5056171393724438	second
Gender:Masc	0.025042316104147304	1.4748890960035126	first
Reflex:Yes	0.024416673107363094	699.5006025688363	first
Tense:Pres	0.023517648140906504	1.4896134066798563	second
NumForm:Digit	0.022483697118357206	593.4206504177673	first
Case:Loc	0.02179297260457836	1.7083552918624372	first
Gender:Fem	0.02129852556496575	1.4327016742281735	first
PronType:Int	0.019116864692380664	2.8254802331440025	second
Mood:Imp	0.018378570965450702	3.791997651968869	second
Case:Ins	0.018308854809044352	2.0816886275516455	first
Aspect:Imp	0.01595530651024542	1.491526932928539	second
Polarity:Pos	0.015389343283813696	1.3931175980723853	second
Number:Plur	0.015101686601082178	1.2937540626185027	first
PronType:Prs	0.012840220897127438	1.391913401310955	second
Polarity:Neg	0.010467835392964546	1.6218991635410638	second
Abbr:Yes	0.009174573465203728	100.64168811481056	first
Tense:Fut	0.008659708802953942	1.6695810431815645	second
Poss:Yes	0.008622929553196686	1.8430908774898307	first
PronType:Ind	0.007988560895972635	1.5545205160767552	second
VerbForm:Sup	0.007606299226342561	6.994685959408413	second
VerbForm:Part	0.007588183914607259	1.2152331908399183	first
PronType:Neg	0.0072933899363323805	2.492256128366947	second
Gender[psor]:Masc	0.006270434086374932	7.047436934055062	first
Degree:Pos	0.006204462420882378	1.1017059175543398	second
Definite:Def	0.006094185561369171	1.6136203228373713	first
Case:Dat	0.005207882750413297	1.296449710663011	first
NumType:Ord	0.004907024006001947	1.6690069108717716	first
Gender[psor]:Fem	0.0043045691241584165	4.971538958295724	first
Number:Sing	0.0042501886679687595	1.0445633278559616	first
Number[psor]:Sing	0.004168042039340886	1.7214890333000743	first
Variant:Short	0.004116420078423952	1.15773584843268	second
Number[psor]:Plur	0.003520769896614546	1.4800187255151183	second
Animacy:Anim	0.0031959140489293135	1.609703528865278	first
Number:Dual	0.003119169910226467	1.1902942252586413	second
Degree:Sup	0.0028717358191274397	1.4607040532750566	first
NumForm:Roman	0.002735775335474741	10.900416396883163	first
Definite:Ind	0.0027015758531109017	1.230778104542293	first
NumType:Mult	0.002586134214293405	5.35236774349564	second
```

Output of the tool if the event is `deprel`:

```
event	cramers_v	odds_ratio	odds_ratio_direction
discourse	0.1995606615688087	75.78024049293329	second
discourse:filler	0.1545168231515424	11514.411314984709	second
reparandum	0.14274706919825084	9797.236607142857	second
punct	0.08173482846847689	3.6189527899648515	first
parataxis:discourse	0.08119452932918503	3149.6412683633353	second
root	0.06758699304934265	2.238339638285662	second
advmod	0.06730374982664965	2.026798690361553	second
nmod	0.05472055507805954	3.295046424712571	first
parataxis:restart	0.05222972721988518	1318.8134851138354	second
amod	0.047244641439906455	2.3917425656320344	first
conj:extend	0.04225937967772594	874.6138211382115	second
parataxis	0.0398717788389968	2.265339381899786	second
dislocated	0.035933307354648995	47.53133350886273	second
case	0.030560438744177788	1.5760166637679451	first
vocative	0.029795832802305106	9.331577965730597	second
fixed	0.024153593693035522	2.418883738805336	second
obl	0.0214600136068333	1.4392252992918104	first
conj	0.01946154708029554	1.509069018715358	first
list	0.014859841032733846	105.90846429170628	first
nummod	0.013237722191482327	1.5856577201396964	first
flat	0.01226834979992188	2.865560442489671	second
ccomp	0.012109670762191542	1.5403849395768785	second
appos	0.01192489066635104	2.0258632065250364	first
mark	0.011210811993208638	1.2446775351581985	second
acl	0.011205361560489591	1.4731494732818564	first
orphan	0.009752798810071982	2.3358535915763845	first
cc	0.008654128343704083	1.1770939880197218	second
advcl	0.007143729012211228	1.284389185911365	second
flat:name	0.006644743576550729	1.417384017797551	first
dep	0.006150595406649066	4.385886921540406	first
aux	0.005735206555028642	1.1100447817440058	first
```

Output of the tool if the event is `deprel+head_deprel`:

```
discourse_root	0.18202736862729124	76.9150923698931	second
discourse:filler_root	0.10398408766447145	4883.512041884816	second
discourse_parataxis	0.08310549743890455	66.31108288242072	second
reparandum_root	0.07203903218914691	2350.031683626272	second
parataxis:discourse_root	0.0654853384402872	1946.6054409980939	second
advmod_root	0.06281476752222244	2.557906542509336	second
cc_root	0.055707328412306335	6.9863965946940025	second
discourse:filler_parataxis	0.052954640088375506	1283.8629836802952	second
advmod_parataxis	0.0494810299649954	3.0164190203770955	second
advmod_reparandum	0.04878776582774138	1094.9009164793358	second
parataxis:restart_root	0.04824164203380061	1071.2929106628242	second
mark_root	0.04787571436102816	12.4166781015429	second
discourse:filler_conj	0.04656491000281119	1000.4852186941738	second
cc_conj:extend	0.0454127571242677	953.2936905790837	second
advmod_parataxis:restart	0.044825592848028915	929.7020050702926	second
parataxis_root	0.04359034032395506	2.5377329342701413	second
reparandum_advmod	0.042395817338828114	835.3624423963133	second
advmod_parataxis:discourse	0.04112721883273275	788.2089621011404	second
reparandum_parataxis	0.04112721883273275	788.2089621011404	second
punct_acl	0.040188438192249216	81.79756089838132	first
case_root	0.039176461994083936	11.178670793434415	second
parataxis_parataxis:restart	0.038464927462689126	693.9345848209144	second
parataxis:discourse_parataxis	0.038464927462689126	693.9345848209144	second
discourse:filler_obl	0.03777009901119597	670.3727759543962	second
discourse:filler_obj	0.03777009901119597	670.3727759543962	second
reparandum_reparandum	0.03634068952927306	623.2572974840232	second
discourse_conj	0.03560451757262082	599.7036269430051	second
fixed_discourse	0.03489266792523979	74.48440528972571	second
discourse:filler_acl	0.03485283496868408	576.1526682401703	second
reparandum_nsubj	0.03485283496868408	576.1526682401703	second
punct_conj	0.0343037374311866	14.76598098276567	first
nsubj_parataxis:restart	0.03408461936979407	552.604420907207	second
discourse:filler_nsubj	0.03408461936979407	552.604420907207	second
amod_nmod	0.03357833694236596	4.584717922597381	first
obl_parataxis:restart	0.03329873115421483	529.0588844759109	second
reparandum_mark	0.03249389371815959	505.51605847818576	second
punct_reparandum	0.03249389371815959	505.51605847818576	second
discourse:filler_ccomp	0.03249389371815959	505.51605847818576	second
reparandum_ccomp	0.03249389371815959	505.51605847818576	second
dislocated_root	0.032099983181928804	47.94146069256899	second
nmod_nmod	0.03179350017167992	6.360152211571991	first
reparandum_obl	0.03166866930185818	481.9759424460431	second
conj:extend_root	0.030821429049221256	458.4385359116022	second
discourse:filler_parataxis:restart	0.030821429049221256	458.4385359116022	second
punct_root	0.03071311098406396	1.8745367606973529	first
mark_reparandum	0.029950315528831664	434.90383840708984	second
reparandum_case	0.029950315528831664	434.90383840708984	second
reparandum_amod	0.029950315528831664	434.90383840708984	second
punct_parataxis:restart	0.029950315528831664	434.90383840708984	second
punct_appos	0.02980646118765951	133.54577644396244	first
discourse:filler_nmod	0.02905319526331687	411.3718494648406	second
cc_parataxis	0.02872166122883359	22.302391238742672	second
reparandum_advcl	0.028127597813434026	387.8425686172967	second
...
```
# Notes for developers

## Building and publishing
```
cd conllu-diff/conlludiff
# Bump version when done:
bumpver update --patch # or --minor or --major
python -m build
twine check dist/*
# test upload:
twine upload --verbose -r testpypi dist/*
# real upload:
twine upload --verbose dist/*
```

## Testing
```
cd conllu-diff/conlludiff/tests
pytest -vv
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "conlludiff",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "conlludiff, conllu, conll",
    "author": null,
    "author_email": "Peter Rupnik <rupnikpeter@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/41/7f/30c3f4426fb9b169614efaa84d2681b2493fc0896ff9a880787ddccbec13/conlludiff-0.0.5.tar.gz",
    "platform": null,
    "description": "# conllu-diff\n\nA tool for statistically comparing two conllu files. It offers two equivalent use modes:\n* as a command line script\n* as a python package\n\n## Installation\n\n```\npip install conlludiff\n```\n\n## CLI use\n\nRun as `python -m conlludiff <json>`.\n\nThe tool is configured through a JSON configuration file (example in `config_files/ssj_sst_upos.json`) where the user defines:\n- `file1` - The conllu file containing the first language sample\n- `file2` - The conllu file containing the second language sample\n- `event` - The linguistic feature the comparison is to be based on, optional events are form, lemma, upos, xpos, upos+feats, feat (each feature separately), feats (all features of a word merged), deprel, deprel+head_deprel\n- `filter` - The minimum p-value of the chi-square test for the entry to be filtered from the results\n- `fields` - A list of fields to be retained in the output (list of available values is listed below)\n- `order` - The field by which the output is to be ordered\n- `reverse` - Whether the ordering should be reverse\n- `output` - Where the output is to be produced, either stdout or filename\n\nThe following fields / values are currently available:\n- `chisq` - The chi-square statistical test.\n- `chisq_p` - The p-value of the chi-square test. Very useful for discarding results with p-value below 0.05. These results you simply cannot trust (they might have happened by chance) and do not have to look at.\n- `cramers_v` - The Cramer's V effect size, based on the chi square statistic and the sample size. Traditionally it should be over 0.1 for small effect, over 0.3 for medium effect and over 0.5 for strong effect, but on language phenomena it will never achieve even medium effect. It is comparable across datasets of different sizes, so if the tool is run on multiple pairs of documents, these effect sizes CAN be used for comparison across datasets.\n- `odds_ratio` - The odds ratio effect size. Put simply - it reports how many times the odds of an event are higher in one dataset in comparison to another dataset. It is always higher than 1. This is why `odds_ratio_direction` gives info on the dataset for which the odds of a specific event are higher.\n- `odds_ratio_direction` - The direction of the odds ratio presented previously. If `first`, the odds of the event are greater in the first dataset. If `second`, the odds for this event are higher in the second dataset.\n- `log_likelihood_ratio` - The log-likelihood ratio, as defined by Danning (1993), here mostly for reasons of popularity in the computational linguistics circles.\n\n## Python API\n\n### Use\nThe differ can be used through its python API as follows:\n\n```python\nfrom conlludiff import Differ\n\nd = Differ(\n    \"conllu_files/sl_ssj-ud-train.conllu\",\n    \"conllu_files/sl_sst-ud-train.conllu\",\n    event=\"upos\",\n    filter=0.05,\n    fields=[\n        \"event\",\n        \"cramers_v\",\n        \"odds_ratio\",\n        \"odds_ratio_direction\",\n        \"contingency\"\n    ],\n    order=\"chisq\",\n    reverse=True,\n)\nd.results\n#[\n# {'event': 'INTJ', 'cramers_v': 0.18546269144487967, 'odds_ratio': 205.64922609298688, 'odds_ratio_direction': 'second'},\n# {'event': 'PART', 'cramers_v': 0.09362273156839818, 'odds_ratio': 3.3765821947519883, 'odds_ratio_direction': 'second'},\n# {'event': 'PUNCT', 'cramers_v': 0.0817401329794944, 'odds_ratio': 3.619217699912104, 'odds_ratio_direction': 'first'},\n# {'event': 'ADV', 'cramers_v': 0.0697921632735567, 'odds_ratio': 2.368271631503067, 'odds_ratio_direction': 'second'},\n# {'event': 'NOUN', 'cramers_v': 0.06087356761646711, 'odds_ratio': 1.9182977375177561, 'odds_ratio_direction': 'first'}\n# ...]\n\nd.to_tsv(\"output.tsv\")\n# Writes the data to a tsv, same way as CLI.\n```\n\n\n## Outputs\n\n\nRunning the tool on the exemplary JSON configuration file compares the UPOS dependence between the two files, `sl_sst-ud-train.conllu` and `sl_ssj-ud-train.conllu`.\n\nThe output of the tool with the `event` set to `upos`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\nINTJ\t0.18546269144487967\t205.64922609298688\tsecond\nPART\t0.09362273156839818\t3.3765821947519883\tsecond\nPUNCT\t0.0817401329794944\t3.619217699912104\tfirst\nADV\t0.0697921632735567\t2.368271631503067\tsecond\nNOUN\t0.06087356761646711\t1.9182977375177561\tfirst\nX\t0.054704563268060884\t3.730096502268552\tsecond\nADJ\t0.04337452602078068\t1.914494685493001\tfirst\nDET\t0.039058849022439675\t1.8178073753376662\tsecond\nVERB\t0.038179122992568475\t1.5098089234134449\tsecond\nPRON\t0.03023447484586653\t1.6315601187054114\tsecond\nADP\t0.027554631145556924\t1.5029875604157752\tfirst\nCCONJ\t0.020521602435811057\t1.3849776400810214\tsecond\nPROPN\t0.018222766826042337\t1.48950205465959\tfirst\nSCONJ\t0.01516452354505404\t1.3147102075692845\tsecond\nSYM\t0.00788752606312867\t31.431147723995498\tfirst\nNUM\t0.006279927568528497\t1.1841482466696223\tfirst\n```\n\nThe output of the tool if the event is `lemma`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\nja\t0.1443680388876384\t316.50392575024387\tsecond\neee\t0.14224954241402205\t9727.90290395421\tsecond\n[gap]\t0.1404103062969733\t9473.866117350688\tsecond\n_\t0.10342084138781253\t5109.162639646662\tsecond\n[name:personal]\t0.08543886760201726\t3486.230522337837\tsecond\n,\t0.0798983928857083\t3017.3227483286178\tfirst\n[pause]\t0.07793562469300712\t2903.1409295352323\tsecond\n\u2026\t0.06957402089144266\t29.21324122737166\tsecond\n.\t0.06417706510165858\t1943.1095197895877\tfirst\nne\t0.06172098673848206\t4.567216172071131\tsecond\nmhm\t0.06135550149322792\t1808.49138820132\tsecond\npa\t0.05533923234690654\t3.5866900542385225\tsecond\n[speaker:laughter]\t0.05267879283333141\t1341.047493947355\tsecond\nno\t0.05236304456192136\t33.68782033850626\tsecond\neem\t0.05177677215351289\t1296.5817666752512\tsecond\nta\t0.04975510679153412\t3.112790024717654\tsecond\naha\t0.04842184761560267\t387.7447093352556\tsecond\n?\t0.047909243941398276\t6.54896239567592\tsecond\n[name:surname]\t0.04599517464931115\t1029.9796674731044\tsecond\npol\t0.04522323386592585\t19.314885381209148\tsecond\nvedeti\t0.044877978271710375\t8.204695698185576\tsecond\n[audience:laughter]\t0.04443254560578941\t963.3805970149255\tsecond\nkaj\t0.04422508295450475\t6.665906246951688\tsecond\njaz\t0.04254326905719024\t4.205848770006655\tsecond\nzdaj\t0.04220703488553522\t8.449570357876933\tsecond\n[:voice]\t0.03937486931297157\t763.7067235968929\tsecond\npa\u010d\t0.03880380764818679\t16.29708140017637\tsecond\nti\t0.0374965082691098\t5.789587920409137\tsecond\na\t0.036544753303008326\t6.611184664673473\tsecond\nre\u010di\t0.03507827440464862\t7.032673208444842\tsecond\ntako\t0.03488349828161137\t3.661944581348252\tsecond\nampak\t0.03300775971108695\t8.776849122671143\tsecond\nmmm\t0.03213178836124324\t519.9118251928021\tsecond\n[incident]\t0.03213178836124324\t519.9118251928021\tsecond\n[all:laughter]\t0.03213178836124324\t519.9118251928021\tsecond\niti\t0.03203659707760247\t5.261877926755125\tsecond\naja\t0.03063185252461786\t475.61510384536297\tsecond\nmisliti\t0.030313213443565672\t7.600055895339754\tsecond\nka\t0.02979521974817808\t158.53763109191857\tsecond\nful\t0.02745038147121303\t64.78054798745696\tsecond\nimeti\t0.026728261048545494\t2.8695967134026836\tsecond\nzdajle\t0.02646949225144055\t129.01572779605263\tsecond\nen\t0.026425718058774403\t3.8274431810951612\tsecond\ndati\t0.025769048935856378\t6.3530704294133375\tsecond\ntale\t0.025422305095331388\t17.767587428769104\tsecond\nte\t0.024677083578157007\t320.6482861400894\tsecond\nma\t0.024643752142178156\t114.25932778291704\tsecond\ngor\t0.023985218122048756\t28.75423049244371\tsecond\ntam\t0.02388517506596499\t6.532873916176885\tsecond\nsamo\t0.023726743621988657\t4.152648893958719\tsecond\noni\t0.02313889520888621\t19.023543080403044\tsecond\naaa\t0.02269213397252964\t276.39252864703764\tsecond\n...\n```\n\nThe output of the tool if the event is `feat`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\nPerson:1\t0.04939558517684953\t3.5638167542809853\tsecond\nPerson:2\t0.047040110877374676\t5.079993584778688\tsecond\nPronType:Dem\t0.037175859411220154\t3.245197490851571\tsecond\nCase:Gen\t0.03144647693695685\t2.172092243237678\tfirst\nNumForm:Word\t0.028272960330824267\t3.440339072082062\tsecond\nVerbForm:Fin\t0.0267587007742558\t1.5225701489232548\tsecond\nMood:Ind\t0.02504649601144956\t1.5056171393724438\tsecond\nGender:Masc\t0.025042316104147304\t1.4748890960035126\tfirst\nReflex:Yes\t0.024416673107363094\t699.5006025688363\tfirst\nTense:Pres\t0.023517648140906504\t1.4896134066798563\tsecond\nNumForm:Digit\t0.022483697118357206\t593.4206504177673\tfirst\nCase:Loc\t0.02179297260457836\t1.7083552918624372\tfirst\nGender:Fem\t0.02129852556496575\t1.4327016742281735\tfirst\nPronType:Int\t0.019116864692380664\t2.8254802331440025\tsecond\nMood:Imp\t0.018378570965450702\t3.791997651968869\tsecond\nCase:Ins\t0.018308854809044352\t2.0816886275516455\tfirst\nAspect:Imp\t0.01595530651024542\t1.491526932928539\tsecond\nPolarity:Pos\t0.015389343283813696\t1.3931175980723853\tsecond\nNumber:Plur\t0.015101686601082178\t1.2937540626185027\tfirst\nPronType:Prs\t0.012840220897127438\t1.391913401310955\tsecond\nPolarity:Neg\t0.010467835392964546\t1.6218991635410638\tsecond\nAbbr:Yes\t0.009174573465203728\t100.64168811481056\tfirst\nTense:Fut\t0.008659708802953942\t1.6695810431815645\tsecond\nPoss:Yes\t0.008622929553196686\t1.8430908774898307\tfirst\nPronType:Ind\t0.007988560895972635\t1.5545205160767552\tsecond\nVerbForm:Sup\t0.007606299226342561\t6.994685959408413\tsecond\nVerbForm:Part\t0.007588183914607259\t1.2152331908399183\tfirst\nPronType:Neg\t0.0072933899363323805\t2.492256128366947\tsecond\nGender[psor]:Masc\t0.006270434086374932\t7.047436934055062\tfirst\nDegree:Pos\t0.006204462420882378\t1.1017059175543398\tsecond\nDefinite:Def\t0.006094185561369171\t1.6136203228373713\tfirst\nCase:Dat\t0.005207882750413297\t1.296449710663011\tfirst\nNumType:Ord\t0.004907024006001947\t1.6690069108717716\tfirst\nGender[psor]:Fem\t0.0043045691241584165\t4.971538958295724\tfirst\nNumber:Sing\t0.0042501886679687595\t1.0445633278559616\tfirst\nNumber[psor]:Sing\t0.004168042039340886\t1.7214890333000743\tfirst\nVariant:Short\t0.004116420078423952\t1.15773584843268\tsecond\nNumber[psor]:Plur\t0.003520769896614546\t1.4800187255151183\tsecond\nAnimacy:Anim\t0.0031959140489293135\t1.609703528865278\tfirst\nNumber:Dual\t0.003119169910226467\t1.1902942252586413\tsecond\nDegree:Sup\t0.0028717358191274397\t1.4607040532750566\tfirst\nNumForm:Roman\t0.002735775335474741\t10.900416396883163\tfirst\nDefinite:Ind\t0.0027015758531109017\t1.230778104542293\tfirst\nNumType:Mult\t0.002586134214293405\t5.35236774349564\tsecond\n```\n\nOutput of the tool if the event is `deprel`:\n\n```\nevent\tcramers_v\todds_ratio\todds_ratio_direction\ndiscourse\t0.1995606615688087\t75.78024049293329\tsecond\ndiscourse:filler\t0.1545168231515424\t11514.411314984709\tsecond\nreparandum\t0.14274706919825084\t9797.236607142857\tsecond\npunct\t0.08173482846847689\t3.6189527899648515\tfirst\nparataxis:discourse\t0.08119452932918503\t3149.6412683633353\tsecond\nroot\t0.06758699304934265\t2.238339638285662\tsecond\nadvmod\t0.06730374982664965\t2.026798690361553\tsecond\nnmod\t0.05472055507805954\t3.295046424712571\tfirst\nparataxis:restart\t0.05222972721988518\t1318.8134851138354\tsecond\namod\t0.047244641439906455\t2.3917425656320344\tfirst\nconj:extend\t0.04225937967772594\t874.6138211382115\tsecond\nparataxis\t0.0398717788389968\t2.265339381899786\tsecond\ndislocated\t0.035933307354648995\t47.53133350886273\tsecond\ncase\t0.030560438744177788\t1.5760166637679451\tfirst\nvocative\t0.029795832802305106\t9.331577965730597\tsecond\nfixed\t0.024153593693035522\t2.418883738805336\tsecond\nobl\t0.0214600136068333\t1.4392252992918104\tfirst\nconj\t0.01946154708029554\t1.509069018715358\tfirst\nlist\t0.014859841032733846\t105.90846429170628\tfirst\nnummod\t0.013237722191482327\t1.5856577201396964\tfirst\nflat\t0.01226834979992188\t2.865560442489671\tsecond\nccomp\t0.012109670762191542\t1.5403849395768785\tsecond\nappos\t0.01192489066635104\t2.0258632065250364\tfirst\nmark\t0.011210811993208638\t1.2446775351581985\tsecond\nacl\t0.011205361560489591\t1.4731494732818564\tfirst\norphan\t0.009752798810071982\t2.3358535915763845\tfirst\ncc\t0.008654128343704083\t1.1770939880197218\tsecond\nadvcl\t0.007143729012211228\t1.284389185911365\tsecond\nflat:name\t0.006644743576550729\t1.417384017797551\tfirst\ndep\t0.006150595406649066\t4.385886921540406\tfirst\naux\t0.005735206555028642\t1.1100447817440058\tfirst\n```\n\nOutput of the tool if the event is `deprel+head_deprel`:\n\n```\ndiscourse_root\t0.18202736862729124\t76.9150923698931\tsecond\ndiscourse:filler_root\t0.10398408766447145\t4883.512041884816\tsecond\ndiscourse_parataxis\t0.08310549743890455\t66.31108288242072\tsecond\nreparandum_root\t0.07203903218914691\t2350.031683626272\tsecond\nparataxis:discourse_root\t0.0654853384402872\t1946.6054409980939\tsecond\nadvmod_root\t0.06281476752222244\t2.557906542509336\tsecond\ncc_root\t0.055707328412306335\t6.9863965946940025\tsecond\ndiscourse:filler_parataxis\t0.052954640088375506\t1283.8629836802952\tsecond\nadvmod_parataxis\t0.0494810299649954\t3.0164190203770955\tsecond\nadvmod_reparandum\t0.04878776582774138\t1094.9009164793358\tsecond\nparataxis:restart_root\t0.04824164203380061\t1071.2929106628242\tsecond\nmark_root\t0.04787571436102816\t12.4166781015429\tsecond\ndiscourse:filler_conj\t0.04656491000281119\t1000.4852186941738\tsecond\ncc_conj:extend\t0.0454127571242677\t953.2936905790837\tsecond\nadvmod_parataxis:restart\t0.044825592848028915\t929.7020050702926\tsecond\nparataxis_root\t0.04359034032395506\t2.5377329342701413\tsecond\nreparandum_advmod\t0.042395817338828114\t835.3624423963133\tsecond\nadvmod_parataxis:discourse\t0.04112721883273275\t788.2089621011404\tsecond\nreparandum_parataxis\t0.04112721883273275\t788.2089621011404\tsecond\npunct_acl\t0.040188438192249216\t81.79756089838132\tfirst\ncase_root\t0.039176461994083936\t11.178670793434415\tsecond\nparataxis_parataxis:restart\t0.038464927462689126\t693.9345848209144\tsecond\nparataxis:discourse_parataxis\t0.038464927462689126\t693.9345848209144\tsecond\ndiscourse:filler_obl\t0.03777009901119597\t670.3727759543962\tsecond\ndiscourse:filler_obj\t0.03777009901119597\t670.3727759543962\tsecond\nreparandum_reparandum\t0.03634068952927306\t623.2572974840232\tsecond\ndiscourse_conj\t0.03560451757262082\t599.7036269430051\tsecond\nfixed_discourse\t0.03489266792523979\t74.48440528972571\tsecond\ndiscourse:filler_acl\t0.03485283496868408\t576.1526682401703\tsecond\nreparandum_nsubj\t0.03485283496868408\t576.1526682401703\tsecond\npunct_conj\t0.0343037374311866\t14.76598098276567\tfirst\nnsubj_parataxis:restart\t0.03408461936979407\t552.604420907207\tsecond\ndiscourse:filler_nsubj\t0.03408461936979407\t552.604420907207\tsecond\namod_nmod\t0.03357833694236596\t4.584717922597381\tfirst\nobl_parataxis:restart\t0.03329873115421483\t529.0588844759109\tsecond\nreparandum_mark\t0.03249389371815959\t505.51605847818576\tsecond\npunct_reparandum\t0.03249389371815959\t505.51605847818576\tsecond\ndiscourse:filler_ccomp\t0.03249389371815959\t505.51605847818576\tsecond\nreparandum_ccomp\t0.03249389371815959\t505.51605847818576\tsecond\ndislocated_root\t0.032099983181928804\t47.94146069256899\tsecond\nnmod_nmod\t0.03179350017167992\t6.360152211571991\tfirst\nreparandum_obl\t0.03166866930185818\t481.9759424460431\tsecond\nconj:extend_root\t0.030821429049221256\t458.4385359116022\tsecond\ndiscourse:filler_parataxis:restart\t0.030821429049221256\t458.4385359116022\tsecond\npunct_root\t0.03071311098406396\t1.8745367606973529\tfirst\nmark_reparandum\t0.029950315528831664\t434.90383840708984\tsecond\nreparandum_case\t0.029950315528831664\t434.90383840708984\tsecond\nreparandum_amod\t0.029950315528831664\t434.90383840708984\tsecond\npunct_parataxis:restart\t0.029950315528831664\t434.90383840708984\tsecond\npunct_appos\t0.02980646118765951\t133.54577644396244\tfirst\ndiscourse:filler_nmod\t0.02905319526331687\t411.3718494648406\tsecond\ncc_parataxis\t0.02872166122883359\t22.302391238742672\tsecond\nreparandum_advcl\t0.028127597813434026\t387.8425686172967\tsecond\n...\n```\n# Notes for developers\n\n## Building and publishing\n```\ncd conllu-diff/conlludiff\n# Bump version when done:\nbumpver update --patch # or --minor or --major\npython -m build\ntwine check dist/*\n# test upload:\ntwine upload --verbose -r testpypi dist/*\n# real upload:\ntwine upload --verbose dist/*\n```\n\n## Testing\n```\ncd conllu-diff/conlludiff/tests\npytest -vv\n```\n",
    "bugtrack_url": null,
    "license": "Apache License Version 2.0, January 2004 http://www.apache.org/licenses/  TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION  1. Definitions.  \"License\" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.  \"Licensor\" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.  \"Legal Entity\" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, \"control\" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.  \"You\" (or \"Your\") shall mean an individual or Legal Entity exercising permissions granted by this License.  \"Source\" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.  \"Object\" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.  \"Work\" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).  \"Derivative Works\" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.  \"Contribution\" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, \"submitted\" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as \"Not a Contribution.\"  \"Contributor\" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.  2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.  3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.  4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:  (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and  (b) You must cause any modified files to carry prominent notices stating that You changed the files; and  (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and  (d) If the Work includes a \"NOTICE\" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.  You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.  5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.  6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.  7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.  8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.  9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.  END OF TERMS AND CONDITIONS  APPENDIX: How to apply the Apache License to your work.  To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets \"[]\" replaced with your own identifying information. (Don't include the brackets!)  The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same \"printed page\" as the copyright notice for easier identification within third-party archives.  Copyright [yyyy] [name of copyright owner]  Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at  http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ",
    "summary": "Analyze two CONLLU files",
    "version": "0.0.5",
    "project_urls": {
        "Homepage": "https://github.com/clarinsi/conllu-diff"
    },
    "split_keywords": [
        "conlludiff",
        " conllu",
        " conll"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2d2f5262b432e839172cc3a50933d1435538af16bfac6124de413ba4fd288758",
                "md5": "80ecc09f1cca6d1bf44657a2aa0c8b3a",
                "sha256": "706997c031f1bc9822c67097bc6cc8dc128c79d547bb17e7b619ff0acdb605b0"
            },
            "downloads": -1,
            "filename": "conlludiff-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "80ecc09f1cca6d1bf44657a2aa0c8b3a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 21415,
            "upload_time": "2024-04-02T08:49:41",
            "upload_time_iso_8601": "2024-04-02T08:49:41.471388Z",
            "url": "https://files.pythonhosted.org/packages/2d/2f/5262b432e839172cc3a50933d1435538af16bfac6124de413ba4fd288758/conlludiff-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "417f30c3f4426fb9b169614efaa84d2681b2493fc0896ff9a880787ddccbec13",
                "md5": "0e8d2932ba40532b60b61dcd800ac026",
                "sha256": "899d81f0486503da663a8d4fe6357e4209eba25cb09b047555715446851550cd"
            },
            "downloads": -1,
            "filename": "conlludiff-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "0e8d2932ba40532b60b61dcd800ac026",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 29147,
            "upload_time": "2024-04-02T08:49:42",
            "upload_time_iso_8601": "2024-04-02T08:49:42.799466Z",
            "url": "https://files.pythonhosted.org/packages/41/7f/30c3f4426fb9b169614efaa84d2681b2493fc0896ff9a880787ddccbec13/conlludiff-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-02 08:49:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "clarinsi",
    "github_project": "conllu-diff",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "conlludiff"
}
        
Elapsed time: 0.22442s