zensols.calamr


Namezensols.calamr JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/plandes/calamr
SummaryCALAMR: Component ALignment for AMR
upload_time2024-05-19 22:12:26
maintainerNone
docs_urlNone
authorPaul Landes
requires_pythonNone
licenseNone
keywords tooling
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # CALAMR: Component ALignment for Abstract Meaning Representation

[![PyPI][pypi-badge]][pypi-link]
[![Python 3.10][python310-badge]][python310-link]
[![Python 3.11][python311-badge]][python311-link]

This repository contains code for the paper [CALAMR: Component ALignment for
Abstract Meaning Representation] and aligns the components of a bipartite
source and summary AMR graph.  To reproduce the results of the paper, see the
[paper repository](https://github.com/uic-nlp-lab/calamr).

The results are useful as a semantic graph similarity score (like SMATCH) or to
find the summarized portion (as AMR nodes, edges and subgraphs) of a document
or the portion of the source that represents the summary.  If you use this
library or the [PropBank API/curated database], please [cite](#citation) our
paper.

Features:

* Align source/summary AMR graphs.
* Scores for extent to which AMRs are summarized or represented in their source
  text.
* Rendering of the alignments.
* Support for four AMR [corpora](#corpora).

<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
## Table of Contents

- [Documentation](#documentation)
- [Obtaining](#obtaining)
- [Corpora](#corpora)
- [Usage](#usage)
    - [Command Line](#command-line)
        - [Aligning Corpus Documents](#aligning-corpus-documents)
        - [Ad hoc Corpora](#ad-hoc-corpora)
    - [AMR Release 3.0 Corpus (LDC2020T02)](#amr-release-30-corpus-ldc2020t02)
    - [API](#api)
    - [Docker](#docker)
- [Example Graphs](#example-graphs)
    - [GraphViz](#graphviz)
        - [The Nascent Graph (with flow data)](#the-nascent-graph-with-flow-data)
        - [The Source Graph](#the-source-graph)
    - [Plotly](#plotly)
- [Attribution](#attribution)
- [Citation](#citation)
- [Changelog](#changelog)
- [License](#license)

<!-- markdown-toc end -->



## Documentation

The recommended reading order for this project:

1. The abstract and introduction of the paper [CALAMR: Component ALignment for
   Abstract Meaning Representation]
1. [Overview and implementation guide](https://plandes.github.io/calamr/doc/CalamrImplementGuide.pdf)
1. [Full documentation](https://plandes.github.io/calamr/index.html)
1. [API reference](https://plandes.github.io/calamr/api.html)


## Obtaining

The library can be installed with pip from the [pypi] repository:
```bash
pip3 install zensols.calamr
```


## Corpora

This repository contains code to support the following corpora with
source/summary AMR for alignment:

* [LDC2020T02] (AMR Release 3.0) Proxy Corpus
* [ISI] Little Prince
* [ISI] Bio AMR
* A [micro corpus] (toy corpus) used in the paper examples and [usage](#usage).


## Usage

The command-line tool and API does not depend on the repository.  However, it
has a template configuration file that both the CLI and the API use.  The
examples also use data in the repository.  Do the following to get started:

1. Clone this repository and change the working directory to it:
   ```bash
   git clone https://github.com/plandes/calamr && cd calamr
   ```
1. Copy the resource file:
   ```bash
   cp src/config/dot-calamrrc ~/.calamrrc
   ```


### Command Line

The steps below show how to use the command-line tool.  First set up the
application environment:

1. Edit the `~/.calamrrc` file to choose the corpus and visualization.  Keep
   the `calamr_corpus` set to `adhoc` for these examples.  (Note that you can
   also set the the `CALAMRRC` environment variable to a file in a different
   location if you prefer.)
1. Create the micro corpus:
   ```bash
   calamr mkadhoc --corpusfile corpus/micro/source.json
   ```
1. Print the document keys of the corpus:
   ```bash
   calamr keys
   ```


#### Aligning Corpus Documents

AMR corpora that distinguish between source and summary documents are needed so
the API knows what data to align.  The following examples utilize preexisting
corpora (including the last section's micro corpus):

1. Generate the Liu et al. graph for the micro corpus in directory `example`:
	```bash
	calamr aligncorp liu-example -f txt -d example
	```
1. Force the *Little Prince* AMR corpus download and confirm success with the
   single document key `1943`:
	```bash
	calamr keys --override=calamr_corpus.name=little-prince
	```
1. Use the default AMR parser to extract sentence text from the *Little Prince*
	AMR corpus using the [SPRING] (Bevilacqua et al. 2021) parser:
	```bash
	calamr penman -d lp.txt --limit 5 \
		--override amr_default.parse_model=spring \
		~/.cache/calamr/corpus/amr-rel/amr-bank-struct-v3.0.txt
	```
1. Score the parsed sentences using CALAMR, SMATCH and WLK:
	```bash
	calamr score --parsed lp.txt \
		--methods calamr,smatch,wlk \
		~/.cache/calamr/corpus/amr-rel/amr-bank-struct-v3.0.txt
	```


#### Ad hoc Corpora

The [micro corpus] can be edited and rebuilt to add your own data to be
aligned.  However, there's an easier way to align ad hoc documents.

1. Align a summarized document not included in any corpus.  First create the
   annotated documents as files `short-story.json`.
   ```json
   [
	   {
		   "id": "intro",
		   "body": "The Dow Jones Industrial Average and other major indexes pared losses.",
		   "summary": "Dow Jones and other major indexes reduced losses."
	   },
	   {
		   "id": "dow-stats",
		   "body": "The Dow ended 0.5% lower on Friday while the S&P 500 fell 0.7%. Among the S&P sectors, energy and utilities gained while technology and communication services lagged.",
		   "summary": "Dow sank 0.5%, S&P 500 lost 0.7% and energy, utilities up, tech, comms came down."
	   }
   ]
   ```
   Now align the documents using the `XFM Bart Base` AMR parser, rendering
   with the maximum number of steps (`-r 10`), and save results to `example`:
	```bash
	calamr align short-story.json --override amr_default.parse_model=xfm_bart_base -r 10 -d example -f txt
	```

The `-r` option controls how many intermediate graphs generated to show the
iteration of the algorithm over all the steps (see the paper for details).


### AMR Release 3.0 Corpus (LDC2020T02)

If you are using the AMR 3.0 corpus, there is a preprocessing step that needs
executing before it can be used.

The Proxy Report corpus from the AMR 3.0 does not have both the `alignments`
(text-to-graph alignments) and `snt-type` (indicates if a sentence is part of
the source or the summary) metadata.  By default, this API expects both.  To
merge them into one dataset do the following:

1. [Obtain or purchase](https://catalog.ldc.upenn.edu/LDC2020T02) the corpus.
1. Move the file where the software can find it:
   ```bash
   mkdir ~/.cache/calamr/download
   cp /path/to/amr_annotation_3.0_LDC2020T02.tgz ~/.cache/calamr/download
   ```
1. Merge the alignments and sentence descriptors:
   ```bash
   ./src/bin/merge-proxy-anons.py
   ```
1. Confirm the merge was successful by printing the document keys and align a report:
   ```bash
   calamr keys --override=calamr_corpus.name=proxy-report
   calamr aligncorp 20041010_0024 -f txt -d example \
	   --override calamr_corpus.name=proxy-report
   ```

### API

To use the package programmatically:

1. Get the resource bundle:
   ```python
   from pathlib import Path
   from zensols.amr import AmrFeatureDocument
   from zensols.calamr import DocumentGraph, Resource, ApplicationFactory

   # get the resource bundle
   res: Resource = ApplicationFactory.get_resource()
   ```
1. Get the Liu et al. AMR feature document example and print it.
   ```python
   doc: AmrFeatureDocument = res.get_corpus_document('liu-example')
   doc.write()
   ```
   output:
   ```yaml
   [T]: Joe's dog was chasing a cat in the garden. I saw Joe's dog, which was running in the garden. The dog was chasing a cat.
   sentences:
       [N]: Joe's dog was chasing a cat in the garden.
           (c0 / chase-01~e.4
                 :location (g0 / garden~e.9)
                 :ARG0 (d0 / dog~e.2
                       :poss (p0 / person
                             :name (n0 / name
                                   :op1 "Joe"~e.0)))
                 :ARG1 (c1 / cat~e.6))
   .
   .
   .
   amr:
       summary:
           Joe's dog was chasing a cat in the garden.
       sections:
           no section sentences
               I saw Joe's dog, which was running in the garden.
               The dog was chasing a cat.
   ```
1. Align (if not already and cached) and get the flow results of the example:
   ```python
   flow = res.align_corpus_document('liu-example')
   flow.write()
   ```
   output:
   ```yaml
   summary:
       Joe's dog was chasing a cat in the garden.
   sections:
       no section sentences
           I saw Joe's dog, which was running in the garden.
           The dog was chasing a cat.
   statistics:
       agg:
           aligned_portion_hmean: 0.8695652173913044
           mean_flow: 0.7131309357900468
           tot_alignable: 21
           tot_aligned: 18
           aligned_portion: 0.8571428571428571
           reentrancies: 0
   ```
1. Parse the first document from the [ad hoc JSON file](#ad-hoc-corpora) align
   it, and give its statistics:
   ```python
   doc: AmrFeatureDocument = next(iter(res.parse_documents(Path('short-story.json'))))
   graph: DocumentGraph = res.create_graph(doc)
   flow = res.align(graph)
   flow.write()
   ```
   output:
   ```yaml
   summary:
       Dow Jones and other major indexes reduced losses.
   sections:
       no section sentences
           The Dow Jones Industrial Average and other major indexes pared losses.
   statistics:
       agg:
           aligned_portion_hmean: 1.0
           mean_flow: 0.9269955839429582
           tot_alignable: 24
           tot_aligned: 24
           aligned_portion: 1.0
           reentrancies: 0
    ...
	```
1. Render the results of a flow:
   ```python
   flow = res.align_corpus_document('liu-example')
   flow.render()
   ```
1. Render all graphs of the flow results of the flow to directory `example`:
   ```python
   flow.render(
       contexts=flow.get_render_contexts(include_nascent=True),
       directory=Path('example'),
       display=False)
   ```

### Docker

A stand-alone docker image is also available (see [CALAMR Docker
image](./docker)).  This [docker image] provides stand-alone container with all
models, configuration and the adhoc micro corpus installed.


## Example Graphs

The Liu et al. example graphs were created from the last step of the
[API](#api) examples, which is equivalent the first step of the [command line
example](#aligning-corpus-documents).


### GraphViz

To create these graphs, set your `~/.calamrrc` configuration to:

```ini
[calamr_default]
renderer = graphviz
```

#### The Nascent Graph (with flow data)

<p align="center">
	<img src="./doc/graphs/liu-nascent-graphviz.svg"
		alt="source graph" width="80%"
		style="outline: 5px solid #D3D3D3;"/>
</p>


#### The Source Graph

<p align="center">
	<img src="./doc/graphs/liu-source-graphviz.svg"
		alt="source graph" width="90%"
		style="outline: 5px solid #D3D3D3;"/>
</p>


### Plotly

To create these graphs, set your `~/.calamrrc` configuration to:

```ini
[calamr_default]
renderer = plotly
```

See the [interactive version](https://plandes.github.io/calamr/doc/graphs/liu-source-plotly.html).
[<img src="./doc/graphs/liu-source-plotly-screenshot.png">](https://plandes.github.io/calamr/doc/graphs/liu-source-plotly.html)


## Attribution

This project, or reference model code, uses:

* Python 3.11
* [amrlib] for AMR parsing.
* [amr_coref] for AMR co-reference
* [zensols.amr] for AMR features and summarization data structures.
* [Sentence-BERT] embeddings
* [zensols.propbankdb] and [zensols.deepnlp] for PropBank embeddings
* [zensols.nlparse] for natural language features and [NLP scoring]
* [Smatch] (Cai and Knight. 2013) and [WLK] (Opitz et. al. 2021) for scoring.


## Citation

If you use this project in your research please use the following BibTeX entry:

```bibtex
@inproceedings{landes-di-eugenio-2024-calamr-component,
    title = "{CALAMR}: Component {AL}ignment for {A}bstract {M}eaning {R}epresentation",
    author = "Landes, Paul  and
      Di Eugenio, Barbara",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italy",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.236",
    pages = "2622--2637"
}
```


## Changelog

An extensive changelog is available [here](CHANGELOG.md).


## License

[MIT License](LICENSE.md)

Copyright (c) 2023 - 2024 Paul Landes


<!-- links -->
[pypi]: https://pypi.org/project/zensols.calamr/
[pypi-link]: https://pypi.python.org/pypi/zensols.calamr
[pypi-badge]: https://img.shields.io/pypi/v/zensols.calamr.svg
[python310-badge]: https://img.shields.io/badge/python-3.10-blue.svg
[python310-link]: https://www.python.org/downloads/release/python-3100
[python311-badge]: https://img.shields.io/badge/python-3.11-blue.svg
[python311-link]: https://www.python.org/downloads/release/python-3110

[micro corpus]: corpus/micro/source.json
[LDC2020T02]: https://catalog.ldc.upenn.edu/LDC2020T02
[SPRING]: https://github.com/SapienzaNLP/spring
[CALAMR: Component ALignment for Abstract Meaning Representation]: https://example.com
[ISI]: https://amr.isi.edu

[amrlib]: https://github.com/bjascob/amrlib
[amr_coref]: https://github.com/bjascob/amr_coref
[spaCy]: https://spacy.io
[Smatch]: https://github.com/snowblink14/smatch
[WLK]: https://github.com/flipz357/weisfeiler-leman-amr-metrics
[zensols.nlparse]: https://github.com/plandes/nlparse
[NLP scoring]: https://plandes.github.io/nlparse/api/zensols.nlp.html#zensols-nlp-score
[Sentence-BERT]: https://www.sbert.net
[docker image]: https://hub.docker.com/r/plandes/calamr
[zensols.amr]: https://github.com/plandes/amr
[zensols.deepnlp]: https://github.com/plandes/deepnlp
[zensols.propbankdb]: https://github.com/plandes/propbankdb
[PropBank API/curated database]: https://github.com/plandes/propbankdb

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/plandes/calamr",
    "name": "zensols.calamr",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "tooling",
    "author": "Paul Landes",
    "author_email": "landes@mailc.net",
    "download_url": "https://github.com/plandes/calamr/releases/download/v0.1.0/zensols.calamr-0.1.0-py3-none-any.whl",
    "platform": null,
    "description": "# CALAMR: Component ALignment for Abstract Meaning Representation\n\n[![PyPI][pypi-badge]][pypi-link]\n[![Python 3.10][python310-badge]][python310-link]\n[![Python 3.11][python311-badge]][python311-link]\n\nThis repository contains code for the paper [CALAMR: Component ALignment for\nAbstract Meaning Representation] and aligns the components of a bipartite\nsource and summary AMR graph.  To reproduce the results of the paper, see the\n[paper repository](https://github.com/uic-nlp-lab/calamr).\n\nThe results are useful as a semantic graph similarity score (like SMATCH) or to\nfind the summarized portion (as AMR nodes, edges and subgraphs) of a document\nor the portion of the source that represents the summary.  If you use this\nlibrary or the [PropBank API/curated database], please [cite](#citation) our\npaper.\n\nFeatures:\n\n* Align source/summary AMR graphs.\n* Scores for extent to which AMRs are summarized or represented in their source\n  text.\n* Rendering of the alignments.\n* Support for four AMR [corpora](#corpora).\n\n<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->\n## Table of Contents\n\n- [Documentation](#documentation)\n- [Obtaining](#obtaining)\n- [Corpora](#corpora)\n- [Usage](#usage)\n    - [Command Line](#command-line)\n        - [Aligning Corpus Documents](#aligning-corpus-documents)\n        - [Ad hoc Corpora](#ad-hoc-corpora)\n    - [AMR Release 3.0 Corpus (LDC2020T02)](#amr-release-30-corpus-ldc2020t02)\n    - [API](#api)\n    - [Docker](#docker)\n- [Example Graphs](#example-graphs)\n    - [GraphViz](#graphviz)\n        - [The Nascent Graph (with flow data)](#the-nascent-graph-with-flow-data)\n        - [The Source Graph](#the-source-graph)\n    - [Plotly](#plotly)\n- [Attribution](#attribution)\n- [Citation](#citation)\n- [Changelog](#changelog)\n- [License](#license)\n\n<!-- markdown-toc end -->\n\n\n\n## Documentation\n\nThe recommended reading order for this project:\n\n1. The abstract and introduction of the paper [CALAMR: Component ALignment for\n   Abstract Meaning Representation]\n1. [Overview and implementation guide](https://plandes.github.io/calamr/doc/CalamrImplementGuide.pdf)\n1. [Full documentation](https://plandes.github.io/calamr/index.html)\n1. [API reference](https://plandes.github.io/calamr/api.html)\n\n\n## Obtaining\n\nThe library can be installed with pip from the [pypi] repository:\n```bash\npip3 install zensols.calamr\n```\n\n\n## Corpora\n\nThis repository contains code to support the following corpora with\nsource/summary AMR for alignment:\n\n* [LDC2020T02] (AMR Release 3.0) Proxy Corpus\n* [ISI] Little Prince\n* [ISI] Bio AMR\n* A [micro corpus] (toy corpus) used in the paper examples and [usage](#usage).\n\n\n## Usage\n\nThe command-line tool and API does not depend on the repository.  However, it\nhas a template configuration file that both the CLI and the API use.  The\nexamples also use data in the repository.  Do the following to get started:\n\n1. Clone this repository and change the working directory to it:\n   ```bash\n   git clone https://github.com/plandes/calamr && cd calamr\n   ```\n1. Copy the resource file:\n   ```bash\n   cp src/config/dot-calamrrc ~/.calamrrc\n   ```\n\n\n### Command Line\n\nThe steps below show how to use the command-line tool.  First set up the\napplication environment:\n\n1. Edit the `~/.calamrrc` file to choose the corpus and visualization.  Keep\n   the `calamr_corpus` set to `adhoc` for these examples.  (Note that you can\n   also set the the `CALAMRRC` environment variable to a file in a different\n   location if you prefer.)\n1. Create the micro corpus:\n   ```bash\n   calamr mkadhoc --corpusfile corpus/micro/source.json\n   ```\n1. Print the document keys of the corpus:\n   ```bash\n   calamr keys\n   ```\n\n\n#### Aligning Corpus Documents\n\nAMR corpora that distinguish between source and summary documents are needed so\nthe API knows what data to align.  The following examples utilize preexisting\ncorpora (including the last section's micro corpus):\n\n1. Generate the Liu et al. graph for the micro corpus in directory `example`:\n\t```bash\n\tcalamr aligncorp liu-example -f txt -d example\n\t```\n1. Force the *Little Prince* AMR corpus download and confirm success with the\n   single document key `1943`:\n\t```bash\n\tcalamr keys --override=calamr_corpus.name=little-prince\n\t```\n1. Use the default AMR parser to extract sentence text from the *Little Prince*\n\tAMR corpus using the [SPRING] (Bevilacqua et al. 2021) parser:\n\t```bash\n\tcalamr penman -d lp.txt --limit 5 \\\n\t\t--override amr_default.parse_model=spring \\\n\t\t~/.cache/calamr/corpus/amr-rel/amr-bank-struct-v3.0.txt\n\t```\n1. Score the parsed sentences using CALAMR, SMATCH and WLK:\n\t```bash\n\tcalamr score --parsed lp.txt \\\n\t\t--methods calamr,smatch,wlk \\\n\t\t~/.cache/calamr/corpus/amr-rel/amr-bank-struct-v3.0.txt\n\t```\n\n\n#### Ad hoc Corpora\n\nThe [micro corpus] can be edited and rebuilt to add your own data to be\naligned.  However, there's an easier way to align ad hoc documents.\n\n1. Align a summarized document not included in any corpus.  First create the\n   annotated documents as files `short-story.json`.\n   ```json\n   [\n\t   {\n\t\t   \"id\": \"intro\",\n\t\t   \"body\": \"The Dow Jones Industrial Average and other major indexes pared losses.\",\n\t\t   \"summary\": \"Dow Jones and other major indexes reduced losses.\"\n\t   },\n\t   {\n\t\t   \"id\": \"dow-stats\",\n\t\t   \"body\": \"The Dow ended 0.5% lower on Friday while the S&P 500 fell 0.7%. Among the S&P sectors, energy and utilities gained while technology and communication services lagged.\",\n\t\t   \"summary\": \"Dow sank 0.5%, S&P 500 lost 0.7% and energy, utilities up, tech, comms came down.\"\n\t   }\n   ]\n   ```\n   Now align the documents using the `XFM Bart Base` AMR parser, rendering\n   with the maximum number of steps (`-r 10`), and save results to `example`:\n\t```bash\n\tcalamr align short-story.json --override amr_default.parse_model=xfm_bart_base -r 10 -d example -f txt\n\t```\n\nThe `-r` option controls how many intermediate graphs generated to show the\niteration of the algorithm over all the steps (see the paper for details).\n\n\n### AMR Release 3.0 Corpus (LDC2020T02)\n\nIf you are using the AMR 3.0 corpus, there is a preprocessing step that needs\nexecuting before it can be used.\n\nThe Proxy Report corpus from the AMR 3.0 does not have both the `alignments`\n(text-to-graph alignments) and `snt-type` (indicates if a sentence is part of\nthe source or the summary) metadata.  By default, this API expects both.  To\nmerge them into one dataset do the following:\n\n1. [Obtain or purchase](https://catalog.ldc.upenn.edu/LDC2020T02) the corpus.\n1. Move the file where the software can find it:\n   ```bash\n   mkdir ~/.cache/calamr/download\n   cp /path/to/amr_annotation_3.0_LDC2020T02.tgz ~/.cache/calamr/download\n   ```\n1. Merge the alignments and sentence descriptors:\n   ```bash\n   ./src/bin/merge-proxy-anons.py\n   ```\n1. Confirm the merge was successful by printing the document keys and align a report:\n   ```bash\n   calamr keys --override=calamr_corpus.name=proxy-report\n   calamr aligncorp 20041010_0024 -f txt -d example \\\n\t   --override calamr_corpus.name=proxy-report\n   ```\n\n### API\n\nTo use the package programmatically:\n\n1. Get the resource bundle:\n   ```python\n   from pathlib import Path\n   from zensols.amr import AmrFeatureDocument\n   from zensols.calamr import DocumentGraph, Resource, ApplicationFactory\n\n   # get the resource bundle\n   res: Resource = ApplicationFactory.get_resource()\n   ```\n1. Get the Liu et al. AMR feature document example and print it.\n   ```python\n   doc: AmrFeatureDocument = res.get_corpus_document('liu-example')\n   doc.write()\n   ```\n   output:\n   ```yaml\n   [T]: Joe's dog was chasing a cat in the garden. I saw Joe's dog, which was running in the garden. The dog was chasing a cat.\n   sentences:\n       [N]: Joe's dog was chasing a cat in the garden.\n           (c0 / chase-01~e.4\n                 :location (g0 / garden~e.9)\n                 :ARG0 (d0 / dog~e.2\n                       :poss (p0 / person\n                             :name (n0 / name\n                                   :op1 \"Joe\"~e.0)))\n                 :ARG1 (c1 / cat~e.6))\n   .\n   .\n   .\n   amr:\n       summary:\n           Joe's dog was chasing a cat in the garden.\n       sections:\n           no section sentences\n               I saw Joe's dog, which was running in the garden.\n               The dog was chasing a cat.\n   ```\n1. Align (if not already and cached) and get the flow results of the example:\n   ```python\n   flow = res.align_corpus_document('liu-example')\n   flow.write()\n   ```\n   output:\n   ```yaml\n   summary:\n       Joe's dog was chasing a cat in the garden.\n   sections:\n       no section sentences\n           I saw Joe's dog, which was running in the garden.\n           The dog was chasing a cat.\n   statistics:\n       agg:\n           aligned_portion_hmean: 0.8695652173913044\n           mean_flow: 0.7131309357900468\n           tot_alignable: 21\n           tot_aligned: 18\n           aligned_portion: 0.8571428571428571\n           reentrancies: 0\n   ```\n1. Parse the first document from the [ad hoc JSON file](#ad-hoc-corpora) align\n   it, and give its statistics:\n   ```python\n   doc: AmrFeatureDocument = next(iter(res.parse_documents(Path('short-story.json'))))\n   graph: DocumentGraph = res.create_graph(doc)\n   flow = res.align(graph)\n   flow.write()\n   ```\n   output:\n   ```yaml\n   summary:\n       Dow Jones and other major indexes reduced losses.\n   sections:\n       no section sentences\n           The Dow Jones Industrial Average and other major indexes pared losses.\n   statistics:\n       agg:\n           aligned_portion_hmean: 1.0\n           mean_flow: 0.9269955839429582\n           tot_alignable: 24\n           tot_aligned: 24\n           aligned_portion: 1.0\n           reentrancies: 0\n    ...\n\t```\n1. Render the results of a flow:\n   ```python\n   flow = res.align_corpus_document('liu-example')\n   flow.render()\n   ```\n1. Render all graphs of the flow results of the flow to directory `example`:\n   ```python\n   flow.render(\n       contexts=flow.get_render_contexts(include_nascent=True),\n       directory=Path('example'),\n       display=False)\n   ```\n\n### Docker\n\nA stand-alone docker image is also available (see [CALAMR Docker\nimage](./docker)).  This [docker image] provides stand-alone container with all\nmodels, configuration and the adhoc micro corpus installed.\n\n\n## Example Graphs\n\nThe Liu et al. example graphs were created from the last step of the\n[API](#api) examples, which is equivalent the first step of the [command line\nexample](#aligning-corpus-documents).\n\n\n### GraphViz\n\nTo create these graphs, set your `~/.calamrrc` configuration to:\n\n```ini\n[calamr_default]\nrenderer = graphviz\n```\n\n#### The Nascent Graph (with flow data)\n\n<p align=\"center\">\n\t<img src=\"./doc/graphs/liu-nascent-graphviz.svg\"\n\t\talt=\"source graph\" width=\"80%\"\n\t\tstyle=\"outline: 5px solid #D3D3D3;\"/>\n</p>\n\n\n#### The Source Graph\n\n<p align=\"center\">\n\t<img src=\"./doc/graphs/liu-source-graphviz.svg\"\n\t\talt=\"source graph\" width=\"90%\"\n\t\tstyle=\"outline: 5px solid #D3D3D3;\"/>\n</p>\n\n\n### Plotly\n\nTo create these graphs, set your `~/.calamrrc` configuration to:\n\n```ini\n[calamr_default]\nrenderer = plotly\n```\n\nSee the [interactive version](https://plandes.github.io/calamr/doc/graphs/liu-source-plotly.html).\n[<img src=\"./doc/graphs/liu-source-plotly-screenshot.png\">](https://plandes.github.io/calamr/doc/graphs/liu-source-plotly.html)\n\n\n## Attribution\n\nThis project, or reference model code, uses:\n\n* Python 3.11\n* [amrlib] for AMR parsing.\n* [amr_coref] for AMR co-reference\n* [zensols.amr] for AMR features and summarization data structures.\n* [Sentence-BERT] embeddings\n* [zensols.propbankdb] and [zensols.deepnlp] for PropBank embeddings\n* [zensols.nlparse] for natural language features and [NLP scoring]\n* [Smatch] (Cai and Knight. 2013) and [WLK] (Opitz et. al. 2021) for scoring.\n\n\n## Citation\n\nIf you use this project in your research please use the following BibTeX entry:\n\n```bibtex\n@inproceedings{landes-di-eugenio-2024-calamr-component,\n    title = \"{CALAMR}: Component {AL}ignment for {A}bstract {M}eaning {R}epresentation\",\n    author = \"Landes, Paul  and\n      Di Eugenio, Barbara\",\n    editor = \"Calzolari, Nicoletta  and\n      Kan, Min-Yen  and\n      Hoste, Veronique  and\n      Lenci, Alessandro  and\n      Sakti, Sakriani  and\n      Xue, Nianwen\",\n    booktitle = \"Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)\",\n    month = may,\n    year = \"2024\",\n    address = \"Torino, Italy\",\n    publisher = \"ELRA and ICCL\",\n    url = \"https://aclanthology.org/2024.lrec-main.236\",\n    pages = \"2622--2637\"\n}\n```\n\n\n## Changelog\n\nAn extensive changelog is available [here](CHANGELOG.md).\n\n\n## License\n\n[MIT License](LICENSE.md)\n\nCopyright (c) 2023 - 2024 Paul Landes\n\n\n<!-- links -->\n[pypi]: https://pypi.org/project/zensols.calamr/\n[pypi-link]: https://pypi.python.org/pypi/zensols.calamr\n[pypi-badge]: https://img.shields.io/pypi/v/zensols.calamr.svg\n[python310-badge]: https://img.shields.io/badge/python-3.10-blue.svg\n[python310-link]: https://www.python.org/downloads/release/python-3100\n[python311-badge]: https://img.shields.io/badge/python-3.11-blue.svg\n[python311-link]: https://www.python.org/downloads/release/python-3110\n\n[micro corpus]: corpus/micro/source.json\n[LDC2020T02]: https://catalog.ldc.upenn.edu/LDC2020T02\n[SPRING]: https://github.com/SapienzaNLP/spring\n[CALAMR: Component ALignment for Abstract Meaning Representation]: https://example.com\n[ISI]: https://amr.isi.edu\n\n[amrlib]: https://github.com/bjascob/amrlib\n[amr_coref]: https://github.com/bjascob/amr_coref\n[spaCy]: https://spacy.io\n[Smatch]: https://github.com/snowblink14/smatch\n[WLK]: https://github.com/flipz357/weisfeiler-leman-amr-metrics\n[zensols.nlparse]: https://github.com/plandes/nlparse\n[NLP scoring]: https://plandes.github.io/nlparse/api/zensols.nlp.html#zensols-nlp-score\n[Sentence-BERT]: https://www.sbert.net\n[docker image]: https://hub.docker.com/r/plandes/calamr\n[zensols.amr]: https://github.com/plandes/amr\n[zensols.deepnlp]: https://github.com/plandes/deepnlp\n[zensols.propbankdb]: https://github.com/plandes/propbankdb\n[PropBank API/curated database]: https://github.com/plandes/propbankdb\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "CALAMR: Component ALignment for AMR",
    "version": "0.1.0",
    "project_urls": {
        "Download": "https://github.com/plandes/calamr/releases/download/v0.1.0/zensols.calamr-0.1.0-py3-none-any.whl",
        "Homepage": "https://github.com/plandes/calamr"
    },
    "split_keywords": [
        "tooling"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc22c703d23ed622253ed96fc19545a8ff330887b914b16cef8e62a716f62a77",
                "md5": "43e67783eb4bf781520471c1d7a48391",
                "sha256": "8477d66534d2daac82285992af4b2b5586454339235038dc6bf655810f10063a"
            },
            "downloads": -1,
            "filename": "zensols.calamr-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "43e67783eb4bf781520471c1d7a48391",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 113986,
            "upload_time": "2024-05-19T22:12:26",
            "upload_time_iso_8601": "2024-05-19T22:12:26.925800Z",
            "url": "https://files.pythonhosted.org/packages/dc/22/c703d23ed622253ed96fc19545a8ff330887b914b16cef8e62a716f62a77/zensols.calamr-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-19 22:12:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "plandes",
    "github_project": "calamr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "zensols.calamr"
}
        
Elapsed time: 0.33776s