Spacy2FoLiA


NameSpacy2FoLiA JSON
Version 0.3.4 PyPI version JSON
download
home_pagehttps://proycon.github.io/folia
SummaryLibrary that adds FoLiA (format for linguistic annotation) support to spaCy
upload_time2024-02-27 21:45:47
maintainer
docs_urlNone
authorMaarten van Gompel
requires_python
licenseGPL
keywords nlp computational_linguistics spacy linguistics toolkit folia
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Spacy-to-FoliA
===================

.. image:: https://travis-ci.com/proycon/foliapy.svg?branch=master
    :target: https://travis-ci.com/proycon/spacy2folia

.. image:: http://applejack.science.ru.nl/lamabadge.php/spacy2folia
   :target: http://applejack.science.ru.nl/languagemachines/

Convert Spacy output to `FoLiA XML <https://proycon.github.io/folia>`_ Documents. Also supports FoLiA input.

Installation
--------------

``$ pip install spacy2folia``

You also need to install the spacy models you want like:

``python -m spacy download en_core_web_sm``

Usage Example
----------------

Using the command line tool on an input file named ``test.txt``:

``$ spacy2folia --model en_core_web_sm test.txt``

This results in a document ``test.folia.xml`` in the current working directory.

You can also invoke the command line tool on one or more FoLiA documents as input:

``$ spacy2folia --model en_core_web_sm document.folia.xml``

The output file will be written to the currrent working directory (so it may overwirte the input if it's in the same
directory!)

Usage from Python:

.. code:: python

   import spacy
   from spacy2folia import spacy2folia

   text = "Input text goes here"

   nlp = spacy.load("en_core_web_sm")
   doc = nlp(text)
   foliadoc = spacy2folia.convert(doc, "example", paragraphs=True)
   foliadoc.save("/tmp/output.folia.xml")

Usage from Python with FoLiA input:

.. code:: python

   import spacy
   import folia.main as folia
   from spacy2folia import spacy2folia

   foliadoc = folia.Document(file="/tmp/input.folia.xml")
   nlp = spacy.load("en_core_web_sm")
   spacy2folia.convert_folia(foliadoc, nlp)
   foliadoc.save("/tmp/output.folia.xml")



            

Raw data

            {
    "_id": null,
    "home_page": "https://proycon.github.io/folia",
    "name": "Spacy2FoLiA",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "nlp computational_linguistics spacy linguistics toolkit folia",
    "author": "Maarten van Gompel",
    "author_email": "proycon@anaproy.nl",
    "download_url": "https://files.pythonhosted.org/packages/b3/33/1039e2309a400283602fb95d65045f4f9c0bf6e821ebfd45eacf40277dcb/Spacy2FoLiA-0.3.4.tar.gz",
    "platform": null,
    "description": "Spacy-to-FoliA\n===================\n\n.. image:: https://travis-ci.com/proycon/foliapy.svg?branch=master\n    :target: https://travis-ci.com/proycon/spacy2folia\n\n.. image:: http://applejack.science.ru.nl/lamabadge.php/spacy2folia\n   :target: http://applejack.science.ru.nl/languagemachines/\n\nConvert Spacy output to `FoLiA XML <https://proycon.github.io/folia>`_ Documents. Also supports FoLiA input.\n\nInstallation\n--------------\n\n``$ pip install spacy2folia``\n\nYou also need to install the spacy models you want like:\n\n``python -m spacy download en_core_web_sm``\n\nUsage Example\n----------------\n\nUsing the command line tool on an input file named ``test.txt``:\n\n``$ spacy2folia --model en_core_web_sm test.txt``\n\nThis results in a document ``test.folia.xml`` in the current working directory.\n\nYou can also invoke the command line tool on one or more FoLiA documents as input:\n\n``$ spacy2folia --model en_core_web_sm document.folia.xml``\n\nThe output file will be written to the currrent working directory (so it may overwirte the input if it's in the same\ndirectory!)\n\nUsage from Python:\n\n.. code:: python\n\n   import spacy\n   from spacy2folia import spacy2folia\n\n   text = \"Input text goes here\"\n\n   nlp = spacy.load(\"en_core_web_sm\")\n   doc = nlp(text)\n   foliadoc = spacy2folia.convert(doc, \"example\", paragraphs=True)\n   foliadoc.save(\"/tmp/output.folia.xml\")\n\nUsage from Python with FoLiA input:\n\n.. code:: python\n\n   import spacy\n   import folia.main as folia\n   from spacy2folia import spacy2folia\n\n   foliadoc = folia.Document(file=\"/tmp/input.folia.xml\")\n   nlp = spacy.load(\"en_core_web_sm\")\n   spacy2folia.convert_folia(foliadoc, nlp)\n   foliadoc.save(\"/tmp/output.folia.xml\")\n\n\n",
    "bugtrack_url": null,
    "license": "GPL",
    "summary": "Library that adds FoLiA (format for linguistic annotation) support to spaCy",
    "version": "0.3.4",
    "project_urls": {
        "Homepage": "https://proycon.github.io/folia"
    },
    "split_keywords": [
        "nlp",
        "computational_linguistics",
        "spacy",
        "linguistics",
        "toolkit",
        "folia"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b3331039e2309a400283602fb95d65045f4f9c0bf6e821ebfd45eacf40277dcb",
                "md5": "8e11f9074c68d2f1e2e301faa4483913",
                "sha256": "ef983c93f5809677cd2a46602e4440ccf4cf3a270a35bed1c18d3d0280d8e275"
            },
            "downloads": -1,
            "filename": "Spacy2FoLiA-0.3.4.tar.gz",
            "has_sig": false,
            "md5_digest": "8e11f9074c68d2f1e2e301faa4483913",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 5585,
            "upload_time": "2024-02-27T21:45:47",
            "upload_time_iso_8601": "2024-02-27T21:45:47.217703Z",
            "url": "https://files.pythonhosted.org/packages/b3/33/1039e2309a400283602fb95d65045f4f9c0bf6e821ebfd45eacf40277dcb/Spacy2FoLiA-0.3.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-27 21:45:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "spacy2folia"
}
        
Elapsed time: 1.16577s