fastgedcom


Namefastgedcom JSON
Version 1.1.3 PyPI version JSON
download
home_pageNone
SummaryA lightweight tool to easily parse, browse and edit gedcom files.
upload_time2024-11-01 18:36:49
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License
keywords fastgedcom gedcom parser genealogy
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FastGedcom

A lightweight tool to easily parse, browse and edit gedcom files.

Install FastGedcom using pip from [the PyPI repository](https://pypi.org/project/fastgedcom/):
```bash
pip install fastgedcom
```
To install the Ansel codec use the following command. It enables the use of the Ansel text encoding often used for gedcom files.
```bash
pip install fastgedcom[ansel]
```

## Highlights of FastGedcom

- FastGedcom is **easy** to write.
- FastGedcom has **type annotations**.
- FastGedcom has a **[documentation](https://fastgedcom.readthedocs.io/en/latest/)**.
- FastGedcom has **[code examples](https://github.com/GatienBouyer/fastgedcom/tree/main/examples)**.
- FastGedcom has **[unit tests](https://github.com/GatienBouyer/fastgedcom/tree/main/test)**.
- FastGedcom has **less methods** than the alternatives, which make it easy to learn.
- FastGedcom is **concise** thanks to operator overloads. (**optional**)
- FastGedcom has a **linear** syntax, if/else and try/except blocks are less needed.
- Last but not least, FastGedcom is **fast**. Go to [benchmarks](https://github.com/GatienBouyer/benchmark-python-gedcom).

Comparison:
<table>
	<tr>
		<th>Gedcom file</th>
		<th>FastGedcom</th>
		<th>python-gedcom</th>
	</tr>
	<tr>
		<td><pre lang="gedcom"><code>
0 HEAD
1 FILE my-file.ged
0 @I1@ INDI
1 NAME John Doe
1 BIRT
2 DATE 1 Jan 1970
1 DEAT
2 DATE 2 Feb 2081
0 TRLR
		</code></pre></td>
		<td><pre lang="python3"><code>
from fastgedcom.parser import strict_parse
document = strict_parse("my-file.ged")
person = document["@I1@"]
# use ">" to get a sub-line
death = person > "DEAT"
# use ">=" to get a sub-line value
date = death >= "DATE"
print(date)
# Prints "" if the field is missing
		</code></pre></td>
		<td><pre lang="python3"><code>
from gedcom.parser import Parser
document = Parser()
document.parse_file("my-file.ged")
records = document.get_element_dictionary()
person = records["@I1@"]
death_data = person.get_death_data()
# data is (date, place, sources)
date = death_data[0]
print(date)
		</code></pre></td>
	</tr>
</table>

## Features

### Multi-encoding support
It supports a broad set of encoding for gedcom files such as UTF-8 (with and without BOM), UTF-16 (also named UNICODE), ANSI, and ANSEL.

### Kept closed from gedcom with free choice of formatting
There is a lot of genealogy software out there, and every one of them have its own tags and formats to write information. With the FastGedcom approach, you can easily adapt your code to your gedcom files. You have to choose how do you want to parse and format the values. You can use non-standard field, for example the "_AKA" field (standing for Also Known As).

```python
from fastgedcom.parser import strict_parse
from fastgedcom.helpers import extract_name_parts

document = strict_parse("gedcom_file.ged")

person = document["@I1@"]
name = person >= "NAME"
print(name)  # Unformatted string such as "John /Doe/"

given_name, surname = extract_name_parts(name)
print(f"{given_name.capitalize()} {surname.upper()}")  # Would be "John DOE"

alias = person > "NAME" >= "_AKA"
print(f"a.k.a: {alias}")  # Could be "Johnny" or ""
```

### The Option paradigm replaces the if blocks:
If a field is missing, you will get a [FakeLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FakeLine) containing an empty string. This helps reduce the boilerplate code massively. And, you can differentiate a [TrueLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.TrueLine) from a [FakeLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FakeLine) with a simple boolean check.

```python
indi = document["@I13@"]

# You can access the date of death, whether the person is deceased or not.
date = (indi > "DEAT") >= "DATE"

# The date of death or an empty string
print("Death date:", date)
```

Another example:

```python
for record in document:
    line = record > "_UID"
    if line:  # Check if field _UID exists to avoid ValueError in list.remove()
        record.sub_lines.remove(line)

# Get the Document as a gedcom string to write it into a file
gedcom_without_uids = document.get_source()

with open("./gedcom_without_uids.ged", "w", encoding="utf-8-sig") as f:
    f.write(gedcom_without_uids)
```

### Typehints for salvation!
Autocompletion and type checking make development so much easier.

- There are only 3 main classes: [Document](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.Document), [TrueLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.TrueLine), and [FakeLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FakeLine).
- There are type aliases for code clarity and code documentation: [Record](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.Record), [XRef](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.XRef), [IndiRef](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.IndiRef), [FamRef](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FamRef), and more.

```python
from fastgedcom.base import Record, FakeLine
from fastgedcom.family_link import FamilyLink

# For fast and easy family lookups
families = FamilyLink(document)


def ancestral_generation_count(indi: Record | FakeLine) -> int:
    """Return the number of generation registered as ancestors of the given person."""
    if not indi:
        return 1
    father, mother = families.get_parents(indi.tag)
    return 1 + max(
        ancestral_generation_count(father),
        ancestral_generation_count(mother),
    )


root = document["@I1@"]
number_generations_above_root = ancestral_generation_count(root)
```

## Why it is called FastGedcom?

FastGedcom's aim is to keep the code close to your gedcom files. So, you don't have to learn what FastGedcom does. The data you have is the data you get. The content of the gedcom file is unchanged and there is no abstraction. Hence, the learning curve of the library is faster than the alternatives. The data processing is optional to best suit your needs. FastGedcom is more of a starting point for your data processing than a feature-rich library.

The name **FastGedcom** doesn't just come from its ease of use. Parsing is the fastest among Python libraries. Especially for parsing and getting the relatives of a person, the [FamilyLink](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/family_link/index.html#fastgedcom.family_link.FamilyLink) class is build for this purpose. Here are the [benchmarks](https://github.com/GatienBouyer/benchmark-python-gedcom).

## Documentation and examples

Want to see more of FastGedcom? Here are some [examples](https://github.com/GatienBouyer/fastgedcom/tree/main/examples)

The [documentation](https://fastgedcom.readthedocs.io/en/latest/) of FastGedcom is available on ReadTheDocs.

## Feedback

Comments and contributions are welcomed, and they will be greatly appreciated!

If you like this project, consider putting a star on [GitHub](https://github.com/GatienBouyer/fastgedcom). Thank you!

For any feedback or questions, please feel free to contact me by email at gatien.bouyer.dev@gmail.com or via [GitHub issues](https://github.com/GatienBouyer/fastgedcom/issues).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fastgedcom",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "fastgedcom, gedcom, parser, genealogy",
    "author": null,
    "author_email": "Gatien Bouyer <gatien.bouyer.dev@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/6c/5b/eb45a9303fcdeba7040b4489c23754137adc308953197a9ec446f37c6fb2/fastgedcom-1.1.3.tar.gz",
    "platform": null,
    "description": "# FastGedcom\r\n\r\nA lightweight tool to easily parse, browse and edit gedcom files.\r\n\r\nInstall FastGedcom using pip from [the PyPI repository](https://pypi.org/project/fastgedcom/):\r\n```bash\r\npip install fastgedcom\r\n```\r\nTo install the Ansel codec use the following command. It enables the use of the Ansel text encoding often used for gedcom files.\r\n```bash\r\npip install fastgedcom[ansel]\r\n```\r\n\r\n## Highlights of FastGedcom\r\n\r\n- FastGedcom is **easy** to write.\r\n- FastGedcom has **type annotations**.\r\n- FastGedcom has a **[documentation](https://fastgedcom.readthedocs.io/en/latest/)**.\r\n- FastGedcom has **[code examples](https://github.com/GatienBouyer/fastgedcom/tree/main/examples)**.\r\n- FastGedcom has **[unit tests](https://github.com/GatienBouyer/fastgedcom/tree/main/test)**.\r\n- FastGedcom has **less methods** than the alternatives, which make it easy to learn.\r\n- FastGedcom is **concise** thanks to operator overloads. (**optional**)\r\n- FastGedcom has a **linear** syntax, if/else and try/except blocks are less needed.\r\n- Last but not least, FastGedcom is **fast**. Go to [benchmarks](https://github.com/GatienBouyer/benchmark-python-gedcom).\r\n\r\nComparison:\r\n<table>\r\n\t<tr>\r\n\t\t<th>Gedcom file</th>\r\n\t\t<th>FastGedcom</th>\r\n\t\t<th>python-gedcom</th>\r\n\t</tr>\r\n\t<tr>\r\n\t\t<td><pre lang=\"gedcom\"><code>\r\n0 HEAD\r\n1 FILE my-file.ged\r\n0 @I1@ INDI\r\n1 NAME John Doe\r\n1 BIRT\r\n2 DATE 1 Jan 1970\r\n1 DEAT\r\n2 DATE 2 Feb 2081\r\n0 TRLR\r\n\t\t</code></pre></td>\r\n\t\t<td><pre lang=\"python3\"><code>\r\nfrom fastgedcom.parser import strict_parse\r\ndocument = strict_parse(\"my-file.ged\")\r\nperson = document[\"@I1@\"]\r\n# use \">\" to get a sub-line\r\ndeath = person > \"DEAT\"\r\n# use \">=\" to get a sub-line value\r\ndate = death >= \"DATE\"\r\nprint(date)\r\n# Prints \"\" if the field is missing\r\n\t\t</code></pre></td>\r\n\t\t<td><pre lang=\"python3\"><code>\r\nfrom gedcom.parser import Parser\r\ndocument = Parser()\r\ndocument.parse_file(\"my-file.ged\")\r\nrecords = document.get_element_dictionary()\r\nperson = records[\"@I1@\"]\r\ndeath_data = person.get_death_data()\r\n# data is (date, place, sources)\r\ndate = death_data[0]\r\nprint(date)\r\n\t\t</code></pre></td>\r\n\t</tr>\r\n</table>\r\n\r\n## Features\r\n\r\n### Multi-encoding support\r\nIt supports a broad set of encoding for gedcom files such as UTF-8 (with and without BOM), UTF-16 (also named UNICODE), ANSI, and ANSEL.\r\n\r\n### Kept closed from gedcom with free choice of formatting\r\nThere is a lot of genealogy software out there, and every one of them have its own tags and formats to write information. With the FastGedcom approach, you can easily adapt your code to your gedcom files. You have to choose how do you want to parse and format the values. You can use non-standard field, for example the \"_AKA\" field (standing for Also Known As).\r\n\r\n```python\r\nfrom fastgedcom.parser import strict_parse\r\nfrom fastgedcom.helpers import extract_name_parts\r\n\r\ndocument = strict_parse(\"gedcom_file.ged\")\r\n\r\nperson = document[\"@I1@\"]\r\nname = person >= \"NAME\"\r\nprint(name)  # Unformatted string such as \"John /Doe/\"\r\n\r\ngiven_name, surname = extract_name_parts(name)\r\nprint(f\"{given_name.capitalize()} {surname.upper()}\")  # Would be \"John DOE\"\r\n\r\nalias = person > \"NAME\" >= \"_AKA\"\r\nprint(f\"a.k.a: {alias}\")  # Could be \"Johnny\" or \"\"\r\n```\r\n\r\n### The Option paradigm replaces the if blocks:\r\nIf a field is missing, you will get a [FakeLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FakeLine) containing an empty string. This helps reduce the boilerplate code massively. And, you can differentiate a [TrueLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.TrueLine) from a [FakeLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FakeLine) with a simple boolean check.\r\n\r\n```python\r\nindi = document[\"@I13@\"]\r\n\r\n# You can access the date of death, whether the person is deceased or not.\r\ndate = (indi > \"DEAT\") >= \"DATE\"\r\n\r\n# The date of death or an empty string\r\nprint(\"Death date:\", date)\r\n```\r\n\r\nAnother example:\r\n\r\n```python\r\nfor record in document:\r\n    line = record > \"_UID\"\r\n    if line:  # Check if field _UID exists to avoid ValueError in list.remove()\r\n        record.sub_lines.remove(line)\r\n\r\n# Get the Document as a gedcom string to write it into a file\r\ngedcom_without_uids = document.get_source()\r\n\r\nwith open(\"./gedcom_without_uids.ged\", \"w\", encoding=\"utf-8-sig\") as f:\r\n    f.write(gedcom_without_uids)\r\n```\r\n\r\n### Typehints for salvation!\r\nAutocompletion and type checking make development so much easier.\r\n\r\n- There are only 3 main classes: [Document](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.Document), [TrueLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.TrueLine), and [FakeLine](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FakeLine).\r\n- There are type aliases for code clarity and code documentation: [Record](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.Record), [XRef](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.XRef), [IndiRef](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.IndiRef), [FamRef](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/base/index.html#fastgedcom.base.FamRef), and more.\r\n\r\n```python\r\nfrom fastgedcom.base import Record, FakeLine\r\nfrom fastgedcom.family_link import FamilyLink\r\n\r\n# For fast and easy family lookups\r\nfamilies = FamilyLink(document)\r\n\r\n\r\ndef ancestral_generation_count(indi: Record | FakeLine) -> int:\r\n    \"\"\"Return the number of generation registered as ancestors of the given person.\"\"\"\r\n    if not indi:\r\n        return 1\r\n    father, mother = families.get_parents(indi.tag)\r\n    return 1 + max(\r\n        ancestral_generation_count(father),\r\n        ancestral_generation_count(mother),\r\n    )\r\n\r\n\r\nroot = document[\"@I1@\"]\r\nnumber_generations_above_root = ancestral_generation_count(root)\r\n```\r\n\r\n## Why it is called FastGedcom?\r\n\r\nFastGedcom's aim is to keep the code close to your gedcom files. So, you don't have to learn what FastGedcom does. The data you have is the data you get. The content of the gedcom file is unchanged and there is no abstraction. Hence, the learning curve of the library is faster than the alternatives. The data processing is optional to best suit your needs. FastGedcom is more of a starting point for your data processing than a feature-rich library.\r\n\r\nThe name **FastGedcom** doesn't just come from its ease of use. Parsing is the fastest among Python libraries. Especially for parsing and getting the relatives of a person, the [FamilyLink](https://fastgedcom.readthedocs.io/en/latest/autoapi/fastgedcom/family_link/index.html#fastgedcom.family_link.FamilyLink) class is build for this purpose. Here are the [benchmarks](https://github.com/GatienBouyer/benchmark-python-gedcom).\r\n\r\n## Documentation and examples\r\n\r\nWant to see more of FastGedcom? Here are some [examples](https://github.com/GatienBouyer/fastgedcom/tree/main/examples)\r\n\r\nThe [documentation](https://fastgedcom.readthedocs.io/en/latest/) of FastGedcom is available on ReadTheDocs.\r\n\r\n## Feedback\r\n\r\nComments and contributions are welcomed, and they will be greatly appreciated!\r\n\r\nIf you like this project, consider putting a star on [GitHub](https://github.com/GatienBouyer/fastgedcom). Thank you!\r\n\r\nFor any feedback or questions, please feel free to contact me by email at gatien.bouyer.dev@gmail.com or via [GitHub issues](https://github.com/GatienBouyer/fastgedcom/issues).\r\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A lightweight tool to easily parse, browse and edit gedcom files.",
    "version": "1.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/GatienBouyer/fastgedcom/issues",
        "Documentation": "https://fastgedcom.readthedocs.io/en/latest/",
        "Source": "https://github.com/GatienBouyer/fastgedcom"
    },
    "split_keywords": [
        "fastgedcom",
        " gedcom",
        " parser",
        " genealogy"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2fac4b7aa8160febb4af4eccea12c4346bace936cdfa0f8a753f60a354ee4c78",
                "md5": "10e90c0853c82ad0504cd6fce3d24013",
                "sha256": "2b982488514bd398ab52f01c984a00f73a58fe290bd6874901bdbf77ac3cf625"
            },
            "downloads": -1,
            "filename": "fastgedcom-1.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "10e90c0853c82ad0504cd6fce3d24013",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 17441,
            "upload_time": "2024-11-01T18:36:48",
            "upload_time_iso_8601": "2024-11-01T18:36:48.102307Z",
            "url": "https://files.pythonhosted.org/packages/2f/ac/4b7aa8160febb4af4eccea12c4346bace936cdfa0f8a753f60a354ee4c78/fastgedcom-1.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6c5beb45a9303fcdeba7040b4489c23754137adc308953197a9ec446f37c6fb2",
                "md5": "73711dc137d017777e7b53658befff03",
                "sha256": "384556587cc44080794e5a9ffaf60e15699d42fd1a0f2cc8dce16f41957e7ea3"
            },
            "downloads": -1,
            "filename": "fastgedcom-1.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "73711dc137d017777e7b53658befff03",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 24146,
            "upload_time": "2024-11-01T18:36:49",
            "upload_time_iso_8601": "2024-11-01T18:36:49.356657Z",
            "url": "https://files.pythonhosted.org/packages/6c/5b/eb45a9303fcdeba7040b4489c23754137adc308953197a9ec446f37c6fb2/fastgedcom-1.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-01 18:36:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "GatienBouyer",
    "github_project": "fastgedcom",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "fastgedcom"
}
        
Elapsed time: 0.33664s