af3cli


Nameaf3cli JSON
Version 0.3.1 PyPI version JSON
download
home_pageNone
SummaryA command-line interface and Python library for generating AlphaFold3 input files.
upload_time2025-01-26 13:56:23
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseNone
keywords alphafold cli
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # af3cli
A command-line interface and Python library for generating [AlphaFold3](https://github.com/google-deepmind/alphafold3) input files.

## Installation

We recommend using [uv](https://github.com/astral-sh/uv) to manage your installation.

```shell
uv sync --locked
```

This automatically creates a virtual environment `.venv` in the project folder and installs all dependencies. If you do not need the optional dependencies for reading SDF ([RDKit](https://github.com/rdkit/rdkit)) or FASTA files ([Biopython](https://github.com/biopython/biopython)), the installation can be prevented with `--no-group features`.

## Basic Usage

The generation of AlphaFold3 input files can be done either with the standalone CLI tool or for more advanced tasks by using the library in Python scripts. [Python Fire](https://github.com/google/python-fire) is used to implement the CLI application.

For a detailed overview of all available JSON fields check the [AlphaFold3 input documentation](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md).

> [!WARNING]
> In most cases, checks are only carried out to ensure a correct structure of the input file, but not whether the inputs themselves are valid.

You can display the help and overview of the available CLI commands with the following statement.

```shell
af3cli -- --help
```

All commands and sub-commands are separated with `-` to enable chaining, allowing e.g. several sequences, ligands, bonds etc. to be added.

```shell
af3cli toplevel sub [...] - sub [...] \
    - toplevel sub [...]
```

You can use the `debug` command to display the final file without writing it.

```shell
af3cli debug --show - [...]
```

### Config Parameters

The `config` command is used to manage basic settings, such as the file name of the JSON file to be written, the name of the job or the respective version.

```shell
af3cli config -f "filename.json" -j "jobname" -v 2
```

The library provides an `InputBuilder` that allows new jobs to be created very comfortably step by step.

```python
from af3cli import InputBuilder

builder = InputBuilder()
builder.set_name("jobname")
builder.set_version(2)
# builder.set_dialect("alphafold3") # default

input_file = builder.build()
input_file.write("filename.json")
```

You can also initialize the `InputBuilder` with an existing `InputFile` object in order to add further sequences or ligands or to change settings.

### Random Seeds

It is required that at least one random seed is specified. The default value is therefore 1. Otherwise, you can either specify a number of values to generate a list of random seeds or pass a list of integers yourself.

```shell
af3cli seeds -n 10 - ...
# generates 10 random numbers

af3cli seeds -v "1,2,3"
# "(1,2,3,...)" or "[1,2,3,...]" are also valid
```

Python:
```python
builder.set_seeds([1, 2, 3])
```

### Sequences

Adding sequences works basically the same for all three available types, but not all JSON fields are supported for each type. The corresponding subcommands therefore differ in some cases.

```shell
af3cli [...] \
    - protein add "MVKLAGST" \ # positional argument
    - protein add --sequence "AAQAA" \
    - dna add --sequence "AATTTTCC" \
    - rna add --sequence "UUUGGCCGG"
```

A check is performed to ensure that the sequence characters match the respective type in the CLI application or in the Python library itself, when the `Sequence` object is converted into a dictionary. When using the sequence base class, the corresponding `SequenceType` must be specified. We therefore provide derived classes for the respective sequence types to simplify use.

```python
from af3cli import ProteinSequence # DNASequence, RNASequence

# DNASequence / RNASequence
protein_seq = ProteinSequence(
    "MVKLAGST...",
    #[...]
)

builder.add_sequence(protein_seq)
```

For DNA sequences, it is possible to generate the reverse complementary strand. The associated data, such as manually specified IDs or modifications, are not included. The CLI tool will generate appropriate warnings.

```shell
af3cli [...] \
    - dna add --sequence "AATTTTCC" --complement
```

Python:

```python
from af3cli import DNASequence


dna_seq = DNASequence(
    "AATTTTCC...",
    #[...]
)
rc_dna_seq = dna_seq.reverse_complement()

builder.add_sequence(dna_seq)
builder.add_sequence(rc_dna_seq)
```

If modifications or manually defined IDs are required, the complementary sequence must be created separately.

#### FASTA Files

As it is often not very practical to add many or particularly long sequences via the CLI, it is possible to read the respective sequence from a FASTA file. To use this feature, [Biopython](https://github.com/biopython/biopython) must be installed as an optional dependency.

```shell
af3cli protein add --sequence <filename> --fasta
```

Each sequence command expects exactly one single sequence. Otherwise it is not possible to add additional fields, such as modifications or templates. However, it is still possible to read several sequences from a FASTA file if the additional features are not required. For even more advanced tasks, the Python API must otherwise be used.

```shell
af3cli [...] - fasta [--filename] <filename>
```

The respective sequence type is automatically detected, which is not possible in rare cases. If this is the case, the sequence is ignored and a warning is issued. It is, therefore, advisable to add all sequences whose type cannot be clearly identified separately via the sequence commands.

There are also two ways of doing this when using Python. The `fasta2seq` function can be used to obtain a generator that automatically creates `Sequence` objects and the `read_fasta` function is used to create a generator that returns the plain FASTA IDs and sequences from the FASTA file as a string.

```python
from af3cli.sequence import fasta2seq, read_fasta

for seq in fasta2seq(filename):
    ...
    # do something with the Sequence object

for fasta_id, seq_str in read_fasta(filename):
    ...
    # create your own Sequence objects
```

#### Modifications

By applying the `modification` subcommand, any number of modifications can be added to the sequences with the respective CCD identifier and position. The different fields in the JSON file are automatically inserted correctly based on the sequence type.

```shell
# as positional arguments
af3cli [...] protein [...] - modification "SEP" 5
# or with explicit argument names
af3cli [...] dna [...] - modification --mod "6OG" --pos 1
```

When using the Python API, you have to explicitly define what kind of modifications you would like to add, since the resulting JSON fields are different for protein (`ResidueModification`) or nucleotide sequences (`NucleotideModification`). Please note that checks are performed to verify the modification types when the `Sequence` object is converted to a dictionary.

```python
from af3cli import ProteinSequence, ResidueModification
                   # NucleotideModification

rmod = ResidueModification("SEP", 5)

protein_seq = ProteinSequence(
    "<SEQUENCE>",
    # [...]
    modifications=[rmod]
)

# it is possible to add more modifications later
protein_seq.modifications.append(rmod)
```

#### Structural Templates

For protein sequences, it is possible to specify multiple structural templates in mmCIF format as a string or path. Since it is completely impractical to use strings via the CLI tool, the file must be submitted as plain text and is then read in its entirety as a string. 

```shell
# read the file as string with the '--read' flag
af3cli [...] protein [...] - template [--mmcif] <filename> --read
# keep relative/absolute path
af3cli [...] protein [...] - template [--mmcif] <filename>

# specify query and template indices as list of integers
# "1,2,3,..." | "(1,2,3,...)" | "[1,2,3,...]" are valid
af3cli [...] protein [...] \
    - template [--mmcif] <filename> -q "..." -t "..."
```

As it makes no difference to Python whether the string contains a path to a file or the file content, all you need to do is specify the template type. The file must then be read manually beforehand if a string is desired in the JSON file.

```python
from af3cli import Template, TemplateType, ProteinSequence

# TemplateType.FILE for relative/absolute path
t = Template(
    TemplateType.STRING,
    "mmCIF content",
    qidx=[], tidx=[]
)

protein_seq = ProteinSequence(
    "<SEQUENCE>",
    # [...]
    templates=[t]
)

# it is possible to add more templates later
protein_seq.templates.append(t)
```

#### Multiple Sequence Alignment

Please refer to the [AlphaFold3 input documentation](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md#multiple-sequence-alignment) on how to specify the MSA section for protein and RNA sequences.

The A3M-formatted content can be specified either as a path or as a string (mutally exclusive).

```shell
af3cli [...] protein [...] msa --paired ... --unpaired ...
af3cli [...] protein [...] msa --pairedpath ... --unpairedpath ...
```

In the case of the Python API, you must specify whether the respective string is a path.

```python
from af3cli import MSA

msa = MSA(
    paired="...", unpaired="...",
    paired_is_path=True, unpaired_is_path=True,
)

protein_seq = ProteinSequence(
    "<SEQUENCE>",
    # [...]
    msa=msa
)

# alternative
protein_seq._msa = msa
```

### Ligands and Ions

The ligands are treated in a generally similar way to the sequences and can be defined either as SMILES or with a corresponding CCD identifier. SDF files can also be read and converted to SMILES via an optional [RDKit](https://github.com/rdkit/rdkit) dependency. If there are multiple entries in the SDF, they are added as individual ligands. Ions are simply treated as ligands in AlphaFold3.

```shell
af3cli [...] \
    - ligand add --smiles "CCC" \
    # providing a list of CCD codes is also supported
    - ligand add --ccd "MG" \
    - ligand add --sdf ligands.sdf
```

In Python, either the parent class `Ligand` together with the respective `LigandType` or alternatively the corresponding child classes `CCDLigand` or `SMILigand` can be used to add new ligands.

```python
from af3cli import Ligand, LigandType, SMILigand
from af3cli.ligand import sdf2smiles

ligand = Ligand(
    LigandType.SMILES,
    "CCC",
    #[...]
)

# using SMILigand
# ligand = SMILigand("CCC")

builder.add_ligand(ligand)

# ...
for smi in sdf2smiles("ligands.sdf"):
    builder.add_ligand(
        Ligand(LigandType.SMILES, smi)
    )
```

### Custom CCD

Please refer to the [AlphaFold3 input documentation](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md#user-provided-ccd-format) on how to generate valid CCD mmCIF files.

The entire file content is stored as a string in the JSON file and is only stored in a variable here. A plain text file must, therefore, simply be specified for the CLI.

```shell
af3cli [...] ccd [--filename] <filename>
```

In Python, you then have to read the file yourself.

```python
builder.set_user_ccd(filecontent)
```

### Bonds

The bonded atom pairs are defined in the JSON file as a list of lists, each of which contains the Entity ID, the Residue ID and the atom name. To make it as easy as possible to add new bonds, a string format is used, which is then translated into the correct format.

```shell
# E: Entity ID; R: Residue ID N: atom name
af3cli [...] bond [--add] "E:R:N-E:R:N"

# example
af3cli [...] bond [--add] "A:1:C-B:1:O"
```

Although the sequences should be numbered in the order in which they were added, it is advisable to manually assign a sequence ID to the respective entities for the bonds (see below).

Python:

```python
from af3cli import Bond

bond = Bond.from_string("A:1:C-B:1:O")

builder.add_bonded_atom_pair(bond)
```

You can also use the `Atom` class to initialize new atoms and create a `Bond` object from any two atoms.

```python
from af3cli import Bond, Atom

atom_1 = Atom("A", 1, "C")
atom_2 = Atom("B", 1, "O")
bond = Bond(atom_1,atom_2)

builder.add_bonded_atom_pair(bond)
```

### Sequence ID Handling

The IDs for sequences, ligands, and ions are normally assigned automatically and should only be specified manually if it is really necessary, as ID clashes may occur. An `IDRegister` object keeps track of the sequences used and, if necessary, skips IDs that have already been registered.

One case where it is necessary to specify the IDs manually is for bonds between different entries, as the chain ID must be specified for the bonded atom pairs (see above).

```shell
# "A,B,..." | "(A,B,...)" | "[A,B,...]" are valid
af3cli [...] protein add [...] -i "A,B"
```

If you only want to calculate homomultimers without specifying an explicit ID, you can also specify a number.

```shell
af3cli [...] protein add [...] -n 2

# works for all sequence types and ligands/ions
af3cli [...] \
    - protein add [...] -n 5 \
    - ligand add [...] -n 5
```

You can also specify IDs or a number in connection with an SDF file, whereby it should be noted that the number of manually specified IDs must correspond to the number of ligands in the SDF file. If a number is specified, all entries in the SDF are then multiplied by this number.

In Python, the number or explicit IDs can be specified when initializing `Ligand` or `Sequence` objects. If both parameters are specified, their count must match. The registration or automatic assignment of IDs only takes place in connection with an `InputFile` object and is carried out when the file is converted into a dictionary (e.g. when the file is written).

```python
ligand = SMILigand(
    "CCC",
    seq_id=["A", "B"],
    num=2
)
```

### Merging Files

Occasionally, it can be helpful to create a base file of your system and prepare subsequent AlphaFold3 jobs by merging existing files with new entries. The `merge` command is chainable, allowing to combine several files. However, this should be done with caution if certain IDs, bonds, or seeds are important.

```shell
af3cli [...] merge [--filename] <filename>

# add new sequences
af3cli [...] merge [--filename] <filename> \
    - protein add "MVKLAGST..." \
    - ligand add --ccd "MG"

# keep IDs
af3cli [...] merge [--filename] <filename> --noreset

# override/merge special entries
af3cli [...] merge [--filename] <filename> \
    # override user-specified CCD data
    --userccd \
    # merge bonded atoms data
    --bonds \
    # merge seeds (removes duplicates)
    --seeds
```

Python:

```python
from af3cli import InputFile

input_file = InputFile()
other_input_file = InputFile.read("filename")

input_file.merge(other_input_file)

# with additional parameters
input_file.merge(
    other_input_file,
    reset=True,
    seeds=True,
    bonded_atoms=False,
    userccd=False
)
```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "af3cli",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "AlphaFold, CLI",
    "author": null,
    "author_email": "Lukas Schulig <schuligl@uni-greifswald.de>, Philipp D\u00f6pner <doepnerp@uni-greifswald.de>, Mark D\u00f6rr <mark.doerr@uni-greifswald.de>",
    "download_url": "https://files.pythonhosted.org/packages/92/c2/e3d0b3a972c8090352d5051fdfd6f9b4c5a63636a52419e04d5036bcfd61/af3cli-0.3.1.tar.gz",
    "platform": null,
    "description": "# af3cli\nA command-line interface and Python library for generating [AlphaFold3](https://github.com/google-deepmind/alphafold3) input files.\n\n## Installation\n\nWe recommend using [uv](https://github.com/astral-sh/uv) to manage your installation.\n\n```shell\nuv sync --locked\n```\n\nThis automatically creates a virtual environment `.venv` in the project folder and installs all dependencies. If you do not need the optional dependencies for reading SDF ([RDKit](https://github.com/rdkit/rdkit)) or FASTA files ([Biopython](https://github.com/biopython/biopython)), the installation can be prevented with `--no-group features`.\n\n## Basic Usage\n\nThe generation of AlphaFold3 input files can be done either with the standalone CLI tool or for more advanced tasks by using the library in Python scripts. [Python Fire](https://github.com/google/python-fire) is used to implement the CLI application.\n\nFor a detailed overview of all available JSON fields check the [AlphaFold3 input documentation](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md).\n\n> [!WARNING]\n> In most cases, checks are only carried out to ensure a correct structure of the input file, but not whether the inputs themselves are valid.\n\nYou can display the help and overview of the available CLI commands with the following statement.\n\n```shell\naf3cli -- --help\n```\n\nAll commands and sub-commands are separated with `-` to enable chaining, allowing e.g. several sequences, ligands, bonds etc. to be added.\n\n```shell\naf3cli toplevel sub [...] - sub [...] \\\n    - toplevel sub [...]\n```\n\nYou can use the `debug` command to display the final file without writing it.\n\n```shell\naf3cli debug --show - [...]\n```\n\n### Config Parameters\n\nThe `config` command is used to manage basic settings, such as the file name of the JSON file to be written, the name of the job or the respective version.\n\n```shell\naf3cli config -f \"filename.json\" -j \"jobname\" -v 2\n```\n\nThe library provides an `InputBuilder` that allows new jobs to be created very comfortably step by step.\n\n```python\nfrom af3cli import InputBuilder\n\nbuilder = InputBuilder()\nbuilder.set_name(\"jobname\")\nbuilder.set_version(2)\n# builder.set_dialect(\"alphafold3\") # default\n\ninput_file = builder.build()\ninput_file.write(\"filename.json\")\n```\n\nYou can also initialize the `InputBuilder` with an existing `InputFile` object in order to add further sequences or ligands or to change settings.\n\n### Random Seeds\n\nIt is required that at least one random seed is specified. The default value is therefore 1. Otherwise, you can either specify a number of values to generate a list of random seeds or pass a list of integers yourself.\n\n```shell\naf3cli seeds -n 10 - ...\n# generates 10 random numbers\n\naf3cli seeds -v \"1,2,3\"\n# \"(1,2,3,...)\" or \"[1,2,3,...]\" are also valid\n```\n\nPython:\n```python\nbuilder.set_seeds([1, 2, 3])\n```\n\n### Sequences\n\nAdding sequences works basically the same for all three available types, but not all JSON fields are supported for each type. The corresponding subcommands therefore differ in some cases.\n\n```shell\naf3cli [...] \\\n    - protein add \"MVKLAGST\" \\ # positional argument\n    - protein add --sequence \"AAQAA\" \\\n    - dna add --sequence \"AATTTTCC\" \\\n    - rna add --sequence \"UUUGGCCGG\"\n```\n\nA check is performed to ensure that the sequence characters match the respective type in the CLI application or in the Python library itself, when the `Sequence` object is converted into a dictionary. When using the sequence base class, the corresponding `SequenceType` must be specified. We therefore provide derived classes for the respective sequence types to simplify use.\n\n```python\nfrom af3cli import ProteinSequence # DNASequence, RNASequence\n\n# DNASequence / RNASequence\nprotein_seq = ProteinSequence(\n    \"MVKLAGST...\",\n    #[...]\n)\n\nbuilder.add_sequence(protein_seq)\n```\n\nFor DNA sequences, it is possible to generate the reverse complementary strand. The associated data, such as manually specified IDs or modifications, are not included. The CLI tool will generate appropriate warnings.\n\n```shell\naf3cli [...] \\\n    - dna add --sequence \"AATTTTCC\" --complement\n```\n\nPython:\n\n```python\nfrom af3cli import DNASequence\n\n\ndna_seq = DNASequence(\n    \"AATTTTCC...\",\n    #[...]\n)\nrc_dna_seq = dna_seq.reverse_complement()\n\nbuilder.add_sequence(dna_seq)\nbuilder.add_sequence(rc_dna_seq)\n```\n\nIf modifications or manually defined IDs are required, the complementary sequence must be created separately.\n\n#### FASTA Files\n\nAs it is often not very practical to add many or particularly long sequences via the CLI, it is possible to read the respective sequence from a FASTA file. To use this feature, [Biopython](https://github.com/biopython/biopython) must be installed as an optional dependency.\n\n```shell\naf3cli protein add --sequence <filename> --fasta\n```\n\nEach sequence command expects exactly one single sequence. Otherwise it is not possible to add additional fields, such as modifications or templates. However, it is still possible to read several sequences from a FASTA file if the additional features are not required. For even more advanced tasks, the Python API must otherwise be used.\n\n```shell\naf3cli [...] - fasta [--filename] <filename>\n```\n\nThe respective sequence type is automatically detected, which is not possible in rare cases. If this is the case, the sequence is ignored and a warning is issued. It is, therefore, advisable to add all sequences whose type cannot be clearly identified separately via the sequence commands.\n\nThere are also two ways of doing this when using Python. The `fasta2seq` function can be used to obtain a generator that automatically creates `Sequence` objects and the `read_fasta` function is used to create a generator that returns the plain FASTA IDs and sequences from the FASTA file as a string.\n\n```python\nfrom af3cli.sequence import fasta2seq, read_fasta\n\nfor seq in fasta2seq(filename):\n    ...\n    # do something with the Sequence object\n\nfor fasta_id, seq_str in read_fasta(filename):\n    ...\n    # create your own Sequence objects\n```\n\n#### Modifications\n\nBy applying the `modification` subcommand, any number of modifications can be added to the sequences with the respective CCD identifier and position. The different fields in the JSON file are automatically inserted correctly based on the sequence type.\n\n```shell\n# as positional arguments\naf3cli [...] protein [...] - modification \"SEP\" 5\n# or with explicit argument names\naf3cli [...] dna [...] - modification --mod \"6OG\" --pos 1\n```\n\nWhen using the Python API, you have to explicitly define what kind of modifications you would like to add, since the resulting JSON fields are different for protein (`ResidueModification`) or nucleotide sequences (`NucleotideModification`). Please note that checks are performed to verify the modification types when the `Sequence` object is converted to a dictionary.\n\n```python\nfrom af3cli import ProteinSequence, ResidueModification\n                   # NucleotideModification\n\nrmod = ResidueModification(\"SEP\", 5)\n\nprotein_seq = ProteinSequence(\n    \"<SEQUENCE>\",\n    # [...]\n    modifications=[rmod]\n)\n\n# it is possible to add more modifications later\nprotein_seq.modifications.append(rmod)\n```\n\n#### Structural Templates\n\nFor protein sequences, it is possible to specify multiple structural templates in mmCIF format as a string or path. Since it is completely impractical to use strings via the CLI tool, the file must be submitted as plain text and is then read in its entirety as a string. \n\n```shell\n# read the file as string with the '--read' flag\naf3cli [...] protein [...] - template [--mmcif] <filename> --read\n# keep relative/absolute path\naf3cli [...] protein [...] - template [--mmcif] <filename>\n\n# specify query and template indices as list of integers\n# \"1,2,3,...\" | \"(1,2,3,...)\" | \"[1,2,3,...]\" are valid\naf3cli [...] protein [...] \\\n    - template [--mmcif] <filename> -q \"...\" -t \"...\"\n```\n\nAs it makes no difference to Python whether the string contains a path to a file or the file content, all you need to do is specify the template type. The file must then be read manually beforehand if a string is desired in the JSON file.\n\n```python\nfrom af3cli import Template, TemplateType, ProteinSequence\n\n# TemplateType.FILE for relative/absolute path\nt = Template(\n    TemplateType.STRING,\n    \"mmCIF content\",\n    qidx=[], tidx=[]\n)\n\nprotein_seq = ProteinSequence(\n    \"<SEQUENCE>\",\n    # [...]\n    templates=[t]\n)\n\n# it is possible to add more templates later\nprotein_seq.templates.append(t)\n```\n\n#### Multiple Sequence Alignment\n\nPlease refer to the [AlphaFold3 input documentation](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md#multiple-sequence-alignment) on how to specify the MSA section for protein and RNA sequences.\n\nThe A3M-formatted content can be specified either as a path or as a string (mutally exclusive).\n\n```shell\naf3cli [...] protein [...] msa --paired ... --unpaired ...\naf3cli [...] protein [...] msa --pairedpath ... --unpairedpath ...\n```\n\nIn the case of the Python API, you must specify whether the respective string is a path.\n\n```python\nfrom af3cli import MSA\n\nmsa = MSA(\n    paired=\"...\", unpaired=\"...\",\n    paired_is_path=True, unpaired_is_path=True,\n)\n\nprotein_seq = ProteinSequence(\n    \"<SEQUENCE>\",\n    # [...]\n    msa=msa\n)\n\n# alternative\nprotein_seq._msa = msa\n```\n\n### Ligands and Ions\n\nThe ligands are treated in a generally similar way to the sequences and can be defined either as SMILES or with a corresponding CCD identifier. SDF files can also be read and converted to SMILES via an optional [RDKit](https://github.com/rdkit/rdkit) dependency. If there are multiple entries in the SDF, they are added as individual ligands. Ions are simply treated as ligands in AlphaFold3.\n\n```shell\naf3cli [...] \\\n    - ligand add --smiles \"CCC\" \\\n    # providing a list of CCD codes is also supported\n    - ligand add --ccd \"MG\" \\\n    - ligand add --sdf ligands.sdf\n```\n\nIn Python, either the parent class `Ligand` together with the respective `LigandType` or alternatively the corresponding child classes `CCDLigand` or `SMILigand` can be used to add new ligands.\n\n```python\nfrom af3cli import Ligand, LigandType, SMILigand\nfrom af3cli.ligand import sdf2smiles\n\nligand = Ligand(\n    LigandType.SMILES,\n    \"CCC\",\n    #[...]\n)\n\n# using SMILigand\n# ligand = SMILigand(\"CCC\")\n\nbuilder.add_ligand(ligand)\n\n# ...\nfor smi in sdf2smiles(\"ligands.sdf\"):\n    builder.add_ligand(\n        Ligand(LigandType.SMILES, smi)\n    )\n```\n\n### Custom CCD\n\nPlease refer to the [AlphaFold3 input documentation](https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md#user-provided-ccd-format) on how to generate valid CCD mmCIF files.\n\nThe entire file content is stored as a string in the JSON file and is only stored in a variable here. A plain text file must, therefore, simply be specified for the CLI.\n\n```shell\naf3cli [...] ccd [--filename] <filename>\n```\n\nIn Python, you then have to read the file yourself.\n\n```python\nbuilder.set_user_ccd(filecontent)\n```\n\n### Bonds\n\nThe bonded atom pairs are defined in the JSON file as a list of lists, each of which contains the Entity ID, the Residue ID and the atom name. To make it as easy as possible to add new bonds, a string format is used, which is then translated into the correct format.\n\n```shell\n# E: Entity ID; R: Residue ID N: atom name\naf3cli [...] bond [--add] \"E:R:N-E:R:N\"\n\n# example\naf3cli [...] bond [--add] \"A:1:C-B:1:O\"\n```\n\nAlthough the sequences should be numbered in the order in which they were added, it is advisable to manually assign a sequence ID to the respective entities for the bonds (see below).\n\nPython:\n\n```python\nfrom af3cli import Bond\n\nbond = Bond.from_string(\"A:1:C-B:1:O\")\n\nbuilder.add_bonded_atom_pair(bond)\n```\n\nYou can also use the `Atom` class to initialize new atoms and create a `Bond` object from any two atoms.\n\n```python\nfrom af3cli import Bond, Atom\n\natom_1 = Atom(\"A\", 1, \"C\")\natom_2 = Atom(\"B\", 1, \"O\")\nbond = Bond(atom_1,atom_2)\n\nbuilder.add_bonded_atom_pair(bond)\n```\n\n### Sequence ID Handling\n\nThe IDs for sequences, ligands, and ions are normally assigned automatically and should only be specified manually if it is really necessary, as ID clashes may occur. An `IDRegister` object keeps track of the sequences used and, if necessary, skips IDs that have already been registered.\n\nOne case where it is necessary to specify the IDs manually is for bonds between different entries, as the chain ID must be specified for the bonded atom pairs (see above).\n\n```shell\n# \"A,B,...\" | \"(A,B,...)\" | \"[A,B,...]\" are valid\naf3cli [...] protein add [...] -i \"A,B\"\n```\n\nIf you only want to calculate homomultimers without specifying an explicit ID, you can also specify a number.\n\n```shell\naf3cli [...] protein add [...] -n 2\n\n# works for all sequence types and ligands/ions\naf3cli [...] \\\n    - protein add [...] -n 5 \\\n    - ligand add [...] -n 5\n```\n\nYou can also specify IDs or a number in connection with an SDF file, whereby it should be noted that the number of manually specified IDs must correspond to the number of ligands in the SDF file. If a number is specified, all entries in the SDF are then multiplied by this number.\n\nIn Python, the number or explicit IDs can be specified when initializing `Ligand` or `Sequence` objects. If both parameters are specified, their count must match. The registration or automatic assignment of IDs only takes place in connection with an `InputFile` object and is carried out when the file is converted into a dictionary (e.g. when the file is written).\n\n```python\nligand = SMILigand(\n    \"CCC\",\n    seq_id=[\"A\", \"B\"],\n    num=2\n)\n```\n\n### Merging Files\n\nOccasionally, it can be helpful to create a base file of your system and prepare subsequent AlphaFold3 jobs by merging existing files with new entries. The `merge` command is chainable, allowing to combine several files. However, this should be done with caution if certain IDs, bonds, or seeds are important.\n\n```shell\naf3cli [...] merge [--filename] <filename>\n\n# add new sequences\naf3cli [...] merge [--filename] <filename> \\\n    - protein add \"MVKLAGST...\" \\\n    - ligand add --ccd \"MG\"\n\n# keep IDs\naf3cli [...] merge [--filename] <filename> --noreset\n\n# override/merge special entries\naf3cli [...] merge [--filename] <filename> \\\n    # override user-specified CCD data\n    --userccd \\\n    # merge bonded atoms data\n    --bonds \\\n    # merge seeds (removes duplicates)\n    --seeds\n```\n\nPython:\n\n```python\nfrom af3cli import InputFile\n\ninput_file = InputFile()\nother_input_file = InputFile.read(\"filename\")\n\ninput_file.merge(other_input_file)\n\n# with additional parameters\ninput_file.merge(\n    other_input_file,\n    reset=True,\n    seeds=True,\n    bonded_atoms=False,\n    userccd=False\n)\n```",
    "bugtrack_url": null,
    "license": null,
    "summary": "A command-line interface and Python library for generating AlphaFold3 input files.",
    "version": "0.3.1",
    "project_urls": {
        "Repository": "https://github.com/SLx64/af3cli.git"
    },
    "split_keywords": [
        "alphafold",
        " cli"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d706682bdc15a98043b8c12e31cd6af48f35d45540b8b2d9f705ee2731be9f40",
                "md5": "f915385503370f101668f1153ebd9ce8",
                "sha256": "af4f3690150d0e99c45793bd68d197d0d87236a265a28a06a6811b7dfb4bdd6d"
            },
            "downloads": -1,
            "filename": "af3cli-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f915385503370f101668f1153ebd9ce8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 34492,
            "upload_time": "2025-01-26T13:56:21",
            "upload_time_iso_8601": "2025-01-26T13:56:21.119990Z",
            "url": "https://files.pythonhosted.org/packages/d7/06/682bdc15a98043b8c12e31cd6af48f35d45540b8b2d9f705ee2731be9f40/af3cli-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "92c2e3d0b3a972c8090352d5051fdfd6f9b4c5a63636a52419e04d5036bcfd61",
                "md5": "7552bfb15c5e31acb16f0a00e132129b",
                "sha256": "bf6169d550dbf3d109dd158910cb653c2a7b3419f691bac5fd703002d4b4ae98"
            },
            "downloads": -1,
            "filename": "af3cli-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "7552bfb15c5e31acb16f0a00e132129b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 73100,
            "upload_time": "2025-01-26T13:56:23",
            "upload_time_iso_8601": "2025-01-26T13:56:23.911210Z",
            "url": "https://files.pythonhosted.org/packages/92/c2/e3d0b3a972c8090352d5051fdfd6f9b4c5a63636a52419e04d5036bcfd61/af3cli-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-26 13:56:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SLx64",
    "github_project": "af3cli",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "af3cli"
}
        
Elapsed time: 0.37730s