iadrive


Nameiadrive JSON
Version 1.0.5 PyPI version JSON
download
home_pagehttps://github.com/Andres9890/iadrive
SummaryDownload Google Drive files/folders and upload them to the Internet Archive
upload_time2025-08-28 22:26:47
maintainerNone
docs_urlNone
authorAndres99
requires_python>=3.9
licenseMIT
keywords archive.org archiving google drive internet archive file mirroring
VCS
bugtrack_url
requirements internetarchive gdown docopt-ng python-dateutil
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [License Button]: https://img.shields.io/badge/License-MIT-black
[License Link]: https://github.com/Andres9890/iadrive/blob/main/LICENSE 'MIT License.'

[PyPI Button]: https://img.shields.io/pypi/v/iadrive?color=yellow&label=PyPI
[PyPI Link]: https://pypi.org/project/iadrive/ 'PyPI Package.'

# IAdrive
[![Lint](https://github.com/Andres9890/iadrive/actions/workflows/lint.yml/badge.svg)](https://github.com/Andres9890/iadrive/actions/workflows/lint.yml)
[![Unit Tests](https://github.com/Andres9890/iadrive/actions/workflows/unit-test.yml/badge.svg)](https://github.com/Andres9890/iadrive/actions/workflows/unit-test.yml)
[![License Button]][License Link]
[![PyPI Button]][PyPI Link]

IAdrive is a tool for archiving Google Drive files/folders and Google Docs/Sheets/Slides and uploading them to the [Internet Archive](https://archive.org/), It downloads the content, creates appropriate metadata, and uploads to IA with preservation of folder structure

- This project is heavily based off [tubeup](https://github.com/bibanon/tubeup) by bibanon, credits to them

## Features

- **Google Drive Support**: Downloads files and/or folders from Google Drive using [gdown](https://github.com/wkentaro/gdown)
- **Google Docs Integration**: Directly exports Google Docs, Sheets, and Slides in multiple formats
- **Multiple Format Export**: For Google Docs, automatically exports in all available formats (PDF, DOCX, TXT, HTML, etc)
- Preserves folder structure when uploading (can be disabled with `--disable-slash-files`)
- Extract file modification dates to determine the creation date for the item
- Pass custom metadata to Archive.org using `--metadata=<key:value>`
- Supports quiet mode (`--quiet`) and debug mode (`--debug`) for log output
- Automatically cleans up downloaded files after upload
- Sanitizes identifiers and truncates subject tags to fit Archive.org requirements
- Falls back to "IAdrive" as publisher since Google Drive collaborators fetching is not yet implemented
- Improved error handling and debug output

## Installation

Requires Python 3.9 or newer

```bash
pip install iadrive
```

The package makes a console script named `iadrive` once installed, You can also install from the source using `pip install .`

## Configuration

```bash
ia configure
```

You're gonna be prompted to enter your IA account's email and password

Optional envs:

- `GOOGLE_API_KEY` – if set, the tool attempts to look up the owner names of
  the Google Drive file or folder for the `creator` field in metadata (not yet implemented)

## Usage

```bash
iadrive <url> [--metadata=<key:value>...] [--disable-slash-files] [--quiet] [--debug]
```

Arguments:

- `<url>` – Google Drive file/folder URL or Google Docs/Sheets/Slides URL to archive

Options:

- `--metadata=<key:value>` – custom metadata to add to the IA item
- `--disable-slash-files` – upload files without preserving folder structure
- `--quiet` – only print errors
- `--debug` – print all logs to stdout

## Google Docs Support

IAdrive can directly archive Google Docs, Sheets, and Slides by exporting them in all available formats, it uses public export URLs

### Available Formats

Google Documents:
- `pdf`
- `docx`
- `odt`
- `rtf`
- `txt`
- `html`
- `epub`

**Google Spreadsheets:**
- `xlsx`
- `ods`
- `pdf`
- `csv`
- `tsv`
- `html`

**Google Presentations:**
- `pdf`
- `pptx`
- `odp`
- `txt`
- `jpeg`
- `png`
- `svg`

### Automatic Export Behavior

For example, a Google Document will be automatically exported and uploaded as:
- `placeholder.pdf`
- `placeholder.docx`
- `placeholder.odt`
- `placeholder.rtf`
- `placeholder.txt`
- `placeholder.html`
- `placeholder.epub`

### Google Docs Examples

```bash

# Archive Google Document
iadrive https://docs.google.com/document/d/1abc123/edit

# Archive Google Spreadsheet
iadrive https://docs.google.com/spreadsheets/d/1abc123/edit

# Archive Google Slides with custom metadata
iadrive https://docs.google.com/presentation/d/1abc123/edit --metadata=collection:placeholder --metadata=creator:placeholder

# Debug mode with Google Docs
iadrive https://docs.google.com/document/d/1abc123/edit --debug
```

## Google Drive Examples

```bash
# Upload with folder structure preserved (default)
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder

# Upload with flat structure
iadrive https://drive.google.com/drive/folders/placeholder --disable-slash-files

# Debug mode with custom metadata
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder \
        --metadata=mediatype:data --debug
```

## Folder Structure Preservation

By default, IAdrive preserves the folder structure from Google Drive when uploading to Internet Archive, For example, if your Google Drive link contains:

```
placeholder.txt
placeholder.mp3
folder/
  ├── placeholder.pdf
  └── folder/
      └── placeholder.mp4
```

The files will be uploaded to Internet Archive as:
- `placeholder.txt`
- `placeholder.mp3`
- `folder/placeholder.pdf`
- `folder/folder/placeholder.mp4`

If you use the `--disable-slash-files` command argument, all files will be uploaded to the root level:
- `placeholder.txt`
- `placeholder.mp3`
- `placeholder.pdf`
- `placeholder.mp4`

Note: When using flat structure, duplicate filenames are automatically handled by adding a number (e.g., `placeholder.pdf`, `placeholder_1.pdf`).

## How it works

### Google Drive Files/Folders
1. `iadrive` uses `gdown` to fetch the specified Google Drive file or folder
2. It walks the downloaded directory and extracts file extensions and modification dates
3. Metadata is made including a file listing (with sizes), oldest file modification date, and original URL
4. The content is uploaded to Archive.org with identifier format `drive-{drive-id}`

### Google Docs/Sheets/Slides
1. `iadrive` detects Google Docs URLs and determines the document type
2. It automatically exports the document in **all available formats** using Google's public export URLs
3. Each format is downloaded and saved with descriptive filenames
4. Metadata includes comprehensive format information and document type
5. The content is uploaded to Archive.org with identifier format `docs-{doc-id}`

### Common Steps
- Identifiers are sanitized and subject tags are truncated to fit Archive.org requirements
- Publisher defaults to "IAdrive" since collaborator fetching is not yet implemented
- Folder structure is preserved by default (can be disabled with `--disable-slash-files`)
- Downloaded files are automatically cleaned up after upload
- Errors are handled gracefully, and debug output is available with `--debug`

## Supported Platforms

For a list of supported platforms for archiving, please see [`SUPPORTEDPLATFORMS.md`](SUPPORTEDPLATFORMS.md)

## To-do list

- Google Drive collaborator fetching to use as creator metadata through the Google API
- Batch processing

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Andres9890/iadrive",
    "name": "iadrive",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "archive.org, archiving, google drive, internet archive, file mirroring",
    "author": "Andres99",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/f3/e3/92bc621ce67c87e298f308e2e614d51b0b21b6ffc0ad38908ea071a0cf60/iadrive-1.0.5.tar.gz",
    "platform": null,
    "description": "[License Button]: https://img.shields.io/badge/License-MIT-black\r\n[License Link]: https://github.com/Andres9890/iadrive/blob/main/LICENSE 'MIT License.'\r\n\r\n[PyPI Button]: https://img.shields.io/pypi/v/iadrive?color=yellow&label=PyPI\r\n[PyPI Link]: https://pypi.org/project/iadrive/ 'PyPI Package.'\r\n\r\n# IAdrive\r\n[![Lint](https://github.com/Andres9890/iadrive/actions/workflows/lint.yml/badge.svg)](https://github.com/Andres9890/iadrive/actions/workflows/lint.yml)\r\n[![Unit Tests](https://github.com/Andres9890/iadrive/actions/workflows/unit-test.yml/badge.svg)](https://github.com/Andres9890/iadrive/actions/workflows/unit-test.yml)\r\n[![License Button]][License Link]\r\n[![PyPI Button]][PyPI Link]\r\n\r\nIAdrive is a tool for archiving Google Drive files/folders and Google Docs/Sheets/Slides and uploading them to the [Internet Archive](https://archive.org/), It downloads the content, creates appropriate metadata, and uploads to IA with preservation of folder structure\r\n\r\n- This project is heavily based off [tubeup](https://github.com/bibanon/tubeup) by bibanon, credits to them\r\n\r\n## Features\r\n\r\n- **Google Drive Support**: Downloads files and/or folders from Google Drive using [gdown](https://github.com/wkentaro/gdown)\r\n- **Google Docs Integration**: Directly exports Google Docs, Sheets, and Slides in multiple formats\r\n- **Multiple Format Export**: For Google Docs, automatically exports in all available formats (PDF, DOCX, TXT, HTML, etc)\r\n- Preserves folder structure when uploading (can be disabled with `--disable-slash-files`)\r\n- Extract file modification dates to determine the creation date for the item\r\n- Pass custom metadata to Archive.org using `--metadata=<key:value>`\r\n- Supports quiet mode (`--quiet`) and debug mode (`--debug`) for log output\r\n- Automatically cleans up downloaded files after upload\r\n- Sanitizes identifiers and truncates subject tags to fit Archive.org requirements\r\n- Falls back to \"IAdrive\" as publisher since Google Drive collaborators fetching is not yet implemented\r\n- Improved error handling and debug output\r\n\r\n## Installation\r\n\r\nRequires Python 3.9 or newer\r\n\r\n```bash\r\npip install iadrive\r\n```\r\n\r\nThe package makes a console script named `iadrive` once installed, You can also install from the source using `pip install .`\r\n\r\n## Configuration\r\n\r\n```bash\r\nia configure\r\n```\r\n\r\nYou're gonna be prompted to enter your IA account's email and password\r\n\r\nOptional envs:\r\n\r\n- `GOOGLE_API_KEY` \u2013 if set, the tool attempts to look up the owner names of\r\n  the Google Drive file or folder for the `creator` field in metadata (not yet implemented)\r\n\r\n## Usage\r\n\r\n```bash\r\niadrive <url> [--metadata=<key:value>...] [--disable-slash-files] [--quiet] [--debug]\r\n```\r\n\r\nArguments:\r\n\r\n- `<url>` \u2013 Google Drive file/folder URL or Google Docs/Sheets/Slides URL to archive\r\n\r\nOptions:\r\n\r\n- `--metadata=<key:value>` \u2013 custom metadata to add to the IA item\r\n- `--disable-slash-files` \u2013 upload files without preserving folder structure\r\n- `--quiet` \u2013 only print errors\r\n- `--debug` \u2013 print all logs to stdout\r\n\r\n## Google Docs Support\r\n\r\nIAdrive can directly archive Google Docs, Sheets, and Slides by exporting them in all available formats, it uses public export URLs\r\n\r\n### Available Formats\r\n\r\nGoogle Documents:\r\n- `pdf`\r\n- `docx`\r\n- `odt`\r\n- `rtf`\r\n- `txt`\r\n- `html`\r\n- `epub`\r\n\r\n**Google Spreadsheets:**\r\n- `xlsx`\r\n- `ods`\r\n- `pdf`\r\n- `csv`\r\n- `tsv`\r\n- `html`\r\n\r\n**Google Presentations:**\r\n- `pdf`\r\n- `pptx`\r\n- `odp`\r\n- `txt`\r\n- `jpeg`\r\n- `png`\r\n- `svg`\r\n\r\n### Automatic Export Behavior\r\n\r\nFor example, a Google Document will be automatically exported and uploaded as:\r\n- `placeholder.pdf`\r\n- `placeholder.docx`\r\n- `placeholder.odt`\r\n- `placeholder.rtf`\r\n- `placeholder.txt`\r\n- `placeholder.html`\r\n- `placeholder.epub`\r\n\r\n### Google Docs Examples\r\n\r\n```bash\r\n\r\n# Archive Google Document\r\niadrive https://docs.google.com/document/d/1abc123/edit\r\n\r\n# Archive Google Spreadsheet\r\niadrive https://docs.google.com/spreadsheets/d/1abc123/edit\r\n\r\n# Archive Google Slides with custom metadata\r\niadrive https://docs.google.com/presentation/d/1abc123/edit --metadata=collection:placeholder --metadata=creator:placeholder\r\n\r\n# Debug mode with Google Docs\r\niadrive https://docs.google.com/document/d/1abc123/edit --debug\r\n```\r\n\r\n## Google Drive Examples\r\n\r\n```bash\r\n# Upload with folder structure preserved (default)\r\niadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder\r\n\r\n# Upload with flat structure\r\niadrive https://drive.google.com/drive/folders/placeholder --disable-slash-files\r\n\r\n# Debug mode with custom metadata\r\niadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder \\\r\n        --metadata=mediatype:data --debug\r\n```\r\n\r\n## Folder Structure Preservation\r\n\r\nBy default, IAdrive preserves the folder structure from Google Drive when uploading to Internet Archive, For example, if your Google Drive link contains:\r\n\r\n```\r\nplaceholder.txt\r\nplaceholder.mp3\r\nfolder/\r\n  \u251c\u2500\u2500 placeholder.pdf\r\n  \u2514\u2500\u2500 folder/\r\n      \u2514\u2500\u2500 placeholder.mp4\r\n```\r\n\r\nThe files will be uploaded to Internet Archive as:\r\n- `placeholder.txt`\r\n- `placeholder.mp3`\r\n- `folder/placeholder.pdf`\r\n- `folder/folder/placeholder.mp4`\r\n\r\nIf you use the `--disable-slash-files` command argument, all files will be uploaded to the root level:\r\n- `placeholder.txt`\r\n- `placeholder.mp3`\r\n- `placeholder.pdf`\r\n- `placeholder.mp4`\r\n\r\nNote: When using flat structure, duplicate filenames are automatically handled by adding a number (e.g., `placeholder.pdf`, `placeholder_1.pdf`).\r\n\r\n## How it works\r\n\r\n### Google Drive Files/Folders\r\n1. `iadrive` uses `gdown` to fetch the specified Google Drive file or folder\r\n2. It walks the downloaded directory and extracts file extensions and modification dates\r\n3. Metadata is made including a file listing (with sizes), oldest file modification date, and original URL\r\n4. The content is uploaded to Archive.org with identifier format `drive-{drive-id}`\r\n\r\n### Google Docs/Sheets/Slides\r\n1. `iadrive` detects Google Docs URLs and determines the document type\r\n2. It automatically exports the document in **all available formats** using Google's public export URLs\r\n3. Each format is downloaded and saved with descriptive filenames\r\n4. Metadata includes comprehensive format information and document type\r\n5. The content is uploaded to Archive.org with identifier format `docs-{doc-id}`\r\n\r\n### Common Steps\r\n- Identifiers are sanitized and subject tags are truncated to fit Archive.org requirements\r\n- Publisher defaults to \"IAdrive\" since collaborator fetching is not yet implemented\r\n- Folder structure is preserved by default (can be disabled with `--disable-slash-files`)\r\n- Downloaded files are automatically cleaned up after upload\r\n- Errors are handled gracefully, and debug output is available with `--debug`\r\n\r\n## Supported Platforms\r\n\r\nFor a list of supported platforms for archiving, please see [`SUPPORTEDPLATFORMS.md`](SUPPORTEDPLATFORMS.md)\r\n\r\n## To-do list\r\n\r\n- Google Drive collaborator fetching to use as creator metadata through the Google API\r\n- Batch processing\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Download Google Drive files/folders and upload them to the Internet Archive",
    "version": "1.0.5",
    "project_urls": {
        "Homepage": "https://github.com/Andres9890/iadrive",
        "issues": "https://github.com/Andres9890/iadrive/issues",
        "source": "https://github.com/Andres9890/iadrive.git"
    },
    "split_keywords": [
        "archive.org",
        " archiving",
        " google drive",
        " internet archive",
        " file mirroring"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b0a914aefd23949bc97541f0ad0d9ae541f8631723024b7d63feaa408c610d88",
                "md5": "3897e392b16671d90bb17857a3d45664",
                "sha256": "90f5bde4cfca02f21495fdbb3a0b85ec1825e7645b749de1bbc84f23435f014d"
            },
            "downloads": -1,
            "filename": "iadrive-1.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3897e392b16671d90bb17857a3d45664",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 18988,
            "upload_time": "2025-08-28T22:26:45",
            "upload_time_iso_8601": "2025-08-28T22:26:45.901412Z",
            "url": "https://files.pythonhosted.org/packages/b0/a9/14aefd23949bc97541f0ad0d9ae541f8631723024b7d63feaa408c610d88/iadrive-1.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f3e392bc621ce67c87e298f308e2e614d51b0b21b6ffc0ad38908ea071a0cf60",
                "md5": "f4e569650a4ab7286e1bd2dd4df22de0",
                "sha256": "7769596fb27da5cc82893753afa689871e9473899642d1b0a7bf8fdcf69d7a84"
            },
            "downloads": -1,
            "filename": "iadrive-1.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "f4e569650a4ab7286e1bd2dd4df22de0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 19987,
            "upload_time": "2025-08-28T22:26:47",
            "upload_time_iso_8601": "2025-08-28T22:26:47.692625Z",
            "url": "https://files.pythonhosted.org/packages/f3/e3/92bc621ce67c87e298f308e2e614d51b0b21b6ffc0ad38908ea071a0cf60/iadrive-1.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 22:26:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Andres9890",
    "github_project": "iadrive",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "internetarchive",
            "specs": [
                [
                    ">=",
                    "5.5.0"
                ]
            ]
        },
        {
            "name": "gdown",
            "specs": [
                [
                    ">=",
                    "5.2.0"
                ]
            ]
        },
        {
            "name": "docopt-ng",
            "specs": [
                [
                    ">=",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    ">=",
                    "2.9.0.post0"
                ]
            ]
        }
    ],
    "lcname": "iadrive"
}
        
Elapsed time: 0.58524s