mft2parquet


Namemft2parquet JSON
Version 0.11 PyPI version JSON
download
home_pagehttps://github.com/hansalemaos/mft2parquet
Summarymft to parquet (pyarrow dtypes)
upload_time2023-09-14 03:08:15
maintainer
docs_urlNone
authorJohannes Fischer
requires_python
licenseMIT
keywords mft file search
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# mft to parquet (pyarrow dtypes)

## Tested against Windows 10 / Python 3.11 / Anaconda

### pip install mft2parquet


```python
Reads HDD (Hard Disk Drive) information from a specified drive and returns it as a pandas DataFrame.

Args:
drive (str, optional): The drive path to read from. Default is "c:\\".
outputfile (str, optional): If provided, the DataFrame will be saved as a Parquet file at this path.
					  Default is None.

Returns:
pd.DataFrame: A DataFrame with pyarrow dtypes containing HDD information with the specified columns.

Raises:
subprocess.CalledProcessError: If the external command fails to execute.

Note:
- This function uses an external command-line utility https://github.com/githubrobbi/Ultra-Fast-File-Search to retrieve HDD information.
- The DataFrame will have the following columns:
- aa_path
- aa_name
- aa_path_only
- aa_size
- aa_size_on_disk
- aa_created
- aa_last_written
- aa_last_accessed
- aa_descendents
- aa_read-only
- aa_archive
- aa_system
- aa_hidden
- aa_offline
- aa_not_content_indexed_file
- aa_no_scrub_file
- aa_integrity
- aa_pinned
- aa_unpinned
- aa_directory_flag
- aa_compressed
- aa_encrypted
- aa_sparse
- aa_reparse
- aa_attributes

Example:
df = read_hdd(drive="d:\\", outputfile="hdd_info.parquet")
# Reads HDD information from the 'D:' drive and saves it as 'hdd_info.parquet'.
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/hansalemaos/mft2parquet",
    "name": "mft2parquet",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "mft,file,search",
    "author": "Johannes Fischer",
    "author_email": "aulasparticularesdealemaosp@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/89/80/3ad5d0ae0df238fcc0108cb5531012047741a1f0a6aec23862f6989e16f7/mft2parquet-0.11.tar.gz",
    "platform": null,
    "description": "\r\n# mft to parquet (pyarrow dtypes)\r\n\r\n## Tested against Windows 10 / Python 3.11 / Anaconda\r\n\r\n### pip install mft2parquet\r\n\r\n\r\n```python\r\nReads HDD (Hard Disk Drive) information from a specified drive and returns it as a pandas DataFrame.\r\n\r\nArgs:\r\ndrive (str, optional): The drive path to read from. Default is \"c:\\\\\".\r\noutputfile (str, optional): If provided, the DataFrame will be saved as a Parquet file at this path.\r\n\t\t\t\t\t  Default is None.\r\n\r\nReturns:\r\npd.DataFrame: A DataFrame with pyarrow dtypes containing HDD information with the specified columns.\r\n\r\nRaises:\r\nsubprocess.CalledProcessError: If the external command fails to execute.\r\n\r\nNote:\r\n- This function uses an external command-line utility https://github.com/githubrobbi/Ultra-Fast-File-Search to retrieve HDD information.\r\n- The DataFrame will have the following columns:\r\n- aa_path\r\n- aa_name\r\n- aa_path_only\r\n- aa_size\r\n- aa_size_on_disk\r\n- aa_created\r\n- aa_last_written\r\n- aa_last_accessed\r\n- aa_descendents\r\n- aa_read-only\r\n- aa_archive\r\n- aa_system\r\n- aa_hidden\r\n- aa_offline\r\n- aa_not_content_indexed_file\r\n- aa_no_scrub_file\r\n- aa_integrity\r\n- aa_pinned\r\n- aa_unpinned\r\n- aa_directory_flag\r\n- aa_compressed\r\n- aa_encrypted\r\n- aa_sparse\r\n- aa_reparse\r\n- aa_attributes\r\n\r\nExample:\r\ndf = read_hdd(drive=\"d:\\\\\", outputfile=\"hdd_info.parquet\")\r\n# Reads HDD information from the 'D:' drive and saves it as 'hdd_info.parquet'.\r\n```\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "mft to parquet (pyarrow dtypes)",
    "version": "0.11",
    "project_urls": {
        "Homepage": "https://github.com/hansalemaos/mft2parquet"
    },
    "split_keywords": [
        "mft",
        "file",
        "search"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5b611ce7c4389039c4ff723171dc32e0c0c996f34fa3a4a0a30dfd8717308516",
                "md5": "c4ba995b68b1a2dc9bb87c9691f33a4e",
                "sha256": "b7bcad862194241311b99cc0e571a03ebebdf0493d164cbef0358096ce718ecb"
            },
            "downloads": -1,
            "filename": "mft2parquet-0.11-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c4ba995b68b1a2dc9bb87c9691f33a4e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 681541,
            "upload_time": "2023-09-14T03:08:11",
            "upload_time_iso_8601": "2023-09-14T03:08:11.131957Z",
            "url": "https://files.pythonhosted.org/packages/5b/61/1ce7c4389039c4ff723171dc32e0c0c996f34fa3a4a0a30dfd8717308516/mft2parquet-0.11-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "89803ad5d0ae0df238fcc0108cb5531012047741a1f0a6aec23862f6989e16f7",
                "md5": "c235d79266314ab7661d0449f653e9c5",
                "sha256": "38eb1b4fea7f7397809d3cf1dba04afee50a6151bff7b5e9c3d3e02222147761"
            },
            "downloads": -1,
            "filename": "mft2parquet-0.11.tar.gz",
            "has_sig": false,
            "md5_digest": "c235d79266314ab7661d0449f653e9c5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 679240,
            "upload_time": "2023-09-14T03:08:15",
            "upload_time_iso_8601": "2023-09-14T03:08:15.015042Z",
            "url": "https://files.pythonhosted.org/packages/89/80/3ad5d0ae0df238fcc0108cb5531012047741a1f0a6aec23862f6989e16f7/mft2parquet-0.11.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-14 03:08:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hansalemaos",
    "github_project": "mft2parquet",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "mft2parquet"
}
        
Elapsed time: 0.14826s