Name | magika JSON |
Version |
0.5.1
JSON |
| download |
home_page | |
Summary | A tool to determine the content type of a file with deep-learning |
upload_time | 2024-03-07 16:44:24 |
maintainer | |
docs_url | None |
author | Yanick Fratantonio |
requires_python | >=3.8,<3.13 |
license | |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Magika Python Package
Magika is a novel AI powered file type detection tool that rely on the recent advance of deep learning to provide accurate detection. Under the hood, Magika employs a custom, highly optimized Keras model that only weighs about 1MB, and enables precise file identification within milliseconds, even when running on a single CPU.
Use Magika as a command line client or in your Python code!
Please check out Magika on GitHub for more information and documentation: [https://github.com/google/magika](https://github.com/google/magika).
## Installing Magika
```shell
$ pip install magika
```
If you intend to use Magika only as a command line, you may want to use `$ pipx install magika` instead.
## Using Magika as a command-line tool
```shell
$ magika examples/*
code.asm: Assembly (code)
code.py: Python source (code)
doc.docx: Microsoft Word 2007+ document (document)
doc.ini: INI configuration file (text)
elf64.elf: ELF executable (executable)
flac.flac: FLAC audio bitstream data (audio)
image.bmp: BMP image data (image)
java.class: Java compiled bytecode (executable)
jpg.jpg: JPEG image data (image)
pdf.pdf: PDF document (document)
pe32.exe: PE executable (executable)
png.png: PNG image data (image)
README.md: Markdown document (text)
tar.tar: POSIX tar archive (archive)
webm.webm: WebM data (video)
```
```help
$ magika --help
Usage: magika [OPTIONS] [FILE]...
Magika - Determine type of FILEs with deep-learning.
Options:
-r, --recursive When passing this option, magika scans every
file within directories, instead of
outputting "directory"
--json Output in JSON format.
--jsonl Output in JSONL format.
-i, --mime-type Output the MIME type instead of a verbose
content type description.
-l, --label Output a simple label instead of a verbose
content type description. Use --list-output-
content-types for the list of supported
output.
-c, --compatibility-mode Compatibility mode: output is as close as
possible to `file` and colors are disabled.
-s, --output-score Output the prediction score in addition to
the content type.
-m, --prediction-mode [best-guess|medium-confidence|high-confidence]
--batch-size INTEGER How many files to process in one batch.
--no-dereference This option causes symlinks not to be
followed. By default, symlinks are
dereferenced.
--colors / --no-colors Enable/disable use of colors.
-v, --verbose Enable more verbose output.
-vv, --debug Enable debug logging.
--generate-report Generate report useful when reporting
feedback.
--version Print the version and exit.
--list-output-content-types Show a list of supported content types.
--model-dir DIRECTORY Use a custom model.
-h, --help Show this message and exit.
Magika version: "0.5.0"
Default model: "standard_v1"
Send any feedback to magika-dev@google.com or via GitHub issues.
```
## Using Magika as a Python module
```python
from magika import Magika
magika = Magika()
result = magika.identify_bytes(b"# Example\nThis is an example of markdown!")
print(result.output.ct_label) # Output: "markdown"
```
## Citation
If you use this software for your research, please cite it as:
```bibtex
@software{magika,
author = {Fratantonio, Yanick and Bursztein, Elie and Invernizzi, Luca and Zhang, Marina and Metitieri, Giancarlo and Kurt, Thomas and Galilee, Francois and Petit-Bianco, Alexandre and Farah, Loua and Albertini, Ange},
title = {{Magika content-type scanner}},
url = {https://github.com/google/magika}
}
```
Raw data
{
"_id": null,
"home_page": "",
"name": "magika",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<3.13",
"maintainer_email": "",
"keywords": "",
"author": "Yanick Fratantonio",
"author_email": "yanickf@google.com",
"download_url": "https://files.pythonhosted.org/packages/1a/58/c1d8887354d0ff2256d4d78d08a69bcc55719a0189afa706c51da04390f2/magika-0.5.1.tar.gz",
"platform": null,
"description": "# Magika Python Package\n\nMagika is a novel AI powered file type detection tool that rely on the recent advance of deep learning to provide accurate detection. Under the hood, Magika employs a custom, highly optimized Keras model that only weighs about 1MB, and enables precise file identification within milliseconds, even when running on a single CPU.\n\nUse Magika as a command line client or in your Python code!\n\nPlease check out Magika on GitHub for more information and documentation: [https://github.com/google/magika](https://github.com/google/magika).\n\n\n## Installing Magika\n\n```shell\n$ pip install magika\n```\n\nIf you intend to use Magika only as a command line, you may want to use `$ pipx install magika` instead.\n\n\n## Using Magika as a command-line tool\n\n```shell\n$ magika examples/*\ncode.asm: Assembly (code)\ncode.py: Python source (code)\ndoc.docx: Microsoft Word 2007+ document (document)\ndoc.ini: INI configuration file (text)\nelf64.elf: ELF executable (executable)\nflac.flac: FLAC audio bitstream data (audio)\nimage.bmp: BMP image data (image)\njava.class: Java compiled bytecode (executable)\njpg.jpg: JPEG image data (image)\npdf.pdf: PDF document (document)\npe32.exe: PE executable (executable)\npng.png: PNG image data (image)\nREADME.md: Markdown document (text)\ntar.tar: POSIX tar archive (archive)\nwebm.webm: WebM data (video)\n```\n\n```help\n$ magika --help\nUsage: magika [OPTIONS] [FILE]...\n\n Magika - Determine type of FILEs with deep-learning.\n\nOptions:\n -r, --recursive When passing this option, magika scans every\n file within directories, instead of\n outputting \"directory\"\n --json Output in JSON format.\n --jsonl Output in JSONL format.\n -i, --mime-type Output the MIME type instead of a verbose\n content type description.\n -l, --label Output a simple label instead of a verbose\n content type description. Use --list-output-\n content-types for the list of supported\n output.\n -c, --compatibility-mode Compatibility mode: output is as close as\n possible to `file` and colors are disabled.\n -s, --output-score Output the prediction score in addition to\n the content type.\n -m, --prediction-mode [best-guess|medium-confidence|high-confidence]\n --batch-size INTEGER How many files to process in one batch.\n --no-dereference This option causes symlinks not to be\n followed. By default, symlinks are\n dereferenced.\n --colors / --no-colors Enable/disable use of colors.\n -v, --verbose Enable more verbose output.\n -vv, --debug Enable debug logging.\n --generate-report Generate report useful when reporting\n feedback.\n --version Print the version and exit.\n --list-output-content-types Show a list of supported content types.\n --model-dir DIRECTORY Use a custom model.\n -h, --help Show this message and exit.\n\n Magika version: \"0.5.0\"\n\n Default model: \"standard_v1\"\n\n Send any feedback to magika-dev@google.com or via GitHub issues.\n```\n\n\n## Using Magika as a Python module\n\n```python\nfrom magika import Magika\nmagika = Magika()\nresult = magika.identify_bytes(b\"# Example\\nThis is an example of markdown!\")\nprint(result.output.ct_label) # Output: \"markdown\"\n```\n\n\n## Citation\nIf you use this software for your research, please cite it as:\n```bibtex\n@software{magika,\nauthor = {Fratantonio, Yanick and Bursztein, Elie and Invernizzi, Luca and Zhang, Marina and Metitieri, Giancarlo and Kurt, Thomas and Galilee, Francois and Petit-Bianco, Alexandre and Farah, Loua and Albertini, Ange},\ntitle = {{Magika content-type scanner}},\nurl = {https://github.com/google/magika}\n}\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "A tool to determine the content type of a file with deep-learning",
"version": "0.5.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6679e1c167ec35060692b70bfc4f2d0aa9314dd7e37ba8e30c1c27965e2f1daa",
"md5": "b7198531cbbf7985862259bb10653a2f",
"sha256": "a4d1f64f71460f335841c13c3d16cfc2cb21e839c1898a1ae9bd5adc8d66cb2b"
},
"downloads": -1,
"filename": "magika-0.5.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b7198531cbbf7985862259bb10653a2f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<3.13",
"size": 1008301,
"upload_time": "2024-03-07T16:44:22",
"upload_time_iso_8601": "2024-03-07T16:44:22.222115Z",
"url": "https://files.pythonhosted.org/packages/66/79/e1c167ec35060692b70bfc4f2d0aa9314dd7e37ba8e30c1c27965e2f1daa/magika-0.5.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1a58c1d8887354d0ff2256d4d78d08a69bcc55719a0189afa706c51da04390f2",
"md5": "278586fcc194faa4b2b3df09961c7654",
"sha256": "43dc1153a1637327225a626a1550c0a395a1d45ea33ec1f5d46b9b080238bee0"
},
"downloads": -1,
"filename": "magika-0.5.1.tar.gz",
"has_sig": false,
"md5_digest": "278586fcc194faa4b2b3df09961c7654",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<3.13",
"size": 1005077,
"upload_time": "2024-03-07T16:44:24",
"upload_time_iso_8601": "2024-03-07T16:44:24.377635Z",
"url": "https://files.pythonhosted.org/packages/1a/58/c1d8887354d0ff2256d4d78d08a69bcc55719a0189afa706c51da04390f2/magika-0.5.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-07 16:44:24",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "magika"
}