simplemagic


Namesimplemagic JSON
Version 0.1.11 PyPI version JSON
download
home_page
SummarySimple file magic. We try to get file's mimetype using 'file-magic', 'command file' and 'puremagic'. On linux we need system package 'file-libs' which mostly already installed. On MacOS we need system package 'libimage' which can be installed by 'brew install libmagic'. On windows we need file command which can be install by 'pacman -S file' within msys2. If system package missing, we try to get the file's mimetype using 'puremagic' which is write in pure python without any extra depends.
upload_time2023-05-12 15:06:46
maintainer
docs_urlNone
authorzencore
requires_python
licenseMIT
keywords libmagic file-magic file
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # simplemagic

Simple file magic. We try to get file's mimetype using 'file-magic', 'command file' and 'puremagic'. On linux we need system package 'file-libs' which mostly already installed. On MacOS we need system package 'libimage' which can be installed by 'brew install libmagic'. On windows we need file command which can be install by 'pacman -S file' within msys2. If system package missing, we try to get the file's mimetype using 'puremagic' which is write in pure python without any extra depends.

## Install

```
pip3 install simplemagic
```

## System requirements

### Linux

- file-libs

Mostly it is installed already, and you can installed it with command:

```
yum install file-libs
```

### MacOS

- libmagic

You can installed it with command:

```
brew install libmagic
```

### Windows

libmagic mostly not working on windows. Suggest you install msys2 on in system, and in msys2 you can install libmagic with command:

```
pacman -S file
```

Add msys2's bin path to your system's PATH env. We can call the external command `file` to get the mimetype of a file.

## APIS

- simplemagic.get_mimetype_by_stream
- simplemagic.get_mimetype_by_filename
- simplemagic.guess_all_extensions
- simplemagic.is_file_content_matches_with_file_extension # mostly we just use this function to check if the file cotent is matches with the file extension.
- simplemagic.file_content_matches_with_file_extension_test

You can read the source code to find other private apis which maybe you will need to reset the global settings or running env.

### simplemagic.magic.file_content_matches_with_file_extension_test

```
def file_content_matches_with_file_extension_test(
        filename,
        stream=None,
        enable_using_magic=True,
        enable_using_file_command=True,
        enable_using_puremagic=True,
        magic_content_length=MAGIC_CONTENT_LENGTH,
        lax_extensions=None,
        ):
    """Detect the file's mimetypes by it's content and test if it matches with the given file extension.

    Returns:
        (bool): True if file content matches with the file extension.
        (str): The file's extension.
        (str): The mimetype detected by the file content.

    Parameters:
        filename(str): A filename string.
        stream(file): Opened file instance. If stream is
        enable_using_magic(bool): Use libmagic engine or not. Default to True.
        enable_using_file_command(bool): Use file command or not. Default to True.
        enable_using_puremagic(bool): Use puremagic engine or not. Default to True.
        magic_content_length(int): Read max while doing file's mimetype test.
        lax_extensions(List[List[str]]): Extra information for compares. Extensions in a lax set can be used in mix.
    """
    pass
```

### simplemagic.magic.is_file_content_matches_with_file_extension

```
def is_file_content_matches_with_file_extension(*args, **kwargs):
    """Detect the file's mimetypes by it's content and test if it matches with the given file extension.

    Returns:
        (bool): True if file content matches with the file extension.

    Parameters:
        filename(str): A filename string.
        stream(file): Opened file instance. If stream is
        enable_using_magic(bool): Use libmagic engine or not. Default to True.
        enable_using_file_command(bool): Use file command or not. Default to True.
        enable_using_puremagic(bool): Use puremagic engine or not. Default to True.
        magic_content_length(int): Read max while doing file's mimetype test.
        lax_extensions(List[List[str]]): Extra information for compares. Extensions in a lax set can be used in mix.
    """
```

## Examples

```
import simplemagic

ext = ".docx"
filename = "ok.docx"
result, ext, mimetype = simplemagic.file_content_matches_with_file_extension_test(filename)
if result:
    print("the file content is matches with the file extension.")
else:
    print(f"the file content is NOT matches with the file extension.")
    print(f"The mimetype detected by the file content is {mimetype}, but the given file extension {ext} is not in the suggest extension set of this mimetype!")
```

## filemagic command util

simplemagic also ships a command util `filemagic`.

### Usage of the command filemagic

```
test@test simplemagic % filemagic --help
Usage: filemagic [OPTIONS] [FILENAME]...

  Get file's mimetype information.

Options:
  --disable-magic         Don't use libmagic.
  --disable-file-command  Don't use file command.
  --disable-puremagic     Don't use puremagic.
  --help                  Show this message and exit.
```

### Example files test result

```
test@test simplemagic % filemagic *

ok.bash_history: text/plain
ok.bash_profile: text/plain
ok.bashrc: text/plain
ok.conf: text/plain
ok.coverage: application/vnd.sqlite3
ok.csv: text/plain
ok.dat: application/octet-stream
ok.doc: application/msword
ok.docx: application/vnd.openxmlformats-officedocument.wordprocessingml.document
ok.dot: application/vnd.openxmlformats-officedocument.wordprocessingml.document
ok.dps: application/vnd.openxmlformats-officedocument.presentationml.presentation
ok.dpt: application/vnd.openxmlformats-officedocument.presentationml.presentation
ok.et: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
ok.ett: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
ok.gif: image/gif
ok.gitignore: text/plain
ok.htaccess: text/plain
ok.in: text/plain
ok.ini: text/plain
ok.java: text/x-java
ok.jpg: image/jpeg
ok.less: text/plain
ok.log: text/plain
ok.md: text/plain
ok.pages: application/zip
ok.pdf: application/pdf
ok.pl: text/x-perl
ok.png: image/png
ok.pptx: application/vnd.openxmlformats-officedocument.presentationml.presentation
ok.properties: text/plain
ok.py: text/x-script.python
ok.rpm: application/x-rpm
ok.scss: text/plain
ok.sh: text/x-shellscript
ok.sql: text/plain
ok.svg: image/svg+xml
ok.tar.gz: application/gzip
ok.ttf: font/sfnt
ok.txt: text/plain
ok.txt.bz2: application/x-bzip2
ok.whl: application/zip
ok.woff: application/octet-stream
ok.woff2: application/octet-stream
ok.wps: application/vnd.openxmlformats-officedocument.wordprocessingml.document
ok.wpt: application/vnd.openxmlformats-officedocument.wordprocessingml.document
ok.wsdl: text/xml
ok.xlsx: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
ok.xmind: application/zip
ok.xml: text/xml
ok.xsl: text/xml
ok.yml: text/plain
ok.zip: application/zip
private.DS_Store: application/octet-stream
private.bpmn: text/xml
private.cab: application/vnd.ms-cab-compressed
private.class: application/x-java-applet
private.dll: application/x-dosexec
private.dmg: application/x-bzip2
private.doc: application/msword
private.dwg: image/vnd.dwg
private.fla: application/CDFV2
private.ftl: text/html
private.ico: image/vnd.microsoft.icon
private.img: application/octet-stream
private.inf: text/plain
private.jsp: text/html
private.mht: message/rfc822
private.mp4: video/mp4
private.mpp: application/vnd.ms-office
private.msi: application/x-msi
private.pcap: application/vnd.tcpdump.pcap
private.pps: application/vnd.ms-powerpoint
private.ppt: application/vnd.ms-powerpoint
private.psd: image/vnd.adobe.photoshop
private.pyc: application/x-bytecode.python
private.rar: application/x-rar
private.reg: text/x-ms-regedit
private.swf: application/x-shockwave-flash
private.tar: application/x-tar
private.tif: image/tiff
private.vsd: application/vnd.ms-office
private.xls: application/vnd.ms-excel
private.xps: application/zip
private.xsd: text/xml
```

## Notice

Always upgrade your libmagic to the latest, old libmagic may get wrong answer.

## Compatibility

- test passed on python3.6, python3.7, python3.8, python3.9 and python3.10
- test failed on python2.7, python3.3, python3.4, python3.5

## Releases

### v0.1.0

- First release.

### v0.1.1

- Recover stream position after mimetype detect.
- Fix small file handling problem in puremagic.
- Fix .gz extension problem.
- Fix .bz2 extension problem.


### v0.1.5

- Put function is_file_content_matches_with_file_extension to public.
- Using magic.detect_from_fobj instead of magic.detect_from_content to improve the recognition.
- Change register_mimetype_extensions' parameters, and fix the problem.
- Fix .dps, .dpt, .et, .ett extension problems.
- Fix .dox problem.
- Fix .mptt problem.
- Fix .csv problem.
- Fix .pcap problem.
- Fix .rpm problem.
- Fix .dmg problem.
- Fix .reg problem.
- Fix .dwg problem.
- Fix .xps problem.
- Fix .ttf problem.
- Fix .woff and .woff2 problem.
- Fix java .class problem.
- Fix .jsp problem.
- Fix .less and .scss problem.
- Fix .pyc problem.
- Fix .fla problem.
- Fix .vsd problem.


### v0.1.7

- Add magic_content_length parameter in function is_file_content_matches_with_file_extension to control the stream content read length locally.
- Fix export api name problem.

### v0.1.8

- Add lax_extensions parameter in function is_file_content_matches_with_file_extension to support lax extension compares, especially for user missing .jpg, .png extension for images.
- Add LAX_IMAGE_EXTENSIONS = [".png", ".jpg", ".jpe", ".jpeg", ".gif", ".bmp", ".tif", ".tiff", ".webp", ".ico"].

### v0.1.9

- Remove .svg from text/plain, for both libmagic and puremagic are not treat .svg file as text/plain. libmagic treat it as image/svg+xml, and puremagic treat it as application/xml. If put .svg in candidate extensions of text/plain, in image lex compares model, will allow user upload plain text script in image file field.

### v0.1.10

- Add mimetype and file extension binding from nginx's default mimetype settings.
- Add mimetype and file extension binding from https://docs.w3cub.com/http/basics_of_http/mime_types/complete_list_of_mime_types.html.
- Add mimetype adn file extension binding from https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types.


### v0.1.11

- Add many items in EXTRA_MIMETYPE_EXTENSIONS.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "simplemagic",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "libmagic,file-magic,file",
    "author": "zencore",
    "author_email": "dobetter@zencore.cn",
    "download_url": "https://files.pythonhosted.org/packages/6f/8a/c22393f316afe60f9cdeb046aaf4910ed5aad6e99fe2c117a8c823b0052c/simplemagic-0.1.11.tar.gz",
    "platform": null,
    "description": "# simplemagic\n\nSimple file magic. We try to get file's mimetype using 'file-magic', 'command file' and 'puremagic'. On linux we need system package 'file-libs' which mostly already installed. On MacOS we need system package 'libimage' which can be installed by 'brew install libmagic'. On windows we need file command which can be install by 'pacman -S file' within msys2. If system package missing, we try to get the file's mimetype using 'puremagic' which is write in pure python without any extra depends.\n\n## Install\n\n```\npip3 install simplemagic\n```\n\n## System requirements\n\n### Linux\n\n- file-libs\n\nMostly it is installed already, and you can installed it with command:\n\n```\nyum install file-libs\n```\n\n### MacOS\n\n- libmagic\n\nYou can installed it with command:\n\n```\nbrew install libmagic\n```\n\n### Windows\n\nlibmagic mostly not working on windows. Suggest you install msys2 on in system, and in msys2 you can install libmagic with command:\n\n```\npacman -S file\n```\n\nAdd msys2's bin path to your system's PATH env. We can call the external command `file` to get the mimetype of a file.\n\n## APIS\n\n- simplemagic.get_mimetype_by_stream\n- simplemagic.get_mimetype_by_filename\n- simplemagic.guess_all_extensions\n- simplemagic.is_file_content_matches_with_file_extension # mostly we just use this function to check if the file cotent is matches with the file extension.\n- simplemagic.file_content_matches_with_file_extension_test\n\nYou can read the source code to find other private apis which maybe you will need to reset the global settings or running env.\n\n### simplemagic.magic.file_content_matches_with_file_extension_test\n\n```\ndef file_content_matches_with_file_extension_test(\n        filename,\n        stream=None,\n        enable_using_magic=True,\n        enable_using_file_command=True,\n        enable_using_puremagic=True,\n        magic_content_length=MAGIC_CONTENT_LENGTH,\n        lax_extensions=None,\n        ):\n    \"\"\"Detect the file's mimetypes by it's content and test if it matches with the given file extension.\n\n    Returns:\n        (bool): True if file content matches with the file extension.\n        (str): The file's extension.\n        (str): The mimetype detected by the file content.\n\n    Parameters:\n        filename(str): A filename string.\n        stream(file): Opened file instance. If stream is\n        enable_using_magic(bool): Use libmagic engine or not. Default to True.\n        enable_using_file_command(bool): Use file command or not. Default to True.\n        enable_using_puremagic(bool): Use puremagic engine or not. Default to True.\n        magic_content_length(int): Read max while doing file's mimetype test.\n        lax_extensions(List[List[str]]): Extra information for compares. Extensions in a lax set can be used in mix.\n    \"\"\"\n    pass\n```\n\n### simplemagic.magic.is_file_content_matches_with_file_extension\n\n```\ndef is_file_content_matches_with_file_extension(*args, **kwargs):\n    \"\"\"Detect the file's mimetypes by it's content and test if it matches with the given file extension.\n\n    Returns:\n        (bool): True if file content matches with the file extension.\n\n    Parameters:\n        filename(str): A filename string.\n        stream(file): Opened file instance. If stream is\n        enable_using_magic(bool): Use libmagic engine or not. Default to True.\n        enable_using_file_command(bool): Use file command or not. Default to True.\n        enable_using_puremagic(bool): Use puremagic engine or not. Default to True.\n        magic_content_length(int): Read max while doing file's mimetype test.\n        lax_extensions(List[List[str]]): Extra information for compares. Extensions in a lax set can be used in mix.\n    \"\"\"\n```\n\n## Examples\n\n```\nimport simplemagic\n\next = \".docx\"\nfilename = \"ok.docx\"\nresult, ext, mimetype = simplemagic.file_content_matches_with_file_extension_test(filename)\nif result:\n    print(\"the file content is matches with the file extension.\")\nelse:\n    print(f\"the file content is NOT matches with the file extension.\")\n    print(f\"The mimetype detected by the file content is {mimetype}, but the given file extension {ext} is not in the suggest extension set of this mimetype!\")\n```\n\n## filemagic command util\n\nsimplemagic also ships a command util `filemagic`.\n\n### Usage of the command filemagic\n\n```\ntest@test simplemagic % filemagic --help\nUsage: filemagic [OPTIONS] [FILENAME]...\n\n  Get file's mimetype information.\n\nOptions:\n  --disable-magic         Don't use libmagic.\n  --disable-file-command  Don't use file command.\n  --disable-puremagic     Don't use puremagic.\n  --help                  Show this message and exit.\n```\n\n### Example files test result\n\n```\ntest@test simplemagic % filemagic *\n\nok.bash_history: text/plain\nok.bash_profile: text/plain\nok.bashrc: text/plain\nok.conf: text/plain\nok.coverage: application/vnd.sqlite3\nok.csv: text/plain\nok.dat: application/octet-stream\nok.doc: application/msword\nok.docx: application/vnd.openxmlformats-officedocument.wordprocessingml.document\nok.dot: application/vnd.openxmlformats-officedocument.wordprocessingml.document\nok.dps: application/vnd.openxmlformats-officedocument.presentationml.presentation\nok.dpt: application/vnd.openxmlformats-officedocument.presentationml.presentation\nok.et: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\nok.ett: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\nok.gif: image/gif\nok.gitignore: text/plain\nok.htaccess: text/plain\nok.in: text/plain\nok.ini: text/plain\nok.java: text/x-java\nok.jpg: image/jpeg\nok.less: text/plain\nok.log: text/plain\nok.md: text/plain\nok.pages: application/zip\nok.pdf: application/pdf\nok.pl: text/x-perl\nok.png: image/png\nok.pptx: application/vnd.openxmlformats-officedocument.presentationml.presentation\nok.properties: text/plain\nok.py: text/x-script.python\nok.rpm: application/x-rpm\nok.scss: text/plain\nok.sh: text/x-shellscript\nok.sql: text/plain\nok.svg: image/svg+xml\nok.tar.gz: application/gzip\nok.ttf: font/sfnt\nok.txt: text/plain\nok.txt.bz2: application/x-bzip2\nok.whl: application/zip\nok.woff: application/octet-stream\nok.woff2: application/octet-stream\nok.wps: application/vnd.openxmlformats-officedocument.wordprocessingml.document\nok.wpt: application/vnd.openxmlformats-officedocument.wordprocessingml.document\nok.wsdl: text/xml\nok.xlsx: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\nok.xmind: application/zip\nok.xml: text/xml\nok.xsl: text/xml\nok.yml: text/plain\nok.zip: application/zip\nprivate.DS_Store: application/octet-stream\nprivate.bpmn: text/xml\nprivate.cab: application/vnd.ms-cab-compressed\nprivate.class: application/x-java-applet\nprivate.dll: application/x-dosexec\nprivate.dmg: application/x-bzip2\nprivate.doc: application/msword\nprivate.dwg: image/vnd.dwg\nprivate.fla: application/CDFV2\nprivate.ftl: text/html\nprivate.ico: image/vnd.microsoft.icon\nprivate.img: application/octet-stream\nprivate.inf: text/plain\nprivate.jsp: text/html\nprivate.mht: message/rfc822\nprivate.mp4: video/mp4\nprivate.mpp: application/vnd.ms-office\nprivate.msi: application/x-msi\nprivate.pcap: application/vnd.tcpdump.pcap\nprivate.pps: application/vnd.ms-powerpoint\nprivate.ppt: application/vnd.ms-powerpoint\nprivate.psd: image/vnd.adobe.photoshop\nprivate.pyc: application/x-bytecode.python\nprivate.rar: application/x-rar\nprivate.reg: text/x-ms-regedit\nprivate.swf: application/x-shockwave-flash\nprivate.tar: application/x-tar\nprivate.tif: image/tiff\nprivate.vsd: application/vnd.ms-office\nprivate.xls: application/vnd.ms-excel\nprivate.xps: application/zip\nprivate.xsd: text/xml\n```\n\n## Notice\n\nAlways upgrade your libmagic to the latest, old libmagic may get wrong answer.\n\n## Compatibility\n\n- test passed on python3.6, python3.7, python3.8, python3.9 and python3.10\n- test failed on python2.7, python3.3, python3.4, python3.5\n\n## Releases\n\n### v0.1.0\n\n- First release.\n\n### v0.1.1\n\n- Recover stream position after mimetype detect.\n- Fix small file handling problem in puremagic.\n- Fix .gz extension problem.\n- Fix .bz2 extension problem.\n\n\n### v0.1.5\n\n- Put function is_file_content_matches_with_file_extension to public.\n- Using magic.detect_from_fobj instead of magic.detect_from_content to improve the recognition.\n- Change register_mimetype_extensions' parameters, and fix the problem.\n- Fix .dps, .dpt, .et, .ett extension problems.\n- Fix .dox problem.\n- Fix .mptt problem.\n- Fix .csv problem.\n- Fix .pcap problem.\n- Fix .rpm problem.\n- Fix .dmg problem.\n- Fix .reg problem.\n- Fix .dwg problem.\n- Fix .xps problem.\n- Fix .ttf problem.\n- Fix .woff and .woff2 problem.\n- Fix java .class problem.\n- Fix .jsp problem.\n- Fix .less and .scss problem.\n- Fix .pyc problem.\n- Fix .fla problem.\n- Fix .vsd problem.\n\n\n### v0.1.7\n\n- Add magic_content_length parameter in function is_file_content_matches_with_file_extension to control the stream content read length locally.\n- Fix export api name problem.\n\n### v0.1.8\n\n- Add lax_extensions parameter in function is_file_content_matches_with_file_extension to support lax extension compares, especially for user missing .jpg, .png extension for images.\n- Add LAX_IMAGE_EXTENSIONS = [\".png\", \".jpg\", \".jpe\", \".jpeg\", \".gif\", \".bmp\", \".tif\", \".tiff\", \".webp\", \".ico\"].\n\n### v0.1.9\n\n- Remove .svg from text/plain, for both libmagic and puremagic are not treat .svg file as text/plain. libmagic treat it as image/svg+xml, and puremagic treat it as application/xml. If put .svg in candidate extensions of text/plain, in image lex compares model, will allow user upload plain text script in image file field.\n\n### v0.1.10\n\n- Add mimetype and file extension binding from nginx's default mimetype settings.\n- Add mimetype and file extension binding from https://docs.w3cub.com/http/basics_of_http/mime_types/complete_list_of_mime_types.html.\n- Add mimetype adn file extension binding from https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types.\n\n\n### v0.1.11\n\n- Add many items in EXTRA_MIMETYPE_EXTENSIONS.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Simple file magic. We try to get file's mimetype using 'file-magic', 'command file' and 'puremagic'. On linux we need system package 'file-libs' which mostly already installed. On MacOS we need system package 'libimage' which can be installed by 'brew install libmagic'. On windows we need file command which can be install by 'pacman -S file' within msys2. If system package missing, we try to get the file's mimetype using 'puremagic' which is write in pure python without any extra depends.",
    "version": "0.1.11",
    "project_urls": null,
    "split_keywords": [
        "libmagic",
        "file-magic",
        "file"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5d5e46fd9bdd1bcc9e5bb473c754b562d93d667dfce3be3edcb8875fb81a99ea",
                "md5": "0dbd666c9a511a2ce3f5a04ccd5cb989",
                "sha256": "62065231b13c427c8efad450e23905c68d379c4009897d9a679a43561849aceb"
            },
            "downloads": -1,
            "filename": "simplemagic-0.1.11-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0dbd666c9a511a2ce3f5a04ccd5cb989",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 12070,
            "upload_time": "2023-05-12T15:06:44",
            "upload_time_iso_8601": "2023-05-12T15:06:44.168637Z",
            "url": "https://files.pythonhosted.org/packages/5d/5e/46fd9bdd1bcc9e5bb473c754b562d93d667dfce3be3edcb8875fb81a99ea/simplemagic-0.1.11-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6f8ac22393f316afe60f9cdeb046aaf4910ed5aad6e99fe2c117a8c823b0052c",
                "md5": "eaf678cb30b8a90d02993e7db4b6f072",
                "sha256": "25174d46db8b996e1b21aa45085b917624f108721fddee5ae27d58454ebf7667"
            },
            "downloads": -1,
            "filename": "simplemagic-0.1.11.tar.gz",
            "has_sig": false,
            "md5_digest": "eaf678cb30b8a90d02993e7db4b6f072",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 13389,
            "upload_time": "2023-05-12T15:06:46",
            "upload_time_iso_8601": "2023-05-12T15:06:46.955463Z",
            "url": "https://files.pythonhosted.org/packages/6f/8a/c22393f316afe60f9cdeb046aaf4910ed5aad6e99fe2c117a8c823b0052c/simplemagic-0.1.11.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-12 15:06:46",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "simplemagic"
}
        
Elapsed time: 0.68282s