AstroPT


NameAstroPT JSON
Version 1.0.0 PyPI version JSON
download
home_pageNone
SummaryTransformer for galaxy images (and general astronomy)
upload_time2024-12-08 18:18:26
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements datasets gudhi h5py kmapper matplotlib multiprocess numpy Requests scikit_learn scipy tiktoken torch tqdm traces transformers umap_learn wandb
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
    
<img src="assets/emoji.png" alt="earthPT" width="150"/>

[![ICML](https://img.shields.io/badge/AI4Science@ICML-2024---?logo=https%3A%2F%2Fneurips.cc%2Fstatic%2Fcore%2Fimg%2FNeurIPS-logo.svg&labelColor=68448B&color=b3b3b3)](https://openreview.net/forum?id=aOLuuLxqav)
[![arXiv](https://img.shields.io/badge/arXiv-2405.14930---?logo=arXiv&labelColor=b31b1b&color=grey)](https://arxiv.org/abs/2405.14930)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![All Contributors](https://img.shields.io/badge/all_contributors-4-orange.svg?style=flat-square)](#contributors-)
</div>


# astroPT: a Large Observation Model for astronomy 🔭

Welcome to our simple repository for training astronomical large observation
models. This repository began its life as Andrej Karpathy's
[nanoGPT](https://github.com/karpathy/nanoGPT), and has been altered so that it
is usable for imagery data.  Within `train.py` you will find a ~300-line
boilerplate training loop and within `model.py` you will find a ~300-line GPT
model definition with an MLP tokeniser and a regressive loss.

Check out the [UniverseTBD](https://universetbd.org/) Discord for updates:
[https://discord.gg/MNEVegvfJq](https://discord.gg/MNEVegvfJq)

## install

Dependencies:

- `pip install -r requirements.txt`

## results

AstroPT v1.0.0 has been trained on 8.6M galaxy grz band `*.png` postage stamps 
downloaded from DESI-LS DR8 to see if neural scaling laws apply to galaxian
data (in other words, to see if `more galaxy data == more better model`).  
We tried to make the astroPT model as simple as possible so that other
modalities can be easily folded in. We also choose to use a causally trained
autoregressive transformer model as our backbone so that our work can more
easily integrate the wider deep learning FOSS community.

Our pretraining task is feeding in our galaxy images patch-by-patch and
predicting the next patch in our galaxy patch sequence. We follow ViT
and define a patch as a 16 by 16 pixel square, and feed the galaxy patches
in a spiral order:

<p align="center">
    <img src="explore/galaxy.png" alt="galaxy" width="128"/>
</p>

The trained model results are promising -- below we show our full training run
validation losses across a parameter sweep of `{1,5,12,21,89,309,830,2100}M`
trainable parameters:

<p align="center">
    <img src="explore/scaling_xkcd.png" alt="scaling" width="512"/>
</p>

We also test our astroPT models on some scientifically-useful downstream tasks by
taking the models' penultimate layer outputs and finetuning linear probes to
predict emergent physical properties of the galaxies:

<p align="center">
    <img src="explore/downstream_xkcd.png" alt="downstream" width="512"/>
</p>

In the above pic, $M_g$ and $M_z$ are the absolute magnitudes (or brightness at
a fixed distance) of the galaxies, $g - r$ and $r - z$ are the differences
between the observations of different telescope filter bands, redshift is the
distance to the galaxies, sSFR is the total mass of new stars born each year in
the galaxies per total galaxy mass, and $M_{\*}$ is the total mass of stars within
the galaxies. "smooth?", "disc?", "artefact?", "edge on?" and "tight spiral?" are
morphological properties of the galaxies as described by citizen scientists.

The cool thing to take away from these plots is that the surrogate task loss
(predicting the next patch in a sequence of ViT-like galaxy image patches)
is correlated with astronomically "useful" downstream tasks 🤯🚀.

Finally, check out our UMAP projection of astroPT-87M's penultimate layer
outputs of our validation set. We colour each point with an emergent physical
galaxy property described above. The structure suggests that the model has
learnt some knowledge about physics simply from our next-token prediction
pretraining task!

<p align="center">
    <img src="explore/hexbin_xkcd.png" alt="hexbin" width="512"/>
</p>

## pretrained weights, and full galaxy dataset

Check out the paper here: [https://arxiv.org/abs/2405.14930](https://arxiv.org/abs/2405.14930).

We of course release all our model weights checkpointed across our full training runs on [HuggingFace 🤗 here](https://huggingface.co/Smith42/astroPT).

We also release our full dataset and galaxy metadata on [HuggingFace 🔥](https://huggingface.co/datasets/Smith42/galaxies).

## contributors

<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
  <tbody>
    <tr>
      <td align="center" valign="top" width="14.28%"><a href="https://github.com/RJ-Roberts"><img src="https://avatars.githubusercontent.com/u/131991163?v=4?s=100" width="100px;" alt="Ryan Roberts"/><br /><sub><b>Ryan Roberts</b></sub></a><br /><a href="https://github.com/Smith42/astroPT/commits?author=RJ-Roberts" title="Code">💻</a> <a href="#ideas-RJ-Roberts" title="Ideas, Planning, & Feedback">🤔</a> <a href="#content-RJ-Roberts" title="Content">🖋</a></td>
      <td align="center" valign="top" width="14.28%"><a href="https://mjjsmith.com/"><img src="https://avatars.githubusercontent.com/u/8194280?v=4?s=100" width="100px;" alt="Mike Smith"/><br /><sub><b>Mike Smith</b></sub></a><br /><a href="https://github.com/Smith42/astroPT/commits?author=Smith42" title="Code">💻</a> <a href="#ideas-Smith42" title="Ideas, Planning, & Feedback">🤔</a> <a href="#content-Smith42" title="Content">🖋</a> <a href="#data-Smith42" title="Data">🔣</a></td>
      <td align="center" valign="top" width="14.28%"><a href="https://github.com/mhuertascompany"><img src="https://avatars.githubusercontent.com/u/22987973?v=4?s=100" width="100px;" alt="mhuertascompany"/><br /><sub><b>mhuertascompany</b></sub></a><br /><a href="#ideas-mhuertascompany" title="Ideas, Planning, & Feedback">🤔</a> <a href="#content-mhuertascompany" title="Content">🖋</a></td>
      <td align="center" valign="top" width="14.28%"><a href="https://github.com/msiudek"><img src="https://avatars.githubusercontent.com/u/53626980?v=4?s=100" width="100px;" alt="Malgorzata Siudek"/><br /><sub><b>Malgorzata Siudek</b></sub></a><br /><a href="#ideas-msiudek" title="Ideas, Planning, & Feedback">🤔</a> <a href="#content-msiudek" title="Content">🖋</a> <a href="https://github.com/Smith42/astroPT/commits?author=msiudek" title="Code">💻</a> <a href="#data-msiudek" title="Data">🔣</a></td>
    </tr>
  </tbody>
  <tfoot>
    <tr>
      <td align="center" size="13px" colspan="7">
        <img src="https://raw.githubusercontent.com/all-contributors/all-contributors-cli/1b8533af435da9854653492b1327a23a4dbd0a10/assets/logo-small.svg">
          <a href="https://all-contributors.js.org/docs/en/bot/usage">Add your contributions</a>
        </img>
      </td>
    </tr>
  </tfoot>
</table>

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->

<!-- ALL-CONTRIBUTORS-LIST:END -->

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "AstroPT",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "\"Michael J. Smith\" <mike@mjjsmith.com>",
    "download_url": "https://files.pythonhosted.org/packages/ce/55/c1a9a885463fd197b670208a014a598a8c8f412c72c037a1b78b926e64d9/astropt-1.0.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n    \n<img src=\"assets/emoji.png\" alt=\"earthPT\" width=\"150\"/>\n\n[![ICML](https://img.shields.io/badge/AI4Science@ICML-2024---?logo=https%3A%2F%2Fneurips.cc%2Fstatic%2Fcore%2Fimg%2FNeurIPS-logo.svg&labelColor=68448B&color=b3b3b3)](https://openreview.net/forum?id=aOLuuLxqav)\n[![arXiv](https://img.shields.io/badge/arXiv-2405.14930---?logo=arXiv&labelColor=b31b1b&color=grey)](https://arxiv.org/abs/2405.14930)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![All Contributors](https://img.shields.io/badge/all_contributors-4-orange.svg?style=flat-square)](#contributors-)\n</div>\n\n\n# astroPT: a Large Observation Model for astronomy \ud83d\udd2d\n\nWelcome to our simple repository for training astronomical large observation\nmodels. This repository began its life as Andrej Karpathy's\n[nanoGPT](https://github.com/karpathy/nanoGPT), and has been altered so that it\nis usable for imagery data.  Within `train.py` you will find a ~300-line\nboilerplate training loop and within `model.py` you will find a ~300-line GPT\nmodel definition with an MLP tokeniser and a regressive loss.\n\nCheck out the [UniverseTBD](https://universetbd.org/) Discord for updates:\n[https://discord.gg/MNEVegvfJq](https://discord.gg/MNEVegvfJq)\n\n## install\n\nDependencies:\n\n- `pip install -r requirements.txt`\n\n## results\n\nAstroPT v1.0.0 has been trained on 8.6M galaxy grz band `*.png` postage stamps \ndownloaded from DESI-LS DR8 to see if neural scaling laws apply to galaxian\ndata (in other words, to see if `more galaxy data == more better model`).  \nWe tried to make the astroPT model as simple as possible so that other\nmodalities can be easily folded in. We also choose to use a causally trained\nautoregressive transformer model as our backbone so that our work can more\neasily integrate the wider deep learning FOSS community.\n\nOur pretraining task is feeding in our galaxy images patch-by-patch and\npredicting the next patch in our galaxy patch sequence. We follow ViT\nand define a patch as a 16 by 16 pixel square, and feed the galaxy patches\nin a spiral order:\n\n<p align=\"center\">\n    <img src=\"explore/galaxy.png\" alt=\"galaxy\" width=\"128\"/>\n</p>\n\nThe trained model results are promising -- below we show our full training run\nvalidation losses across a parameter sweep of `{1,5,12,21,89,309,830,2100}M`\ntrainable parameters:\n\n<p align=\"center\">\n    <img src=\"explore/scaling_xkcd.png\" alt=\"scaling\" width=\"512\"/>\n</p>\n\nWe also test our astroPT models on some scientifically-useful downstream tasks by\ntaking the models' penultimate layer outputs and finetuning linear probes to\npredict emergent physical properties of the galaxies:\n\n<p align=\"center\">\n    <img src=\"explore/downstream_xkcd.png\" alt=\"downstream\" width=\"512\"/>\n</p>\n\nIn the above pic, $M_g$ and $M_z$ are the absolute magnitudes (or brightness at\na fixed distance) of the galaxies, $g - r$ and $r - z$ are the differences\nbetween the observations of different telescope filter bands, redshift is the\ndistance to the galaxies, sSFR is the total mass of new stars born each year in\nthe galaxies per total galaxy mass, and $M_{\\*}$ is the total mass of stars within\nthe galaxies. \"smooth?\", \"disc?\", \"artefact?\", \"edge on?\" and \"tight spiral?\" are\nmorphological properties of the galaxies as described by citizen scientists.\n\nThe cool thing to take away from these plots is that the surrogate task loss\n(predicting the next patch in a sequence of ViT-like galaxy image patches)\nis correlated with astronomically \"useful\" downstream tasks \ud83e\udd2f\ud83d\ude80.\n\nFinally, check out our UMAP projection of astroPT-87M's penultimate layer\noutputs of our validation set. We colour each point with an emergent physical\ngalaxy property described above. The structure suggests that the model has\nlearnt some knowledge about physics simply from our next-token prediction\npretraining task!\n\n<p align=\"center\">\n    <img src=\"explore/hexbin_xkcd.png\" alt=\"hexbin\" width=\"512\"/>\n</p>\n\n## pretrained weights, and full galaxy dataset\n\nCheck out the paper here: [https://arxiv.org/abs/2405.14930](https://arxiv.org/abs/2405.14930).\n\nWe of course release all our model weights checkpointed across our full training runs on [HuggingFace \ud83e\udd17 here](https://huggingface.co/Smith42/astroPT).\n\nWe also release our full dataset and galaxy metadata on [HuggingFace \ud83d\udd25](https://huggingface.co/datasets/Smith42/galaxies).\n\n## contributors\n\n<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->\n<!-- prettier-ignore-start -->\n<!-- markdownlint-disable -->\n<table>\n  <tbody>\n    <tr>\n      <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://github.com/RJ-Roberts\"><img src=\"https://avatars.githubusercontent.com/u/131991163?v=4?s=100\" width=\"100px;\" alt=\"Ryan Roberts\"/><br /><sub><b>Ryan Roberts</b></sub></a><br /><a href=\"https://github.com/Smith42/astroPT/commits?author=RJ-Roberts\" title=\"Code\">\ud83d\udcbb</a> <a href=\"#ideas-RJ-Roberts\" title=\"Ideas, Planning, & Feedback\">\ud83e\udd14</a> <a href=\"#content-RJ-Roberts\" title=\"Content\">\ud83d\udd8b</a></td>\n      <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://mjjsmith.com/\"><img src=\"https://avatars.githubusercontent.com/u/8194280?v=4?s=100\" width=\"100px;\" alt=\"Mike Smith\"/><br /><sub><b>Mike Smith</b></sub></a><br /><a href=\"https://github.com/Smith42/astroPT/commits?author=Smith42\" title=\"Code\">\ud83d\udcbb</a> <a href=\"#ideas-Smith42\" title=\"Ideas, Planning, & Feedback\">\ud83e\udd14</a> <a href=\"#content-Smith42\" title=\"Content\">\ud83d\udd8b</a> <a href=\"#data-Smith42\" title=\"Data\">\ud83d\udd23</a></td>\n      <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://github.com/mhuertascompany\"><img src=\"https://avatars.githubusercontent.com/u/22987973?v=4?s=100\" width=\"100px;\" alt=\"mhuertascompany\"/><br /><sub><b>mhuertascompany</b></sub></a><br /><a href=\"#ideas-mhuertascompany\" title=\"Ideas, Planning, & Feedback\">\ud83e\udd14</a> <a href=\"#content-mhuertascompany\" title=\"Content\">\ud83d\udd8b</a></td>\n      <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://github.com/msiudek\"><img src=\"https://avatars.githubusercontent.com/u/53626980?v=4?s=100\" width=\"100px;\" alt=\"Malgorzata Siudek\"/><br /><sub><b>Malgorzata Siudek</b></sub></a><br /><a href=\"#ideas-msiudek\" title=\"Ideas, Planning, & Feedback\">\ud83e\udd14</a> <a href=\"#content-msiudek\" title=\"Content\">\ud83d\udd8b</a> <a href=\"https://github.com/Smith42/astroPT/commits?author=msiudek\" title=\"Code\">\ud83d\udcbb</a> <a href=\"#data-msiudek\" title=\"Data\">\ud83d\udd23</a></td>\n    </tr>\n  </tbody>\n  <tfoot>\n    <tr>\n      <td align=\"center\" size=\"13px\" colspan=\"7\">\n        <img src=\"https://raw.githubusercontent.com/all-contributors/all-contributors-cli/1b8533af435da9854653492b1327a23a4dbd0a10/assets/logo-small.svg\">\n          <a href=\"https://all-contributors.js.org/docs/en/bot/usage\">Add your contributions</a>\n        </img>\n      </td>\n    </tr>\n  </tfoot>\n</table>\n\n<!-- markdownlint-restore -->\n<!-- prettier-ignore-end -->\n\n<!-- ALL-CONTRIBUTORS-LIST:END -->\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Transformer for galaxy images (and general astronomy)",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://github.com/smith42/astropt",
        "Issues": "https://github.com/smith42/astropt/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "273f7f4cdd8cd55017583f51c2be0e75f28fd97dfeb30208aba502b7f5d99684",
                "md5": "8f7c5f39a2e48538cfbdb346db679108",
                "sha256": "b0eb9d75ee99e8d8051dbbc8f2b293f2a6ef2ea4fb480f7cee5301d4dc924b88"
            },
            "downloads": -1,
            "filename": "astropt-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8f7c5f39a2e48538cfbdb346db679108",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 4537,
            "upload_time": "2024-12-08T18:18:21",
            "upload_time_iso_8601": "2024-12-08T18:18:21.188169Z",
            "url": "https://files.pythonhosted.org/packages/27/3f/7f4cdd8cd55017583f51c2be0e75f28fd97dfeb30208aba502b7f5d99684/astropt-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ce55c1a9a885463fd197b670208a014a598a8c8f412c72c037a1b78b926e64d9",
                "md5": "857940045c0274ff0d68d61bdee96e69",
                "sha256": "24576c1b6749addded08e29c48e1ae78995f6a7281e1016bcd754adecaddf511"
            },
            "downloads": -1,
            "filename": "astropt-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "857940045c0274ff0d68d61bdee96e69",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 58351425,
            "upload_time": "2024-12-08T18:18:26",
            "upload_time_iso_8601": "2024-12-08T18:18:26.319976Z",
            "url": "https://files.pythonhosted.org/packages/ce/55/c1a9a885463fd197b670208a014a598a8c8f412c72c037a1b78b926e64d9/astropt-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-08 18:18:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "smith42",
    "github_project": "astropt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "datasets",
            "specs": [
                [
                    "==",
                    "2.13.1"
                ]
            ]
        },
        {
            "name": "gudhi",
            "specs": [
                [
                    "==",
                    "3.7.1"
                ]
            ]
        },
        {
            "name": "h5py",
            "specs": [
                [
                    "==",
                    "3.8.0"
                ]
            ]
        },
        {
            "name": "kmapper",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.7.1"
                ]
            ]
        },
        {
            "name": "multiprocess",
            "specs": [
                [
                    "==",
                    "0.70.14"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.24.3"
                ]
            ]
        },
        {
            "name": "Requests",
            "specs": [
                [
                    "==",
                    "2.32.2"
                ]
            ]
        },
        {
            "name": "scikit_learn",
            "specs": [
                [
                    "==",
                    "1.2.2"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.11.1"
                ]
            ]
        },
        {
            "name": "tiktoken",
            "specs": [
                [
                    "==",
                    "0.3.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.66.3"
                ]
            ]
        },
        {
            "name": "traces",
            "specs": [
                [
                    "==",
                    "0.6.0"
                ]
            ]
        },
        {
            "name": "transformers",
            "specs": [
                [
                    "==",
                    "4.38.0"
                ]
            ]
        },
        {
            "name": "umap_learn",
            "specs": [
                [
                    "==",
                    "0.5.3"
                ]
            ]
        },
        {
            "name": "wandb",
            "specs": [
                [
                    "==",
                    "0.15.4"
                ]
            ]
        }
    ],
    "lcname": "astropt"
}
        
Elapsed time: 1.18308s