# atai
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
Atomic AI is a flexible, minimalist deep neural network training
framework based on Jeremy Howard’s
[miniai](https://github.com/fastai/course22p2/tree/master) from the
[fast.ai 2022](https://course.fast.ai/Lessons/part2.html) course.
## Install
``` sh
pip install atai
```
## How to use
``` python
from atai.core import *
```
The following example demonstrates how the Atomic AI training framework
can be used to train a custom model that predicts protein solubility.
### Imports
``` python
import torch
import torch.nn.functional as F
import torch.nn as nn
from torch.nn import init
from torch import optim
from torcheval.metrics import BinaryAccuracy, BinaryAUROC
from torcheval.metrics.functional import binary_auroc, binary_accuracy
from torchmetrics.classification import BinaryMatthewsCorrCoef
from torchmetrics.functional.classification import binary_matthews_corrcoef
import fastcore.all as fc
from functools import partial
```
### Load Protein Solubility
This example uses the dataset from the
[DeepSol](https://doi.org/10.1093/bioinformatics/bty166) paper by
Khurana *et al.* which was obtained at
<https://zenodo.org/records/1162886>. It consists of amino acid
sequences of peptides along with solubility labels that are `1` if the
peptide is soluble and `0` if the peptide is insoluble.
``` python
train_sqs = open('sol_data/train_src', 'r').read().splitlines()
train_tgs = list(map(int, open('sol_data/train_tgt', 'r').read().splitlines()))
valid_sqs = open('sol_data/val_src', 'r').read().splitlines()
valid_tgs = list(map(int, open('sol_data/val_tgt', 'r').read().splitlines()))
train_sqs[:2], train_tgs[:2]
```
(['GMILKTNLFGHTYQFKSITDVLAKANEEKSGDRLAGVAAESAEERVAAKVVLSKMTLGDLRNNPVVPYETDEVTRIIQDQVNDRIHDSIKNWTVEELREWILDHKTTDADIKRVARGLTSEIIAAVTKLMSNLDLIYGAKKIRVIAHANTTIGLPGTFSARLQPNHPTDDPDGILASLMEGLTYGIGDAVIGLNPVDDSTDSVVRLLNKFEEFRSKWDVPTQTCVLAHVKTQMEAMRRGAPTGLVFQSIAGSEKGNTAFGFDGATIEEARQLALQSGAATGPNVMYFETGQGSELSSDAHFGVDQVTMEARCYGFAKKFDPFLVNTVVGFIGPEYLYDSKQVIRAGLEDHFMGKLTGISMGCDVCYTNHMKADQNDVENLSVLLTAAGCNFIMGIPHGDDVMLNYQTTGYHETATLRELFGLKPIKEFDQWMEKMGFSENGKLTSRAGDASIFLK',
'MAHHHHHHMSFFRMKRRLNFVVKRGIEELWENSFLDNNVDMKKIEYSKTGDAWPCVLLRKKSFEDLHKLYYICLKEKNKLLGEQYFHLQNSTKMLQHGRLKKVKLTMKRILTVLSRRAIHDQCLRAKDMLKKQEEREFYEIQKFKLNEQLLCLKHKMNILKKYNSFSLEQISLTFSIKKIENKIQQIDIILNPLRKETMYLLIPHFKYQRKYSDLPGFISWKKQNIIALRNNMSKLHRLY'],
[1, 0])
``` python
len(train_sqs), len(train_tgs), len(valid_sqs), len(valid_tgs)
```
(62478, 62478, 6942, 6942)
### Data Preparation
Create a sorted list of amino acid sequences `aas` including an empty
string for padding and determine the size of the vocabulary.
``` python
aas = sorted(list(set("".join(train_sqs))) + [""])
vocab_size = len(aas)
aas, vocab_size
```
(['',
'A',
'C',
'D',
'E',
'F',
'G',
'H',
'I',
'K',
'L',
'M',
'N',
'P',
'Q',
'R',
'S',
'T',
'V',
'W',
'Y'],
21)
Create dictionaries that translate between string and integer
representations of amino acids and define the corresponding `encode` and
`decode` functions.
``` python
str2int = {aa:i for i, aa in enumerate(aas)}
int2str = {i:aa for i, aa in enumerate(aas)}
encode = lambda s: [str2int[aa] for aa in s]
decode = lambda l: ''.join([int2str[i] for i in l])
print(encode("AYWCCCGGGHH"))
print(decode(encode("AYWCCCGGGHH")))
```
[1, 20, 19, 2, 2, 2, 6, 6, 6, 7, 7]
AYWCCCGGGHH
Figure out what the range of lengths of amino acid sequences in the
dataset is.
``` python
train_lens = list(map(len, train_sqs))
min(train_lens), max(train_lens)
```
(19, 1691)
Create a function that drops all sequences above a chosen threshold and
also returns a list of indices of the sequences that meet the threshold
that can be used to obtain the correct labels.
``` python
def drop_long_sqs(sqs, threshold=1200):
new_sqs = []
idx = []
for i, sq in enumerate(sqs):
if len(sq) <= threshold:
new_sqs.append(sq)
idx.append(i)
return new_sqs, idx
```
Drop all sequences above your chosen threshold.
``` python
trnsqs, trnidx = drop_long_sqs(train_sqs, threshold=200)
vldsqs, vldidx = drop_long_sqs(valid_sqs, threshold=200)
```
``` python
len(trnidx), len(vldidx)
```
(18066, 1971)
``` python
max(map(len, trnsqs))
```
200
Create a function for zero padding all sequences.
``` python
def zero_pad(sq, length=1200):
new_sq = sq.copy()
if len(new_sq) < length:
new_sq.extend([0] * (length-len(new_sq)))
return new_sq
```
Now encode and zero pad all sequences and make sure that it worked out
correctly.
``` python
trn = list(map(encode, trnsqs))
vld = list(map(encode, vldsqs))
print(f"Length of the first two sequences before zero padding: {len(trn[0])}, {len(trn[1])}")
trn = list(map(partial(zero_pad, length=200), trn))
vld = list(map(partial(zero_pad, length=200), vld))
print(f"Length of the first two sequences after zero padding: {len(trn[0])}, {len(trn[1])}");
```
Length of the first two sequences before zero padding: 116, 135
Length of the first two sequences after zero padding: 200, 200
Convert the data to `torch.tensor`s unsing `dtype=torch.int64` and check
for correctness.
``` python
trntns = torch.tensor(trn, dtype=torch.int64)
vldtns = torch.tensor(vld, dtype=torch.int64)
trntns.shape, trntns[0]
```
(torch.Size([18066, 200]),
tensor([11, 9, 1, 10, 2, 10, 10, 10, 10, 13, 18, 10, 6, 10, 10, 18, 16, 16, 9, 17, 10, 2, 16, 11, 4, 4, 1, 8, 12, 4, 15, 8, 14,
4, 18, 1, 6, 16, 10, 8, 5, 15, 1, 8, 16, 16, 8, 6, 10, 4, 2, 14, 16, 18, 17, 16, 15, 6, 3, 10, 1, 17, 2, 13, 15, 6,
5, 1, 18, 17, 6, 2, 17, 2, 6, 16, 1, 2, 6, 16, 19, 3, 18, 15, 1, 4, 17, 17, 2, 7, 2, 14, 2, 1, 6, 11, 3, 19, 17,
6, 1, 15, 2, 2, 15, 18, 14, 13, 10, 4, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]))
``` python
trntns.shape, vldtns.shape
```
(torch.Size([18066, 200]), torch.Size([1971, 200]))
Obtain the correct labels using the lists of indices obtained from the
`drop_long_sqs` function and convert the lists of labels to tensors in
`torch.float32` format.
``` python
trnlbs = torch.tensor(train_tgs, dtype=torch.float32)[trnidx]
vldlbs = torch.tensor(valid_tgs, dtype=torch.float32)[vldidx]
trnlbs.shape, vldlbs.shape
```
(torch.Size([18066]), torch.Size([1971]))
Calculate the ratios of soluble peptides in the train and valid data.
``` python
trnlbs.sum().item()/trnlbs.shape[0], vldlbs.sum().item()/vldlbs.shape[0]
```
(0.4722129967895494, 0.4657534246575342)
These ratios tell us that there are slightly less than half soluble
proteins in the training an validation data, and slightly more than half
in the test set.
### Dataset and DataLoaders
Turn train and valid data into datasets using the
[`Dataset`](https://frenio.github.io/atai/core.html#dataset) class.
``` python
trnds = Dataset(trntns, trnlbs)
vldds = Dataset(vldtns, vldlbs)
trnds[0]
```
(tensor([11, 9, 1, 10, 2, 10, 10, 10, 10, 13, 18, 10, 6, 10, 10, 18, 16, 16, 9, 17, 10, 2, 16, 11, 4, 4, 1, 8, 12, 4, 15, 8, 14,
4, 18, 1, 6, 16, 10, 8, 5, 15, 1, 8, 16, 16, 8, 6, 10, 4, 2, 14, 16, 18, 17, 16, 15, 6, 3, 10, 1, 17, 2, 13, 15, 6,
5, 1, 18, 17, 6, 2, 17, 2, 6, 16, 1, 2, 6, 16, 19, 3, 18, 15, 1, 4, 17, 17, 2, 7, 2, 14, 2, 1, 6, 11, 3, 19, 17,
6, 1, 15, 2, 2, 15, 18, 14, 13, 10, 4, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]),
tensor(0.))
Use the [`get_dls`](https://frenio.github.io/atai/core.html#get_dls)
function to obtain the dataloaders from the train and valid datasets.
``` python
dls = get_dls(trnds, vldds, bs=32)
next(iter(dls.train))[0][:2], next(iter(dls.train))[1][:2]
```
(tensor([[10, 15, 16, 18, 5, 18, 16, 6, 5, 13, 15, 6, 18, 3, 16, 1, 14, 10, 16, 4, 20, 5, 10, 1, 5, 6, 13, 18, 1, 16, 18, 18, 11,
3, 9, 3, 9, 6, 18, 5, 1, 8, 18, 4, 11, 6, 3, 18, 6, 1, 15, 4, 1, 18, 10, 16, 14, 16, 14, 7, 16, 10, 6, 6, 7, 15,
10, 15, 18, 15, 13, 15, 4, 14, 9, 4, 5, 14, 16, 13, 1, 16, 9, 16, 13, 9, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[11, 12, 13, 16, 1, 13, 16, 20, 13, 11, 1, 16, 10, 20, 18, 6, 3, 10, 7, 13, 3, 18, 17, 4, 1, 11, 10, 20, 4, 9, 5, 16, 13,
1, 6, 13, 8, 10, 16, 8, 15, 18, 2, 15, 3, 11, 8, 17, 15, 15, 16, 10, 6, 20, 1, 20, 18, 12, 5, 14, 14, 13, 1, 3, 1, 4,
15, 1, 10, 3, 17, 11, 12, 5, 3, 18, 8, 9, 6, 9, 13, 18, 15, 8, 11, 19, 16, 14, 15, 3, 13, 16, 10, 15, 9, 16, 6, 18, 6,
12, 8, 5, 8, 9, 12, 10, 3, 9, 16, 8, 3, 12, 9, 1, 10, 20, 3, 17, 5, 16, 1, 5, 6, 12, 8, 10, 16, 2, 9, 18, 18, 2,
3, 4, 12, 6, 16, 9, 6, 20, 6, 5, 18, 7, 5, 4, 17, 14, 4, 1, 1, 4, 15, 1, 8, 4, 9, 11, 12, 6, 11, 10, 10, 12, 3,
15, 9, 18, 5, 18, 6, 15, 5, 9, 16, 15, 9, 4, 15, 4, 1, 4, 10, 6, 1, 15, 1, 9, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]]),
tensor([1., 1.]))
### Design Your Model
Let’s create a tiny model (~10k parameters) that uses a sequence of
1-dimensional convolutional layers with skip connections, kaiming he
initialization, leaky relus, batchnorm, and dropout.
First, obtain a single batch from `dls` to help design the model.
``` python
idx = next(iter(dls.train))[0] ## a single batch
idx, idx.shape
```
(tensor([[11, 9, 9, ..., 0, 0, 0],
[11, 3, 20, ..., 0, 0, 0],
[ 1, 5, 4, ..., 0, 0, 0],
...,
[ 1, 18, 18, ..., 0, 0, 0],
[10, 6, 5, ..., 0, 0, 0],
[ 2, 8, 1, ..., 0, 0, 0]]),
torch.Size([32, 200]))
#### Custom Modules
``` python
def conv1d(ni, nf, ks=3, stride=2, act=nn.ReLU, norm=None, bias=None):
if bias is None: bias = not isinstance(norm, (nn.BatchNorm1d,nn.BatchNorm2d,nn.BatchNorm3d))
layers = [nn.Conv1d(ni, nf, stride=stride, kernel_size=ks, padding=ks//2, bias=bias)]
if norm: layers.append(norm(nf))
if act: layers.append(act())
return nn.Sequential(*layers)
def _conv1d_block(ni, nf, stride, act=nn.ReLU, norm=None, ks=3):
return nn.Sequential(conv1d(ni, nf, stride=1, act=act, norm=norm, ks=ks),
conv1d(nf, nf, stride=stride, act=None, norm=norm, ks=ks))
class ResBlock1d(nn.Module):
def __init__(self, ni, nf, stride=1, ks=3, act=nn.ReLU, norm=None):
super().__init__()
self.convs = _conv1d_block(ni, nf, stride=stride, ks=ks, act=act, norm=norm)
self.idconv = fc.noop if ni==nf else conv1d(ni, nf, stride=1, ks=1, act=None)
self.pool = fc.noop if stride==1 else nn.AvgPool1d(stride, ceil_mode=True)
self.act = act()
def forward(self, x): return self.act(self.convs(x) + self.pool(self.idconv(x)))
```
The following module switches the rank order from BLC to BCL.
``` python
class Reshape(nn.Module):
def forward(self, x):
B, L, C = x.shape
return x.view(B, C, L)
```
#### Model Architecture
``` python
lr = 1e-2
epochs = 30
n_embd = 16
dls = get_dls(trnds, vldds, bs=32)
act_genrelu = partial(GeneralRelu, leak=0.1, sub=0.4)
model = nn.Sequential(nn.Embedding(vocab_size, n_embd, padding_idx=0), Reshape(),
ResBlock1d(n_embd, 2, ks=15, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(2, 4, ks=13, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(4, 4, ks=11, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(4, 4, ks=9, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(4, 8, ks=7, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(8, 8, ks=5, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(8, 16, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(16, 32, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
nn.Flatten(1, -1),
nn.Linear(32, 1),
nn.Flatten(0, -1),
nn.Sigmoid())
model(idx).shape
```
torch.Size([32])
``` python
iw = partial(init_weights, leaky=0.1)
model = model.apply(iw)
metrics = MetricsCB(BinaryAccuracy(), BinaryMatthewsCorrCoef(), BinaryAUROC())
astats = ActivationStats(fc.risinstance(GeneralRelu))
cbs = [DeviceCB(), ProgressCB(plot=False), metrics, astats]
learn = TrainLearner(model, dls, F.binary_cross_entropy, lr=lr, cbs=cbs, opt_func=torch.optim.AdamW)
print(f"Parameters total: {sum(p.nelement() for p in model.parameters())}")
learn.lr_find(start_lr=1e-4, gamma=1.05, av_over=3, max_mult=5)
```
Parameters total: 10175
<style>
/* Turns off some styling */
progress {
/* gets rid of default border in Firefox and Opera. */
border: none;
/* Needs to be in here for Safari polyfill so background images work as expected. */
background-size: auto;
}
progress:not([value]), progress:not([value])::-webkit-progress-bar {
background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
<div>
<progress value='0' class='' max='10' style='width:300px; height:20px; vertical-align: middle;'></progress>
0.00% [0/10 00:00<?]
</div>
<div>
<progress value='492' class='' max='565' style='width:300px; height:20px; vertical-align: middle;'></progress>
87.08% [492/565 00:33<00:04 2.113]
</div>
![](index_files/figure-commonmark/cell-25-output-4.png)
This is a pretty noisy training set, so the learning rate finder does
not work very well. Yet it is possible to get a somewhat informative
result using the `av_over` keyword argument that tells
[`lr_find`](https://frenio.github.io/atai/core.html#lr_find) to average
over the specified number of batches for each learning rate tested. It
also helps to dial the `gamma` value down from its default value of
`1.3`.
### Training
``` python
learn.fit(epochs)
```
<style>
/* Turns off some styling */
progress {
/* gets rid of default border in Firefox and Opera. */
border: none;
/* Needs to be in here for Safari polyfill so background images work as expected. */
background-size: auto;
}
progress:not([value]), progress:not([value])::-webkit-progress-bar {
background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
<div>
| BinaryAccuracy | BinaryMatthewsCorrCoef | BinaryAUROC | loss | epoch | train |
|----------------|------------------------|-------------|-------|-------|-------|
| 0.507 | 0.001 | 0.504 | 0.718 | 0 | train |
| 0.534 | -0.021 | 0.514 | 0.691 | 0 | eval |
| 0.515 | 0.008 | 0.506 | 0.697 | 1 | train |
| 0.539 | 0.040 | 0.530 | 0.691 | 1 | eval |
| 0.519 | 0.015 | 0.509 | 0.695 | 2 | train |
| 0.534 | 0.003 | 0.498 | 0.691 | 2 | eval |
| 0.519 | 0.017 | 0.517 | 0.694 | 3 | train |
| 0.525 | 0.036 | 0.527 | 0.692 | 3 | eval |
| 0.582 | 0.155 | 0.592 | 0.675 | 4 | train |
| 0.638 | 0.283 | 0.649 | 0.633 | 4 | eval |
| 0.618 | 0.242 | 0.629 | 0.652 | 5 | train |
| 0.651 | 0.321 | 0.666 | 0.619 | 5 | eval |
| 0.627 | 0.266 | 0.630 | 0.647 | 6 | train |
| 0.656 | 0.329 | 0.664 | 0.618 | 6 | eval |
| 0.635 | 0.286 | 0.637 | 0.641 | 7 | train |
| 0.647 | 0.332 | 0.679 | 0.623 | 7 | eval |
| 0.636 | 0.284 | 0.648 | 0.637 | 8 | train |
| 0.645 | 0.309 | 0.680 | 0.614 | 8 | eval |
| 0.635 | 0.281 | 0.647 | 0.636 | 9 | train |
| 0.649 | 0.346 | 0.682 | 0.627 | 9 | eval |
| 0.636 | 0.286 | 0.656 | 0.632 | 10 | train |
| 0.654 | 0.340 | 0.684 | 0.613 | 10 | eval |
| 0.646 | 0.304 | 0.666 | 0.627 | 11 | train |
| 0.653 | 0.314 | 0.686 | 0.614 | 11 | eval |
| 0.645 | 0.299 | 0.665 | 0.627 | 12 | train |
| 0.666 | 0.327 | 0.702 | 0.605 | 12 | eval |
| 0.648 | 0.301 | 0.676 | 0.622 | 13 | train |
| 0.665 | 0.340 | 0.704 | 0.600 | 13 | eval |
| 0.657 | 0.319 | 0.684 | 0.619 | 14 | train |
| 0.670 | 0.333 | 0.717 | 0.609 | 14 | eval |
| 0.656 | 0.318 | 0.688 | 0.616 | 15 | train |
| 0.655 | 0.313 | 0.703 | 0.619 | 15 | eval |
| 0.652 | 0.308 | 0.682 | 0.619 | 16 | train |
| 0.674 | 0.352 | 0.719 | 0.597 | 16 | eval |
| 0.659 | 0.324 | 0.692 | 0.614 | 17 | train |
| 0.662 | 0.348 | 0.717 | 0.606 | 17 | eval |
| 0.658 | 0.320 | 0.693 | 0.613 | 18 | train |
| 0.668 | 0.336 | 0.718 | 0.604 | 18 | eval |
| 0.657 | 0.316 | 0.698 | 0.612 | 19 | train |
| 0.662 | 0.339 | 0.715 | 0.602 | 19 | eval |
| 0.660 | 0.325 | 0.697 | 0.612 | 20 | train |
| 0.662 | 0.330 | 0.716 | 0.607 | 20 | eval |
| 0.665 | 0.334 | 0.700 | 0.611 | 21 | train |
| 0.662 | 0.317 | 0.703 | 0.609 | 21 | eval |
| 0.665 | 0.337 | 0.698 | 0.609 | 22 | train |
| 0.666 | 0.326 | 0.712 | 0.607 | 22 | eval |
| 0.666 | 0.336 | 0.703 | 0.608 | 23 | train |
| 0.677 | 0.360 | 0.723 | 0.591 | 23 | eval |
| 0.663 | 0.330 | 0.703 | 0.608 | 24 | train |
| 0.676 | 0.358 | 0.722 | 0.595 | 24 | eval |
| 0.666 | 0.335 | 0.708 | 0.607 | 25 | train |
| 0.675 | 0.361 | 0.723 | 0.594 | 25 | eval |
| 0.661 | 0.323 | 0.707 | 0.608 | 26 | train |
| 0.664 | 0.321 | 0.708 | 0.615 | 26 | eval |
| 0.664 | 0.328 | 0.710 | 0.605 | 27 | train |
| 0.663 | 0.322 | 0.723 | 0.602 | 27 | eval |
| 0.669 | 0.337 | 0.711 | 0.606 | 28 | train |
| 0.647 | 0.290 | 0.701 | 0.619 | 28 | eval |
| 0.669 | 0.339 | 0.715 | 0.602 | 29 | train |
| 0.665 | 0.336 | 0.719 | 0.605 | 29 | eval |
</div>
### Inspect Activations
``` python
dls = get_dls(trnds, vldds, bs=256)
model = nn.Sequential(nn.Embedding(vocab_size, n_embd, padding_idx=0), Reshape(),
ResBlock1d(n_embd, 2, ks=15, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(2, 4, ks=13, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(4, 4, ks=11, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(4, 4, ks=9, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(4, 8, ks=7, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(8, 8, ks=5, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(8, 16, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
ResBlock1d(16, 32, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),
nn.Flatten(1, -1),
nn.Linear(32, 1),
nn.Flatten(0, -1),
nn.Sigmoid())
model = model.apply(iw)
metrics = MetricsCB(BinaryAccuracy(), BinaryMatthewsCorrCoef(), BinaryAUROC())
astats = ActivationStats(fc.risinstance(GeneralRelu))
cbs = [DeviceCB(), ProgressCB(plot=False), metrics, astats]
learn = TrainLearner(model, dls, F.binary_cross_entropy, lr=lr, cbs=cbs, opt_func=torch.optim.AdamW)
print(f"Parameters total: {sum(p.nelement() for p in model.parameters())}")
learn.fit(1)
```
Parameters total: 10175
<style>
/* Turns off some styling */
progress {
/* gets rid of default border in Firefox and Opera. */
border: none;
/* Needs to be in here for Safari polyfill so background images work as expected. */
background-size: auto;
}
progress:not([value]), progress:not([value])::-webkit-progress-bar {
background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
<div>
| BinaryAccuracy | BinaryMatthewsCorrCoef | BinaryAUROC | loss | epoch | train |
|----------------|------------------------|-------------|-------|-------|-------|
| 0.507 | -0.002 | 0.498 | 0.713 | 0 | train |
| 0.524 | 0.010 | 0.524 | 0.691 | 0 | eval |
</div>
``` python
astats.color_dim()
```
![](index_files/figure-commonmark/cell-28-output-1.png)
``` python
astats.plot_stats()
```
![](index_files/figure-commonmark/cell-29-output-1.png)
``` python
astats.dead_chart()
```
![](index_files/figure-commonmark/cell-30-output-1.png)
Raw data
{
"_id": null,
"home_page": "https://github.com/frenio/atai",
"name": "atai",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "nbdev jupyter notebook python",
"author": "Frenio Redeker",
"author_email": "f.redeker@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/58/61/506e01a5bbe9a02d8bd74ea4eaff77ef214be925ebd001b006eff2609b71/atai-0.0.3.tar.gz",
"platform": null,
"description": "# atai\n\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\nAtomic AI is a flexible, minimalist deep neural network training\nframework based on Jeremy Howard\u2019s\n[miniai](https://github.com/fastai/course22p2/tree/master) from the\n[fast.ai 2022](https://course.fast.ai/Lessons/part2.html) course.\n\n## Install\n\n``` sh\npip install atai\n```\n\n## How to use\n\n``` python\nfrom atai.core import *\n```\n\nThe following example demonstrates how the Atomic AI training framework\ncan be used to train a custom model that predicts protein solubility.\n\n### Imports\n\n``` python\nimport torch\nimport torch.nn.functional as F\nimport torch.nn as nn\nfrom torch.nn import init\nfrom torch import optim\n\nfrom torcheval.metrics import BinaryAccuracy, BinaryAUROC\nfrom torcheval.metrics.functional import binary_auroc, binary_accuracy\nfrom torchmetrics.classification import BinaryMatthewsCorrCoef\nfrom torchmetrics.functional.classification import binary_matthews_corrcoef\n\nimport fastcore.all as fc\nfrom functools import partial\n```\n\n### Load Protein Solubility\n\nThis example uses the dataset from the\n[DeepSol](https://doi.org/10.1093/bioinformatics/bty166) paper by\nKhurana *et al.* which was obtained at\n<https://zenodo.org/records/1162886>. It consists of amino acid\nsequences of peptides along with solubility labels that are `1` if the\npeptide is soluble and `0` if the peptide is insoluble.\n\n``` python\ntrain_sqs = open('sol_data/train_src', 'r').read().splitlines()\ntrain_tgs = list(map(int, open('sol_data/train_tgt', 'r').read().splitlines()))\n\nvalid_sqs = open('sol_data/val_src', 'r').read().splitlines()\nvalid_tgs = list(map(int, open('sol_data/val_tgt', 'r').read().splitlines()))\n\ntrain_sqs[:2], train_tgs[:2]\n```\n\n (['GMILKTNLFGHTYQFKSITDVLAKANEEKSGDRLAGVAAESAEERVAAKVVLSKMTLGDLRNNPVVPYETDEVTRIIQDQVNDRIHDSIKNWTVEELREWILDHKTTDADIKRVARGLTSEIIAAVTKLMSNLDLIYGAKKIRVIAHANTTIGLPGTFSARLQPNHPTDDPDGILASLMEGLTYGIGDAVIGLNPVDDSTDSVVRLLNKFEEFRSKWDVPTQTCVLAHVKTQMEAMRRGAPTGLVFQSIAGSEKGNTAFGFDGATIEEARQLALQSGAATGPNVMYFETGQGSELSSDAHFGVDQVTMEARCYGFAKKFDPFLVNTVVGFIGPEYLYDSKQVIRAGLEDHFMGKLTGISMGCDVCYTNHMKADQNDVENLSVLLTAAGCNFIMGIPHGDDVMLNYQTTGYHETATLRELFGLKPIKEFDQWMEKMGFSENGKLTSRAGDASIFLK',\n 'MAHHHHHHMSFFRMKRRLNFVVKRGIEELWENSFLDNNVDMKKIEYSKTGDAWPCVLLRKKSFEDLHKLYYICLKEKNKLLGEQYFHLQNSTKMLQHGRLKKVKLTMKRILTVLSRRAIHDQCLRAKDMLKKQEEREFYEIQKFKLNEQLLCLKHKMNILKKYNSFSLEQISLTFSIKKIENKIQQIDIILNPLRKETMYLLIPHFKYQRKYSDLPGFISWKKQNIIALRNNMSKLHRLY'],\n [1, 0])\n\n``` python\nlen(train_sqs), len(train_tgs), len(valid_sqs), len(valid_tgs)\n```\n\n (62478, 62478, 6942, 6942)\n\n### Data Preparation\n\nCreate a sorted list of amino acid sequences `aas` including an empty\nstring for padding and determine the size of the vocabulary.\n\n``` python\naas = sorted(list(set(\"\".join(train_sqs))) + [\"\"])\nvocab_size = len(aas)\naas, vocab_size\n```\n\n (['',\n 'A',\n 'C',\n 'D',\n 'E',\n 'F',\n 'G',\n 'H',\n 'I',\n 'K',\n 'L',\n 'M',\n 'N',\n 'P',\n 'Q',\n 'R',\n 'S',\n 'T',\n 'V',\n 'W',\n 'Y'],\n 21)\n\nCreate dictionaries that translate between string and integer\nrepresentations of amino acids and define the corresponding `encode` and\n`decode` functions.\n\n``` python\nstr2int = {aa:i for i, aa in enumerate(aas)}\nint2str = {i:aa for i, aa in enumerate(aas)}\nencode = lambda s: [str2int[aa] for aa in s]\ndecode = lambda l: ''.join([int2str[i] for i in l])\n\nprint(encode(\"AYWCCCGGGHH\"))\nprint(decode(encode(\"AYWCCCGGGHH\")))\n```\n\n [1, 20, 19, 2, 2, 2, 6, 6, 6, 7, 7]\n AYWCCCGGGHH\n\nFigure out what the range of lengths of amino acid sequences in the\ndataset is.\n\n``` python\ntrain_lens = list(map(len, train_sqs))\nmin(train_lens), max(train_lens)\n```\n\n (19, 1691)\n\nCreate a function that drops all sequences above a chosen threshold and\nalso returns a list of indices of the sequences that meet the threshold\nthat can be used to obtain the correct labels.\n\n``` python\ndef drop_long_sqs(sqs, threshold=1200):\n new_sqs = []\n idx = []\n for i, sq in enumerate(sqs):\n if len(sq) <= threshold:\n new_sqs.append(sq)\n idx.append(i)\n return new_sqs, idx\n```\n\nDrop all sequences above your chosen threshold.\n\n``` python\ntrnsqs, trnidx = drop_long_sqs(train_sqs, threshold=200)\nvldsqs, vldidx = drop_long_sqs(valid_sqs, threshold=200)\n```\n\n``` python\nlen(trnidx), len(vldidx)\n```\n\n (18066, 1971)\n\n``` python\nmax(map(len, trnsqs))\n```\n\n 200\n\nCreate a function for zero padding all sequences.\n\n``` python\ndef zero_pad(sq, length=1200):\n new_sq = sq.copy()\n if len(new_sq) < length:\n new_sq.extend([0] * (length-len(new_sq)))\n return new_sq\n```\n\nNow encode and zero pad all sequences and make sure that it worked out\ncorrectly.\n\n``` python\ntrn = list(map(encode, trnsqs))\nvld = list(map(encode, vldsqs))\nprint(f\"Length of the first two sequences before zero padding: {len(trn[0])}, {len(trn[1])}\")\ntrn = list(map(partial(zero_pad, length=200), trn))\nvld = list(map(partial(zero_pad, length=200), vld))\nprint(f\"Length of the first two sequences after zero padding: {len(trn[0])}, {len(trn[1])}\");\n```\n\n Length of the first two sequences before zero padding: 116, 135\n Length of the first two sequences after zero padding: 200, 200\n\nConvert the data to `torch.tensor`s unsing `dtype=torch.int64` and check\nfor correctness.\n\n``` python\ntrntns = torch.tensor(trn, dtype=torch.int64)\nvldtns = torch.tensor(vld, dtype=torch.int64)\ntrntns.shape, trntns[0]\n```\n\n (torch.Size([18066, 200]),\n tensor([11, 9, 1, 10, 2, 10, 10, 10, 10, 13, 18, 10, 6, 10, 10, 18, 16, 16, 9, 17, 10, 2, 16, 11, 4, 4, 1, 8, 12, 4, 15, 8, 14,\n 4, 18, 1, 6, 16, 10, 8, 5, 15, 1, 8, 16, 16, 8, 6, 10, 4, 2, 14, 16, 18, 17, 16, 15, 6, 3, 10, 1, 17, 2, 13, 15, 6,\n 5, 1, 18, 17, 6, 2, 17, 2, 6, 16, 1, 2, 6, 16, 19, 3, 18, 15, 1, 4, 17, 17, 2, 7, 2, 14, 2, 1, 6, 11, 3, 19, 17,\n 6, 1, 15, 2, 2, 15, 18, 14, 13, 10, 4, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0]))\n\n``` python\ntrntns.shape, vldtns.shape\n```\n\n (torch.Size([18066, 200]), torch.Size([1971, 200]))\n\nObtain the correct labels using the lists of indices obtained from the\n`drop_long_sqs` function and convert the lists of labels to tensors in\n`torch.float32` format.\n\n``` python\ntrnlbs = torch.tensor(train_tgs, dtype=torch.float32)[trnidx]\nvldlbs = torch.tensor(valid_tgs, dtype=torch.float32)[vldidx]\ntrnlbs.shape, vldlbs.shape\n```\n\n (torch.Size([18066]), torch.Size([1971]))\n\nCalculate the ratios of soluble peptides in the train and valid data.\n\n``` python\ntrnlbs.sum().item()/trnlbs.shape[0], vldlbs.sum().item()/vldlbs.shape[0]\n```\n\n (0.4722129967895494, 0.4657534246575342)\n\nThese ratios tell us that there are slightly less than half soluble\nproteins in the training an validation data, and slightly more than half\nin the test set.\n\n### Dataset and DataLoaders\n\nTurn train and valid data into datasets using the\n[`Dataset`](https://frenio.github.io/atai/core.html#dataset) class.\n\n``` python\ntrnds = Dataset(trntns, trnlbs)\nvldds = Dataset(vldtns, vldlbs)\ntrnds[0]\n```\n\n (tensor([11, 9, 1, 10, 2, 10, 10, 10, 10, 13, 18, 10, 6, 10, 10, 18, 16, 16, 9, 17, 10, 2, 16, 11, 4, 4, 1, 8, 12, 4, 15, 8, 14,\n 4, 18, 1, 6, 16, 10, 8, 5, 15, 1, 8, 16, 16, 8, 6, 10, 4, 2, 14, 16, 18, 17, 16, 15, 6, 3, 10, 1, 17, 2, 13, 15, 6,\n 5, 1, 18, 17, 6, 2, 17, 2, 6, 16, 1, 2, 6, 16, 19, 3, 18, 15, 1, 4, 17, 17, 2, 7, 2, 14, 2, 1, 6, 11, 3, 19, 17,\n 6, 1, 15, 2, 2, 15, 18, 14, 13, 10, 4, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0]),\n tensor(0.))\n\nUse the [`get_dls`](https://frenio.github.io/atai/core.html#get_dls)\nfunction to obtain the dataloaders from the train and valid datasets.\n\n``` python\ndls = get_dls(trnds, vldds, bs=32)\nnext(iter(dls.train))[0][:2], next(iter(dls.train))[1][:2]\n```\n\n (tensor([[10, 15, 16, 18, 5, 18, 16, 6, 5, 13, 15, 6, 18, 3, 16, 1, 14, 10, 16, 4, 20, 5, 10, 1, 5, 6, 13, 18, 1, 16, 18, 18, 11,\n 3, 9, 3, 9, 6, 18, 5, 1, 8, 18, 4, 11, 6, 3, 18, 6, 1, 15, 4, 1, 18, 10, 16, 14, 16, 14, 7, 16, 10, 6, 6, 7, 15,\n 10, 15, 18, 15, 13, 15, 4, 14, 9, 4, 5, 14, 16, 13, 1, 16, 9, 16, 13, 9, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0],\n [11, 12, 13, 16, 1, 13, 16, 20, 13, 11, 1, 16, 10, 20, 18, 6, 3, 10, 7, 13, 3, 18, 17, 4, 1, 11, 10, 20, 4, 9, 5, 16, 13,\n 1, 6, 13, 8, 10, 16, 8, 15, 18, 2, 15, 3, 11, 8, 17, 15, 15, 16, 10, 6, 20, 1, 20, 18, 12, 5, 14, 14, 13, 1, 3, 1, 4,\n 15, 1, 10, 3, 17, 11, 12, 5, 3, 18, 8, 9, 6, 9, 13, 18, 15, 8, 11, 19, 16, 14, 15, 3, 13, 16, 10, 15, 9, 16, 6, 18, 6,\n 12, 8, 5, 8, 9, 12, 10, 3, 9, 16, 8, 3, 12, 9, 1, 10, 20, 3, 17, 5, 16, 1, 5, 6, 12, 8, 10, 16, 2, 9, 18, 18, 2,\n 3, 4, 12, 6, 16, 9, 6, 20, 6, 5, 18, 7, 5, 4, 17, 14, 4, 1, 1, 4, 15, 1, 8, 4, 9, 11, 12, 6, 11, 10, 10, 12, 3,\n 15, 9, 18, 5, 18, 6, 15, 5, 9, 16, 15, 9, 4, 15, 4, 1, 4, 10, 6, 1, 15, 1, 9, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0]]),\n tensor([1., 1.]))\n\n### Design Your Model\n\nLet\u2019s create a tiny model (~10k parameters) that uses a sequence of\n1-dimensional convolutional layers with skip connections, kaiming he\ninitialization, leaky relus, batchnorm, and dropout.\n\nFirst, obtain a single batch from `dls` to help design the model.\n\n``` python\nidx = next(iter(dls.train))[0] ## a single batch\nidx, idx.shape\n```\n\n (tensor([[11, 9, 9, ..., 0, 0, 0],\n [11, 3, 20, ..., 0, 0, 0],\n [ 1, 5, 4, ..., 0, 0, 0],\n ...,\n [ 1, 18, 18, ..., 0, 0, 0],\n [10, 6, 5, ..., 0, 0, 0],\n [ 2, 8, 1, ..., 0, 0, 0]]),\n torch.Size([32, 200]))\n\n#### Custom Modules\n\n``` python\ndef conv1d(ni, nf, ks=3, stride=2, act=nn.ReLU, norm=None, bias=None):\n if bias is None: bias = not isinstance(norm, (nn.BatchNorm1d,nn.BatchNorm2d,nn.BatchNorm3d))\n layers = [nn.Conv1d(ni, nf, stride=stride, kernel_size=ks, padding=ks//2, bias=bias)]\n if norm: layers.append(norm(nf))\n if act: layers.append(act())\n return nn.Sequential(*layers)\n\ndef _conv1d_block(ni, nf, stride, act=nn.ReLU, norm=None, ks=3):\n return nn.Sequential(conv1d(ni, nf, stride=1, act=act, norm=norm, ks=ks),\n conv1d(nf, nf, stride=stride, act=None, norm=norm, ks=ks))\n\nclass ResBlock1d(nn.Module):\n def __init__(self, ni, nf, stride=1, ks=3, act=nn.ReLU, norm=None):\n super().__init__()\n self.convs = _conv1d_block(ni, nf, stride=stride, ks=ks, act=act, norm=norm)\n self.idconv = fc.noop if ni==nf else conv1d(ni, nf, stride=1, ks=1, act=None)\n self.pool = fc.noop if stride==1 else nn.AvgPool1d(stride, ceil_mode=True)\n self.act = act()\n\n def forward(self, x): return self.act(self.convs(x) + self.pool(self.idconv(x)))\n```\n\nThe following module switches the rank order from BLC to BCL.\n\n``` python\nclass Reshape(nn.Module):\n def forward(self, x): \n B, L, C = x.shape\n return x.view(B, C, L)\n```\n\n#### Model Architecture\n\n``` python\nlr = 1e-2\nepochs = 30\nn_embd = 16\ndls = get_dls(trnds, vldds, bs=32)\nact_genrelu = partial(GeneralRelu, leak=0.1, sub=0.4)\n\nmodel = nn.Sequential(nn.Embedding(vocab_size, n_embd, padding_idx=0), Reshape(),\n ResBlock1d(n_embd, 2, ks=15, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(2, 4, ks=13, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(4, 4, ks=11, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(4, 4, ks=9, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(4, 8, ks=7, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(8, 8, ks=5, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(8, 16, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(16, 32, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n nn.Flatten(1, -1),\n nn.Linear(32, 1),\n nn.Flatten(0, -1),\n nn.Sigmoid())\nmodel(idx).shape\n```\n\n torch.Size([32])\n\n``` python\niw = partial(init_weights, leaky=0.1)\nmodel = model.apply(iw)\nmetrics = MetricsCB(BinaryAccuracy(), BinaryMatthewsCorrCoef(), BinaryAUROC())\nastats = ActivationStats(fc.risinstance(GeneralRelu))\ncbs = [DeviceCB(), ProgressCB(plot=False), metrics, astats]\nlearn = TrainLearner(model, dls, F.binary_cross_entropy, lr=lr, cbs=cbs, opt_func=torch.optim.AdamW)\nprint(f\"Parameters total: {sum(p.nelement() for p in model.parameters())}\")\nlearn.lr_find(start_lr=1e-4, gamma=1.05, av_over=3, max_mult=5)\n```\n\n Parameters total: 10175\n\n<style>\n /* Turns off some styling */\n progress {\n /* gets rid of default border in Firefox and Opera. */\n border: none;\n /* Needs to be in here for Safari polyfill so background images work as expected. */\n background-size: auto;\n }\n progress:not([value]), progress:not([value])::-webkit-progress-bar {\n background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n }\n .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n background: #F44336;\n }\n</style>\n\n <div>\n <progress value='0' class='' max='10' style='width:300px; height:20px; vertical-align: middle;'></progress>\n 0.00% [0/10 00:00<?]\n </div>\n \n <div>\n <progress value='492' class='' max='565' style='width:300px; height:20px; vertical-align: middle;'></progress>\n 87.08% [492/565 00:33<00:04 2.113]\n </div>\n \n![](index_files/figure-commonmark/cell-25-output-4.png)\n\nThis is a pretty noisy training set, so the learning rate finder does\nnot work very well. Yet it is possible to get a somewhat informative\nresult using the `av_over` keyword argument that tells\n[`lr_find`](https://frenio.github.io/atai/core.html#lr_find) to average\nover the specified number of batches for each learning rate tested. It\nalso helps to dial the `gamma` value down from its default value of\n`1.3`.\n\n### Training\n\n``` python\nlearn.fit(epochs)\n```\n\n<style>\n /* Turns off some styling */\n progress {\n /* gets rid of default border in Firefox and Opera. */\n border: none;\n /* Needs to be in here for Safari polyfill so background images work as expected. */\n background-size: auto;\n }\n progress:not([value]), progress:not([value])::-webkit-progress-bar {\n background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n }\n .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n background: #F44336;\n }\n</style>\n\n<div>\n\n| BinaryAccuracy | BinaryMatthewsCorrCoef | BinaryAUROC | loss | epoch | train |\n|----------------|------------------------|-------------|-------|-------|-------|\n| 0.507 | 0.001 | 0.504 | 0.718 | 0 | train |\n| 0.534 | -0.021 | 0.514 | 0.691 | 0 | eval |\n| 0.515 | 0.008 | 0.506 | 0.697 | 1 | train |\n| 0.539 | 0.040 | 0.530 | 0.691 | 1 | eval |\n| 0.519 | 0.015 | 0.509 | 0.695 | 2 | train |\n| 0.534 | 0.003 | 0.498 | 0.691 | 2 | eval |\n| 0.519 | 0.017 | 0.517 | 0.694 | 3 | train |\n| 0.525 | 0.036 | 0.527 | 0.692 | 3 | eval |\n| 0.582 | 0.155 | 0.592 | 0.675 | 4 | train |\n| 0.638 | 0.283 | 0.649 | 0.633 | 4 | eval |\n| 0.618 | 0.242 | 0.629 | 0.652 | 5 | train |\n| 0.651 | 0.321 | 0.666 | 0.619 | 5 | eval |\n| 0.627 | 0.266 | 0.630 | 0.647 | 6 | train |\n| 0.656 | 0.329 | 0.664 | 0.618 | 6 | eval |\n| 0.635 | 0.286 | 0.637 | 0.641 | 7 | train |\n| 0.647 | 0.332 | 0.679 | 0.623 | 7 | eval |\n| 0.636 | 0.284 | 0.648 | 0.637 | 8 | train |\n| 0.645 | 0.309 | 0.680 | 0.614 | 8 | eval |\n| 0.635 | 0.281 | 0.647 | 0.636 | 9 | train |\n| 0.649 | 0.346 | 0.682 | 0.627 | 9 | eval |\n| 0.636 | 0.286 | 0.656 | 0.632 | 10 | train |\n| 0.654 | 0.340 | 0.684 | 0.613 | 10 | eval |\n| 0.646 | 0.304 | 0.666 | 0.627 | 11 | train |\n| 0.653 | 0.314 | 0.686 | 0.614 | 11 | eval |\n| 0.645 | 0.299 | 0.665 | 0.627 | 12 | train |\n| 0.666 | 0.327 | 0.702 | 0.605 | 12 | eval |\n| 0.648 | 0.301 | 0.676 | 0.622 | 13 | train |\n| 0.665 | 0.340 | 0.704 | 0.600 | 13 | eval |\n| 0.657 | 0.319 | 0.684 | 0.619 | 14 | train |\n| 0.670 | 0.333 | 0.717 | 0.609 | 14 | eval |\n| 0.656 | 0.318 | 0.688 | 0.616 | 15 | train |\n| 0.655 | 0.313 | 0.703 | 0.619 | 15 | eval |\n| 0.652 | 0.308 | 0.682 | 0.619 | 16 | train |\n| 0.674 | 0.352 | 0.719 | 0.597 | 16 | eval |\n| 0.659 | 0.324 | 0.692 | 0.614 | 17 | train |\n| 0.662 | 0.348 | 0.717 | 0.606 | 17 | eval |\n| 0.658 | 0.320 | 0.693 | 0.613 | 18 | train |\n| 0.668 | 0.336 | 0.718 | 0.604 | 18 | eval |\n| 0.657 | 0.316 | 0.698 | 0.612 | 19 | train |\n| 0.662 | 0.339 | 0.715 | 0.602 | 19 | eval |\n| 0.660 | 0.325 | 0.697 | 0.612 | 20 | train |\n| 0.662 | 0.330 | 0.716 | 0.607 | 20 | eval |\n| 0.665 | 0.334 | 0.700 | 0.611 | 21 | train |\n| 0.662 | 0.317 | 0.703 | 0.609 | 21 | eval |\n| 0.665 | 0.337 | 0.698 | 0.609 | 22 | train |\n| 0.666 | 0.326 | 0.712 | 0.607 | 22 | eval |\n| 0.666 | 0.336 | 0.703 | 0.608 | 23 | train |\n| 0.677 | 0.360 | 0.723 | 0.591 | 23 | eval |\n| 0.663 | 0.330 | 0.703 | 0.608 | 24 | train |\n| 0.676 | 0.358 | 0.722 | 0.595 | 24 | eval |\n| 0.666 | 0.335 | 0.708 | 0.607 | 25 | train |\n| 0.675 | 0.361 | 0.723 | 0.594 | 25 | eval |\n| 0.661 | 0.323 | 0.707 | 0.608 | 26 | train |\n| 0.664 | 0.321 | 0.708 | 0.615 | 26 | eval |\n| 0.664 | 0.328 | 0.710 | 0.605 | 27 | train |\n| 0.663 | 0.322 | 0.723 | 0.602 | 27 | eval |\n| 0.669 | 0.337 | 0.711 | 0.606 | 28 | train |\n| 0.647 | 0.290 | 0.701 | 0.619 | 28 | eval |\n| 0.669 | 0.339 | 0.715 | 0.602 | 29 | train |\n| 0.665 | 0.336 | 0.719 | 0.605 | 29 | eval |\n\n</div>\n\n### Inspect Activations\n\n``` python\ndls = get_dls(trnds, vldds, bs=256)\n\nmodel = nn.Sequential(nn.Embedding(vocab_size, n_embd, padding_idx=0), Reshape(),\n ResBlock1d(n_embd, 2, ks=15, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(2, 4, ks=13, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(4, 4, ks=11, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(4, 4, ks=9, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(4, 8, ks=7, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(8, 8, ks=5, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(8, 16, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n ResBlock1d(16, 32, ks=3, stride=2, norm=nn.BatchNorm1d, act=act_genrelu), nn.Dropout(0.1),\n nn.Flatten(1, -1),\n nn.Linear(32, 1),\n nn.Flatten(0, -1),\n nn.Sigmoid())\n\nmodel = model.apply(iw)\nmetrics = MetricsCB(BinaryAccuracy(), BinaryMatthewsCorrCoef(), BinaryAUROC())\nastats = ActivationStats(fc.risinstance(GeneralRelu))\ncbs = [DeviceCB(), ProgressCB(plot=False), metrics, astats]\nlearn = TrainLearner(model, dls, F.binary_cross_entropy, lr=lr, cbs=cbs, opt_func=torch.optim.AdamW)\nprint(f\"Parameters total: {sum(p.nelement() for p in model.parameters())}\")\nlearn.fit(1)\n```\n\n Parameters total: 10175\n\n<style>\n /* Turns off some styling */\n progress {\n /* gets rid of default border in Firefox and Opera. */\n border: none;\n /* Needs to be in here for Safari polyfill so background images work as expected. */\n background-size: auto;\n }\n progress:not([value]), progress:not([value])::-webkit-progress-bar {\n background: repeating-linear-gradient(45deg, #7e7e7e, #7e7e7e 10px, #5c5c5c 10px, #5c5c5c 20px);\n }\n .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n background: #F44336;\n }\n</style>\n\n<div>\n\n| BinaryAccuracy | BinaryMatthewsCorrCoef | BinaryAUROC | loss | epoch | train |\n|----------------|------------------------|-------------|-------|-------|-------|\n| 0.507 | -0.002 | 0.498 | 0.713 | 0 | train |\n| 0.524 | 0.010 | 0.524 | 0.691 | 0 | eval |\n\n</div>\n\n``` python\nastats.color_dim()\n```\n\n![](index_files/figure-commonmark/cell-28-output-1.png)\n\n``` python\nastats.plot_stats()\n```\n\n![](index_files/figure-commonmark/cell-29-output-1.png)\n\n``` python\nastats.dead_chart()\n```\n\n![](index_files/figure-commonmark/cell-30-output-1.png)\n",
"bugtrack_url": null,
"license": "Apache Software License 2.0",
"summary": "Atomic AI \u2013 An attempt at a minimalist, flexible deep learning framework for diverse models.",
"version": "0.0.3",
"project_urls": {
"Homepage": "https://github.com/frenio/atai"
},
"split_keywords": [
"nbdev",
"jupyter",
"notebook",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0dbde6ed2d7707ea4c111715d3b8b51ce828b9e5f0d769c612421e99f5333eee",
"md5": "532cc648e52e1707056b2dcd1de8b892",
"sha256": "d753a80544c7abcc84bf16ee2a3d7d5d86b193495c7ff5e32d23f30be16a29ec"
},
"downloads": -1,
"filename": "atai-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "532cc648e52e1707056b2dcd1de8b892",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 19183,
"upload_time": "2024-04-22T20:22:26",
"upload_time_iso_8601": "2024-04-22T20:22:26.981749Z",
"url": "https://files.pythonhosted.org/packages/0d/bd/e6ed2d7707ea4c111715d3b8b51ce828b9e5f0d769c612421e99f5333eee/atai-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5861506e01a5bbe9a02d8bd74ea4eaff77ef214be925ebd001b006eff2609b71",
"md5": "904f0bfabf225b5f8efae323f185771a",
"sha256": "6d04d9c409c510b792166b1627430af519981621a36e5489ca4cb8021566fdca"
},
"downloads": -1,
"filename": "atai-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "904f0bfabf225b5f8efae323f185771a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 26706,
"upload_time": "2024-04-22T20:22:28",
"upload_time_iso_8601": "2024-04-22T20:22:28.108383Z",
"url": "https://files.pythonhosted.org/packages/58/61/506e01a5bbe9a02d8bd74ea4eaff77ef214be925ebd001b006eff2609b71/atai-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-22 20:22:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "frenio",
"github_project": "atai",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "atai"
}