# **Caspian - Deep Learning Architectures**
| [**Information**](#information)
| [**Installation**](#installation)
| [**Getting Started**](#getting-started)
| [**Examples**](#examples)
| [**Notes**](#notes)
| [**Future Plans**](#future-plans-and-developments)
|



A flexible deep learning/machine learning research library using [NumPy].
## Information
Caspian is written entirely with base Python and [NumPy], meaning no other library or
framework is required. Contains many basic tools required to create machine learning
models like neural networks, regressions, image processors, and more.
General structure and functionality inspired by popular frameworks such as [PyTorch] and [TensorFlow].
On top of providing necessary layers and functions, Caspian also allows for simple creation
of new layers and tools that the user may need. Each part of the Caspian architecture has its
own abstraction that the user can inherit from, including:
- `cspn.Layer`
- `cspn.Activation`
- `cspn.PoolFunc`
- `cspn.Loss`
- `cspn.Optimizer`
- `cspn.Scheduler`
Caspian also provides support for [CUDA] parallel processing, using [CuPy] as an optional secondary
import.
## Installation
Before installing, this library requires:
- Python 3.10+
- NumPy v1.23.5+
- CuPy (12x) v13.0.0+ **(Optional)**
```bash
$ pip install caspian-ml
```
## Getting Started
Caspian architectures are split into 6 different class types:
- `Layers`, the backbone behind any model and the main processors & learners.
- `Activations`, non-linear functions which assist layers in learning and processing data.
- `PoolFuncs`, similar to activations, but to be used with pooling layers and work on strided data rather than standard arrays.
- `Losses`, functions which describe the loss, or error, of a model.
- `Optimizers`, functions which assist in layer weight updating and learning.
- `Schedulers`, functions which define the learning rate at a particular step in a model's learning process.
The structure of a network differs slightly from that of [PyTorch] or [TensorFlow], where each layer,
activation, optimizer, and scheduler is separate. With Caspian, layers can contain an activation or pooling function, as well as an optimizer. Optimizers contain a scheduler, which controls the learning rate of the optimizer and layer as a whole. Some layers, like `Dropout` and `Upsampling1D` do not contain optimizers OR activations, as they do not have any learnable parameters or perform any non-linear transformations.
Some types have default classes that allow that operation to be skipped or performed at a base level, like `Linear` for activations, `StandardGD` for optimizers, and `SchedulerLR` for schedulers.
If an optimizer is required for a layer but not provided in the initialization, a default `StandardGD` optimizer with a `SchedulerLR` scheduler will automatically be assigned. Activation and pooling functions will not be defaulted if not provided, so they must be manually provided by the user.
#### GPU Computing
Caspian and its tools can also be used with [CUDA] through [CuPy] to increase speeds by a significant amount. Before importing Caspian or any of its tools, place this segment of Python code above the other imports:
```python
import os
os.environ["CSPN_CUDA"] = "cuda"
```
This ensures that all modules and tools from Caspian are synced with [CUDA], and [CUDA]-supported GPU computing should be enabled as long as [CuPy] and the [CUDA] toolkit are both properly installed.
If a custom tool for Caspian is expected to use both CPU and GPU computing, then use this import instead of directly importing [NumPy] or [CuPy]:
```python
from caspian.cudalib import np
```
This will automatically import the library that Caspian is currently using. This allows for easier compatibility and prevents the user from having to manually switch between the two libraries manually within their tool.
## Examples
The setup and training of a model in Caspian is similar to other deep learning libraries of its kind, here is a quick training example of a neural network to provide more information:
### Creation of a Model:
```python
from caspian.layers import Layer, Dense
from caspian.activations import Activation, Softmax
from caspian.optimizers import Optimizer
import numpy as np
class NeuralNet(Layer):
def __init__(self, inputs: int, hiddens: int, outputs: int,
activation: Activation, opt: Optimizer):
in_size = (inputs,)
out_size = (outputs,)
super().__init__(in_size, out_size)
self.x_1 = Dense(activation, inputs, hiddens, optimizer=opt.deepcopy())
self.x_2 = Dense(activation, hiddens, outputs, optimizer=opt.deepcopy())
self.softmax = Softmax()
def forward(self, data: np.ndarray, training: bool = False) -> np.ndarray:
self.training = training
step_1 = x_1(data, training)
step_2 = x_2(step_1, training)
return self.softmax(step_2)
def backward(self, dx: np.ndarray) -> np.ndarray:
assert self.training is True
d_sm = self.softmax.backward(dx)
d_2 = x_2.backward(d_sm)
d_1 = x_1.backward(d_1)
return d_1
def step(self) -> None:
x_1.step()
x_2.step()
```
This is a simple neural network model containing two `Dense` layers, each with the same activation function and optimizer (separate instances are highly recommended) as provided. The variables `in_size` and `out_size` are a part of every layer class, and can be set for a layer using `super().__init__()`, which expects the input size and output size as tuples. If constructed like this, it can also be used inside of `Sequence` layers (similar to [PyTorch]'s [Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html)).
### Creation of an Activation Function:
```python
from caspian.activations import Activation
import numpy as np
class ReLU(Activation):
def forward(self, data: np.ndarray) -> np.ndarray:
return np.maximum(0, data)
def backward(self, data: np.ndarray) -> np.ndarray:
return (data >= 0) * 1
```
Creating a new activation function is quite simple as well, and only expects two functions, `forward()` and `backward()`, which take and return a [NumPy] array. Activations should return an array of the same size as the input for both functions, and can also have an `__init__()` if any internal variables are necessary. The abstract class `cspn.Activation` also provides default functionality for `__call__()`, which allows it to act like a standard Python function. If only one value (representing the last output of the activation) is passed into the `__call__()` function, the forward pass will automatically be performed. If two values are given, then the backward pass is called, and the second array (which is the error gradient with respect to the last output) is multiplied by the backward pass result and returned.
### Creation of a Pooling Function:
```python
from caspian.pooling import PoolFunc
import numpy as np
class Average(PoolFunc):
def forward(self, partition: np.ndarray) -> np.ndarray:
return np.average(partition)
def backward(self, partition: np.ndarray) -> np.ndarray:
return partition * (1.0 / partition.shape[self.axis])
```
Similar in structure to activation functions, but pooling functions return an `ndarray` with a smaller array rather than an array with the same size as the partition. Like activations as well, can be called like a standard Python function if inheriting from the `PoolFunc` abstract class. Each pooling function will have an internal variable `self.axis` (can be set during initialization) which can be used at any point in both the forward and backward passes.
### Creation of a Loss Function:
```python
from caspian.losses import Loss
import numpy as np
class CrossEntropy(Loss):
@staticmethod
def forward(actual: np.ndarray, prediction: np.ndarray) -> float:
clip_pred = np.clip(prediction, 1e-10, 1 - 1e-10)
return -np.sum(actual * np.log(clip_pred))
@staticmethod
def backward(actual: np.ndarray, prediction: np.ndarray) -> np.ndarray:
return prediction - actual
```
Loss functions quantify the rate of error of a model's predictions and provides the partial derivative with respect to the output (gradient array) that a model can use to learn. Losses are not a part of any layer or other class, and unless required by some special cases, do not store any internal variables. Because of this, losses can be created as either static classes or instantiable, depending on the user's choice.
### Creation of an Optimizer:
```python
from caspian.optimizers import Optimizer
from caspian.schedulers import Scheduler
import numpy as np
class Momentum(Optimizer):
def __init__(self, momentum: float = 0.9, learn_rate: float = 0.01,
sched: Scheduler) -> None:
super().__init__(learn_rate, sched)
self.momentum = momentum
self.previous = 0.0
def process_grad(self, grad: np.ndarray) -> np.ndarray:
learn_rate = self.scheduler(self.learn_rate)
velocity_grad = self.momentum * self.previous - learn_rate * grad
self.previous = velocity_grad
return velocity_grad
def step(self) -> None:
self.scheduler.step()
def reset_grad(self) -> None:
self.previous = 0.0
self.scheduler.reset()
def deepcopy(self) -> 'Momentum':
return Momentum(self.momentum, self.learn_rate, self.scheduler.deepcopy())
```
The general framework for an optimizer is a little bit more complex, but still easy to assemble. The abstract `Optimizer` class initialization takes in two parameters, `learn_rate` as a float, and `sched` as a scheduler class.
The function `process_grad()` is the main transformation of the optimizer. It should process the given gradient array, apply the learning rate (if applicable), and return an array with the same size as the input.
The function `step()` is meant to keep track of the epoch or training iteration of the model that the optimizer is a part of. For the example above, it only calls the internal scheduler's `step()` function and does not modify any variables. However, some more advanced optimizers like `ADAM` may require an internal variable to be kept for this purpose.
Another function expected from optimizers is `reset_grad()`, which clears all previous gradient information and resets the learning rate scheduler for that optimizer.
The function `deepcopy()` is highly recommended if being used on multiple layers of a model, as each layer contains its own version of an optimizer and scheduler. It should pass a deep copy of whatever data structures it contains or needs into the initialization of a new instance.
### Creation of a Learning Rate Scheduler:
```python
from caspian.schedulers import Scheduler
import numpy as np
class ConstantLR(Scheduler):
def __init__(self, steps: int, const: float = 0.1) -> None:
self.steps = steps
self.const = const
self.epoch = 0
def __call__(self, init_rate: float) -> float:
return init_rate * self.const if self.epoch < self.steps else init_rate
def step(self) -> None:
self.epoch += 1
def reset(self) -> None:
self.epoch = 0
def deepcopy(self) -> 'ConstantLR':
return ConstantLR(self.steps, self.const)
```
This is a basic scheduler that multiplies the initial learning rate by a set constant for a specific number of steps. The `__call__()` function is how a scheduler is called to process a learning rate, and is initialized with custom parameters that are unique to that subclass. Similar to how an optimizer is created, schedulers also have `step()`, `reset()`, and `deepcopy()` functions which perform the same operations as described for optimizers above.
### Training and Using a Model:
Now, here's an example on how to create a neural network which can recognize digits from the [MNIST](https://keras.io/api/datasets/mnist/) data set using only Caspian tools:
```python
import numpy as np
from caspian.layers import Conv2D, Pooling2D, Reshape, Dense, Container, Sequence
from caspian.activations import Sigmoid, ReLU, Softmax
from caspian.pooling import Maximum
from caspian.losses import BinCrossEntropy
from caspian.optimizers import StandardGD
from keras.datasets import mnist
#Import the dataset and reshape
(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
x_train = np.array(xtrain).reshape(xtrain.shape[0], 784)
x_test = np.array(xtest).reshape(xtest.shape[0], 784)
y_train = np.zeros((ytrain.shape[0], ytrain.max()+1), dtype=np.float32)
for i in range(len(y_train)):
y_train[i][int(ytrain[i])] = 1
xt = x_train.reshape(-1, 60, 1, 28, 28) / 255.0
yt = y_train.reshape(-1, 60, 10)
print(xt.shape)
print(yt.shape)
#Create the model to be trained
optim = StandardGD(0.001)
d1 = Conv2D(Sigmoid(), 32, 3, (1, 28, 28))
d2 = Pooling2D(Maximum(), 2, (32, 26, 26), 2)
d3 = Conv2D(Sigmoid(), 12, 3, (32, 13, 13))
d4 = Pooling2D(Maximum(), 2, (12, 11, 11), 2)
d5 = Reshape((-1, 12, 5, 5), (-1, 12*5*5))
d6 = Dense(ReLU(), 12*5*5, 100)
d7 = Dense(Sigmoid(), 100, 10)
d8 = Container(Softmax())
Seq1 = Sequence([d1, d2, d3, d4, d5, d6, d7])
Seq1.set_optimizer(optim)
ent = BinCrossEntropy()
#Training
losses = 0.0
for ep in range(50):
for x, y in zip(xt, yt):
x_r = Seq1.forward(x, True)
err_grad = ent.backward(y, x_r)
loss = ent.forward(y, x_r)
Seq1.backward(err_grad)
Seq1.step()
losses += loss
print(f"Epoch {ep+1} - {losses / xt.shape[0]}")
losses = 0.0
```
The example above uses all of the tools that were created to stochastically train a basic neural network that can recognize digits 0 through 9 on a 28x28 size image. Improvements and changes can be made to the model for greater accuracy using other tools in the Caspian library.
### Saving and Loading Layers:
> [!NOTE]
> Saving and loading models may change in the future at an unknown time. In the event that it is changed, previously formatted `.cspn` files will no longer work with new ones. If this occurs, then it will be specified in the update that does so.
Once a model has been trained (or in the process of training), each layer can be exported and loaded at a different time. Layers, activations, pooling functions, optimizers, and schedulers all have methods which allow them to be encoded into strings and/or saved to files (of type `.cspn`).
#### Saving
Layers can be encoded into a string or saved to a file using the `save_to_file()` method, as shown here:
```python
d1 = Conv2D(ReLU(), 32, 3, (1, 28, 28))
d1.save_to_file("layer1.cspn")
```
If the file name is not specified and no parameters are given to this method, then a string is returned which contains the information of that layer. This includes the activation or pooling function, optimizer, and scheduler of that layer (if any are applicable).
For other tools like optimizers, schedulers, or functions, the `repr()` function is used in place of a set saving method. It returns a string with the name of the class and all initialized attributes of the object in the order of the initialization function with `/` as a separator (except for schedulers, which use `:`). A quick example:
```python
opt = ADAM(learn_rate = 0.001, sched = StepLR(10))
opt_info = repr(opt)
#Returns "ADAM/0.9/0.99/1e-8/0.001/StepLR:10:0.1"
```
#### Loading
Once a layer has been saved to a file or encoded in a string, it can be re-loaded and re-instantiated from where it was saved before. Each layer has a static `from_save()` method, which takes two parameters. The first is a string `context`, which is either the name of the file to be loaded from or the encoded string containing the appropriate information. The second is a boolean `file_load`, which determines whether the context is either a file name or the encoded string itself. To use the method on a file:
```python
new_layer = Conv2D.from_save("layer1.cspn", True)
```
If the file provided is incorrectly formatted/modified or the file imported is not an appropriate `.cspn` file, an exception is thrown instead.
For all other saveable tools in the Caspian library, each tool folder has a function which takes the `repr` string and returns a class instance of the encoded object. The functions that correspond to each class type include:
- `Activations` -> `activations.parse_act_info()`
- `Optimizers` -> `optimizers.parse_opt_info()`
- `Pooling` -> `pooling.parse_pool_info()`
- `Schedulers` -> `schedulers.parse_sched_info()`
These classes do not have options to save directly to a file, but the user can export them and import them manually if absolutely needed. If the user creates a custom sub-class and wishes to save or load them, they will need to create an appropriate `repr()` following the same procedure as outlined above, and add the class to the tool folder dictionary:
- `Activations` -> `activations.act_funct_dict`
- `Optimizers` -> `optimizers.opt_dict`
- `Pooling` -> `pooling.pool_funct_dict`
- `Schedulers` -> `schedulers.sched_dict`
Loading a class in these categories will look similar to below:
```python
from caspian import activations as act
class CustomFunct(act.Activation):
...
def __repr__():
...
#Create instance
a_1 = CustomFunct(...)
saved_str = repr(a_1)
#Load from context string
act.act_funct_dict["CustomFunct"] = CustomFunct
a_2 = act.parse_act_info(saved_str)
```
## Notes
**It's important to note that this library is still a work in progress, and due to it using very little framework resources, it prioritizes both efficiency and utility over heavy safety. Here are a few things to keep in mind while using Caspian:**
### Memory Safety
> [!CAUTION]
> While most functions and classes in this
library are perfectly safe to use and modify, there are some that use unsafe memory operations to greatly increase the speed of that tool. An example of this would be any convolutional or pooling layers, like `Conv1D`, `Conv1DTranspose`, or `Pooling1D`. It is highly recommended for the safety of any machine that uses Caspian, DO NOT modify the internal variables or functions of these unsafe layers. Any memory unsafe layers or functions will contain a warning in their in-line documentation. Changes to necessary variables may create harmful effects such as segmentation faults.
### General Usability
> All classes in this library fit into specific categories of tools that all inherit from a basic abstraction ([**See Above**](#information)) and follow specific functionality guidelines which allow them to work seamlessly with one another. To keep the necessary functionality working as intended, it is encouraged to not modify any variables inside of any class that has already been initialized. Some variables, like the weights of a layer, for instance, may be changed safely as long as the shape and integrity is kept the same.
### Gradient Calculation
> Because [NumPy] does not have any integrated automatic differentiation functionality, all gradient calculations performed by each class is done manually. For any new layers that the user may create, they may use an auto-grad to perform any backwards passes as long as it is compatible with [NumPy].
### Further Compatibility
> Caspian only requires Python and [NumPy], so any other libraries that the user wishes to use alongside it will not be required or affected by Caspian's installation. As mentioned previously in [**Gradient Calculation**](#gradient-calculation), any custom class which inherits from a Caspian abstract container may use any helper libraries or frameworks as long as they are [NumPy] compatible.
## Future Plans and Developments
- Recurrent Layers (RNN & LSTM Cells, etc.)
- Transformer grade layers (Encoders, Decoders, etc.)
- More activation functions, base layers, and optimizers.
- Improved model saving and loading.
- More utilities, like train/test data splitting, etc.
[NumPy]: https://github.com/numpy/numpy
[CuPy]: https://github.com/cupy/cupy
[PyTorch]: https://github.com/pytorch/pytorch
[TensorFlow]: https://github.com/tensorflow/tensorflow
[CUDA]: https://developer.nvidia.com/cuda-toolkit
Raw data
{
"_id": null,
"home_page": "https://github.com/Vexives/caspian",
"name": "caspian-ml",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "machine learning, data science, deep learning, numpy",
"author": "Vexives",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/44/77/d35c255f91391726f88394718d95a0321fecefdc88bca9153f93756e37f3/caspian-ml-1.1.0.tar.gz",
"platform": null,
"description": "# **Caspian - Deep Learning Architectures**\r\n\r\n| [**Information**](#information)\r\n| [**Installation**](#installation)\r\n| [**Getting Started**](#getting-started)\r\n| [**Examples**](#examples)\r\n| [**Notes**](#notes)\r\n| [**Future Plans**](#future-plans-and-developments)\r\n|\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\nA flexible deep learning/machine learning research library using [NumPy].\r\n\r\n\r\n\r\n\r\n## Information\r\n\r\nCaspian is written entirely with base Python and [NumPy], meaning no other library or\r\nframework is required. Contains many basic tools required to create machine learning\r\nmodels like neural networks, regressions, image processors, and more. \r\n\r\nGeneral structure and functionality inspired by popular frameworks such as [PyTorch] and [TensorFlow].\r\n\r\nOn top of providing necessary layers and functions, Caspian also allows for simple creation\r\nof new layers and tools that the user may need. Each part of the Caspian architecture has its\r\nown abstraction that the user can inherit from, including:\r\n\r\n- `cspn.Layer`\r\n- `cspn.Activation`\r\n- `cspn.PoolFunc`\r\n- `cspn.Loss`\r\n- `cspn.Optimizer`\r\n- `cspn.Scheduler`\r\n\r\nCaspian also provides support for [CUDA] parallel processing, using [CuPy] as an optional secondary\r\nimport.\r\n\r\n\r\n## Installation\r\n\r\nBefore installing, this library requires:\r\n\r\n- Python 3.10+\r\n- NumPy v1.23.5+\r\n- CuPy (12x) v13.0.0+ **(Optional)**\r\n\r\n```bash\r\n$ pip install caspian-ml\r\n```\r\n\r\n\r\n\r\n\r\n## Getting Started\r\n\r\nCaspian architectures are split into 6 different class types:\r\n\r\n- `Layers`, the backbone behind any model and the main processors & learners.\r\n- `Activations`, non-linear functions which assist layers in learning and processing data.\r\n- `PoolFuncs`, similar to activations, but to be used with pooling layers and work on strided data rather than standard arrays.\r\n- `Losses`, functions which describe the loss, or error, of a model.\r\n- `Optimizers`, functions which assist in layer weight updating and learning.\r\n- `Schedulers`, functions which define the learning rate at a particular step in a model's learning process.\r\n\r\n\r\nThe structure of a network differs slightly from that of [PyTorch] or [TensorFlow], where each layer,\r\nactivation, optimizer, and scheduler is separate. With Caspian, layers can contain an activation or pooling function, as well as an optimizer. Optimizers contain a scheduler, which controls the learning rate of the optimizer and layer as a whole. Some layers, like `Dropout` and `Upsampling1D` do not contain optimizers OR activations, as they do not have any learnable parameters or perform any non-linear transformations.\r\n\r\nSome types have default classes that allow that operation to be skipped or performed at a base level, like `Linear` for activations, `StandardGD` for optimizers, and `SchedulerLR` for schedulers.\r\nIf an optimizer is required for a layer but not provided in the initialization, a default `StandardGD` optimizer with a `SchedulerLR` scheduler will automatically be assigned. Activation and pooling functions will not be defaulted if not provided, so they must be manually provided by the user.\r\n\r\n\r\n#### GPU Computing\r\n\r\nCaspian and its tools can also be used with [CUDA] through [CuPy] to increase speeds by a significant amount. Before importing Caspian or any of its tools, place this segment of Python code above the other imports:\r\n```python\r\nimport os\r\nos.environ[\"CSPN_CUDA\"] = \"cuda\"\r\n```\r\nThis ensures that all modules and tools from Caspian are synced with [CUDA], and [CUDA]-supported GPU computing should be enabled as long as [CuPy] and the [CUDA] toolkit are both properly installed.\r\n\r\n\r\nIf a custom tool for Caspian is expected to use both CPU and GPU computing, then use this import instead of directly importing [NumPy] or [CuPy]:\r\n```python\r\nfrom caspian.cudalib import np\r\n```\r\nThis will automatically import the library that Caspian is currently using. This allows for easier compatibility and prevents the user from having to manually switch between the two libraries manually within their tool.\r\n\r\n\r\n\r\n\r\n## Examples\r\n\r\nThe setup and training of a model in Caspian is similar to other deep learning libraries of its kind, here is a quick training example of a neural network to provide more information:\r\n\r\n\r\n### Creation of a Model:\r\n\r\n```python\r\nfrom caspian.layers import Layer, Dense\r\nfrom caspian.activations import Activation, Softmax\r\nfrom caspian.optimizers import Optimizer\r\nimport numpy as np\r\n\r\nclass NeuralNet(Layer):\r\n def __init__(self, inputs: int, hiddens: int, outputs: int, \r\n activation: Activation, opt: Optimizer):\r\n in_size = (inputs,)\r\n out_size = (outputs,)\r\n super().__init__(in_size, out_size)\r\n\r\n self.x_1 = Dense(activation, inputs, hiddens, optimizer=opt.deepcopy())\r\n self.x_2 = Dense(activation, hiddens, outputs, optimizer=opt.deepcopy())\r\n self.softmax = Softmax()\r\n\r\n def forward(self, data: np.ndarray, training: bool = False) -> np.ndarray:\r\n self.training = training\r\n step_1 = x_1(data, training)\r\n step_2 = x_2(step_1, training)\r\n return self.softmax(step_2)\r\n\r\n def backward(self, dx: np.ndarray) -> np.ndarray:\r\n assert self.training is True\r\n d_sm = self.softmax.backward(dx)\r\n d_2 = x_2.backward(d_sm)\r\n d_1 = x_1.backward(d_1)\r\n return d_1\r\n\r\n def step(self) -> None:\r\n x_1.step()\r\n x_2.step()\r\n```\r\n\r\nThis is a simple neural network model containing two `Dense` layers, each with the same activation function and optimizer (separate instances are highly recommended) as provided. The variables `in_size` and `out_size` are a part of every layer class, and can be set for a layer using `super().__init__()`, which expects the input size and output size as tuples. If constructed like this, it can also be used inside of `Sequence` layers (similar to [PyTorch]'s [Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html)).\r\n\r\n\r\n### Creation of an Activation Function:\r\n\r\n```python\r\nfrom caspian.activations import Activation\r\nimport numpy as np\r\n\r\nclass ReLU(Activation):\r\n def forward(self, data: np.ndarray) -> np.ndarray:\r\n return np.maximum(0, data)\r\n\r\n def backward(self, data: np.ndarray) -> np.ndarray:\r\n return (data >= 0) * 1\r\n```\r\n\r\nCreating a new activation function is quite simple as well, and only expects two functions, `forward()` and `backward()`, which take and return a [NumPy] array. Activations should return an array of the same size as the input for both functions, and can also have an `__init__()` if any internal variables are necessary. The abstract class `cspn.Activation` also provides default functionality for `__call__()`, which allows it to act like a standard Python function. If only one value (representing the last output of the activation) is passed into the `__call__()` function, the forward pass will automatically be performed. If two values are given, then the backward pass is called, and the second array (which is the error gradient with respect to the last output) is multiplied by the backward pass result and returned. \r\n\r\n\r\n### Creation of a Pooling Function:\r\n\r\n```python\r\nfrom caspian.pooling import PoolFunc\r\nimport numpy as np\r\n\r\nclass Average(PoolFunc):\r\n def forward(self, partition: np.ndarray) -> np.ndarray:\r\n return np.average(partition)\r\n \r\n def backward(self, partition: np.ndarray) -> np.ndarray:\r\n return partition * (1.0 / partition.shape[self.axis])\r\n```\r\n\r\nSimilar in structure to activation functions, but pooling functions return an `ndarray` with a smaller array rather than an array with the same size as the partition. Like activations as well, can be called like a standard Python function if inheriting from the `PoolFunc` abstract class. Each pooling function will have an internal variable `self.axis` (can be set during initialization) which can be used at any point in both the forward and backward passes.\r\n\r\n\r\n### Creation of a Loss Function:\r\n\r\n```python\r\nfrom caspian.losses import Loss\r\nimport numpy as np\r\n\r\nclass CrossEntropy(Loss):\r\n @staticmethod\r\n def forward(actual: np.ndarray, prediction: np.ndarray) -> float:\r\n clip_pred = np.clip(prediction, 1e-10, 1 - 1e-10)\r\n return -np.sum(actual * np.log(clip_pred))\r\n\r\n @staticmethod\r\n def backward(actual: np.ndarray, prediction: np.ndarray) -> np.ndarray:\r\n return prediction - actual\r\n```\r\n\r\nLoss functions quantify the rate of error of a model's predictions and provides the partial derivative with respect to the output (gradient array) that a model can use to learn. Losses are not a part of any layer or other class, and unless required by some special cases, do not store any internal variables. Because of this, losses can be created as either static classes or instantiable, depending on the user's choice.\r\n\r\n\r\n### Creation of an Optimizer:\r\n```python\r\nfrom caspian.optimizers import Optimizer\r\nfrom caspian.schedulers import Scheduler\r\nimport numpy as np\r\n\r\nclass Momentum(Optimizer):\r\n def __init__(self, momentum: float = 0.9, learn_rate: float = 0.01, \r\n sched: Scheduler) -> None:\r\n super().__init__(learn_rate, sched)\r\n self.momentum = momentum\r\n self.previous = 0.0\r\n\r\n def process_grad(self, grad: np.ndarray) -> np.ndarray:\r\n learn_rate = self.scheduler(self.learn_rate)\r\n velocity_grad = self.momentum * self.previous - learn_rate * grad\r\n self.previous = velocity_grad\r\n return velocity_grad\r\n \r\n def step(self) -> None:\r\n self.scheduler.step()\r\n \r\n def reset_grad(self) -> None:\r\n self.previous = 0.0\r\n self.scheduler.reset()\r\n\r\n def deepcopy(self) -> 'Momentum':\r\n return Momentum(self.momentum, self.learn_rate, self.scheduler.deepcopy())\r\n```\r\n\r\nThe general framework for an optimizer is a little bit more complex, but still easy to assemble. The abstract `Optimizer` class initialization takes in two parameters, `learn_rate` as a float, and `sched` as a scheduler class.\r\n\r\nThe function `process_grad()` is the main transformation of the optimizer. It should process the given gradient array, apply the learning rate (if applicable), and return an array with the same size as the input.\r\n\r\nThe function `step()` is meant to keep track of the epoch or training iteration of the model that the optimizer is a part of. For the example above, it only calls the internal scheduler's `step()` function and does not modify any variables. However, some more advanced optimizers like `ADAM` may require an internal variable to be kept for this purpose.\r\n\r\nAnother function expected from optimizers is `reset_grad()`, which clears all previous gradient information and resets the learning rate scheduler for that optimizer.\r\n\r\nThe function `deepcopy()` is highly recommended if being used on multiple layers of a model, as each layer contains its own version of an optimizer and scheduler. It should pass a deep copy of whatever data structures it contains or needs into the initialization of a new instance.\r\n\r\n\r\n### Creation of a Learning Rate Scheduler:\r\n```python\r\nfrom caspian.schedulers import Scheduler\r\nimport numpy as np\r\n\r\nclass ConstantLR(Scheduler):\r\n def __init__(self, steps: int, const: float = 0.1) -> None:\r\n self.steps = steps\r\n self.const = const\r\n self.epoch = 0\r\n\r\n def __call__(self, init_rate: float) -> float:\r\n return init_rate * self.const if self.epoch < self.steps else init_rate\r\n\r\n def step(self) -> None:\r\n self.epoch += 1\r\n\r\n def reset(self) -> None:\r\n self.epoch = 0\r\n\r\n def deepcopy(self) -> 'ConstantLR':\r\n return ConstantLR(self.steps, self.const)\r\n```\r\n\r\nThis is a basic scheduler that multiplies the initial learning rate by a set constant for a specific number of steps. The `__call__()` function is how a scheduler is called to process a learning rate, and is initialized with custom parameters that are unique to that subclass. Similar to how an optimizer is created, schedulers also have `step()`, `reset()`, and `deepcopy()` functions which perform the same operations as described for optimizers above.\r\n\r\n\r\n### Training and Using a Model:\r\n\r\nNow, here's an example on how to create a neural network which can recognize digits from the [MNIST](https://keras.io/api/datasets/mnist/) data set using only Caspian tools:\r\n\r\n```python\r\nimport numpy as np\r\n\r\nfrom caspian.layers import Conv2D, Pooling2D, Reshape, Dense, Container, Sequence\r\nfrom caspian.activations import Sigmoid, ReLU, Softmax\r\nfrom caspian.pooling import Maximum\r\nfrom caspian.losses import BinCrossEntropy\r\nfrom caspian.optimizers import StandardGD\r\nfrom keras.datasets import mnist\r\n\r\n#Import the dataset and reshape\r\n(xtrain, ytrain), (xtest, ytest) = mnist.load_data()\r\nx_train = np.array(xtrain).reshape(xtrain.shape[0], 784)\r\nx_test = np.array(xtest).reshape(xtest.shape[0], 784)\r\ny_train = np.zeros((ytrain.shape[0], ytrain.max()+1), dtype=np.float32)\r\n\r\nfor i in range(len(y_train)):\r\n y_train[i][int(ytrain[i])] = 1\r\n\r\nxt = x_train.reshape(-1, 60, 1, 28, 28) / 255.0\r\nyt = y_train.reshape(-1, 60, 10)\r\nprint(xt.shape)\r\nprint(yt.shape)\r\n\r\n#Create the model to be trained\r\noptim = StandardGD(0.001)\r\n\r\nd1 = Conv2D(Sigmoid(), 32, 3, (1, 28, 28))\r\nd2 = Pooling2D(Maximum(), 2, (32, 26, 26), 2)\r\nd3 = Conv2D(Sigmoid(), 12, 3, (32, 13, 13))\r\nd4 = Pooling2D(Maximum(), 2, (12, 11, 11), 2)\r\nd5 = Reshape((-1, 12, 5, 5), (-1, 12*5*5))\r\nd6 = Dense(ReLU(), 12*5*5, 100)\r\nd7 = Dense(Sigmoid(), 100, 10)\r\nd8 = Container(Softmax())\r\n\r\nSeq1 = Sequence([d1, d2, d3, d4, d5, d6, d7])\r\nSeq1.set_optimizer(optim)\r\n\r\nent = BinCrossEntropy()\r\n\r\n#Training\r\nlosses = 0.0\r\nfor ep in range(50):\r\n for x, y in zip(xt, yt):\r\n x_r = Seq1.forward(x, True)\r\n\r\n err_grad = ent.backward(y, x_r)\r\n\r\n loss = ent.forward(y, x_r)\r\n\r\n Seq1.backward(err_grad)\r\n Seq1.step()\r\n\r\n losses += loss\r\n print(f\"Epoch {ep+1} - {losses / xt.shape[0]}\")\r\n losses = 0.0\r\n```\r\n\r\nThe example above uses all of the tools that were created to stochastically train a basic neural network that can recognize digits 0 through 9 on a 28x28 size image. Improvements and changes can be made to the model for greater accuracy using other tools in the Caspian library.\r\n\r\n\r\n### Saving and Loading Layers:\r\n\r\n> [!NOTE]\r\n> Saving and loading models may change in the future at an unknown time. In the event that it is changed, previously formatted `.cspn` files will no longer work with new ones. If this occurs, then it will be specified in the update that does so.\r\n\r\nOnce a model has been trained (or in the process of training), each layer can be exported and loaded at a different time. Layers, activations, pooling functions, optimizers, and schedulers all have methods which allow them to be encoded into strings and/or saved to files (of type `.cspn`). \r\n\r\n\r\n#### Saving\r\n\r\nLayers can be encoded into a string or saved to a file using the `save_to_file()` method, as shown here:\r\n```python\r\nd1 = Conv2D(ReLU(), 32, 3, (1, 28, 28))\r\nd1.save_to_file(\"layer1.cspn\")\r\n```\r\nIf the file name is not specified and no parameters are given to this method, then a string is returned which contains the information of that layer. This includes the activation or pooling function, optimizer, and scheduler of that layer (if any are applicable).\r\n\r\nFor other tools like optimizers, schedulers, or functions, the `repr()` function is used in place of a set saving method. It returns a string with the name of the class and all initialized attributes of the object in the order of the initialization function with `/` as a separator (except for schedulers, which use `:`). A quick example:\r\n```python\r\nopt = ADAM(learn_rate = 0.001, sched = StepLR(10))\r\nopt_info = repr(opt)\r\n#Returns \"ADAM/0.9/0.99/1e-8/0.001/StepLR:10:0.1\"\r\n```\r\n\r\n\r\n#### Loading\r\n\r\nOnce a layer has been saved to a file or encoded in a string, it can be re-loaded and re-instantiated from where it was saved before. Each layer has a static `from_save()` method, which takes two parameters. The first is a string `context`, which is either the name of the file to be loaded from or the encoded string containing the appropriate information. The second is a boolean `file_load`, which determines whether the context is either a file name or the encoded string itself. To use the method on a file:\r\n```python\r\nnew_layer = Conv2D.from_save(\"layer1.cspn\", True)\r\n```\r\nIf the file provided is incorrectly formatted/modified or the file imported is not an appropriate `.cspn` file, an exception is thrown instead.\r\n\r\n\r\nFor all other saveable tools in the Caspian library, each tool folder has a function which takes the `repr` string and returns a class instance of the encoded object. The functions that correspond to each class type include:\r\n\r\n- `Activations` -> `activations.parse_act_info()`\r\n- `Optimizers` -> `optimizers.parse_opt_info()`\r\n- `Pooling` -> `pooling.parse_pool_info()`\r\n- `Schedulers` -> `schedulers.parse_sched_info()`\r\n\r\nThese classes do not have options to save directly to a file, but the user can export them and import them manually if absolutely needed. If the user creates a custom sub-class and wishes to save or load them, they will need to create an appropriate `repr()` following the same procedure as outlined above, and add the class to the tool folder dictionary:\r\n\r\n- `Activations` -> `activations.act_funct_dict`\r\n- `Optimizers` -> `optimizers.opt_dict`\r\n- `Pooling` -> `pooling.pool_funct_dict`\r\n- `Schedulers` -> `schedulers.sched_dict`\r\n\r\nLoading a class in these categories will look similar to below:\r\n```python\r\nfrom caspian import activations as act\r\n\r\nclass CustomFunct(act.Activation):\r\n ...\r\n def __repr__():\r\n ...\r\n\r\n#Create instance\r\na_1 = CustomFunct(...)\r\nsaved_str = repr(a_1)\r\n\r\n#Load from context string\r\nact.act_funct_dict[\"CustomFunct\"] = CustomFunct\r\na_2 = act.parse_act_info(saved_str)\r\n```\r\n\r\n\r\n\r\n\r\n## Notes\r\n\r\n**It's important to note that this library is still a work in progress, and due to it using very little framework resources, it prioritizes both efficiency and utility over heavy safety. Here are a few things to keep in mind while using Caspian:**\r\n\r\n### Memory Safety\r\n\r\n> [!CAUTION]\r\n> While most functions and classes in this\r\nlibrary are perfectly safe to use and modify, there are some that use unsafe memory operations to greatly increase the speed of that tool. An example of this would be any convolutional or pooling layers, like `Conv1D`, `Conv1DTranspose`, or `Pooling1D`. It is highly recommended for the safety of any machine that uses Caspian, DO NOT modify the internal variables or functions of these unsafe layers. Any memory unsafe layers or functions will contain a warning in their in-line documentation. Changes to necessary variables may create harmful effects such as segmentation faults.\r\n\r\n### General Usability\r\n\r\n> All classes in this library fit into specific categories of tools that all inherit from a basic abstraction ([**See Above**](#information)) and follow specific functionality guidelines which allow them to work seamlessly with one another. To keep the necessary functionality working as intended, it is encouraged to not modify any variables inside of any class that has already been initialized. Some variables, like the weights of a layer, for instance, may be changed safely as long as the shape and integrity is kept the same.\r\n\r\n### Gradient Calculation\r\n\r\n> Because [NumPy] does not have any integrated automatic differentiation functionality, all gradient calculations performed by each class is done manually. For any new layers that the user may create, they may use an auto-grad to perform any backwards passes as long as it is compatible with [NumPy].\r\n\r\n### Further Compatibility\r\n\r\n> Caspian only requires Python and [NumPy], so any other libraries that the user wishes to use alongside it will not be required or affected by Caspian's installation. As mentioned previously in [**Gradient Calculation**](#gradient-calculation), any custom class which inherits from a Caspian abstract container may use any helper libraries or frameworks as long as they are [NumPy] compatible.\r\n\r\n\r\n\r\n\r\n## Future Plans and Developments\r\n\r\n- Recurrent Layers (RNN & LSTM Cells, etc.)\r\n- Transformer grade layers (Encoders, Decoders, etc.)\r\n- More activation functions, base layers, and optimizers.\r\n- Improved model saving and loading.\r\n- More utilities, like train/test data splitting, etc.\r\n\r\n\r\n\r\n\r\n[NumPy]: https://github.com/numpy/numpy\r\n[CuPy]: https://github.com/cupy/cupy\r\n[PyTorch]: https://github.com/pytorch/pytorch\r\n[TensorFlow]: https://github.com/tensorflow/tensorflow\r\n[CUDA]: https://developer.nvidia.com/cuda-toolkit\r\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "A deep learning library focused entirely around NumPy.",
"version": "1.1.0",
"project_urls": {
"Homepage": "https://github.com/Vexives/caspian"
},
"split_keywords": [
"machine learning",
" data science",
" deep learning",
" numpy"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d875687bdab5bb4453ee86e60484f634999e0cec3290172aaf48ebadde31a287",
"md5": "b5b91290fa55e17f8e803e68d666fac6",
"sha256": "3cd8b418cc0e4dc2c2414426ccce9284b43696e91a705d665f9efa6e2809d88f"
},
"downloads": -1,
"filename": "caspian_ml-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b5b91290fa55e17f8e803e68d666fac6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 201645,
"upload_time": "2025-09-10T00:37:30",
"upload_time_iso_8601": "2025-09-10T00:37:30.180918Z",
"url": "https://files.pythonhosted.org/packages/d8/75/687bdab5bb4453ee86e60484f634999e0cec3290172aaf48ebadde31a287/caspian_ml-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4477d35c255f91391726f88394718d95a0321fecefdc88bca9153f93756e37f3",
"md5": "af4450603cb136a34cbffa16f2569fa8",
"sha256": "9856e15d20f931291db3a2bc7e2169297fee26bde6276e4e2e67aa8c122b5161"
},
"downloads": -1,
"filename": "caspian-ml-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "af4450603cb136a34cbffa16f2569fa8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 99172,
"upload_time": "2025-09-10T00:37:31",
"upload_time_iso_8601": "2025-09-10T00:37:31.208543Z",
"url": "https://files.pythonhosted.org/packages/44/77/d35c255f91391726f88394718d95a0321fecefdc88bca9153f93756e37f3/caspian-ml-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-10 00:37:31",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Vexives",
"github_project": "caspian",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "caspian-ml"
}