MSc Thesis: “Hedging Derivatives Under Incomplete Markets with Deep Learning”
=============================================================================
| *Buchkov Viacheslav*
| MSc in Applied Mathematics and Informatics
| Machine Learning and Data-Intensive Systems
| Faculty of Computer Science
| NRU Higher School of Economics
Install
-------
::
pip install deep-hedging
Deep Learning Example
---------------------
::
from pathlib import Path
from deep_hedging import ExperimentConfig, EuropeanCall
from deep_hedging.dl import Trainer, Assessor
from deep_hedging.dl.models import LSTMHedger
from deep_hedging.dl.baselines import BaselineEuropeanCall
# Amend config
config = ExperimentConfig(
DATA_ROOT=Path(...),
OUTPUT_ROOT=Path(...),
DATA_FILENAME="...",
REBAL_FREQ="5 min"
)
# Train Hedger for 1 epoch
trainer = Trainer(model_cls=LSTMHedger, instrument_cls=EuropeanCall, config=config)
trainer.run(1)
# Assess obtained quality
assessor = Assessor(
model=trainer.hedger,
baseline=BaselineEuropeanCall(dt=trainer.dt).to(config.DEVICE),
test_loader=trainer.test_loader,
)
assessor.run()
# Save model
trainer.save(config.OUTPUT_ROOT)
Custom Derivative Example
-------------------------
::
from pathlib import Path
from deep_hedging import ExperimentConfig, Instrument
from deep_hedging.dl import Trainer, Assessor
from deep_hedging.dl.models import LSTMHedger
from deep_hedging.dl.baselines import BaselineEuropeanCall
# Amend config
config = ExperimentConfig(
DATA_ROOT=Path(...),
OUTPUT_ROOT=Path(...),
DATA_FILENAME="...",
REBAL_FREQ="5 min"
)
# Create custom derivative
class CustomDerivative(Instrument):
def __init__(self, *args, **kwargs):
super().__init__()
def payoff(self, spot: float) -> float:
return ... # any payoff you want - e.g., spot ** 12 - 12 * np.random.randint(12, 121)
def __repr__(self):
return f"SomeCustomDerivative(param1=..., param2=...)"
# Train Hedger for 1 epoch
trainer = Trainer(model_cls=LSTMHedger, instrument_cls=CustomDerivative, config=config)
trainer.run(1)
# Save model
trainer.save(config.OUTPUT_ROOT)
Reinforcement Learning Example
------------------------------
::
from pathlib import Path
from deep_hedging import ExperimentConfig, EuropeanCall, seed_everything
from deep_hedging.rl import DerivativeEnvStep, RLTrainer
from sb3_contrib import RecurrentPPO
from stable_baselines3 import SAC, PPO
# Amend config
config = ExperimentConfig(
DATA_ROOT=Path(...),
OUTPUT_ROOT=Path(...),
DATA_FILENAME="...",
REBAL_FREQ="5 min"
)
# Create environment
env = DerivativeEnvStep(n_days=config.N_DAYS, instrument_cls=EuropeanCall)
env.reset()
# Train Hedger for 1_000 steps
trainer = RLTrainer(
model=RecurrentPPO("MlpLstmPolicy", env, verbose=1),
instrument_cls=EuropeanCall,
environment_cls=DerivativeEnvStep,
config=config,
)
trainer.learn(1_000)
# Assess obtained quality at 100 steps
trainer.assess(100)
Description of Research Tasks
-----------------------------
**Research Task**: create an universal algorithm that would produce for
each point of time weights vector for replicating portfolio assets to
dynamically delta-hedge a derivative, defined by payoff function only.
The algorithm should take into account “state-of-the-world” embedding,
historical dynamics of underlying asset and parameters of a derivative
(like time till maturity for each point in time). The target function
for optimization would be to minimize difference between derivative’s
PnL and replicating portfolio’s PnL. Potentially, approach might be
adjusted to fit Reinforcement Learning framework.
**Data**: MVP: Generate paths via GBM in order to test basics of the
architecture (should coincide with BSM delta-hedging, if no constraints
are imposed) Base: orderbooks for FX and FX Options at Moscow Exchange
Advanced: use generative model (GANs, NFs, VAEs, Diffusion models) to
create random paths and then apply hedging framework
**Baseline**: closed-out solution of delta-hedging (start with BSM
delta-hedging for vanilla option, base: local volatility, Heston,
advanced: SABR) check, if optimization returns replications, for which
put-call parity robustly holds (however, due to potentially present
volatility skew in real data, potentially even great model might fail
such a test)
**Suggested models**: Model 1. \* model receives an input of
underlying’s historical returns and state embeddings for each point \*
model receives another input with derivative’s parameters (time till
maturity, strike, barrier level etc. — for MVP solution it is supposed
that we train a separate model for each payoff type only) \* model
receives (price of underlying, state embedding and time till maturity)
and autoregressively generates :math:`N` vectors (each
:math:`\in \mathbb{R}^{K_{assets}}`) loss function used is MSELoss —
.. math:: \min\limits_{{W}}(PnL_{derivative}-PnL_{portfolio}({W}))^2
**Architecture**: Model 2. Applying Reinforcement Learning for
Derivatives Hedging, where reward function is the risk-adjusted PnL of
the trading book.
**Experiments outline**: \* One underlying, two assets in replicating
portfolio (spot + risk-free asset) fixed market data (spot prices,
interest rates) — linear payoff (Forward). \* One underlying, two assets
in replicating portfolio (spot + risk-free asset), fixed market data
(spot prices, interest rates, historical volatility) — vanilla
non-linear payoff (European Call, European Put). \* One underlying, real
market data for each point of time — vanilla non-linear payoff (European
Call, European Put). \* Application of real-world constraints
(transaction costs — especially, real FX SWAP market data,
short-selling, market impact etc.). \* One underlying, several assets in
replicating portfolio (up to all assets that are available for trading),
real market data for each point of time — vanilla non-linear payoff
(European Call, European Put). \* One underlying, several assets in
replicating portfolio (up to all assets that are available for trading),
real market data for each point of time — exotic derivatives (start with
Barrier Options). \* One underlying, several assets in replicating
portfolio (up to all assets that are available for trading), real market
data for each point of time — path-dependent exotic derivatives (start
with Bermudian Knock-Out Options). \* Application of Reinforcement
Learning. \* ! Correct adjustments for tail-event contingency,
introduction of VaR restrictions.
**Ideas for application of real-world constraints**: 1. Transaction
costs: \* Base: Not mid-price, but bid-offer (without market impact). \*
Advanced: Use market impact model (exogeneous). 2. Correct risk-free
rates (implied rate from FX SWAP at correct bid-offer price): \* (?) add
constraints for weights of replicating portfolio assets (in order to
account for risk-management, regulatory, open vega / open gamma
restrictions).
**Complications / To Be Researched**: \* compute gradient at each step
of autoregressive generation, not for final PnL only \* deal with
overfit for correlation in non-tail state of the world, when the model
is allowed to hedge by not only underlying asset, but any other asset
available (as in tail event such hedge can produce unpredictably large
loss / gain => until model behavior is stabilized, model is expected to
produce cheap hedging in non-tail state of the world due to leverage
tail event \* produce correct torch.DataLoader logic that allows to use
real market data only and have sufficient number of points for training
(e.g., produce batches via shifting window by some and then shuffle in
order to avoid path-dependence in optimization — expected to achieve
high enough level of generalization, outputting regime-independent
solution.
Raw data
{
"_id": null,
"home_page": "https://github.com/v-buchkov/deep-hedging",
"name": "deep-hedging",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "deep-hedging, deep hedging, derivatives, hedging, deep learning, reinforcement learning",
"author": "Viacheslav Buchkov",
"author_email": "viacheslav.buchkov@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/fd/4e/87b8cf8bc8db14b0326fdc1c6d22fc60501d0847f72e111fc58ae9a9fb4f/deep_hedging-2.0.2.tar.gz",
"platform": null,
"description": "MSc Thesis: \u201cHedging Derivatives Under Incomplete Markets with Deep Learning\u201d\n=============================================================================\n\n| *Buchkov Viacheslav*\n| MSc in Applied Mathematics and Informatics\n| Machine Learning and Data-Intensive Systems\n| Faculty of Computer Science\n| NRU Higher School of Economics\n\nInstall\n-------\n\n::\n\n pip install deep-hedging\n\nDeep Learning Example\n---------------------\n\n::\n\n from pathlib import Path\n\n from deep_hedging import ExperimentConfig, EuropeanCall\n\n from deep_hedging.dl import Trainer, Assessor\n from deep_hedging.dl.models import LSTMHedger\n from deep_hedging.dl.baselines import BaselineEuropeanCall\n\n # Amend config\n config = ExperimentConfig(\n DATA_ROOT=Path(...),\n OUTPUT_ROOT=Path(...),\n DATA_FILENAME=\"...\",\n REBAL_FREQ=\"5 min\"\n )\n\n # Train Hedger for 1 epoch\n trainer = Trainer(model_cls=LSTMHedger, instrument_cls=EuropeanCall, config=config)\n trainer.run(1)\n\n # Assess obtained quality\n assessor = Assessor(\n model=trainer.hedger,\n baseline=BaselineEuropeanCall(dt=trainer.dt).to(config.DEVICE),\n test_loader=trainer.test_loader,\n )\n assessor.run()\n\n # Save model\n trainer.save(config.OUTPUT_ROOT)\n\nCustom Derivative Example\n-------------------------\n\n::\n\n from pathlib import Path\n\n from deep_hedging import ExperimentConfig, Instrument\n\n from deep_hedging.dl import Trainer, Assessor\n from deep_hedging.dl.models import LSTMHedger\n from deep_hedging.dl.baselines import BaselineEuropeanCall\n\n # Amend config\n config = ExperimentConfig(\n DATA_ROOT=Path(...),\n OUTPUT_ROOT=Path(...),\n DATA_FILENAME=\"...\",\n REBAL_FREQ=\"5 min\"\n )\n\n # Create custom derivative\n class CustomDerivative(Instrument):\n def __init__(self, *args, **kwargs):\n super().__init__()\n\n def payoff(self, spot: float) -> float:\n return ... # any payoff you want - e.g., spot ** 12 - 12 * np.random.randint(12, 121)\n\n def __repr__(self):\n return f\"SomeCustomDerivative(param1=..., param2=...)\"\n\n # Train Hedger for 1 epoch\n trainer = Trainer(model_cls=LSTMHedger, instrument_cls=CustomDerivative, config=config)\n trainer.run(1)\n\n # Save model\n trainer.save(config.OUTPUT_ROOT)\n\nReinforcement Learning Example\n------------------------------\n\n::\n\n from pathlib import Path\n\n from deep_hedging import ExperimentConfig, EuropeanCall, seed_everything\n from deep_hedging.rl import DerivativeEnvStep, RLTrainer\n\n from sb3_contrib import RecurrentPPO\n from stable_baselines3 import SAC, PPO\n\n # Amend config\n config = ExperimentConfig(\n DATA_ROOT=Path(...),\n OUTPUT_ROOT=Path(...),\n DATA_FILENAME=\"...\",\n REBAL_FREQ=\"5 min\"\n )\n\n # Create environment\n env = DerivativeEnvStep(n_days=config.N_DAYS, instrument_cls=EuropeanCall)\n env.reset()\n\n # Train Hedger for 1_000 steps\n trainer = RLTrainer(\n model=RecurrentPPO(\"MlpLstmPolicy\", env, verbose=1),\n instrument_cls=EuropeanCall,\n environment_cls=DerivativeEnvStep,\n config=config,\n )\n trainer.learn(1_000)\n\n # Assess obtained quality at 100 steps\n trainer.assess(100)\n\nDescription of Research Tasks\n-----------------------------\n\n**Research Task**: create an universal algorithm that would produce for\neach point of time weights vector for replicating portfolio assets to\ndynamically delta-hedge a derivative, defined by payoff function only.\nThe algorithm should take into account \u201cstate-of-the-world\u201d embedding,\nhistorical dynamics of underlying asset and parameters of a derivative\n(like time till maturity for each point in time). The target function\nfor optimization would be to minimize difference between derivative\u2019s\nPnL and replicating portfolio\u2019s PnL. Potentially, approach might be\nadjusted to fit Reinforcement Learning framework.\n\n**Data**: MVP: Generate paths via GBM in order to test basics of the\narchitecture (should coincide with BSM delta-hedging, if no constraints\nare imposed) Base: orderbooks for FX and FX Options at Moscow Exchange\nAdvanced: use generative model (GANs, NFs, VAEs, Diffusion models) to\ncreate random paths and then apply hedging framework\n\n**Baseline**: closed-out solution of delta-hedging (start with BSM\ndelta-hedging for vanilla option, base: local volatility, Heston,\nadvanced: SABR) check, if optimization returns replications, for which\nput-call parity robustly holds (however, due to potentially present\nvolatility skew in real data, potentially even great model might fail\nsuch a test)\n\n**Suggested models**: Model 1. \\* model receives an input of\nunderlying\u2019s historical returns and state embeddings for each point \\*\nmodel receives another input with derivative\u2019s parameters (time till\nmaturity, strike, barrier level etc. \u2014 for MVP solution it is supposed\nthat we train a separate model for each payoff type only) \\* model\nreceives (price of underlying, state embedding and time till maturity)\nand autoregressively generates :math:`N` vectors (each\n:math:`\\in \\mathbb{R}^{K_{assets}}`) loss function used is MSELoss \u2014\n\n.. math:: \\min\\limits_{{W}}(PnL_{derivative}-PnL_{portfolio}({W}))^2\n\n**Architecture**: Model 2. Applying Reinforcement Learning for\nDerivatives Hedging, where reward function is the risk-adjusted PnL of\nthe trading book.\n\n**Experiments outline**: \\* One underlying, two assets in replicating\nportfolio (spot + risk-free asset) fixed market data (spot prices,\ninterest rates) \u2014 linear payoff (Forward). \\* One underlying, two assets\nin replicating portfolio (spot + risk-free asset), fixed market data\n(spot prices, interest rates, historical volatility) \u2014 vanilla\nnon-linear payoff (European Call, European Put). \\* One underlying, real\nmarket data for each point of time \u2014 vanilla non-linear payoff (European\nCall, European Put). \\* Application of real-world constraints\n(transaction costs \u2014 especially, real FX SWAP market data,\nshort-selling, market impact etc.). \\* One underlying, several assets in\nreplicating portfolio (up to all assets that are available for trading),\nreal market data for each point of time \u2014 vanilla non-linear payoff\n(European Call, European Put). \\* One underlying, several assets in\nreplicating portfolio (up to all assets that are available for trading),\nreal market data for each point of time \u2014 exotic derivatives (start with\nBarrier Options). \\* One underlying, several assets in replicating\nportfolio (up to all assets that are available for trading), real market\ndata for each point of time \u2014 path-dependent exotic derivatives (start\nwith Bermudian Knock-Out Options). \\* Application of Reinforcement\nLearning. \\* ! Correct adjustments for tail-event contingency,\nintroduction of VaR restrictions.\n\n**Ideas for application of real-world constraints**: 1. Transaction\ncosts: \\* Base: Not mid-price, but bid-offer (without market impact). \\*\nAdvanced: Use market impact model (exogeneous). 2. Correct risk-free\nrates (implied rate from FX SWAP at correct bid-offer price): \\* (?) add\nconstraints for weights of replicating portfolio assets (in order to\naccount for risk-management, regulatory, open vega / open gamma\nrestrictions).\n\n**Complications / To Be Researched**: \\* compute gradient at each step\nof autoregressive generation, not for final PnL only \\* deal with\noverfit for correlation in non-tail state of the world, when the model\nis allowed to hedge by not only underlying asset, but any other asset\navailable (as in tail event such hedge can produce unpredictably large\nloss / gain => until model behavior is stabilized, model is expected to\nproduce cheap hedging in non-tail state of the world due to leverage\ntail event \\* produce correct torch.DataLoader logic that allows to use\nreal market data only and have sufficient number of points for training\n(e.g., produce batches via shifting window by some and then shuffle in\norder to avoid path-dependence in optimization \u2014 expected to achieve\nhigh enough level of generalization, outputting regime-independent\nsolution.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Hedging Derivatives Under Incomplete Markets with Deep Learning",
"version": "2.0.2",
"project_urls": {
"Download": "https://github.com/v-buchkov/deep-hedging/archive/refs/tags/v2.0.2.tar.gz",
"Homepage": "https://github.com/v-buchkov/deep-hedging"
},
"split_keywords": [
"deep-hedging",
" deep hedging",
" derivatives",
" hedging",
" deep learning",
" reinforcement learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fd4e87b8cf8bc8db14b0326fdc1c6d22fc60501d0847f72e111fc58ae9a9fb4f",
"md5": "1f7d72f6a2fe2ef3c2ea4307199599c9",
"sha256": "dc06f95d5a1f9e8a1f7109618a3a707d24ffddf4d5e538f10758791c006e5c28"
},
"downloads": -1,
"filename": "deep_hedging-2.0.2.tar.gz",
"has_sig": false,
"md5_digest": "1f7d72f6a2fe2ef3c2ea4307199599c9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 44855,
"upload_time": "2024-10-29T22:19:35",
"upload_time_iso_8601": "2024-10-29T22:19:35.723730Z",
"url": "https://files.pythonhosted.org/packages/fd/4e/87b8cf8bc8db14b0326fdc1c6d22fc60501d0847f72e111fc58ae9a9fb4f/deep_hedging-2.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-29 22:19:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "v-buchkov",
"github_project": "deep-hedging",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "deep-hedging"
}