Problem 1st
Two Research Problems for the Workshop in Finance
HSBC Paris
1 The FX Swap Option Smile Prediction Problem
1.1 Motivation
Foreign exchange (FX) swaps are financial derivatives exchanging cash flows in two different currencies. The price of an FX swap is the fair price K, denominated in domestic currency, of a cash flow at a future date T in a foreign currency. At the maturity T of the contract, the predefined price K in domestic currency and the promised cash flow F in foreign currency are exchanged. As the FX rate X at date T is random, the final cash flow (F*X_T-K) is also random. This price is extracted by a no-arbitrage formula from local discount curves in both currency and from the FX spot rate X_0. The fair FX forward rate is X_0*(1+r_d)/(1+r_f) with r_d the discount rate in domestic market for period [0,T], r_f the discount rate in foreign market for period [0,T].
An option on an FX swap, FX option for short, is a financial contract which allows the option buyer entering into an FX swap at a future date. This means that the option buyer has the right to refuse the transaction at the maturity T, despite the payment of the option price P_0 at the initial date. We will assume that the agents are rational, in the sense that the effective cash flow of the option will be (F*X_T-K)+, where + denotes the positive part. One major FX option parameter is volatility, which is defined in all major option pricing models, notably Black-Scholes and SABR. The model price of an FX option is increasing in the volatility and traders are interested in finding the market implied volatility of an option, which is the value of the volatility parameter for which the model price of the option fits its market price. The smile is then a curve representing the volatility as a function of the option strike K (or surface representing the volatility as a function of the option strike and maturity).
In addition, (interest-rate) swaps are financial derivatives exchanging cash flows quoted into different rates. One leg could be seen as a bond denominated at a certain rate (possibly fixed or floating) and the other leg as a bond with the same nominal but denominated with another floating rate. Finally, swaptions are financial contracts allowing the option buyer entering into a swap (with predefined price) at a future date with a predefined price at swaption inception. These contracts are used for instance by companies to protect them against interest rates movement on a future period.
Unfortunately market prices for FX options (and consequently the FX smile) are missing or incomplete on certain secondary financial markets, whereas prices for other option contracts (monocurrency swaptions) are broadly available. The goal of this project is to calibrate the FX smile for SABR model from monocurrency smiles (for SABR model), FX term structure, and domestic OIS rates, on markets where FX option prices are quoted. The ensuing predictor will be used to imply a FX smile on markets without it.
1.2 Problem
We place ourselves in a supervised learning optimization setup (see Hastie, Tibshiran, and Fridman (2008)), relying on a database of features for numerous FX options in different market contexts, along with the associated smiles. Training samples and testing samples will be provided to the students.
Training set and testing set will involve disjoint set of currency pair to assess the robustness of our predictor between different markets.
1.3 Task
The task at hand is to apply and compare several learning/regression techniques to the FX option smile learning problem, such as radial basis function, neural network (see e.g. Hernandez (2017)), inverse distance weighting, random forests, boosting techniques, Chebyshev interpolation (see Gaß, Glau, and Mair (2017)), kriging (see Ankenman, Nelson, and Staum (2010)), .... Student are also free to add some features transformation (kernel, scaling ...) on proposed models.
Potential hyperparameters should be chosen through a cross-validation procedure. The students will be responsible for building their own validation procedure (sampling k-folds or other). The testing error will be calculated as the squared loss function. Once the testing error is considered sufficiently small, attention will be paid to the contribution of each feature for study of overfitting. A residuals/error analysis could highlight limit cases of fitted models and, more precisely, in which market context the results become unstable.
Students are left with the choice of technology in terms of language (python, C++, ...) and framework (tensor flow, ...).
2 The Swaption Smile Deformation Learning Problem
2.1 Motivation
Swaps are financial derivatives exchanging cash flows quoted into different rates. One leg could be seen as a bond denominated at a certain rate (possibly fixed or floating) and the other leg as a bond with the same nominal but denominated with another floating rate.
Additionally, swaptions are financial contract which allow the option buyer to enter into a swap (with predefined price) at a future date with a predefined price at swaption inception, in exchange of a certain strike. These contracts are used for instance by companies to protect them at least cost against interest rates movement on a future period. By convention, a swaption T/U gives the right in T years to enter into a predefined swap which pays cash flows during U years. Thus the underlying swaps ends in (T+U) years. T is called the maturity and U the expiry.
One major swaption pricing parameter is volatility, which is defined in all major pricing models (Black-Scholes, SABR). The price of a swaption is increasing in the volatility and traders are interested in finding the implied volatility corresponding to option market prices. By implied volatility in a model, we mean the value of the volatility parameter for which the model price of a given option fits its market price. The smile is then a curve representing the volatility as a function of the option strike (or surface representing the volatility as a function of the option strike and maturity). In the case of swaptions, the smile is in fact an implied volatility (SIV) cube, since there exists three dimensions for the volatility: the maturity, the expiry and the strike.
One crucial information for pricing swaption is the forward matrix which gives the price of swap in each pair (T/U). It contains market feeling about forward rates curve on future date. Conversely to swaption, the forward swap buyer cannot refuse to enter into the swap when it is unfavorable.
Some no-arbitrage relationship exists on the smile. These relationships should be satisfied at all times, otherwise there is a risk for the option seller to lose money. For instance for a fixed pair (T/U) the smile must be convex with respect to the strike. Otherwise it would be interesting to buy swaption with a high strike and sell a swaption with a smaller strike (see for explanation Mercurio (2007)).
Unfortunately, the SIV cube is not frequently updated which is a limiting factor for dealing trading on an intraday basis. On the opposite, the forward matrix information is easily accessible on an intraday basis. The goal of this project is to update the SIV cube with the knowledge of the previous SIV cube, the previous forward matrix and new forward matrix.
2.2 Problem
We place ourselves in a supervised learning optimization setup (see Hastie, Tibshiran, and Fridman (2008)), relying on a database of SIV cubes and forward matrices for each day. Training samples and testing samples will be provided to the students with numerous examples. Some features transformations will be proposed to the students but these will also be free to go back to the initial data for customizing features transformations.
To address the SIV cube updating problem, two levels of estimation are possible, by order of increasing difficulty:
1.The first step consists in estimating a predictor separately for each (maturity, expiry) pair in the cube. An interest is focused on convexity in strike to respect no-arbitrage condition in the strike direction.
2.We then try and t the whole cube globally or, at least, to take into account a condensed representation of the whole cube, in order to predict points in the new cube that are consistent across strikes and maturities.
Step 1 has already been addressed in the context of forward crude oil prices and interest rate swap by Kondratyev (2017). The challenge in our cube deformation learning problem lies in its high dimensionality and relative small size of exogenous feature (here forward matrix).
2.3 Task
The task at hand is to apply and compare several learning techniques to the swaption SIV cube deformation problem, such as SVD, neural network, wavelets (see e.g. Hastie, Tibshiran, and Fridman (2008)) or kernel methods (see Murphy (2012)) .... Student are also free to add some features transformation (kernel, scaling ...) on proposed models.
Potential hyperparameters should be chosen through a cross-validation procedure. The students will be responsible for building their own validation procedure (K-Fold sampling or other). The testing error will be computed as the average distance on testing examples between the predicted cube and the realized one.
Students are left with the choice of technology in terms of language (python, C++, ...) and framework (tensor flow, ...). Tackling this subject would require skills in computer science, statistics and in preference some financial culture.
References
1.Ankenman, B., B. Nelson, and J. Staum (2010). Stochastic kriging for simulation metamodeling. Operations research 58 (2), 371-382.
2.Gaß, M., K. Glau, and M. Mair (2017). Magic points in finance: Empirical integration for parametric option pricing. SIAM Journal on Financial Mathematics 8, 766-803.
3.Hastie, T., R. Tibshiran, and J. Fridman (2008). The Elements of Statistical Learning (2nd Ed.). Springer Series in Statistics. Freely available on
https://web.stanford.edu/~hastie/Papers/ESLII.pdf.
4.Hernandez, A. (2017). Model calibration with neural networks. Risk Magazine (June 1-5). Preprint version available at SSRN.2812140, code available at
https://github.com/Andres-Hernandez/CalibrationNN.
5.Kondratyev, A. (2017). Learning curve dynamics with artificial neural networks.
6.Mercurio, F. (2007). No-arbitrage conditions for cash-settled swaptions. Technical report, Technical report, Technical report, Banca IMI. 2.
7.Murphy, K. (2012). Machine learning: a probabilistic perspective. 2012. MIT Press.
Problem 2nd
Recognize Fraudulent Trading Behavior
UniDT Technology
1 Scene
Use the information of user's login behavior on an e-commerce platform to recognize fraudulent trade. That's to say, a model which can recognize user's fraudulent trading behavior needed to be built, the model must have strong business sense and be explainable.
The model needs to achieve two goals:
1.Use user's login behavior to recognize user's abnormal trade.
2.Use model to identify fraudulent group, export fraudulent group's labels, and extract behavioral characteristics of fraudulent group.
2 Requirements
Knowing positive and negative samples, conduct pre-risk model learning, extract risky characteristics, and identify fraudulent group.
1.Data: login behavior and online trading behavior from users.
2.Algorithm: Choose it by yourself.
3.Model result: Include elements such as “is it risk trading” and “fraudulent group's labels”.
The model's result will be validated with the trade ID and group ID (the group should be defined by yourself).
Validation Criteria: Accuracy, Precision, Recall.
Problem 2nd(中文版本)
交易欺诈行为识别
华院数据技术
一、题目场景
利用某电商平台用户的登录行为信息,识别用户的欺诈交易。构建针对用户异常登录行为的识别、且具备较强业务可解释性的模型,对用户的异常交易进行识别。
此功能需实现两个目标:
1.根据用户的登录行为信息,识别用户的异常交易行为。
2.利用模型识别欺诈的用户群体,输出欺诈用户群体标签,并提取欺诈群体的行为特征。
二、题目要求
已知正负样本,进行事前风险模型学习,提取风险特征,识别欺诈团伙。
1.建模数据:选取用户登录行为及线上交易行为数据进行建模。
2.模型算法选取:自行选择。
3.模型结果:包括“是否风险交易”、“欺诈用户群体标签”等要素。
模型结果验证将以交易ID和群体ID(群体需自行定义)为维度进行验证。
模型验证标准:模型准确率、精确率、召回率验证。
Problem 3rd
Build a Trading Algorithm via Machine Learning
MathWorks
1 Overview
Machine Learning is invigorating trading and portfolio management, offering new opportunities in strategy development, factor analysis and identification, and, so many would have us believe, better prediction. However, machine learning also comes with the interrelated risks of overfitting, increased model complexity, risk management challenges, and poor model performance.
Through discussion and team work in this event, you’ll have hands-on experience about potential machine learning technologies, how accessible they are to apply to data to hand, and how you can incorporate techniques quickly into back-test frameworks. We need to develop tools and techniques to assess overfitting, and consider ways to apply real-time testing and run models live, if you dare.
We will not advocate that you should use machine learning, but will discuss how easily and quickly techniques can be prototyped, and if you do decide to “go live”, how you can mitigate some of your risks.
2 Highlights
wLearn, apply and test machine learning algorithms faster, using useful, simple, convenient applications
wIncorporate machine learning algorithms into your testing, trading and portfolio management infrastructures.
3 Agenda
wMATLAB Predictive Analytics Backgrounder, with Reference to Relevant New Features: Timetable, Tall, Live Editor, Apps, App Designer and Database Interactivity
wAn Introduction: Machine Learning, Neural Networks and Japanese Equities
FX EuroDollar Strategy Development and Back-testing, with:
üRegression & Supervised Learning Classification Trees
üConvolutional Neural Networks [Deep Learning]
üLSTM “Deep” Networks [Deep Learning]
wDeploying Trained Models
Problem 4th
Research on information community characteristics and change trend of public offering fund
Shenzhen Stock Exchange
1 Construction of information network
Based on the quarterly shareholding information of the public offering fund, the information network is constructed based on the shareholding synergy. The method may refer to Blocher (2016) (see the Reference 1 below).
2 Analysis of the characteristics of the information community and the trend of its change
Information cluster can be constructed according to the degree of account connection. Referring to the practice of Bock and Husain (1950) (reference 2 below), the subgroups are repeatedly constructed until the ratio of the intensity between groups and groups is no longer decreased with a small increase in the number of new members.According to the constructed subgroup, we can see the information community distribution of A share institutional investors.
Besides, are there any other better ways to compute subgroups?
3 Analysis of factors affecting the distribution of community
We examine which factors affect community formation:
1.The location of the office of the institution,
2.The distribution of investment managers in Colleges and universities.
3.The distribution of the securities business department of the investment manager.
4 Analysis of the change characteristics of the information community
Think since 2010, what changes have taken place in China's information community?
References
1.Jesse Blocher 2016. Network Externalities in Mutual Funds. Journal of Financial Markets, Volume 30, September 2016, Pages 1-26.
2.R. Bock and S. Husain 1950. An adaptation of Holzinger's B-coefficients for the analysis of sociometric data. Sociometry, Volume 11, Pages 146-153.
Problem 4th(中文版本)
公募基金的信息群落特征及其变化趋势研究
深圳证券交易所
一、信息网络的构建
以公募基金季度持股信息,基于持股协同性构建信息网络。方法来借鉴Blocher(2016)的做法。
二、分析信息群落特征及其变化趋势。
信息群落(Information cluster)可否根据账户间联系程度进行构建,目前思考的做法为。借鉴Bock与Husain(1950)的做法,不断重复地构造子群,直到群内与群间联系强度的比值不再随新成员的少量增加而下降为止。根据所构建的子群,查看A股机构投资者的信息群落分布。
除此之外,是否有其他更好的办法来计算子群。
三、影响群落分布的因素分析
我们检验下列哪些因素是否影响群落的形成:
1.机构的办公地所在城市。
2.投资经理毕业院校分布。
3.投资经理的证券营业部分布。
四、信息群落的变化特征分析
2010年以来,我国的信息群落发生了哪些变化。
参考文献
1.Jesse Blocher 2016. Network Externalities in Mutual Funds. Journal of Financial Markets, Volume 30, September 2016, Pages 1-26.
2.R. Bock and S. Husain 1950. An adaptation of Holzinger's B-coefficients for the analysis of sociometric data. Sociometry, Volume 11, Pages 146-153.
Problem 5th
Two Cases for the Workshop in Finance
Zhengzhou Commodity Exchange
1 Customer relationship control for mining day to date position and increasing consistency
1.1 Overview
The concept of converging transaction is very extensive and general, it refers to the behavior of the actual control of the account between the accounts, the division of the warehouse and the transaction, that is, the multi account (not reported) is controlled by one person (unit), resulting in the agreement, manipulation or influence of the price, the transfer of funds and so on, which eventually disrupts the order of the transaction, obstructs the fair trading behavior, threats to financial security. According to the specific rules of the China Securities Regulatory Commission (CSRC) or the exchange, there are too many suspected relations, and the exchanges are concerned with those transactions that have substantial control behavior, so it is urgent that intelligent algorithms such as machine learning, data mining and other intelligent algorithms are urgently needed to mine the potential analysis of the actual related accounts.
The consistency of position increase or decrease is a manifestation of convergence behavior. The same judgement on the market shown by the real control group can be reflected at the same time (similar time). The factors of concern include buying and selling logo, contract code, number of positions held, number of positions held yesterday, arbitrage positions, set keeping positions, speculative positions, etc.
In the identification of consistent behavior, the main difficulties are the imbalance of data (that is, the customer of the real control group is far less than the total customer of the market), the high dimension of the data (different varieties, different contracts, different cycles, different directions, etc.), and the interference of data (many incidental behaviors caused by customer increase and decrease). In the construction of recognition scheme, the main difficulties are: the quality of the choice of attributes, the uncertainty of the number of categories, the accuracy of the measurement of the similarity of data, the timeliness and stability of the clustering, which often have a great influence on the quality of the final identification of the converging trade.
1.2 Research Goals
1.Find out the regular characteristics of consistent behavior and extract effective attribute values identified by convergence trading behavior.
2.Propose and design a new similarity measure algorithm for time series similarity.
3.Use 1) and 2) to build an effective scheme for identifying consistent convergence behavior (model).
Ultimately improve the accuracy, stability and timeliness of the identification scheme.
2 Option Modeling for white sugar American Options on Zhengzhou futures exchange
Because the Zhengzhou futures exchange’s white sugar options are based on the American style, the corresponding options are not priced in general analytical formulas. It needs to be modeled by numerical methods, or by two forked trees or trigeminal trees. So the objective of this case analysis is
1.Based on the price information of white sugar option in Zhengzhou futures, the model of white sugar option modeling is modeled, and the option model is corrected based on the implied volatility (calibration).
2.Based on the price volatility of white sugar spot/futures, the effect of options on spot/futures prices is briefly explained before and after the release of the white sugar option products from Zhengzhou futures.
Problem 5th(中文版本)
研讨会 研究课题
郑州商品交易所
一、挖掘日间持仓增减一致性行为的客户实控关系
1.1 概述
趋同交易的概念非常广泛,是指账户之间对敲,分仓,自成交等实际控制关系账户的行为,即多账户(未上报)由一人(单位)控制,导致交易一致性、操纵或影响价格、转移资金等,最终扰乱交易秩序、妨碍公正交易行为、威胁金融安全。根据证监会或交易所具体规则限定,发觉的疑似关系太多,而交易所关心的是那些有实质控制行为的交易,因此迫切需要深刻的机器学习、数据挖掘等智能算法对潜在的是实际关联账户进行针对性地挖掘分析。
持仓增减一致性是趋同行为表现的一种。实控组表现出来的对市场的相同判断,可同时(相近时间)反映在多空持仓增减一致。关心的因素有:买卖标识,合约代码,持仓数量,昨持仓数量,套利持仓,套保持仓,投机持仓等。
持仓一致性行为识别中,主要困难:数据的不平衡性(即实控组的客户量远远小于市场的总客户)、数据的高维性(不同品种,不同合约,不同周期,不同方向等)、数据的干扰性(客户增减一致的偶然行为很多)。在构建识别方案中,主要困难:属性选择的优良性、类别数的不确定性、数据相似性度量的准确性、聚类的时效性与稳定性,这些困难往往对最终识别趋同交易的好坏产生巨大的影响。
1.2 研究目标
1.找出持仓一致行为的规律特征,提取趋同交易行为识别的有效属性值。
2.提出并设计一种新的(适合持仓一致行为)时间序列相似性度量算法。
3.利用1)、2)构建持仓一致性趋同行为识别的有效方案(模型)。
最终提高识别方案的准确性、稳定性以及时效性。
二、郑州期货交易所白糖期权建模
由于郑州期货交易所白糖期权基于美式进行行权,因此对应的期权没有一般的解析公式进行定价,需要通过数值方法,或二叉树或者三叉树进行建模。因此本案例分析的目标是:
1.基于郑州期货所白糖期权价格信息对白糖期权建模进行建模,同时展开基于隐含波动性对期权模型进行校正(calibration)。
2.基于郑州期货所白糖期权产品的推出前后,基于白糖现货/期货的价格波动性,简要说明期权对于现货/期货价格的影响。
Problem 6th
FX Volatility Prediction and Insights from Tick Data and Market Events
OANDA
1 Motivation
Volatility permeates all financial markets and is a fundamental feature of any hedging and trading algorithm. Volatility can change rapidly going from low to high and back again in days, hours, minutes or even seconds. What causes these rapid changes is often not known in advance, however, there are times when we can loosely predict when volatility will likely be higher or lower: it is well known that volatility clusters: large changes tend to be followed by large changes, of either sign, and small changes tend to be followed by small changes (Mandelbrot, 1963).
Market reaction also changes in different volatility regimes; in high volatility periods, there tends to be higher volumes and vice versa. Increasing (decreasing) volume over some fixed time period results in an increase (decrease) in the “tick-rate”, i.e., the number of bid-ask prices that are being seen per second. The market reacts to news events in different ways, but, for example, there tends to be increased volatility around certain known events, for example, USD Non-farm Payrolls or FOMC announcements and minutes.
Understanding the relationship between the current volatility regime, market events, changes in volume and both the emergent patterns and future state of volatility can improve hedge effectiveness and add value to client offerings.
2 Problem
OANDA would like to understand the dynamics of FX volatility and the reaction to known market events. Questions of particular interest are:
1.How can we classify and characterize the current volatility regime for a given FX pair, a given single currency, or the overall FX market[b]? What can we say about volatility clustering for the pair and what does this tell us about emergent patterns and future state of volatility?
(Remark: [b] The “FX market” will be defined as a weighted average across FX pairs.)
2.What is the relationship between the current volatility regime and market reaction to known events?
3.Given a particular FX pair, can we provide reasonable bounds for its trading range in response to known market events? What, if any, is the relationship between the tick rate and the trading range for a given FX Pair?
4.Over what period of time does the change in volatility, in response to a known event, dissipate back to the levels prior to the event? Is this related to “tick rate”?
5.How can we combine this information for volatility prediction at the level of a single FX Pair, at the single currency level or at the over FX market level?
6.What, if any, is the relationship between the current volatility regime and known events and market direction? Do some events exhibit more upside than downside in certain regimes, and vice versa?
To address these questions it will be necessary to use information about known market events from publically available sources. Due to the vast amount of tick data that will be available, as well as unknown non-linear relationships, we are interested in investigating, solutions are likely to involve sophisticated statistical, machine learning and artificial neural net techniques.
References
1.Mandelbrot, B. B., The Variation of Certain Speculative Prices, The Journal of Business 36, No. 4, (1963), 394-419.
Optional Problem
Backtest VaR model for Chinese equity market
Yuzhe ZHU daniel.zhu@msci.com
Disclaimer: the proposal of this problem is from the personal view of the author and doesn’t reflect any view of MSCI.
1 Introduction
Various of techniques for measuring day-to-day investment risk have been investigated by both academia and industry. Still choosing a proper model for particular market segment requires careful backtesting. In this exercise we want to backtest if a direct implementation of well-developed baseline model works well on Chinese stock market, and further investigate more sophisticated risk modeling choices if the backtest result of baseline model is not satisfactory.
2 Baseline model and backtesting methods
We want to start with Value-at-Risk (VaR) methodology from RiskMetrics MSCI[a]. It is one of the most widely-used risk model in the investment community. Although simple it has been proven to be robust in many markets.
Various of methods have been developed for VaR backtesting. A brief introduction of several popular methods can be found on this site[b]. To thoroughly understand the performance of a model several difference methods will usually be employed together. In this exercise we would want to at least consider coverage test to analyze the exceedances of realized loss over VaR, and a distribution test to test goodness-of-fit of the risk model.
(Remarks: [a] https://www.msci.com/documents/10199/dbb975aa-5dc2-4441-aa2d-ae34ab5f0945. [b] https://www.value-at-risk.net/backtesting-value-at-risk/.)
3 Objectives
1.We want to start by selecting a universe of representative equities from Chinese market, around 50 or more. It might be helpful to further divide the equities into segments, e.g. by large cap and small cap. We want to see if there is one proper model works well for the whole market or different segment should use different model.
2.We want then to backtest 1-day VaR at both 95% and 99% confidence level for at least 2 years, with a direct implementation of RiskMetrics VaR methodology from MSCI on the universe/segments from step 1.
3.If the baseline model fails the backtest for certain segments we want to propose an alternative risk model. There are several things to keep in mind when developing an alternative model:
wControlling exceedances is usually required by regulators so it is vital for the success of a risk model
wGoodness-of-fit is also important since accurate forecast of risk is the starting point of effective risk management
wTransparency and efficiency are important for the success of a risk model. A model too complicated to understand or too computationally demanding is not suitable for practical use
4 Possible choices for alternative risk models
There are two key components in risk modeling: 1. Volatility estimation 2. Return distribution assumption.
We can explore from either perspective or both. By examining the data there might also be clues to suggest whether volatility estimation or return distribution assumption plays more important role.
For volatility estimation there are a lot of different methods proposed other than the EWMA method used in the baseline model. However we should keep in mind that risk model typically deals with large portfolios and we usually cannot afford to calibrate one model for each asset. We should explore if there is a set of model parameters works reasonably well for each segment that you divide.
For return distribution assumption a natural enhancement would be to explore distribution with fat tails since we are mostly interested in tail events especially for 99% confidence level. However in pursuing this direction one should be particularly careful to make sure the expected shortfall[c] remains finite since it is also a risk measure that we usually look at.
(Remark: [c] https://www.msci.com/documents/1296102/1636401/risk1214msci.pdf /b2e0992f-bdbf-432f-97ef-a679630e8e8f.)