Why Deep Reinforcement Learning Can Help Improve Trading Efficiency

Image for post
Image for post

What is Deep Reinforcement Learning?

Imagine that you wake up one morning, totally committed to the idea of getting yourself a cat. You head straight to the closest shelter to adopt your new pal. Once you get back home, you start introducing your tiny new roommate to its new home. At first, the cat looks a little bit confused due to the unfamiliar environment. After a few minutes, though, it cautiously starts making its first steps around and explore. What it will do next is try to figure out and get used to the new surroundings. A sniff here and there and a paw-touch for this and that is the way it starts getting familiar with the environment. What it does is collect information on how the surrounding objects respond to its actions. That way, the cat can analyze the feedback and settle in its new home, far quicker. The conclusions it draws from the reaction of the environment to its actions will help it make informative choices in the future. The collected feedback improves its decision making on issues like, for example — which spot has the best characteristics for a good nap, which area of the floor gets the most sunlight, which are the good-to-eat plants and which are the don’t-touch-or-you-will-get-spanked ones, etc. in the most effective way. With the course of time, the longer the cat interacts with the surrounding environment, the better decision-maker it becomes* and the more comfortable it starts to feel.

*at least in theory

Deep Reinforcement Learning in Practice

The first major breakthrough in DRL application came from DeepMind and its efforts on achieving human-level control in 7 Atari games in 2013. Later on, the idea of the project was expanded and it focused on training DQN agents (bots powered by DeepMind’s deep Q-network algorithms) on 50 different Atari games, with no prior knowledge of the rules. The results, presented in DeepMind’s paper, revealed a successful human-level performance at almost 50% of the games with some scenarios even performing at a level superior to a professional human game tester. At the time of publishing, this appeared to be a very significant achievement when compared to all the present machine learning methodologies.

Image for post
Image for post
Source: DeepMind (https://deepmind.com/blog/deep-reinforcement-learning/)
A video that illustrates the improvement in the performance of DQN after 100, 200, 400 and 600 training scenarios. Source: DeepMind
Emergence of Locomotion Behaviours in Rich Environment: https://arxiv.org/pdf/1707.02286.pdf
Image for post
Image for post
Souce: Pixelart

Deep Reinforcement Learning in Trading

For now, most studies, conducted on the DRL framework are related to robotics and gameplay. There is not so much work, focused on the application of deep reinforcement learning in trading. Although the adoption is in its early stage, there are some studies, already, that point out the superiority of DRL algorithms when compared to existing automated trading mechanisms in terms of performance. This means deep reinforcement learning can help in the process of handling some of the most complex issues, typical for the nature of financial markets:

  • Agents’ actions may result in long-term consequences that other machine-learning mechanisms are unable to measure;
  • Agents’ actions also have short-term effects on the current market conditions which makes the environment highly unpredictable;

1. Agent

In this case, the agent is the trader. He opens his trading account, checks the current market situation and makes a trading decision.

2. Environment

The environment where the trading agent operates is, obviously, the market. However, the market is full of other agents as well — both human and computer-driven. The interaction between all agents within the particular environment is what makes things quite complicated.

3. State

The state of the environment (the market) usually is unclear to the agent (unless he has some sort of insider information or technological advantage over other market participants) — he is not aware of the number of other agents, their actions, their positions, order specifications, etc.

4. Reward

The reward function is another characteristic, crucial for the success of the deep reinforcement learning algorithm. If the reward function is naively driven by the absolute maximization of potential profits, the algorithm will start placing highly-risky bets, underestimating the potential losses in the name of reaching its ultimate goal. In reality, traders strive for optimal Sharpe ratios, which has proven to be the most efficient reward scenario for DRL algorithms.

Why Deep Reinforcement Learning Can Help Improve Trading Efficiency

The application of deep reinforcement learning for trading still remains largely unexplored. However, the promise that the learning mechanism shows in other fields has attracted researchers’ interest and canalized efforts towards the exploration of DRL for trading.

Are we there yet?

Although way more flexible and powerful, a deep reinforcement learning agent still requires millions of test scenarios to become flawless at what it does (think of AlphaGo, for example). Also, despite those types of agents are closer to autonomy, they still require an operator to reward their actions as either positive or negative, exactly like the case with your pet — should it decide that your new shoes are the perfect breakfast, the chance is slim for it to be rewarded.

Driving growth to FinTech startups, blockchain ventures, banks, and asset managers. Consulting | Planning | Writing | Reaping rewards. https://viktortachev.com/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store