IBKR Quant Blog


1 2 3 4 5 2 23


Quant

Robot Wealth - Back to Basics: Algorithmic Trading - Part 6


The articles in this series are available as follows: Part I, Part II, Part III, Part IV, and Part V.
In the previous post in this series, Kris shared his views on three core skills.

 

To the three core skills I described, I would also like to add numerical optimizationmachine learning and big data analysis as I think they are incredibly important, however they go a little beyond what I would call “minimum requirements”. These skills are nice to have in your toolkit and will make your life as an algorithmic trader easier, but unlike the other skills I described, they are not absolutely critical.

For the adventurous and truly dedicated, I can also recommend learning about behavioral financemarket microstructure and macroeconomics. Again, these are not minimum requirements, but will provide insights that can augment one’s ability to navigate the markets.

Finance and economics helps with generating trading ideas, but you don’t need formal education in these areas. In fact, I know several folks who are responsible for the hiring and firing that goes on in the professional trading space, and some of these people actually shy away from finance and economics graduates. If you hold such a degree, don’t despair though – just recognize that there is more to the practicalities of trading successfully than what you learned in your formal education.

Finally, it would be remiss of me not to mention the soft (that is, non-technical) skills that come in handy. Singularly most important of these is a critical mindset. You will read mountains of information about the markets through your algorithmic trading journey, and every page should be read with a critical eye. Get into the habit of testing ideas yourself and gathering your own evidence rather than relying on other people’s claims.

Other soft skills that are worth cultivating include perseverance in the face of rejection (you will unfortunately be forced to reject the majority of your trading ideas) and the ability to conduct high-quality, reproducible and objective research.

Important Practical Matters

Finally, I want to cover some of the practical considerations that I think are important to be aware of when starting out.

Expectations

When you are learning algorithmic trading, you will find that it can be an emotional experience. This will pass as your experience and proficiency develops, but during the early years, your emotional state may become somewhat tied to your success or otherwise in the markets. This can obstruct progress, so it is worth understanding and addressing this issue.

In life, our happiness is often tied up with our expectations. Therefore, it makes sense to set ambitious yet realistic expectations for your algorithmic trading journey right at the outset. Your mental state will thank you for it.

First of all, this is a cliché, but one that rings true: trading is the hardest way to make easy money. Don’t expect an easy ride or fast riches. Rather, expect at least a couple of years of unrewarded effort and slow riches, if any riches at all.

Related to this, don’t expect to make multiples of your money in short periods of time. However, once proficient, you should expect to outperform the market over the long term (potentially significantly). Otherwise, what’s the point of doing this at all? If you look through the Barron’s list of top 100 hedge funds from last year, you’ll see that the best performing funds have a 3-year compounded annual return of just under 30%. Do you think it is reasonable that you could out-perform these top-performing funds, with their quant teams and enormous financial resources? That was kind of a loaded question, because perhaps surprisingly, the answer is yes! Funds with billions under management face completely different constraints than a do-it-yourself trader, the most interesting of these being related to capacity. For example, an individual trading say a half-million-dollar futures account can take a completely different approach to a fund that aims to generate returns on billions. There may exist market phenomena that can generate returns that are significant compared to the position sizing of a retail account, but which are not capable of carrying the trades of a larger fund. Therefore, while on the surface, it may appear that a retailer is at a significant disadvantage, there are also opportunities.

One last comment about expectations: avoid becoming fixated on how much you can make. The amount of reward you can gain is inextricably tangled up with the amount of risk you are willing to take. Thinking about reward in terms of risk rather than in isolation will lead you to much more sensible expectations.

 

Learn more about Robot Wealth here: https://robotwealth.com/ and visit his blog to read more on this topic: https://robotwealth.com/blog/

This article is from Robot Wealth and is being posted with Robot Wealth’s permission. The views expressed in this article are solely those of the author and/or Robot Wealth and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.


18573




Quant

QuantConnect - Pairs Trading with Python


In case you missed it! The webinar recording is available on IBKR YouTube channel.

 

Learn how to select correlated pairs to build a long-short hedged pairs trading position with Python in QuantConnect.

Sponsored by QuantConnect

 

https://youtu.be/cZFqYJmQoTM

 

Quant

 

 

Information posted on IBKR Quant that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Quant are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.


18807




Quant

K-Means For Pair Selection In Python - StatArb Strategy


By Lamarcus Coleman

Python

 

Read the previous six posts in this series: OverviewHeatmaps and ADF Tests, Historic Problem of Pair Selection, Understanding K-Means, Visualization and matplotlib subplot functionality

In this post Lamarcus will show us how to build a StatArb strategy using K-Means

 

To Begin, we need to gather data for a group of stocks. We’ll continue using the S&P 500. There are 505 stocks in the S&P 500. We will collect some data for each of these stocks and use this data as features for K-Means. We will then identify a pair within one of the clusters, test it for cointegration using the ADF test, and then build a Statistical Arbitrage trading strategy using the pair.

Let’s get started!

We’ll begin by reading in some data from an Excel File containing the stocks and features will use.

#Importing Our Stock Data From Excel
file=pd.ExcelFile('KMeansStocks.xlsx')

#Parsing the Sheet from Our Excel file
stockData=file.parse('Example')

Now that we have imported our Stock Data from Excel, let’s take a look at it and see what features we will be using to build our K-Means based Statistical Arbitrage Strategy.

#Looking at the head of our Stock Data
stockData.head()

Python

#Looking at the tail of our Stock Data
stockData.tail()

Python

We’re going to use the Dividend Yield, P/E, EPS, Market Cap, and EBITDA as the features for creating clusters across the S&P 500. From looking at the tail of our data, we can see that Yahoo doesn’t have a Dividend Yield, and is a missing P/E ratio. This brings up a good teaching moment. In the real world, data is not always clean and thus will require that you clean and prepare it so that it’s fit to analyze and eventually use to build a strategy.

In actuality, the data imported as been preprocessed a bit as I’ve already dropped some unnecessary columns from it.

 

In the next post, Lamarcus will demonstrate the Process of Implementing a Machine Learning Algorithm.

------------------------------------------------------------

*Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.

If you want to learn more about K-Means Clustering for Pair Selection in Python, or to download the code, visit QuantInsti website and the educational offerings at their Executive Programme in Algorithmic Trading (EPAT™).

This article is from QuantInsti and is being posted with QuantInsti’s permission. The views expressed in this article are solely those of the author and/or QuantInsti and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.


18410




Quant

Back to Basics: Introduction to Algorithmic Trading - Part 5


In the previous post Kris shared his views on the programming skills quants need to build on.

In this post, he continues the discussion on Technical skills.

Statistics

It would be extremely difficult to be a successful algorithmic trader without a good working knowledge of statistics. Statistics underpins almost everything we do, from managing risk to measuring performance and making decisions about allocating to particular strategies. Importantly, you will also find that statistics will be the inspiration for many of your ideas for trading algorithms. Here are some specific examples of using statistics in algorithmic trading to illustrate just how vital this skill is:

  • Statistical tests can provide insight into what sort of underlying process describes a market at a particular time. This can then generate ideas for how best to trade that market.
  • Correlation of portfolio components can be used to manage risk (see important notes about this in the Risk Management section below).
  • Regression analysis can help you test ideas relating to the various factors that may influence a market.
  • Statistics can provide insight into whether a particular approach is outperforming due to taking on higher risk, or if it exploits a genuine source of alpha.

Aside from these, the most important application of statistics in algorithmic trading relates to the interpretation of backtest and simulation results. There are some significant pitfalls – like data dredging or “p-hacking” (Head et.al. (2015)) – that arise naturally as a result of the strategy development process and which aren’t obvious unless you understand the statistics of hypothesis testing and sequential comparison. Improperly accounting for these biases can be disastrous in a trading context. While this issue is incredibly important, it is far from obvious and it represents the most significant and common barrier to success that I have encountered since I started working with individual traders. Please, spend some time understanding this fundamentally important issue; I can’t emphasize enough how essential it is.

It also turns out that the human brain is woefully inadequate when it comes to performing sound statistical reasoning on the fly. Daniel Kahneman’s Thinking, Fast and Slow (2013) summarizes several decades of research into the cognitive biases with which humans are saddled. Kahneman finds that we tend to place far too much confidence in our own skills and judgements, that human reason systematically engages in fallacy and errors in judgment, and that we overwhelmingly tend to attribute too much meaning to chance. A significant implication of Kahneman’s work is that when it comes to drawing conclusions about a complex system with significant amounts of randomness, we are almost guaranteed to make poor decisions without a sound statistical framework. We simply can’t rely on our own interpretation.

As an aside, Kahneman’s Thinking, Fast and Slow is not a book about trading, but it probably assisted me with my trading more than any other book I’ve read. I highly recommend it. Further, it is no coincidence that Kahneman’s work essentially created the field of behavioral economics.

Risk Management

There are numerous risks that need to be managed as part of an algorithmic trading business. For example, there is infrastructure risk (the risk that your server goes down or suffers a power outage, dropped connection or any other interference) and counter-party risk (the risk that the counter-party of a trade can’t make good on a transaction, or the risk that your broker goes bankrupt and takes your trading account with them). While these risks are certainly very real and must be considered, in this section I more concerned with risk management at the trade and portfolio level. This sort of risk management attempts to quantify the risk of loss and determine the optimal allocation approach for a strategy or portfolio of strategies. This is a complex area and there are several approaches and issues of which the practitioner should be aware.

Two (related) allocation strategies that are worth learning about are Kelly allocation and Mean-Variance Optimization (MVO). These have been used in practice, but they carry some questionable assumptions and practical implementation issues. It is these assumptions that the newcomer to algorithmic trading should concern themselves with.

Probably the best place to learn about Kelly allocation is in Ralph Vince’s The Handbook of Portfolio Mathematics, although there are countless blog posts and online articles about Kelly allocation that will be easier to digest. One of the tricky things about implementing Kelly is that it requires regular rebalancing of a portfolio that leads to buying into wins and selling into losses – something that is easier said than done.

MVO, for which Harry Markowitz won a Nobel Prize, involves forming a portfolio that lies on the so-called “efficient frontier” and hence minimizes the variance (risk) for a given return, or conversely maximizes the return for a given risk. MVO suffers from the classic problem that new algorithmic traders will continually encounter in their journey: the optimal portfolio is formed with the benefit of hindsight, and there is no guarantee that the past optimal portfolio will continue to be optimal into the future. The underlying returns, correlations and covariance of portfolio components are not stationary and constantly change in often unpredictable ways. MVO therefore does have its detractors, and it is definitely worth understanding the positions of these detractors (see for example Michaud (1989), DeMiguel (2007) and Ang (2014)). A more positive exposition of MVO, governed by the momentum phenomenon and applied to long-only equities portfolios, is given in the interesting paper by Keller et.al. (2015).

Another way to estimate the risk associated with a strategy is to use Value-at-Risk (VaR), which provides an analytical estimate of the maximum size of a loss from a trading strategy or a portfolio over a given time horizon and under a given confidence level. For example, a VaR of $100,000 at the 95% confidence level for a time horizon of one week means that there is a 95% chance of losing no more than $100,000 over the following week. Alternatively, this VaR could be interpreted as there being a 5% chance of losing at least $100,000 over the following week.

As with the other risk management tools mentioned here, it is important to understand the assumptions that VaR relies upon. Firstly, VaR does not consider the risk associated with the occurrence of extreme events. However, it is often precisely these events that we wish to understand. It also relies on point estimates of correlations and volatilities of strategy components, which of course constantly change. Finally, it assumes returns are normally distributed, which is usually not the case.

Finally, I want to mention an empirical approach to measuring the risk associated with a trading strategy: System Parameter Permutation, or SPP (Walton (2014)). This approach attempts to provide an unbiased estimate of strategy performance at any confidence level at any time horizon of interest. By “unbiased” I mean that the estimate is not subject to data mining biases or “p-hacking” mentioned above. I personally think that this approach has great practical value, but it can be computationally expensive to implement and may not be suitable for all trading strategies.

So now you know about a few different tools to help you manage risk. I won’t recommend one approach over another, but I will recommend learning about each, particularly their advantages, disadvantages and assumptions. You will then be in a good position to choose an approach that fits your goals and that you understand deeply enough to set realistic expectations around. Bear in mind also that there may be many different constraints under which portfolios and strategies need to be managed, particularly in an institutional setting.

One final word on risk management: when measuring any metric related to a trading system, consider that it is not static – rather, it nearly always evolves dynamically with time. Therefore, a point measurement tells only a tiny fraction of the true story. An example of why this is important can be seen in a portfolio of equities whose risk is managed by measuring the correlations and covariance of the different components. Such a portfolio aims to reduce risk through diversification. However, such a portfolio runs into problems when markets tank: under these conditions, previously uncorrelated assets tend to become much more correlated, nullifying the diversification effect precisely when it is needed most!

Learn more about Robot Wealth here: https://robotwealth.com/

This article is from Robot Wealth and is being posted with Robot Wealth’s permission. The views expressed in this article are solely those of the author and/or Robot Wealth and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.


17857




Quant

K-Means Clustering For Pair Selection In Python - matplotlib subplot functionality


In the previous post Lamarcus Coleman explored Python’s matplotlib

In this article, he will compare the clusters he created from the toy data to the ones that the K-Means algorithm created based on viewing the data.

 

Now that we have both our toy data and have visualized the clusters we created, we can compare the clusters we created from our toy data to the ones that our K-Means algorithm created based on viewing our data. We’ll code a visualization similar to the one we created earlier. However, instead of a single plot, we will use matplotlib subplot method to create two plots, our clusters and K-Means clusters that can be viewed side by side for analysis. If you would like to learn more about matplotlib subplot functionality, you can visit here.

#now we can compare our clustered data to that of kmeans
#creating subplots

plt.figure(figsize=(10,8))
plt.subplot(121)
plt.scatter(data[0][:,0],data[0][:,1],c=data[1],cmap='gist_rainbow')
#in the above line of code, we are simply replotting our clustered data
#based on already knowing the labels(i.e. c=data[1])
plt.title('Our Clustering')
plt.tight_layout()

plt.subplot(122)
plt.scatter(data[0][:,0],data[0][:,1],c=model.labels_,cmap='gist_rainbow')
#notice that the above line of code differs from the first in that
#c=model.labels_ instead of data[1]...this means that we will be plotting
#this second plot based on the clusters that our model predicted
plt.title('K-Means Clustering')
plt.tight_layout()
plt.show()

Clustering

 

The above plots show that the K-Means algorithm was able to identify the clusters within our data. The coloring has no bearing on the clusters and is merely a way to distinguish clusters. In practice, we won’t have the actual clusters that our data belongs to and thus we wouldn’t be able to compare the clusters of K-Means to prior clusters. This walkthrough shows the ability of K-Means to identify the presence of subgroups within data.

 

At this point in our journey toward better understanding the application and usefulness of K-Means we’ve created our own clusters from data we created, used the K-Means algorithms to identify the clusters within our toy data and travelled back in time to a Statistical Arbitrage trading world with no K-Means.

We’ve learned that K-Means assigns data points to clusters randomly initially and then calculates centroids or mean values. It then calculates the distances within each cluster, squares these, and sums them, to get the sum of squared error. The goals is to reduce this error or distance. The algorithm repeats this process until there is no more in-cluster variation, or put another way, the cluster compositions stop changing.

Ahead, we will enter a Statistical Arbitrage trading world where K-Means is a viable option for solving the problem of pair selection and use the same to implement a Statistical Arbitrage trading strategy.

 

 

To see the previous posts in this series, click Part I, Part 2, Part 3, Part 4 and Part 5

------------------------------------------------------------

*Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.

If you want to learn more about K-Means Clustering for Pair Selection in Python, or to download the code, visit QuantInsti website and the educational offerings at their Executive Programme in Algorithmic Trading (EPAT™).

This article is from QuantInsti and is being posted with QuantInsti’s permission. The views expressed in this article are solely those of the author and/or QuantInsti and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.


18142




1 2 3 4 5 2 23

Disclosures

We appreciate your feedback. If you have any questions or comments about IBKR Quant Blog please contact ibkrquant@ibkr.com.

The material (including articles and commentary) provided on IBKR Quant Blog is offered for informational purposes only. The posted material is NOT a recommendation by Interactive Brokers (IB) that you or your clients should contract for the services of or invest with any of the independent advisors or hedge funds or others who may post on IBKR Quant Blog or invest with any advisors or hedge funds. The advisors, hedge funds and other analysts who may post on IBKR Quant Blog are independent of IB and IB does not make any representations or warranties concerning the past or future performance of these advisors, hedge funds and others or the accuracy of the information they provide. Interactive Brokers does not conduct a "suitability review" to make sure the trading of any advisor or hedge fund or other party is suitable for you.

Securities or other financial instruments mentioned in the material posted are not suitable for all investors. The material posted does not take into account your particular investment objectives, financial situations or needs and is not intended as a recommendation to you of any particular securities, financial instruments or strategies. Before making any investment or trade, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice. Past performance is no guarantee of future results.

Any information provided by third parties has been obtained from sources believed to be reliable and accurate; however, IB does not warrant its accuracy and assumes no responsibility for any errors or omissions.

Any information posted by employees of IB or an affiliated company is based upon information that is believed to be reliable. However, neither IB nor its affiliates warrant its completeness, accuracy or adequacy. IB does not make any representations or warranties concerning the past or future performance of any financial instrument. By posting material on IB Quant Blog, IB is not representing that any particular financial instrument or trading strategy is appropriate for you.