An introduction to algorithmic trading pdf

Sunday, November 25, 2018 admin Comments(0)

University of Singapore, Nanyang Technological. University, Fudan University, etc. ▻ Quantitative Trader/Analyst, BNPP, UBS. ▻ PhD, Computer Sci, University . Introduction To Algo Trading. 7. - What is a Trading System? 8. - How do Trading Systems operate? 8. - What you call a Trading System is actually a CEP System. Interest in algorithmic trading is growing massively -its cheaper, faster and better to control than standardtrading, it enables you to pre-think the market,executing.

Language: English, Spanish, Japanese
Country: Uzbekistan
Genre: Personal Growth
Pages: 405
Published (Last): 25.03.2016
ISBN: 858-6-38575-124-5
ePub File Size: 22.34 MB
PDF File Size: 19.46 MB
Distribution: Free* [*Regsitration Required]
Downloads: 23728
Uploaded by: ADELIA

An Introduction to Algorithmic Trading: Basic to Advanced Strategies (Wiley .. DOWNLOAD PDF . Part I INTRODUCTION TO TRADING ALGORITHMS. October QuantConnect – An Introduction to Algorithmic Trading. Page 2. Outline. ▫ What is QuantConnect? ▫ What is Algorithmic Trading? How can it help . Interest in algorithmic trading is growing massively – it's cheaper, faster and better to control than standard trading, it enables you to 'pre-think'.

Basic to Advanced Strategies Editor s: Here is what you do to place it on the right hand axis. Price is one of the defining characteristics of a stock. It is rare to read so lucid a dissertation where most others are still stumbling around in the dark. We fundamentally make no assumptions that the data belong to any particular distribution. The full Watchlist of stocks, most of which we have traded, together with some basic outline information and company descriptions you will find in Appendix B and Appendix C. This decreases the chances of Excel or the operating system misbehaving.

The cost alone estimated at 6 cents per share manual, 1 cent per share algorithmic is a sufficient driver to power the growth of the industry. Algorithmic trading is becoming the industry lifeblood. But it is a secretive industry with few willing to share the secrets of their success. Previously he was CEO of an electronics company, supplying point of sale electronics to major retailers such as Sears and Sunoco in Canada and Allied Breweries in the UK, where he gained considerable electronics experience and was the first to automate an assembly line using electronics in the UK.

His main academic background is in mathematics and physics and he has a great interest in the theories of Universality and Complexity as applied to the markets.

He is currently developing a fully automated algorithmic trading system with his co-author Jane Cralle. Free Access. Summary PDF Request permissions. Part I: Part II: Part A: Appendices Free Access. PDF Request permissions. Tools Get online access For authors.

Email or Customer ID. Forgot password? Old Password. New Password. Your password has been changed. It is a bit like looking at a specimen under a high powered microscope. These small swings become invisible as soon as you use any data aggregation, at even one second aggregation. This phenomenon gives us a substantial playing field.

Our real-time recalculation and response time is short enough for us to be able to use market orders most of the time. It is a fair bet that most stock prices are unlikely to move very far in less than milliseconds, our average recalculation time. This imposing figure however includes quite a large number of issues which trade very thinly or even not at all on any particular day.

We trade a very small selection of stocks which meet certain strict criteria on our various metrics including, among many others the very obvious volatility, volume, and price levels. The set of our proprietary algorithms described in this book was selected for ease of use by new traders with only limited experience. The algos, templates and supporting material are provided on the accompanying CD. We have found, anecdotally, that certain types of algo appear to provide better results with certain types of stocks and certain stock sectors.

We have not been able to find any predictors to be able to reliably forecast these patterns up to this point. The prevailing market regime and sentiment swings and regulatory changes influence the behavior of the stock-algorithm couple, as do various other stylistic properties of the markets, possibly swamping the specificity effects on which we rely to provide us with the trading information.

At present, we have to be satisfied with relying on a hybrid approach: A melange of chaos theory, complexity theory, physics, mathematics, behavioral Prospect theory, empirical observation, with a great amount of trial and error experimentation, experience and creative insight thrown into the mix. This, so far has been the only way for us to achieve reasonable results and to get some rewards for our efforts. We often use the weather analogy to explain our core philosophy: Can you tell us exactly what the weather will be three days from now?

Can you tell us exactly what the weather is going to be in the next three minutes? Yet to Come 43 This naturally leads us to the fairly obvious conclusion that the prediction horizon gets fuzzy rapidly. So going to the limits of shortest time of trade and highest data resolution appears to be a reasonable strategy. Each trade is one line in our Excel template and reports date, time of trade, volume how many shares traded and the price at which they changed hands.

We can trade as many instances of Excel simultaneously within the limitations of our hardware, operating system and connectivity. Our maximum was reached on a fairly slow trading session where we were running 40 stocks on an 18 core server. For the starting algo trader we recommend starting with one or two stocks and a reasonable machine see our chapter on Hardware and then, when that feels ok add more stocks and switch to more powerful hardware as you gain confidence.

This is fast enough to put on a trade with a very high probability that the trade price will not have moved appreciably from the trigger point. We should mention that a very quiet market can suddenly erupt like a winter squall on a frozen lake sweeping all in front of it. This view is slowly changing and is supported by the views of various highly recognized academics such as for example Dr Benoit Mandelbrot who has provided much new financial analysis.

Algorithmic trading an introduction pdf to

Our core strategy: Many small trades of short duration and quite small returns on each trade. Our software uses various selected algorithms to find these small and short lived market inefficiencies where we can achieve a profitable trade. The sell algos are much harder to get anywhere near optimum. If one thinks about it for a while the puzzle dissolves: When buying you have a huge selection to buy from. When selling you must sell just what you bought. The opportunity size difference is many orders of magnitude.

In all cases we protect the trade with a proprietary adaptive stop. Only very recently mid have we finally been satisfied with the stop loss algo and have added it to all trades. At this high resolution tick the price swings are very frequent which provide us with the extremely short holding times — trades with very low duration.

This frees up the capital which had been committed to the trade to be deployed on another trade. It can also be used as volatility metric between two designated tick points. Here is our explanation: In our experience, trading churn of more than 4 times is hard to achieve and needs substantial starting capital as you have to trade quite a number of stocks in parallel in order to achieve the multiplier effect.

Our entire methodology has been specifically designed for the individual trader with limited capital. It may be of no great interest to the major brokerages as it does not scale well to trade massive blocks of shares we need different algos for this purpose. Using small volume trades coupled with market orders to guarantee instant execution our method seems to hold quite a number of advantages for the individual trader.

No tier 1 player is likely to be interested to compete in this space. Recently the high frequency of trading has been validated by a number of workers in this area of finance. This type of high frequency trading should not be confused with the current fad of ultra high frequency trading which makes use of collocation of servers at the Exchanges to enable placing extreme order quantities and also to cancel at an equivalent rate.

The ultra low latency approach facilitates machine placement of order rates of over per second with peaks reported of per second. The high resolution methods which we are working with are accessible to the individual trader in contrast to the huge capital investment required for the ultrahigh frequency methods and will produce a considerable increase in the number of people who will embrace personal trading as a hedge against the market cycles.

The results sound so simple to achieve, and to a certain extent they are, but it nevertheless took us 12 years of trading and research, going in and out of a multitude of blind alleys. We spent three years researching Artificial Neural Networks ANNs only to find that the technology could not cope with real time.

To finally reach the level of expertise in the basic methodology described in this book took a fair amount of isolation from friends and family. Trading is both enormously demanding, addictive and intellectually extremely stimulating but also very time consuming.

It is only too easy to lose all track of time when analyzing the markets. It is said that in order of complexity the markets rank a good 4th after the Cosmos, Human Brain, and Human Immune System. Although the latter three are a little bit older and had a bit more time to evolve to the present state.

Minimum holding times and many trades. Very few practitioners show the freshness of intellect and insight of Irene Aldridge whose article in FinAlternatives we have the pleasure of quoting in full. It is rare to read so lucid a dissertation where most others are still stumbling around in the dark. With many thanks to Irene.

High-frequency trading has been taking Wall Street by storm. The discourse on the profitability of high-frequency trading strategies always runs into the question of availability of performance data on returns realized at different frequencies. Hard data on performance of high-frequency strategies is indeed hard to find.

Hedge funds successfully running high-frequency strategies tend to shun the public limelight. Others produce data from questionable sources. Yet, performance at different frequencies can be compared using publicly available data by estimating the maximum potential profitability.

Profitability of trading strategies is often measured by Sharpe ratios, a risk-adjusted return metric first proposed by a Nobel Prize winner, William Sharpe. A Sharpe ratio measures return per unit of risk; a Sharpe ratio of 2 means that the average annualized return on the strategy twice exceeds the annualized standard deviation of strategy returns: The Sharpe ratio further implies the distribution of returns: Note that high-frequency strategies normally do not carry overnight positions, and, therefore, do not incur the overnight carry cost often proxied by the risk-free rate in Sharpe ratios of longer-term investments.

The return is calculated as the maximum return attainable during the observation period within each interval at different frequencies. The standard deviation is then calculated as the standard deviation of all price ranges at a given frequency within the sample. Time Period Average Max. In practice, well-designed and implemented strategies trading at the highest frequencies tend to produce double-digit Sharpe ratios.

Real-life Sharpe ratios for well-executed daily strategies tend to fall in the 1—2 range. Yet to Come 11 Our Nomenclature Nomenclature is just a fancy way of saying: We have found that all successful enterprises and projects develop their own language — lingo develops and acronyms come into common use.

Personally we have found it indispensable even in a team of two to ensure that what we say to each other is understood exactly how we meant it. This precision is not pedantic but in some cases can mean the difference between success and failure. Besides, it can save a lot of time and potential friction. We argue less.

We will try to keep our nomenclature consistent, simple and hopefully clearly understandable perhaps not always using the standard definitions and hope to succeed in this area by including words which we use in our everyday trading and research work.

You would not believe how we have struggled untold hours trying to comprehend what a symbol or an equation really meant in a book or learned paper and we are determined to spare you such agony. So here is the list of definitions and explanations of the key words, symbols and concepts we will use. EOD End of day — data totals for the whole trading session Backtest Procedure testing the performance of an algo against historical tick data, over a specified lookback period.

May be expressed as a percentage by dividing return by Price at the start. This period is chosen to suit the requirements of what data is being calculated and what you are trying to find out. Parameter Any user definable constant, used in an algorithmic calculation. Tn thus tells us to sum the series T0.

An Introduction to Algorithmic Trading (eBook, PDF)

Sub- and superscripts will only come into play when we have to define the summed area as a subset. If we have to go to seconds for equitemporal reasons this is clearly indicated. Otherwise it is always in ticks as the default. On rare occasions we may use the log of the price when the axis will be in log x.

May include various metrics. Stats Toolbox Variance, sigma, standard deviation, average, median, mode, median standard deviation please see Chapter 13 below. Math Toolkit Sigma sign, log, ln natural logarithm , exponents, first and second derivative please see Chapter 12 below.

Yet to Come 12 Math Toolkit Our math toolkit is intentionally quite sparse as we have seen that smothering the trading problem with complex math did not provide us with commensurate benefits in the way of returns. Most algo construction can be achieved with the simplest of tools.

We think that this is the most you will need to start with. One of the most important concepts for us is Moving Averages.

Pdf algorithmic trading an introduction to

In this book we will be concentrating only on stock price as the input data, leaving the analysis of the other factors for future volumes. We shall now describe these moving averages in some detail as they constitute the foundation stone and raw material building blocks of most algorithmic strategies from the most basic to the ultra advanced.

This example will demonstrate how moving averages work in general. The jagged blue line represents the trade price. Before we move to the next tab here is a useful shortcut when you are experimenting to find a good short moving average: It will turn black, and so will the little length scroller. You can now scroll up to a maximum of This maximum is a bit limiting but for quick short MAs it is very fast and convenient.

We use it constantly to quickly see if there are any short patterns we may have missed. Moving on to the next tab, LMA, this plot is usually color coded red. Notice that the blue line, which is the trade price tick line, is jagged. The longer the lookback, the smoother the line. The formula for the MA is: Where addition takes two numbers and produces a third number, convolution takes two signals we are considering our tick series as signals and produces a third.

Note that the values are much smaller and have to be displayed on a different scale, on the right hand axis. Please see the Excel Mini Seminar chapter on how to do this.

This is, in fact an early introduction to our Trade trigger signal line in our ALPHA1 algo which you will be seeing later on. This particular trigger line has not been parameterized so is only shown here as an example of how moving averages are P1: Yet to Come Math Toolkit 55 constructed and how convolution works.

You can scroll right to see an expanded version of the chart. The field of Digital Signal Processing is most useful in the design of trading algos but beyond the scope of this volume. If you are interested in this subject we found the most approachable text-book to be by Steve W.

Smith — please see the Bibliography. There are quite a number of other moving averages, such as median, weighted, logarithmic, standard deviation and exponential EMA as well as a number of others. In fact any variable can be turned into a time series homogenous or inhomogenous and charted to see how it unfolds. The EMA is of particular importance to us as we find that much of the time it gives some of the best results. Note that the first EMA value is seeded with the earliest trade price oldest trade price.

In the EMA the earliest oldest data is never removed, but as you move the series forward its influence diminishes according to an exponential decay function which you have set by selecting the length of the lookback, n. Thus the earliest trade prices have the least influence on the current value of the P1: In our charts we color the plain vanilla EMA a bright violet or any visible color on the chart background you have chosen.

Here are some more odd items of information which will come in handy understanding some of the nuances.

Iterative means repeating. You are probably quite familiar with the next few sections but we have thrown them in for completeness for those who had the good fortune to miss the classes covering these items. One thing to remember is that variables are all the things that vary and constants are all those values which do not. Parameters are constants where you set the values.

It can be fractional, X2 means X times X. An exponent can take any real value e. Most of our math uses the natural log. Both work the same way. If you remember the basic log operations well and good, if not, here is a quick review. A logarithm is the number to which you must raise your base 10 for ordinary logarithms to get your target number. Logs have some useful properties.

They change multiplication into addition: Trade prices are sometimes expressed as the log trade price especially when the graph axis has to take a large range. There is also the advantage that plotting logs of an exponential series will produce a straight line plot. Many practitioners use the natural log logs to the base 2 return as the default.

This is our main point of reference and understanding of how the trading properties develop and how we can understand and assess, extract meaning and make decisions. All curves have an underlying equation from which they are generated. It is useful to get a mini view of how equations translate into charts. The parabola is a beautiful curve that you will be seeing a lot of.

Power Law Equations have the form of: Nonlinear Exponential Equations can have the form of: Obviously things quickly get complex when various equations interact and convolve.

What we observe in the price trajectory is this ultra-convolution of all the factors which influence the price of the stock, a cocktail of equations and parameters, continuously in a state of flux. Of course the expression following the equal sign can be complex but it will always be evaluated by what the x value is. The slope of the Tangent line is equal to the derivative of the function at the point where the straight line touches the plot of the function line.

For the Slope calc we have a sort of mnemonic — imagine a right angled triangle with the hypotenuse touching the function line. The scales must be the same, unless you use it strictly as a standard for comparisons.

To recap: The slope is the vertical distance divided by the horizontal distance between any two points on the line. In all cases the delta value must be kept small.

The Derivative evaluates how a function changes as its input changes. In calculus these changes are at the limit of being smallest, at infinity. It can be thought of as how much the value is changing at a given point. It is the best linear approximation of the function at the point of contact. The derivative of a function at a chosen input value on the plot describes the best linear approximation of the function of that input value.

Thus the derivative at a point on the function line equals the Slope of the Tangent line to the Plotted line at that point. There are many notation conventions — Leibnitz, Lagrange, Euler and Newton all have their own way of writing the derivatives.

We shall use that of Leibnitz. An example of the first derivative is the instantaneous speed, miles per hour. The instantaneous acceleration is given by the second derivative. SETS The theory of sets is a branch of mathematics.

An Introduction to Algorithmic Trading | Wiley Online Books

Sets are collections of defined objects. George Cantor and Friedrich Dedekind started the study of sets circa mids. Surprisingly the elementary facts about sets are easily understood and can be introduced in primary school, so that set membership, the elementary operations of union and intersection as well as Venn diagrams are used in the study of commonplace objects.

Sets can have subsets contained in them and equally there are supersets which can contain other sets. We have two hypothetical lists of stocks — List 1 is filtered by price: List two is filtered on daily volume being not less than shares. The concepts of Sets and Clusters are related and we find them both useful in the selection of stocks to trade.

Please see the Bibliography for further reading. Yet to Come 13 Statistics Toolbox This chapter is intended to give you the tools needed to get a better understanding of the characteristics and properties of the data streams you will be looking at. There are bookshelves of statistical books which we found too wide of reach for the analysis we had in mind. What we need are the tools to express our concepts, many of which are visual impressions, in a melange of mathematical terms, as simply and concisely as possible.

Both the entire populations as well as samples taken from them are used. As we already know a sample is a subset of a population.

This depends on the data vendor selection. Tick data tends to be cleaner than aggregated data. In our experience this has never been a problem.

If it does happen it is usually so far out that any of our algos would ignore it. Should it become a problem we can filter the input data but so far we have never had the need to do this. Next we would like to know something about how the data is distributed. Is it tightly clustered close together, or is it strewn all over the place?

Are there places where it is more concentrated or places where it varies more than it does in others? We use subscripts as position markers, e. T10 means that this is the 10th tick. We define the return intervals using the subscripts. Thus a lookback of ticks will be written T.

If we have the series we can use any part we are interested in by defining a lookback length and operating on that. Also, it turns back to a function if you remove the apostrophe. The sample average is the sum of the variables divided by the number of data points. The mean is a most useful function and much can be learned by using different lookbacks, comparing them, and looking for sudden changes to hunt for telltale prodrome signs of trend change. This can be quite confusing till you get the hang of it — that the minus sign when between two vertical bars is read as plus sign.

If you wish to know how often a particular value appears in the series you choose the mode. The median is the value in the middle of a sorted series lookback. This is the distance that a value lies above or below the mean of the data measured in units of standard deviation, sigma.

More on standard deviations later in this chapter. It could be a genuine characteristic of the particular distribution you are analyzing or it could be some anomaly such as a mistyped trade that got through or a transmission error, or even a blip on the line due to some mechanical or electrical disturbance.

Thus a much higher z-score would indicate that the value is an outlier and should be ignored in the main deliberation on the dataset. To visually compare the relationship between two quantitative variables we can use the Excel scatter plot. We can also use the Excel Side by Side column graph feature to look at how two variables relate to each other visually.

To get a visual handle on how the data is distributed we construct a histogram. Note that we are not making any assumptions on how the data is distributed — we are just dispassionately looking to see what we have. Excel is not the best histogram maker but will do for most work.

Please look at the File: Place the tip of your cursor on a histogram bar and up will come a little window with the message: This gives us a rough idea of what it is possible to achieve with this stock under the current market conditions P1: We fundamentally make no assumptions that the data belong to any particular distribution. We must be aware that the actual price trajectory is a convolution of an unknown number of factors equations which in themselves fluctuate and also fluctuate in their level of contribution to the total convolution resulting in the trade price.

It has been the workhorse of statisticians for well over a century. Karl Friedrich Gauss — was one of the giants of mathematics. Both published results in Gauss actually derived the equation for this distribution. Professor Mandelbrot in his study of early cotton prices found that this distribution did not reflect the entire truth of the data and that it had exaggerated legs, fat legs.

He proposed a more refined distribution but this has also only been a way station as research has still to come up with a definitive answer as to how financial asset series are really distributed.

However, the Gaussian distribution works up to a point just shut your eyes to the simplifying assumptions for the moment and in addition it has a bevy of analytic tools which have developed around it which we can use as long as we are aware that the assumptions we are using may turn around and bite us on occasion. Most of the computations in the past required that the data be IID, meaning that each measurement was taken independent of other measurements and that all measurements came from an identical distribution, thus IID.

This is a stringent requirement and financial data is hardly ever that compliant. So whenever you come across the innocent looking IID, watch out!

What we are really after is some idea of the dynamic trajectory of the price and this is more in the realm of digital signal processing. We interpret the sample variance as a measure of how spread out the data is from a center. This is the square root of the variance.

It is a good measure of dispersion for most occasions and worth the fact that we are using it in violation of some of the precepts on how it operates. We suspect that one reason for its popularity may be that it is so easy to calculate.

The standard deviation is the usual way we measure the degree of dispersion of the data, Gaussianity is assumed with clenched white knuckles and crossed fingers. The formula for actually calculating this distribution from first principles is a bit complex so we satisfy ourselves with getting the standard deviation using the Excel function STDEV. A gives us the sample standard deviation of the series in Column A Rows 1 to If the line falls from left to right the relationship is negative and R will be close to —1.

This feature negates most of the classic statistics as the proviso there is that the values be P1: In most cases the autocorrelation lasts only a relatively short time, say a maximum of five minutes, and then vanishes. However in a high frequency short holding time trading horizon this we believe is of the essence.

It is just the stylistic anomaly of the market which we can use to make profitable trades. Many strategies for creating powerful trading algorithms may require that the incoming data be manipulated in some way.

It lets through the low frequencies while attenuating curtailing the high frequencies. It attenuates the low frequencies leaving us with the high frequencies. In other cases we may need to remove the low frequencies and keep the high frequencies. This is accomplished by subtracting the mean from all observations and dividing by their standard deviation.

Yet to Come 14 Data — Symbol, Date, Timestamp, Volume, Price In our present methodology we have restricted the input data quite severely, taking in only: We often only use Price and Symbol. Our input data is either provided bundled from our brokerage or from a specialist real-time tick resolution data vendor.

Please have a look at the chapters on Data Feed Vendors and Brokerages in due course, as you will have to decide on the suppliers to bring your tick real-time data feed into your spreadsheets. We will leave this area for the relevant chapters later in the book. For each stock on our Watchlist either from the NASDAQ or NYSE Exchanges the data feed provides on request realtime trade data, as it comes from the Exchanges say via your brokerage account on the headings which the feed handler writes into our Excel templates, one stock per template.

As previously mentioned, each template is running under its own Excel instance. The ticker symbol of the stock is written into Column A Row 1.

As already mentioned before we shall use alphabetic Column designation throughout this book. Into Col B Row 3 goes the Timestamp of the trade formatted as hh mm ss. Some vendors have the capability of providing millisecond resolution but we have not P1: Perhaps as the markets evolve this resolution might become useful but at present we do not find it necessary.

Into Col C Row 3 goes the Volume of the trade, in integers. This is the number of shares traded on this transaction. Some feeds do not write the double zeros for whole dollar values. This not only gets rid of the calling strings but also provides us with freedom to move the data around. The initial 4 columns are written as an array in Excel which does not allow any flexibility as arrays cannot be edited.

The above is standard for all our templates. Just a taste of what is to come. We have chosen Excel to implement our algos as it is the universal spreadsheet with the added advantage of a fairly rich function language in optimized microcode and thus provides fast execution.

We will restrict our rambling through the Excel forest strictly or as close to strictly as we can manage without getting lost to what we will be using in our algos. We have lived with spreadsheets since their very inception and the industry standard Excel has been the work horse for nearly all of our research.

We are using Excel version rather than which is the latest version except where clearly stated the reason is that has much expanded grid size with one million Rows, where only gives us 64K Rows and Columns. This can become a problem with high activity stocks as each transaction takes up one row. We use one instance for each stock that is traded.

This provides us with a safe, redundant, well behaved and very fast overall system. Every instance of Excel can have further sheets added to the default 3. We prefer not to go over 12 as that seems to be our limit for keeping oriented. When possible it is good practice to keep the Excel file size below 20Mb, certainly below 30Mb. This decreases the chances of Excel or the operating system misbehaving. We must confess that we often use quite huge Excel files, Mb plus.

This is not recommended. Use multiple backups. You are probably an expert Excel jockey, if so, please skip the following — if you are at all unsure give it a go. The workspace has been set up and configured to fit with the work we shall be doing on the algo templates and in general is the same throughout except when an exception is flagged. For ease and speed we will describe actions like the one you have just done in a sort of shorthand: On the menu bar click in sequence: Insert worksheet This shorthand is also useful when designing and note keeping.

To get the sheets in the order of your choice just put your cursor on the tab named Sheet4, click, hold down the cursor and drag it over past Sheet3 and let go when you have reached the position you want, just past Sheet3. This is a useful habit to develop.

Frequent saves will do much to eliminate or perhaps we should say minimize the chance of suddenly finding the system crashed and losing your work. So plenty of saves will avert this problem. There is also a handy safety feature which saves a copy automatically on a timescale which you define. You will find it at Tools Options Save. To name the file go again to the Menu Bar: File Save As Type in your filename Save Practice navigating around the workspace area using the up, down, right and left arrows.

Try PgDn and PgUp. Type in a couple of numbers — see, it works fine. Now just hit the Del key to clear the cell. Each cell location is defined by a letter and a number.

The letter refers to the Column and the number refers to the Row. We shall use letters for the Column identifier in all cases. Just for your information you can configure Excel so that the columns are also using numbers but for the start that is unnecessarily confusing and we shall use letters to identify Columns throughout this book. The Excel workspace continues on off your screen to the right. Select a cell and press the right arrow and hold it down — you will notice that the sheet has scrolled to the right and stopped at Column IV that is letter I and letter V.

Now for the high speed trick: Cell ranges are defined by the top P1: Yet to Come 71 cell and the bottom cell, for example A1: Cell ranges of more than one column are referred to like this example, A1: B means the range from A1 all the way down to B, thus B1 to B is also highlighted. You can highlight a whole column by clicking on the Column letter in the header.

Similarly highlight a whole row by clicking on the Row number. Now for some calculations, which is what spreadsheets are all about. You can activate any cell by placing the cursor hollow cross usually in the cell and left clicking.

This will add the values in cell A1 and cell A2 into cell A3. What you type will also appear in the edit box of the formula bar, where you can edit it as well, just as you can in the active cell. In case you want to change the worksheet name just double click on the sheet tab and type in your new name.

An Introduction to Algorithmic Trading: Basic to Advanced Strategies (Wiley Trading)

Copying formulas can be tricky. This is a crucial feature of Excel. It gives you the capability to copy down a formula for an entire column and still have all the relative addresses correct. This can be either the Column ref or the Row ref or both mixed is OK but can be pretty confusing so be careful. Here is a trick: This will display your formula, including the apostrophe at the front. You can later remove the apostrophe and it will revert to being a formula.

Formatting the look of the sheet is standard Office stuff. The format paintbrush which lets you copy formats from one place to another comes in handy as well.

Careful, it can be tricky. Charts in Excel have a special chapter devoted to them a little later on. Excel has a very rich function language.

The added beauty of it is that the functions are optimized in microcode so execution is blazing fast. You will encounter these a lot in moving averages. It works like this: This reads: Boolean logic is an indispensably useful tool in devising algorithmic strategies. George Boole was hugely gifted with the ability to learn very rapidly. He studied the works of Newton, Laplace, Lagrange and Leibnitz and created a logic language around the s which we now call Boolean algebra.

It is the foundation stone of all logic operations and electronic circuit, software and chip design. Excel supports all the Boolean operators always write them in capitals: You can thus test for equal, greater or less than.

Yet to Come 73 Any number of operators may be chained. ANDs, NOTs and ORs can be chained and nested but the mixture can be confusing so we advise not to use it, unless absolutely necessary — better get the logic to work some other way.

Parentheses help, with the operators in the innermost pair being done first, followed by the next pair, till all parenthesized operations have been accomplished. You can then take care of any operations which have been left outside the parentheses. Excel abides by all these rules. We find it is easiest to work from the left-most column over to the right, and from the top down. This is also how Excel recalculates so we are being kind to the machine.

We often split up an algo into a number of columns to give us the transparency that we are doing what we think we are doing. This can later be compacted if necessary. There is not much in the way of execution speed improvement. Some of the more important Excel functions preferably always capitalize them are: In creating algo strategies and writing them in function code we will come upon a number of other functions which we will explain at the point of use. Moving a selection: A method we have seen used is to query Google with the question about Excel and it is usually successful in providing a useful answer.

Yet to Come 16 Excel Charts: How to Read Them and How to Build Them The knowledge of handling charts in Excel easily transfers to handling charts in your Order Management System and to specialist charting programs. We are using Excel version throughout this book and advise you do the same unless forced to use version for capacity reasons. Excel charts limit plots to 32 datapoints per chart, and safely up to six plots per chart. More than six may cause problems.

As we have already stressed many times the visual side of our research is key to most of our analyses so proficiency in the use of charts is an important skill. We are going to go into some considerable detail in how to set up, modify and generally use charts in Excel as this is one of the most important skills which make the creation of algorithms, from the very basic to the very advanced, immeasurably faster and easier to deal with. A good facility for creating and working with charts reduces the inertia one sometimes feels when facing a testing task like: If you want to create a chart here is what you do: Highlight the data range you want to chart.

Go to the main menu and hit the chart icon. This will bring up the Chart Wizard panel. Data Range should show the formula for the range you have selected to chart, e.

This is the address of the range you are charting.

More about setting the axis values later. Now click on finish — and lo and behold! You have a chart that looks something like this: Yet to Come Excel Charts: The sample line on the bottom shows you the effect of your choices. On the right three sub panels: Check leave the rest unchecked.

This is important as Excel will proportionately change the type size when you change the size of the chart. If you are charting volumes reduce decimals to 0. Last tab here: Phew, done it. This needs a latte reward. If you place your cursor in any one of these you can move the size of the chart.

Wider and narrower black squares at the sides Taller and shorter using the top or bottom black squares Corner squares to change the aspect ratio. If you use a glide pad just touch down and hold down your finger as you move about the pad till you get the chart exactly where you want it then let go.

Double click anywhere in the plot area and this will pull up the color selection panel for you to select what color you want it to be. Put your cursor in the plot area and right click. This brings up a minipanel with useful stuff: We always prefer the former. To format a gridline put tip of pointer against it and right click. There is a neat trick for adding a new series to the chart: Now right click again to bring up a mini menu showing: Here you see six equation choices, each of which describes a particular type of curve for which we may want to have a fit against the real live data.

The equations are: The moving average has a selection feature for how long you want the trend line to lookback. Its maximum is periods ticks. POWER where s and b are constants. It gives an immediate idea of how quickly the series we are looking at will smooth out and if there are any extremes showing up. The maximum number which you can scroll up to in the lookback is We have one more rather important point on charts which needs careful attention: What do you do if you have an additional curve whose values lie outside the left hand X axis values?

Here is what you do to place it on the right hand axis. Or you can id the series by looking for it the same way but on the formula bar. You may want to adjust the scales that Excel defaults to. In all cases we can craft a series from the metrics which can take on any lookback period required from one day to 60 sessions, or more if required.

We have not found it profitable to extend the lookback past the session point at the present time and believe it more efficient to work on smaller session lookbacks and perhaps a larger variety of analyses. Simple EOD series are useful for a rough and ready global understanding of the data.

Fundamental analysis where we would look at the company, its products in detail, the management, the balance sheet to evaluate the financial condition of the company, the profit and dividend history is not being considered by us in our analyses as to all intents and purposes our trades happen in too short a time span for any fundamental consideration to have time to exhibit an effective influence. The objective of this chapter on Algometrics is to give you a range of tools with which to start developing a close understanding of the individual stocks.

We have found that there are major differences in how stocks trade in the various price tiers. Anecdotally we have experienced considerably better results trading the higher priced stocks.

A possible explanation may be that the lower price stocks attract less experienced traders while the higher price stocks are the domain of the Tier 1 players. Price is one of the defining characteristics of a stock. It appears that the trading character of a stock is strongly modulated by its price. So one could assume that there are some differences in the trading characteristics. See the examples on the CD. This can usefully be extended over a lookback EOD of 5 trading sessions and onward to 60 sessions for some exceptional situations, preferably in boxcar fashion.

Here we string say T sequential lookbacks over a global lookback period of ticks — this produces 50 data points which can then be plotted. All may to be plotted in Excel. Use intraday series of T, T and T to see the series on each session. Taken over long EOD lookbacks 20 plus may on occasion prove useful. These should be plotted as line plots and also as histograms.

The number of zero center line crossings per nT or nt. The amplitude of the reversion characteristic either as an average over nT or as a series to reflect increases or decreases in this value. As usual also EOD. These can be carried out in small segments but we prefer to lookback at EOD. To put it in slightly more general terms: When we head for multiple metrics things can get quite complex when dealing with stocks, so normally we work with one metric at a time. Clustering may also be used for understanding the natural structure of the data.

It may be used as a summarization tool. It may be used to define and provide useful subsets. Our prime conjecture is that stocks of close proximity are likely to exhibit similar trading characteristics. The proximity dimension of the cluster members is considered to provide a good indication of their property similarities.

This conjecture, that sets of similar objects will have similar behaviors, at least to some extent, has no rigorous proof but appears to behave up to expectations reasonably well — in some instances the parallelism of behavior is quite remarkable. So that if we manage to solve the problem of price movement trajectory for one member of the set we may have also come closer to solving it for all similar members of that cluster set.

Yet to Come 86 The Leshik-Cralle Trading Methods ways with little reason for the preference of one partition solution to another. Thus we will still need further elaboration of our model to choose appropriate metrics on which to prioritize. Clustering techniques can be strongly influenced by both the strategy adopted as well as the biases and preferences of the operator.

Skill and experience make all the difference to obtaining successful results. This is a cluster we trade often. Notice also the distance from this cluster to GOOG at 0. Before we dive any further into the metrics here is a brief discussion of cluster analysis.

For a reasonably comprehensive analysis we suggest this should be comprised of between 20 and 60 sessions of the NASDAQ and NYSE Exchanges using your selected Watchlist ticker symbols and tick data as recent as you can get. Be careful not to underestimate the task: The underlying idea is that different stocks appeal to different traders — different people segments like to own particular stocks without any great logic underpinning the preference.

And, taking the logical skein further it could be conjectured that the stocks in a tight cluster might respond to the same algos and that they are also likely to move together, more or less, or at least exhibit some properties which we can use in the design of our algos.

Thus by analyzing stock properties and their behavior clustered on various metrics we will hope to develop more or less homogenous cohorts of stocks which will respond similarly and where selected algos will have a higher probability of producing excess returns.

This very clustering process also provides thought for new algorithmic strategies which explore and possibly make use of some particular facet or feature which a particular stock exhibits. Once we have chosen a property or metric on which to cluster a Watchlist of ticker symbols we have a choice of a large number of clustering methods.

We usually choose what is termed the Euclidian distance method which has the nice property of allowing us to use Excel scatter diagrams besides the actual math. The formula for computing the Euclidian distance D, between two values of a stock variable metric is: Just a simple scatter plot in Excel is usually quite adequate and a lot simpler.

Some concluding driving directions: You want to partition your stocks so that data that are in the same cluster are as similar as possible. Please be aware that a really crisp and clean cluster assignment is not always possible and we have to make do with the best that is available.

Much of the clustering will tend to be related to sector membership. Department of Management, University of St Andrews.

We have not used these but they are highly recommended: Yet to Come 19 Selecting a Cohort of Trading Stocks This chapter lists our first preferences in stock sectors we have experience with. The full Watchlist of stocks, most of which we have traded, together with some basic outline information and company descriptions you will find in Appendix B and Appendix C.

We have our modified industry sector list in Appendix A. Our definition of a cohort: Symbols which we have found to have similar trading characteristics when they are defined by our metrics.

Here is a first, cursive narrative routine to get you started. The Price Tier metric is our first port of call. If there is sufficient capital in the account we have found that we have had more success trading the higher priced stocks. We speculate that the higher priced stocks have a totally different ownership profile to that of the lower priced stocks. Trade share lots as a stop gap if there is not enough cash in your trading account to comfortably cover at least two share trades at the same time.

The volume level metric is the second stop on the tour. We are using volume here as a proxy for liquidity. Do not trade any stock which has a three-session average share volume of less than 1 million shares per day or has one of the three sessions at less than shares.

An EOD up trend over the last 5 days could mean that this will provide more trading opportunities. Ideally it should be over 0. Another version of this metric measures the price motion, or absolute traverse. We also take the same measure at EOD. Over time one inevitably tends to collect favorite trading stocks.

This is totally individual — no two traders will have matching preferences. These preferences evolve from your trading experience.