The Seventh Year with TiDB: The Kelly Formula

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 与tidb的第七年,凯利公式

| username: tidb狂热爱好者

The Kelly Criterion was proposed by John R. Kelly, Jr. in 1956 (Kelly 1956). It indicates the optimal proportion to bet in each round in a repetitive gambling or investment scenario with a positive expected return. The Kelly formula is expressed as: f* = (bp - q) / b = (p(b + 1) - 1) / b, where f* is the proportion of capital to bet, p is the probability of winning, q is the probability of losing, and b is the odds. In investment, what we should really care about is the long-term cumulative income. For cumulative returns, the final result only depends on the number of wins and losses, not the order of wins and losses. The Kelly formula derives an optimal position size to maximize long-term cumulative returns.

Discovery History

The Kelly Criterion was initially established by AT&T Bell Labs physicist John Larry Kelly based on research by his colleague Claude Elwood Shannon on noise in long-distance telephone lines. Kelly explained how Shannon’s information theory could be applied to a gambler with insider information on horse racing. The gambler wants to determine the optimal bet amount, and his insider information does not need to be perfect (no noise) to give him a useful edge. Kelly’s formula was later applied by another colleague of Shannon, Edward Thorp, to blackjack and the stock market.

Introduction

Besides maximizing long-term growth rates, this equation prevents the possibility of losing all existing funds in any bet, thus eliminating bankruptcy concerns. The equation assumes that money and bets can be infinitely divided, which is not an issue in practical applications as long as the funds are sufficient. The most general statement of the Kelly formula is that by finding the capital proportion f* that maximizes the logarithmic expected value of the outcome, one can achieve the maximum long-term growth rate. For a simple bet with only two outcomes (losing all funds or winning a payout multiplied by specific odds), the following formula can be derived from the general statement:

f* = (bp - q) / b

where

f* is the proportion of existing funds to bet next time;

b is the odds (net odds);

p is the winning probability;

q is the losing probability, i.e., 1 - p. For example, if a gamble has a 40% winning probability (p = 0.4, q = 0.6), and the gambler can get two-to-one odds when winning (b = 2), the gambler should bet 10% of the existing funds each time (f* = 0.1) to maximize the long-term growth rate of the funds.

Note that this well-known formula only applies to table gambling, where losing means losing all the principal. The more widely applicable Kelly formula is:

f* = (p * rW - q * rL) / (rL * rW)

where f*, p, q are as above, rW is the net win rate, and rL is the net loss rate.

In other words, the first formula is just a special case of the second formula where rL = 100%.

The Kelly Criterion was initially established by AT&T Bell Labs physicist John Larry Kelly based on research by his colleague Claude Elwood Shannon on noise in long-distance telephone lines. Kelly explained how Shannon’s information theory could be applied to a gambler with insider information on horse racing. The gambler wants to determine the optimal bet amount, and his insider information does not need to be perfect (no noise) to give him a useful edge. Kelly’s formula was later applied by another colleague of Shannon, Edward Thorp, to blackjack and the stock market.

The Kelly Criterion

Kelly % = W – [(1 – W) / R]

Where:

W = Winning probability

R = Win/loss ratio

Investment Application

The Kelly formula can be applied in investment as follows:

  1. The Kelly formula cannot replace stock selection; stock selection should still follow the methods of Buffett and Fisher.
  2. The Kelly formula can be used for timing; even if a formula has investment value, there are times of overvaluation and undervaluation, and the Kelly formula can be used for timing comparisons.
  3. The Kelly formula is suitable for finding short-term speculative opportunities in non-core assets.
  4. The Kelly formula is suitable for asset allocation considerations, which is beneficial for fund management and can fully consider opportunity costs.

Blind Spots

The Kelly formula was originally designed to assist in planning electronic bitstream design and was later applied to blackjack. The problem lies in a simple fact: blackjack is not a commodity or trade. In blackjack, the amount you can lose is limited to the chips you put in, and the profit you can win is also limited to the range of the bet chips. However, the extent of gains and losses in commodity trading is unpredictable, leading to significant fluctuations in assets or gains and losses.

After introducing so much, can the Kelly formula be used in stock trading? Of course, stocks’ daily fluctuations are like the heads and tails of a coin toss. But the key is that you don’t sell the stock on the same day you buy it. You might sell it a few days later or even decades later. So the wins and losses are uncertain. Directly applying the Kelly formula is very inaccurate, leading to variations.

Kelly Probability Calculation Code

The following code:

==========================================

import yfinance as yf

import numpy as np

# Download historical data for the Nasdaq 100 index

data = yf.download('^NDX', period='1y')

# Calculate daily returns

daily_returns = data['Adj Close'].pct_change().dropna()

# Calculate annualized standard deviation

annual_std_dev = np.std(daily_returns) * np.sqrt(252)

# Calculate annualized average return

annual_mean_return = np.mean(daily_returns) * 252

# Assume a risk-free rate of 0 to simplify the Kelly formula calculation

# Kelly formula: f* = (bp - q) / b

# Where:

# f* is the proportion of current funds to bet;

# b is the odds (b to 1);

# p is the winning probability;

# q is the losing probability, i.e., 1 - p.

# For the Kelly formula, approximate b and p

# Average annualized return / standard deviation. The standard deviation reflects the stock price's volatility; the greater the volatility, the higher the risk of profit and loss. Conversely, a higher annualized return usually means a higher standard deviation. The stock market is volatile, and one must view this issue dialectically to seek survival in danger.

b = annual_mean_return / annual_std_dev

# Here, p is the probability, i.e., the number of days with positive returns divided by the total number of trading days, calculating the probability of making money out of 252 trading days.

p = len(daily_returns[daily_returns > 0]) / len(daily_returns)

# Calculate the Kelly formula

kelly_criterion = (b * p - (1 - p)) / b

# Output results

print(f"Nasdaq 100 index odds: {b}/{p}")

print(f"Nasdaq 100 index annual standard deviation: {annual_std_dev}")

print(f"Nasdaq 100 index annual return: {annual_mean_return}")

print(f"Nasdaq 100 index Kelly position ratio: {kelly_criterion}")

==================================================

Data Backtesting

After calculating this probability, you can input the data to calculate your returns:


import yfinance as yf

import numpy as np

# Get historical data for the Nasdaq 100 index

data = yf.download('NDX', period='1y')

# Calculate daily returns

daily_returns = data['Adj Close'].pct_change()

# Define the Kelly formula function

def kelly_criterion(b, p):

q = 1 - p

return (b * p - q) / b

# Assume win probability and odds

win_prob = 0.56 # Win probability

win_loss_ratio = 2.3 # Odds

# Calculate daily investment ratio

investment_ratios = kelly_criterion(win_loss_ratio, win_prob)

# Initial capital

initial_capital = 100000

capital = initial_capital

# Simulate daily investment

for daily_return in daily_returns[1:]: # Exclude the first day's NaN

capital += capital * investment_ratios * daily_return

# Calculate the total amount after one year

final_amount = capital

print(f'Total amount after one year: {final_amount}')

=====================================

Conclusion


jiangming@admin\njiangming ~ % python3 m2.py

[*********************100%%**********************] 1 of 1 completed

Total amount after one year: 114810.36208716096

I calculated an annualized return of about 14%.

Shortcomings

The Kelly formula cannot predict the future. It is a calculation based on historical data. Historical data has a certain accuracy but is also very unreliable. The Kelly formula can only be applied when the stock market has a long-term winning trend over 10 or 20 years. For example, the Nasdaq and Dow Jones have had a century-long upward trend. It’s really like a casino. If stocks were like lottery tickets with official odds set at 65% of the total bet amount, you would lose 35% of your cash on each bet. The more you bet, the more you lose.

Applying the Kelly formula can help you achieve better annualized returns compared to the Nasdaq 100 index.

You can calculate the annualized returns and risk ratios for each stock, such as Apple or Microsoft, but remember that it cannot predict the future. Many tech companies in the Nasdaq have high returns, but many have also disappeared over time, like Luckin Coffee.

The company’s performance will be reflected in the stock price, and the company’s quality is strongly correlated with the CEO. For example, Apple’s Tim Cook is reliable, and Buffett has about 10%-15% of his funds invested in Apple.

However, Apple has also faced numerous challenges over the years, such as direct competition from Huawei, so there is still a probability of loss. These black swan events cannot be tested with the Kelly formula.

For example, Nvidia’s profits soared last year due to AI computing, and its stock price increased tenfold. These are unpredictable in historical data.

But one common point is that people like Tim Cook, Elon Musk, and Jensen Huang are among the most outstanding humans on this planet. They are worth betting on.

On the other hand, consider Nasdaq’s Alibaba. Look at Nasdaq BABA. It is heavily influenced by policies and regulations.

This is why Nasdaq is thriving while the A-share market remains at 3000 points. When China is not strong enough, and profitable companies like Huawei are not listed in the A-share market, it is not worth investing in A-shares.

But testing the Nasdaq index is more accurate.

However, consider this question: if a company has a long-term upward trend, should you go all-in at the beginning or use the Kelly formula for daily investments?