After the last post introduced the Kelly criterion and its application in
deciding how much to invest in a single gamble, we'll investigate whether Kelly
can help us choosing between multiple investment opportunities.
We'll start with a mathematical model:
V1=V0((1+r0)(1−i∑li)+i∑li(1+ri))
Simple, right? :-)
V0,V1 are the available money before and after the first
round, as before.
r0 is the risk-free return rate. This allows us to model opportunity costs
(could put the money into a bank account instead of investing, r0=0.004), or inflation
(non-invested money loses value over time r0=−0.02).
li is the fraction of the available money invested in stock i.
The first term (1+r0)(1−∑ili) represents all the money not
invested in any stock, being invested at the risk-free return rate r0. The
second term ∑ili(1+ri) represents the outcomes of the investments
in individual stocks i.
We will re-formulate a bit to simplify the expression:
We can further simplify by merging the li and ri into the vectors
l and r, with 1 being a vector where all elements
are 1:
V0V1=(1+r0)+l⋅(r−r01)
Let's move from a single game round to n rounds:
V0Vn=((1+r0)+l⋅(r−r01))n
Now we are again in a position where we can follow Kelly's prescription:
invoking the law of large numbers and finding the l which maximizes
the gain Vn/V0.
For large n, we can approximate:
V0Vn≈j∏(1+r0+l⋅(rj−r01))pjn
In this step we switched from the random variable r to its outcomes
rj. Each possible outcome rj occurs with probability
pj. We justify this switch with the law of large numbers: the outcome
rj will be observed proportionally to its probability, pjn
times (for large n). Consequently, we will have pjn factors involving
rj in the overall product, with j iterating over all potential
outcomes of r.
This equation is not analytically solvable, but may be approximated as a
quadratic programming problem as described in a paper by Vasily
Nekrasov.
Conclusion
It should be obvious that the Kelly criterion is applicable in a wide range of
scenarios, from gambling over investment decisions to whether to buy
insurance.
If you're interested in interactive plots that really helped me understand this
material, you can find them in this Jupyter Notebook.
In the next post we'll discuss the origins of the logarithm
in the Kelly formula.
The Kelly criterion is a formula used to determine the optimal size of a
series of bets in order to maximize wealth. It is often described as optimizing
the logarithm of wealth, and will do better than any other strategy in the long
run.
Let's look at a simple gamble: when you play, you win with probability p and
lose with probability q=1−p. If you win, you get back your initial bet plus
a fraction r of that bet. If you lose, your bet is forfeited. The Kelly
formula calculating the optimal fraction of your available wealth to bet is:
lopt=rpr−q=rpr−(1−p)=p−r1−p
Here is a small demo to explore the formula:
Kelly calculation demo
p
0.80
r
0.40
Probability of winning: 80 %
Profit on win, as percentage of placed bet: 40 %
Optimal fraction of wealth bet in this game: 30 %
The main takeaways are:
In each game you invest a fraction l of the total amount of money you
have. If you play the game from the demo (p=0.8,r=0.4) and you have
$100 available, you should invest $30 (since l=0.3). If you win,
you will have $112 (100+30×0.4) and invest $33.60 (112×0.3) in the second round. If you lost the first round you'd have $70,
and invest $21 in the second round. If you can not reinvest your
earnings, Kelly does not apply.
This formula holds when a large number of games are played. If you play only
few rounds of a game this may not apply.
When it applies, the Kelly strategy will generate the most wealth out of your
initial starting capital.
Derivation of the Kelly criterion
We will take a short dive into mathematics to understand how Kelly came up with
this formula. Let's compute the wealth V1 after the first round of gaming,
when starting from an initial wealth V0:
V1=V0(1+WlrW+(1−W)lrL)
Here, l is the fraction of money bet, rW is the profit on win, rL
is the fraction of the bet forfeited on loss (in the above scenario,
rL=−1). W is a random variable that's 1 with probability p and
0 with 1−p. So with probability p we'll have W=1, giving
V1,win=V0(1+lrW), with probability 1−p we'll get V1,loss=V0(1+lrL).
We can reformulate the expression in a way that will show an important generalization:
V1=V0(1+lrW)W(1+lrL)(1−W)
The two expressions look very different, but they are equivalent since W
can only take on the values 0 and 1. If W=1 the second term will
vanish since x0=1, and we will get the same result for V1,win.
The same consideration holds for V1,loss.
If we repeat this game n times, we arrive at
Vn=V0(1+lrW)Wn(1+lrL)(n−Wn)
with Wn being the number of wins in those n games. Essentially, each
won round adds a factor (1+lrW) to the expression, each lost round
(1+lrL).
Now comes a crucial step: how many wins do we expect after playing many games?
Asked mathematically, what value will Wn have for large n? Because of
the law of large numbers we expect W=pn. So for p=0.8 and
n=10000 we would expect to win 8000 out of the 10000 games.
Let's plug this into our formula:
Vn=V0(1+lrW)pn(1+lrL)(1−p)n
The law of large numbers allowed us to eliminate the random variables W,
Wn and instead obtain an explicit formula in terms of p,n,rW,rL.
Since we are interested in maximizing our profit, we are looking for lopt
— the l value that maximizes Vn/V0.
We are free to introduce the logarithm and drop the n factor because this
does not change the value of lopt.1 Differentiating the
logarithmic expression and solving for l gives:
lopt=−rLp−rW1−p
This is identical with the formula given in the introduction after substituting
rL=−1.
Extension to games with more than two possible outcomes
We can extend the Kelly argument to games that have a greater number of
possible outcomes. Such games could be:
A game with 6 different rewards selected by roll of a 6-sided die, where
rewards may be negative (money lost)
An investment in a stock, here we have a continuous range of outcomes.
If we have multiple outcomes ri, each happening with probability pi,
we obtain:
lopt=largmaxi∑pilog(1+lri)
While this equation is not easily solvable analytically, it is trivial to solve
numerically. This allows us to extend the Kelly approach to a new kind of game
with some interesting practical applications.
If you're interested in interactive plots that really helped me understand this
material, you can find them in this Jupyter Notebook.
The Kelly formalism can be generalized further still, taking into account
multiple parallel investment opportunities. We'll look at this in the
next post.
1 The logarithm is a strictly monotonically increasing function, so
α>β⇔logα>logβ. Thus it
doesn't change the location of any maxima: argmaxf(x)=argmaxlogf(x). See Mark Schmidt's Argmax and Max Calculus paper for more information
about properties of argmax.
GDB — the most common console debugger on Linux systems — has
a Python API for adding new debugger commands and
pretty-printers for complex data structures, or to automate debugging
tasks.
While Python scripts can be loaded from files, it is nice to interactively
explore the API or the debugged program. IPython would be
perfect for the job, but starting it directly inside GDB doesn't work well.
Fortunately, it's easy to launch an IPython kernel and connect with an external
Jupyter console.
Currently, only the console client can connect to existing kernels.
Support in the Notebook or in Jupyter Lab is tackled in this Github
issue. Even with the limited capabilities of the console client, it's
a great way to explore the API and to tackle more complicated debugging
problems that require automation to solve.
PyDays 2017 was Austria's first conference dedicated to the Python programming language. It took place on May 5 and May 6 and was graciously hosted by the Linuxwochen Wien at FH Technikum in Vienna. It was great on many levels: meeting new people excited about Python and where it's headed, over 20 talks, interesting hallway conversations, etc.
I helped out with the organization of the conference, mostly taking care of catering. We (the organization team) are very happy with how everything went. The atmosphere was welcoming and open, the quality of the talks and workshops was very good, attendance was great, and people seemed to have a really good time. From an organizational perspective, the talks mostly stayed on-time, the audio/video hardware worked fine, and the PyDays booth was well-received. Giving out snacks, coffee, soda, cake during breaks worked out nicely and was received very well. The feedback, both at and after the conference, was very positive.
In addition to the co-organization of the Conference, I also held a talk about Dask, a Python library for parallel computing. It provides data structures similar to NumPy Arrays or Python Dataframes that process operations in parallel and scale to out-of-memory datasets. Dask is exciting since it provides the analytical power and familiar mental model of Dataframes, but handles hundreds of gigabytes of data and allows seamless scaling from a single laptop to a computing cluster. The talk was well-received, and there was a lot of interest at the end and in the hallway afterwards. The slides are online, if you're interested as well.
Overall it was great to see so much community interest in Python. If you're passionate about Python as well, join us on the Python Austria Slack and our meetups!
Keeping libraries binary-compatible with old versions is hard. Recently, GCC
was in the unenviable situation of having to switch its
std::string implementation.1 GCC used inline
namespaces and ABI tags to minimize the extent of breakage and to ensure
that old and new versions could only be combined in a safe way. Here, we'll have a
look at those mitigation techniques: what they are and how they work.
First, some background: GCC used to have a copy-on-write implementation for
std::string. However, C++11 does not allow this anymore
because of new iterator and reference invalidation rules. So, GCC 5.1
introduced a new implementation of std::string. The new version was not
binary-compatible with the old one: exchanging strings between old (pre-5.1)
and new code would crash. We say the application binary interface (ABI)
changed.
To understand the consequences of this, let's look at several scenarios:
A program uses only code compiled with GCC < 5.1: only old strings, works
A program uses only code compiled with GCC >= 5.1: only new strings, works
A program mixes code compiled with different GCC versions: both types of
strings exist in the program, it could crash.
Let's dig into the last bullet point. When would it crash? Whenever
"new" code accesses an "old" string or vice versa. Here are some examples
where f() is compiled with an older GCC version and called from "new"
code:
voidf(inti);// (a) safe, no std::string involvedvoidf(conststd::string&s);// (b) will crash when f accesses sstd::stringf();// (c) will crash when the returned string is used
Given how common std::string is, such crashes would happen frequently if
"old" and "new" code were combined. Unfortunately, it's surprisingly easy to
end up with a program with some pre-5.1 parts and some newer ones. It's
sufficient to link a "new" executable to an "old" library. Given the bad
consequences, the GCC developers needed to solve this.
The solution is to change the symbol names of the GCC 5.1 std::string and
all functions using it. Because the linker uses symbol names to resolve
function calls into libraries, this would cause link-time errors for cases (b)
and (c) while (a) would still work. Exactly the intended behavior.
How could this be done? The mechanism that converts C++ names into symbol names
is called name mangling. The generated symbol names contain information
about namespaces, function names, argument types, etc. Putting the new
std::string into a separate namespace would change the
symbol names of all functions accepting std::string as argument. But that's
crazy—std::string needs to be in namespace std, right?
The solution for this is inline namespaces. All classes, functions, and
templates declared in an inline namespace are automatically imported into the
parent namespace. Their mangled name still references the original location,
though.
Indeed, the symbol names are different and the linker would give an error if
GCC 5.1 code called f(const std::string &s) from a library compiled
with an older GCC version. This solves the compatibility problem for all
functions taking std::string as argument.
One problem remains: case (c) from above, std::string f(). The return value
type of a function is not part of its mangled name.3 Thus, the symbol
name wouldn't change for functions that return std::string, which could
still lead to runtime crashes.
GCC 5.1 solves this with the use of ABI tags. From the documentation:
The abi_tag attribute can be applied to a function, variable, or class
declaration. It modifies the mangled name of the entity to incorporate the
tag name, in order to distinguish the function or class from an earlier
version with a different ABI
[...]
When a type involving an ABI tag is used as the type of a variable or return
type of a function where that tag is not already present in the signature of
the function, the tag is automatically applied to the variable or function.
GCC applies the attribute __abi_tag__ ("cxx11")) to the std::__cxx11
namespace. This affects all classes therein, including the new version of
std::string. The ABI tag propagates to all functions that
return a string and changes their symbol name.
Let's look at the symbol names for std::string f() for different compiler versions:
Older GCC
GCC 5.1
Symbol name
_Z1fv
_Z1fB5cxx11v
Decoded symbol name2
f()
f[abi:cxx11]()
Again, the symbol name is different, and the "cxx11" ABI tag is applied to
std::string f() with GCC 5.1. This completes the second part of GCC's solution
for the std::string ABI change.
These two methods to induce symbol name changes for anything using
std::string make the migration path to GCC 5.1 much easier.
If the program links correctly it should work, and be safe from difficult-to-debug
runtime errors.
There would be more to discuss about the GCC 5.1 changes, such as how
libstdc++.so still exports the old std::string implementation for
compatibility. But this post has gone on too long already, so let's leave it at
that :-)
1 GCC 5.1 also introduced a new version of std::list. It is handled analogously to std::string. 2 Decoded using c++filt from the binutils package. 3 The return value type doesn't need to be part of the symbol name because it doesn't get used for overload resolution.