Towers of Hanoi

Towers of Hanoi Technical Aspects


The basic Towers of Hanoi problem is moving multiple discs on three pegs - there are more than enough discussions about this (eg see [1]).

Legend has it that a bunch of monks are moving a physical tower of 64 discs from one of three pegs to another; when they finish, the world will end. However, of more relevance to the current generation is the e-Towers of Hanoi legend. Here, Urban Legend has it that a number of WebMonks are watching (in shifts) the animation you can see on the 'Animate' tab but with 32 discs on 3 pegs going really, really fast. When the animation is complete, the epoch of the internet will end. This is currently predicted to be early in 2038. You have been warned!

The generalised Towers of Hanoi problem concerns moving multiple discs using p ≥ 3 pegs. The particular case where p = 4 is called the Reve's puzzle, but is no more special than our favourite value of p, which is 5. The 'presumed optimal' solution was given in a paper back in 1941 [2] addressing the problem posed in 1939. It addresses broadly equivalent solutions by Frame and Stewart. The solutions depend on the assumption that an optimal solution is obtainable by moving the top nk discs to a new disc (for a suitable value of nk) then optimally moving the remaining discs to the destination peg and moving the nk discs to the destination peg. This missing lemma, ie that the assumption generates an optimal solution, was noted by the journal editor and has not been proved since - hence the 'presumed' bit. Note that the original paper uses 'washers' for what we denote as 'discs'. The problem as stated is to find the number of moves, though the algorithm used to achieve it is probably more interesting to most people.

A hint when checking references/links is to verify whether the document is addressing 'the solution' or commenting on 'the presumed solution'. Because of the looseness of definitions and assumptions, a useful link which ties things down is [3]

The Frame-Stewart solution acknowledges that there are (generally) a number of partitions that are minimal; thus the partition used for moving 'off' the base disc is not necessarily the partition used to move 'on' the destination disc. This is is not explored further, though its consequences can be seen on the 'Animate' page where not all solutions are symmetric.

A. M. Hinz (a prolific contributor to the Towers of Hanoi problem) introduced an interesting input to the 'even more generalised' Towers of Hanoi where the initial and final configurations are arbitrary. In the context of 3 pegs, he stated in [4] that the solution does not always require that the base disc only moves once. However, for the less-generalised (normal) problem, symmetry arguments show that it does (as assumed by the 'presumed optimal' solution). Application of this paper in a physics context is discussed in [5]. For those interested in the maths related to the Towers of Hanoi, an excellent tome discussing all facets of the topic is [6].


This section develops and proposes a proof for a formula for the number of moves required to solve the general Towers of Hanoi problem. The approach is to develop a formula for a lower bound for the number of moves, based on structural arguments, and then show that this is achievable. The proposed proof depends critically on the statement highlighted in blue below. This has been rejected (or rather non-accepted) by the mathematical guardians of the tower. The WebMonks maintain it is true and will refine the proof to convince the non-believers.

The maths is rendered using jqMath, and was developed on Firefox, so if you don't like the appearance, try another browser.

A formula for a lower bound is developed below for moving n discs from one of p pegs to another peg. Disc 1 is the smallest disc, disc n is the largest disc. While the lower bound is valid for each clause where the condition applies, there is an implicit condition that a tighter (higher-valued) clause does not apply, since we are really looking for a greatest lower bound (GLB).

It is important to note that a lower bound formula is being developed – whether a sequence of moves that achieves it is possible is irrelevant.

From the rules of the game, and symmetry, all discs above the base disc must move off the base disc on the source peg, onto one of the k = p-2 intermediate pegs, then after the base disc moves once, move onto the destination peg. We will define the base disc as the height 0 base disc.

So, a lower bound for the total number of moves $T(n, k)$ of n disks with k ≥ 1 intermediate pegs from source to destination is:

$T(n, k)= 1.2^0+(n-1)2^1$
if $n-1≥0$
However, because the discs must be unstacked and stacked in order, only k discs can be moved in two moves; denote these as height 1 base discs. The rest must move:
  • off one of the height 1 base discs,
  • then onto a height 1 base disc that has moved to one of the k intermediate pegs,
  • then off the intermediate peg to allow the intermediate height 1 base discs to move to the destination peg,
  • then onto the destination peg,
(all in the correct order). A disc that has moved from a height 1 base disc cannot move onto a smaller base disc, by the rules. A disc cannot move onto a base disc that has not moved, since this would increase the minimum number of moves (as it must then move off again to allow the base disc to move) and so would not be a lower bound. So:
$T(n, k) =1.2^0+k.2^1+(n-(1+k))2^2$
if $n-(1+k)≥0$      (E0)
We will introduce the concept of the maximum number of supported discs at a height h (for a given number, f, of intermediate free discs), $S_f(h)$. This is the maximum number of discs that can move off and on the supported discs at height h-1, without any intervening moves. $S_f(0) = 1$, by definition. Reformulate E0 as follows:
$ S_k(0) = 1$
which is the base disc of the tower
$S_k(1) = k$
$ T(n, k) =S_k(0)2^0+S_k(1)2^1+(n-(S_k(0)+S_k(1)))2^2$
if $n-(S_k(0)+S_k(1))≥0$

Examine the constraints further. Discs moving off the topmost height 1 disc have k intermediate pegs available to them. Once the topmost height 1 disc has moved, the rest (≥ height 1) only have k-1 pegs available, since they cannot use the peg occupied by the smaller height 1 disc.

Note that for the h>1 discs in a ‘moving off’ phase, the original destination peg can be treated as an intermediate peg and an original intermediate peg is the destination. This is valid since the h>1 discs will have cleared the original destination peg (on their second move) before the h = 0 base disc needs it, so will not increase the minimum number of moves. Similarly, for a ‘moving on’ phase the original source peg can be treated as an intermediate peg and the height 1 destination as the destination for the h>1 discs.

Similarly for the discs moving off subsequent height 1 discs; these will have decreasing intermediate pegs available, until for the last one only 1 intermediate peg is available. So we will refine the $S$ function with a version which reflects the number of free intermediate pegs available to it.

$S_k(h, f)=$number of discs supported at height h with f intermediate pegs from a total of k
$$S_k(h)= ∑↙{f=1}↖kS_k(h, f)$$
Recasting the discussion above in this formulation gives:
$S_k(1, f) = 1, f=1..k$
For completeness, we can define:
$S_k(0,f)=0, f=1..k-1$ and $S_k(0,k)=S_k(0)=1$

ie there is only one base disc.

Although the rules allow a disc on a height 1 disc with f free pegs to move onto a larger height 1 disc (which has moved) this would not be consistent with a lower bound, since the disc moving onto a larger height 1 disc would need to wait for discs on the larger height 1 disc to move off the larger height 1 disc. This would reduce the available intermediate pegs for the discs on the larger height 1 disc by one, and so increase the number of moves needed.

So, to maximise the number of height>1 discs that can move, and so maintain a lower bound, we may assume that if they move off a particular height 1 disc, they move on the same height 1 disc.

Assume that there are $S_k(h, f)$ discs supported at height h with $f=1..k$ free pegs available. Then by the argument for h = 1, we see that each $S_k(h, f)$ disc supports one disc at $(h+1, f’)$ for $f’=1..f$, where each h+1 disc contributes two disc movements, one off a height h disc and then one back onto the corresponding height h disc. This is illustrated in Figure 1.

Support Function Relationship

Figure 1 Support Function relationships


$$ S_k(h+1, f)=∑↙{f'=f}↖k S_k(h, f')$$
$$ SS_k(h)=∑↙{h'=0}↖h S_k(h')$$
Then, using this, we can produce the corresponding lower bound clause for h+1 from the clause for height h as follows. Given:
$$ T(n,k)=∑↙{h'=0}↖{h} S_k(h')2^{h'} +(n-SS_k(h))2^{h+1}$$
if $(n-SS_k(h))≥0$
$$ T(n,k)=∑↙{h'=0}↖{h+1} S_k(h')2^{h'} +(n-SS_k(h))2^{(h+1)+1}$$
if $(n-SS_k(h+1)) ≥0$
To arrive at an explicit formula for $Sk(h, f)$, we will apply a sprinkling of Pascal’s triangle magic, in particular that:
$$ ∑↙{m'=0}↖m \binom {n+m'} n=\binom {n+m+1} {n+1} $$
$$ ∑↙{m'=0}↖m \binom {n+m'} {m'}=\binom {n+m+1} m $$
to establish by induction on h that:
$S_k(h,f)= \binom {k-f+h-1}{h-1}$
for h>0, k>0
Inductive basis is for h=1:
$S_k(1,f)= \binom {k-f+1-1}{1-1}$
as required
Assume the hypothesis:
$S_k(h,f)= \binom {k-f+h-1}{h-1} $
$$S_k(h+1,f)= ∑↙{f'=f}↖k \binom {k-f'+h-1}{h-1} $$
- E1 and E2
$$= ∑↙{f''=f}↖{k-f} \binom {f''+h-1}{h-1} $$
- substitute f’’=k-f’ and invert limits
$= \binom {k-f+(h+1)-1}{(h+1)-1} $
- from L1, which is of the required form
Now derive the formula for the total at a given h: If $h ≥ 1$,
$$ S_k(h)= ∑↙{f=1}↖k \binom {k-f+h-1}{h-1} $$
$$ = ∑↙{f'=0}↖{k-1} \binom {f'+h-1}{h-1} $$
- substitute f’=k-f and invert limits
$ = \binom {k-1+h}{h} $
- L1
The formula, but not the derivation, is valid for h=0 where:
$S_k(0) = \binom {k-1+0}{0} = 1 $
The total number of moves up to height h, h≥0:
$$SS_k(h) = ∑↙{h'=0}↖h S_k(h') $$
$$ = ∑↙{h'=0}↖h \binom {k-1+h'}{h'} $$
$ = \binom {k+h}{h}$
- L2
Denote by $hf_k$ the maximum height which is ‘full’, ie
$ hf_k= $max h where $n-SS_k(h) ≥0 $
Finally, we can obtain a tight lower bound on the total number of moves of n discs on k intermediate pegs as:
$$ T(n, k)= ∑↙{h'=0}↖{hf_k}{ S_k(h') 2^{h’}} + (n-SS_k(hf_k) )2^{hf_k+1} $$
$$ = ∑↙{h'=0}↖{hf_k}{\binom {k-1+h’} {h’} 2^{h’}} +(n- \binom {k+hf_k} {hf_k})2^{hf_k+1} $$

This formulation results in the last term being ≥ 0. An equivalent formula, as presented in [2] results in the last term being > 0 and decreases $hf_k$ by 1 if the last term is zero.

Having established an explicit expression for a lower bound on the number of moves, we need to show it is a GLB by providing an algorithm which achieves it. Unsurprisingly, FS solution shows it is achievable; hence is a GLB; hence T(n,k) is the minimum number of moves.


GLB canonical algorithm

However, we will develop an algorithm (broadly in line with Frame’s [2]) more in keeping with the derivation and amenable to relaxing into a general solution.

  • Draw a tree with root node (h = 0) of degree k.

    Each node has the properties:

    n number of discs
    base base disc number
    f max degree (number of free pegs)
    child node[1..f] , where child[i] might be null
    via Intermediate peg
  • Add to each node at height h of degree f, child nodes at height h+1 of degree 1..f (left to right).
  • Repeat until height $hf_k$ nodes are added.
  • Add child $hf_k+1$ nodes, l to r of the appropriate degree, until total n nodes are present. The remaining child nodes of height $hf_k+1$ will be regarded as ‘null’ nodes. Non-null height $hf_k+1$ nodes are leaf nodes, with no children.
  • Carry out a post-order traversal of the tree, starting at the child node with the largest degree, labelling the base property of each node visited with the next available disc number, starting with the smallest (1). An example is shown in Figure 2 for k=3 and n=16, with null nodes shown by dashed connections.
    Support Function Relationship

    Figure 2 Example Canonical Numbering

  • Apply the recursive procedure Move (node, s, d, ip) to the root node (labelled with base disc n) from source peg s=1 to destination peg d=2, with intermediate pegs ip={3..k+2}.
    Move (node, s, d, ip)
        foreach  non-null node cn in node.child from degree node.f to 1
            d’= smallest peg in ip
            ip= ip-{d’}
            Move (cn, s, d’, ip+{d})
        Move node.base from s to d
        foreach  non-null node cn in node.child from degree 1 to node.f
            s’= cn.via
            Move (cn, s’, d, ip+{s})

It is clear (we hope) that the algorithm agrees with the rules of the game and achieves the minimal value based on the argument presented in seeking a GLB. This is not proved formally because we are publishing this on our website, so you can do the donkey work! In particular, it works for the case where the number of child nodes is less than the degree of the parent node – this just means that ip is not empty before the node.base disc is moved. For leaf nodes, the recursion terminates by just moving node.base from source to destination as there are no child nodes.

An animation is given of the canonical algorithm being applied to the example solution here Animation of Canonical Partitioning.

Interestingly, [2] equation 7 would split the rightmost partition incorrectly, by allocating disc 1 to the middle partition and then allocate the ‘left-overs’ (in this case 0 discs) to the right-most partition. This is wrong! The WebMonk community was much-anguished by this error in a much-revered text. However, facts always win over authority in a rational society. We learn and move on.

GLB non-canonical algorithm

In terms of the number of moves, there is no more to be said. However, we can explore the canonical algorithm and relax all the constraints, which we claim will generate all the possible minimal solutions (in terms of a sequence of moves of “top disc from peg p1 to peg p2” which obey the rules and move the tower from a source peg to a destination peg).

Example animations are given of each strategy for generating solutions; if you view the examples you will see the WebMonk philosophical interpretation of the strategies – may this bring you algorithmic enlightenment!

Selection of intermediate node

It is clear that choosing the smallest peg from ip is arbitrary. Any peg can be chosen as an intermediate peg where the choice is made in the canonical algorithm (at each recursive level of Move). As a special case, for k=1 (three pegs) there is only one peg in the set, which is why the solution is unique. A further special case is the assumption that peg 1 is the source and peg 2 is the destination – the algorithm works just as well for any peg (eg the last) as the destination, with ip ={1..k+2}-{s,d}, d ≠ s. See example here Animation of Intermediate Peg Selection.

Identical sequential partitioning

The canonical algorithm assumes that the partition of child nodes is identical in each case of Move, as implied by carrying out the tree-traversal once, initially. This is not necessary, since the partition can be carried out at the start of each Move. This is more appropriately addressed in terms of an algorithmic discussion, but the illustrations should make the issue clear.

However, the original algorithm is easiest if you are using pencil and paper or doing it in your head!

This means that a disc may be a root node at one partition and a leaf node at another. Consequently it also means that a particular disc does not necessarily (unlike in the canonical solution) move $2^n$ times; it is the node, for $h≤hf_k$, that moves $2^n$ times, with the base disc possibly being different. For $h=hf_k+1$ each node moves twice when its parent moves once. The FS paper [2] states that a given disc (washer) moves $2^n$ times, which is correct only for their assumed solution (not claimed to be unique). Different partitioning is illustrated here Animation of Sequential Partitioning.

Sequential movement of child nodes

The canonical solution assumes that child nodes are moved off and on sequentially. This is not necessary, since moving a disc from a larger child node cannot interfere with the movement of any discs from a smaller child node. What stops a larger disc from moving is having a smaller one on top of it. So, as discs are consolidated onto a smaller node, any pegs not containing children of the smaller node are available for use by the next largest node.

Conversely, when consolidating a larger child node onto its destination peg, the next smallest node can start moving off its base, provided this does not block a peg needed by any larger disc. Blocking means occupying an empty disc or being placed on top of a larger disc. This does not prevent a disc from temporarily blocking a larger disc on its way to a peg that would not block a disc from the next larger node. This can be formalised (as embodied in the animation page) and is illustrated here Animation of Interleaved sequences.

Leaf node propagation

The requirement that child nodes move off and on the same parent does not apply to the topmost (leaf) height. Assume that node n1 is of degree k1 with k1a non-null children and its left sibling n2 is of degree k2 = k1-1 with k2a non-null children. Once the k1a discs have moved off n1.base disc and n1.base has moved, up to k2-k2a of the k1a smaller discs can be ‘adopted’ by n2 as its missing smallest child nodes. The rest must be placed on n1.base as normal. This does not increase the number of moves, since the adopted nodes still move twice, once off their original parent and once on their adopted parent. n2 does not have an increased number of moves, since the intermediate pegs occupied by the adopted nodes were spare anyway. Note that the adopted nodes need not have sequential discs.

This can be extended to n3, n2’s larger sibling, where up to k3-k3a nodes can be adopted, chosen from n2’s children and adopted children, and so on. Thus, if an $hf_k-1$ node has a set of k’ child nodes, each of which has at least one null leaf child, then the smallest disc of the child node[k’] could move off its parent then onto the child node[1]. Propagation cannot go any further, since there are no spare intermediate pegs for non-$hf_k$ nodes. Also, as a special case, if a height $hf_k$ node has no null child nodes, then no smaller nodes can be adopted. Propagation is illustrated here Animation of Leaf Node Propagation.

When moving the $hf_k$ nodes back onto their parent, any adopted children will be moved off and left on a ‘spare’ intermediate peg until they can move onto their original parent node, after it has moved onto its parent node. This effectively reverses the original sequence of moves, so cannot increase the total from that used by the canonical sequence. This is subject to the freedom to select different intermediate nodes on the return journey.

Combination of Strategies

The essential regularity of the solution is masked for small numbers of discs, especially by leaf node propagation. When investigating by hand or even by exhaustive search for small n, this strategy tends to dominate, making the rules governing the solution unclear. We hope the explanations and examples above will increase the enjoyment of watching the animations.

We believe that the above strategies (in combination) cover all possible variations in solutions. We offer no proof but a reward of $100 (Hey! We’re poor WebMonks, not Bill Gates) to the first to find a new strategy or an example solution not explainable by combinations of the above strategies.


The 'Animate' page shows an animation of a random solution to the general problem of moving d discs on p pegs from a source disc to a destination disc, with source = 1 and destination = 2.

If you want to analyse the solution, view the page source and search for id="MainContent_Moves". The values shown are:
number of discs, number of pegs, [source peg, destination peg (0-based)]

The 3D animation uses WebGL accessed through the THREE.js library.

Some images are obtained from cgtextures.

There is a C# project which generates all possible solutions. This is a generate-and-test exhaustive search taking advantage of symmetries such as 'all empty pegs are equivalent' which still rapidly runs out of puff for even small values of (disc, peg), but is useful for generating and testing theories and algorithms.

Also a C# project which generates (nearly) all solutions based on an algorithm derived from the solution theory. This runs out of puff much less rapidly than the G&T solution, but still takes a lot of computation, because there are a lot of solutions. Tested against the previous project, for small values. An interesting debate around 'how many unique solutions are there?' arises when taking account of 'is using a different empty peg' a different solution and 'is making two valid moves in the opposite order' two solutions or one.

However, due to the constrained nature of the general solution, generating one solution at random is computationally a breeze. For example, 500 discs on 50 pegs is unchallenging.

If there is any interest, these projects may be published once they have been tidied up a bit.


  1. Tower of Hanoi
  2. Stewart B.M. Problem 3918 solution, American Mathematical Monthly 48 pp 216-219, 1941
  3. On the Frame–Stewart algorithm for the multi-peg Tower of Hanoi problem
  4. Hinz A.M. Shortest Paths Between Regular States of the Tower of Hanoi, Information Sciences 63 pp 173-181 (1992)
  5. Symmetry-breaking considerations of the general Hanoi group in the context of the TW3-tensor formulation of a Theory Of Everything (TOE)
  6. The Tower of Hanoi - Myths and Maths, by Hinz, Klavžar, Milutinović & Petr, published by Birkhäuser

Copyright © 2017 by

You are welcome to reference or link to this site, page or proof.