Over the decades since the Busy Beaver game was first proposed, the process of proving values has come to solidify into the following:

  1. Enumerate all TMs (perhaps up to equivalence using, say, Brady’s TNF algorithm)
  2. Run all halting TMs until they halt
  3. Run a sequence of “filters” on all non-halting TMs to incrementally prove that they will never halt.
  4. Analyze a small number of remaining “holdout” machines by hand to prove they will never halt.

This is the process that Radó used to prove BB(2, 2) (using no filters), Lin used to prove B(3, 2) (using a single filter: Lin Recurrence / Translation Cycling) and Brady used to prove BB(4, 2) (with several filters for various tree/counter detections). Likewise, it is the model that the folks at bbchallenge.org are using to try and prove BB(5, 2).

Each step in this process has its own challenges, but step (3) (creating filters to prove non-halting) is (I believe) the biggest challenge to proving BB(5, 2). In order to support this mission, I will be sharing details about various filters that my father and I use in our codebase.

Note that what I call “filters” the bbchallenge.org folks call “deciders”.

CTL Filter

The first filter I would like to share is the CTL (Closed Tape Language) family of filters.

Background

Heiner Marxen explained this technique to me over email and in person around the summer of 2006. It is a technique that he invented and named. I have not seen it written about or talked about anywhere else, so I am excited to share it here.

The idea behind CTL is to define a set C of configurations so that:

  1. This TM reaches a configuration c in C at some point s.
  2. C is closed: For any configuration c in C, if you advance forward some positive number of steps, you will reach d that is also in C.
  3. C does not contain any halting configs.

If all of these facts are true, then we can see that the TM will, enter set C and once it does, it will continue to revisit unboundedly many times and thus this machine will never halt!

In fact, this general idea is basically how all “Forward Reasoning” proofs of non-halting work. For TMs that repeat in place, this is a finite set of configurations that the machine cycles through. For machines which enter Chain Recurrence (aka Spin Out) that set is the set of all half-zero tapes (in the direction of spin out) and the spin out state. And you can define it for Lin Recurrence and Proof System Recurrence, etc.

Language

However, in all of these examples, we are very rigid, we are effectively defining the exact configuration that the TM will be in at all future steps. The idea with CTL is to be much more fluid, what if we can define a superset of the configurations that this TM will reach in some mathematical language. Specifically, the CTL filter uses the language of Regular Expressions (RegEx).

Example

Let us consider an example (#11_781_743 on bbchallenge):

1RB1LB_1LC0RE_1LD1LC_1RA---_0LB0LD

  0 1
A 1RB 1LB
B 1LC 0RE
C 1LD 1LC
D 1RA
E 0LB 0LD

As of step 14 it is in the following CTL set:

(00^inf (11)* B> (10|11)* 00^inf | 00^inf (11)* <C (10|11)* 00^inf)

Evaluating each potential option, we see the following:

  • B> 10 -> <C 10 in 3 steps
  • B> 11 -> 11 B> in 4 steps
  • B> 00 -> <C 10 in 1 step
  • 11 <C -> <C 11 in 2 steps
  • 00 <C -> 11 B> in 7 steps

Each of these results leaves us in a subset of the same Regular Expression. Thus we have proven that this machine is non-halting using the CTL method!

Implementation

The example above fits into the simplest version of the CTL filter, the one I call A*. Specifically, it attempts to build a CTL Regular Expression of the form ( A* S> B* | C* <T D* | E* T> F* | ...) where A, B, C, D, E, F, … are all of the form (00|10|01) (or-ings of a fixed width block of symbols). The fixed width is called the block size and we set that before calling the CTL filter (or cycle over all block sizes programmatically).

We start by running the TM some parameterized number of steps (ex: 200), then setup the initial RegEx. In this case, after 200 (simulator) steps we ended up in the configuration: 00^inf 11^1 B> 11^7 10^1 00^inf, turning that into a A* CTL RegEx we get: 00^inf (11)* B> (10|11)* 00^inf. Now we iterate the following procedure, expanding the RegEx until either (A) the RegEx expands to include a Halt (failure, nothing can be proven) or (B) the RegEx stops expanding (success, it is closed and thus this TM is proven to never halt). The procedure is: For each state/dir combo (only B> here), for each possible symbol in front of the TM head 10, 11 (or 00 if (10|11)* is empty) evaluate the TM on that block until it either halts, loops forever or leaves the block and then expand the RegEx to include the result.

  • B> 10 -> <C 10 in 3 steps Add 00^inf (11)* <C 10 (10|11)* 00^inf (simplifies to 00^inf (11)* <C (10|11)* 00^inf).
  • B> 00 -> <C 10 in 1 step. Add 00^inf (11)* <C 10 00^inf (subset of previous RegEx).
  • B> 11 -> 11 B> in 4 steps Add 00^inf (11)* 11 B> (10|11)* 00^inf (subset of starting RegEx).

Now the RegEx has become (00^inf (11)* B> (10|11)* 00^inf | 00^inf (11)* <C (10|11)* 00^inf) and as I showed above, one more iteration will show that this is a closed set.

Generalizations

This A* version is the simplest version. In our codebase we also have versions for expressions like A* B, A B* and A* B C. However, I have not yet found any way to program these in general (so far I have to program the logic for each one at a time). Heiner claimed to have found a general way to do this for any RegEx form, but I don’t know how he did it!

Note that in the above example I am really using Heiner Marxen’s Multi-Symbol Macro Machines in order to simulate the TM on a block of symbols at a time. Using Backward-Symbol Macro Machines adds even more power and I highly recommend using them by default in CTL. In general, you can technically do CTL without “Macro Machines”, but it tends to be pretty weak. I remember Heiner mentioning to me, when he first explained this concept, how CTL really only shows it’s true value when combined with Macro Machines.

Effectiveness

This CTL technique is very strong when combined with Macro Machines. Applying our CTL A* B version with block sizes up to 6 and using Backward-Symbol Macro Machines I was able to prove infinite 86% (1,336,882 / 1,538,612) of the outstanding machines on https://bbchallenge.org/ (leaving 201,730 remaining).

This technique is especially nice because it categorizes all sorts of machines (counters, various flavors of X-mas trees, even some chaotic machines) because it does not attempt to show what the TM will for sure do, instead it shows what it will for sure never do.

Future Work

  1. Can anyone describe how to generalize to arbitrary Regular Expressions?
  2. Develop other types of Macro Machines to use here. Experience shows that adding new types of Macro Machines is extremely effective.
  3. Allow for more complicated CTL-type proofs. Right now the simulator only moves one step forward (perhaps one Macro Machine step) and then adds that config to the CTL set. But I am working on some hand-written proofs that are more complicated and require making many steps to get back to the CTL set.

Attribution

I mentioned it above several times, but none of this is my discovery. This technique was invented by Heiner Marxen. In a recent email he told me that his first use of it was not very effective, but his collaborator Jürgen Buntrock was much more enthusiastic about the idea and encouraged him to add variants including using Macro Machines which turned out to be the key to their effectiveness.