Problems & Puzzles: Puzzles

Problem 74. Prime Number Polyomino Islands in Consecutive-Odd-Number-Labeled Hilbert Curves

Mike Keith sent the following interesting problem:

Here’s a new (I think) prime number problem for your consideration.  Since it involves the Hilbert space-filling curve, let’s start with a brief introduction to the Hilbert Curve.

Hilbert Curve

The Hilbert curve is object that results from iterating a certain procedure an infinite number of times.  We will call each stage of this process an iteration; the curve at each step can either be referred to as “the nth iteration of the Hilbert curve” or the “order n Hilbert curve”.  We are not interested in the limit curve here, only the individual iterations of its construction.

The first 6 iterations of the Hilbert curve are shown in the diagram below.

As you can see, the curve of each order starts in the lower left corner of a square and ends at the lower-right corner and visits the center of every subsquare in the square.  Its construction can be understood in terms of this simple procedure: for each order, take four copies of the previous-order curve and make them half size.  Place two copies in the top two quadrants unchanged, but rotate the copy in the lower left quadrant clockwise by 90 degrees and the lower right copy anticlockwise by 90 degrees, then connect the four copies with three unit-length connector lines.  For illustration, the connector lines are colored black in the order-2 diagram above.

Of course, there are four possible orientations for the whole structure (and hence the Hilbert curve itself), corresponding to rotations of 0, 90, 180, and 270 degrees, but the choice of orientation makes no difference to the properties of the Hilbert curve (or our problem), so for simplicity we will always use the orientation shown in the diagrams above.  We have labelled the order 2 diagram to show the order which with the points are visited as we “walk” the curve from start to end. (The curve could be walked in the opposite order, but again we fix on the convention shown in the diagram.)  Specifically, the numbered labels indicate the number of steps away from the origin, counting along the path of the curve.

In the discussion below we will sometimes want to refer to the position of points along the curve according to their two-dimensional (x,y) position in the square.  For this we will use the convention that the origin of the x,y axes (i.e., (0,0)) is in the lower left corner.  The order-n curve inhabits a grid with 2n points on each side, numbered 0 to 2n-1 in x and y, and there are a total of 2n x 2n = 4n points in the whole grid.  The x,y numbering is illustrated in the order 4 diagram.

When writing programs that involve the Hilbert curve, a very useful function is one that takes the curve order n and an integer d (specifing the distance along the curve of a point we are interested in) and calculates the x,y position of that point in the grid.  The Wikepedia article entitled “Hilbert curve”, in the “Applications and mapping algorithms” section, contains an implementation of this function - see the subroutine called d2xy.  This routine conforms to our x,y usage.

Our Problem

For some integer n > 0, label the 2n x 2n grid of squares with the consecutive odd numbers 1, 3, 5, 7, ... as you walk the path of order-n Hilbert curve.  The diagrams below show the result of doing this for n=1, 2, and 3.  Now color each square according to whether it contains a prime number or not.  In the diagrams below, every non-prime square is colored white while the prime squares are colored various non-white colors (color meanings to be explained in a moment).

Now, consider two prime-number squares to be connected if they share a common edge.  We call a group of one or more connected primes an island, and these islands are simply polyominoes of various sizes: 1-square monominoes, 2-square dominoes, 3-square trominoes, etc.  We color each prime polyomino based on its size; in the diagrams below we only have monominoes (size=1, blue), trominoes (size=3, yellow), pentominoes (size-5, gray), and a 13-omino (orange), though in general, of course, other sizes will appear.

The order-1 Hilbert curve (2x2 square) has a single prime tromino containing 3,5,7.  In order 2 (4x4 square) the primes form two separate pentomino islands.  The order-3 curve (8x8 square, right diagram) is more interesting: there are 5 isolated monomino islands (blue), two trominoes (yellow), one pentomino, and...a massive island of size 13, shown in orange.

To proceed further it is almost necessary to use a computer program, so I wrote such a program that works up to the 15th-order Hilbert Curve (this limit is because it is easy to go to that far with unsigned 32-bit integers, and less easy after order 15).  The program steps through the 2N x 2N grid along the Hilbert Curve path, labels the squares, determines which labels are prime numbers, and then uses the 4-neighbor flood-fill algorithm to find the size of all the islands in the whole grid and produce a population count of how many islands of various sizes there are.  It is probably obvious what question I initially wanted to explore: how many more iterations do we have to go to find an island bigger than the 13-square island in the diagram at the right above?  If you want to take a guess, do so before continuing to read.

The answer to the question just posed (when does the next biggest island occur?) is currently: we don’t know.  As shown by the table below, no new islands of size 13 were discovered out to the 15th-order Hilbert Curve, and no islands of size 12 have appeared either.  The behavior of all the smaller island sizes (1 to 11) is pretty much as one would expect, which is to say that as we continue searching more islands of a given size continue to be found.  Islands of size 11 are very rare but new ones continue to be encountered, including 4 new ones in the order 15 curve.

 Hilbert Curve 1 2 3 4 5 6 7 8 9 10 11 12 13 Iteration 1 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 3 6 0 2 0 1 0 0 0 0 0 0 0 1 4 25 7 2 0 3 2 0 0 0 0 1 0 1 5 80 31 17 6 6 5 1 0 0 0 1 0 1 6 340 128 66 22 14 6 1 1 0 0 1 0 1 7 1452 454 185 68 35 16 3 1 0 0 1 0 1 8 5805 1541 590 233 82 30 4 1 0 0 2 0 1 9 22685 5565 1821 637 198 62 18 5 0 0 2 0 1 10 87449 19602 5855 1830 537 154 45 13 1 0 2 0 1 11 337324 69318 18987 5272 1385 384 99 22 1 0 3 0 1 12 1295714 246587 61871 15615 3851 917 213 47 4 1 4 0 1 13 4966960 878897 205175 47738 10754 2313 504 106 21 4 5 0 1 14 19053955 3149515 682824 147447 30557 6272 1229 238 49 6 6 0 1 15 73888499 11641540 2395694 489712 95493 18232 3306 611 127 24 10 0 1

Table 1.  Number of prime polyomino islands of each size that appear in each iteration (1 to 15) of a Hilbert Curve that’s been labeled with consecutive odd numbers.

Could it be that the size-13 island in the 8x8 square is a fluke, and other than that one there are no islands bigger than 11, no matter how far we go in the Hilbert Curve?  Possibly, but there is currently no mathematical proof that this is the case.

To perhaps gain some more insight we can generalize just slightly.  We still label the Hilbert path with successive odd numbers, but instead of 1 we can use any positive odd integer f  as the first number in the sequence.

Here are the results of counting prime islands for all (odd) values of f (first label on the curve) from 1 to 39.  We show the results just for order 15, since the order k Hilbert curve contains all smaller orders within it:

Prime Island Size (number of squares)

 f 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 73888499 11641540 2395694 489712 95493 18232 3306 611 127 24 10 0 1 0 0 3 73087134 11345842 2297891 463010 89166 16830 2968 535 102 21 1 1 0 0 1 5 73087666 11343104 2300033 461941 89579 16843 3032 559 107 18 3 0 0 0 0 7 73091120 11344405 2298546 461754 89379 16837 3109 530 97 14 4 0 0 0 0 9 73088373 11344179 2298565 462931 89256 16672 3067 557 99 25 3 0 0 0 0 11 73085348 11343614 2300929 462407 89053 16747 2994 574 113 21 3 1 0 0 0 13 73086426 11342119 2300530 462307 89986 16610 3004 539 91 25 5 2 0 0 0 15 73092722 11346128 2295769 462466 89670 16782 2973 548 109 17 1 1 0 0 0 17 73092644 11344765 2296283 462919 89510 16803 3045 509 99 19 1 0 1 0 0 19 73086458 11342783 2299905 462892 89568 16676 2980 540 109 15 4 0 0 0 0 21 73092435 11344776 2296817 463368 88904 16746 3053 533 98 8 5 0 0 0 0 23 73081790 11346072 2298898 463274 89416 16547 3098 573 106 18 5 1 0 0 0 25 73093687 11339862 2299813 462581 89265 16810 3051 580 108 16 3 3 0 0 0 27 73090562 11342460 2300647 462064 88882 16873 2947 556 102 19 3 1 0 0 0 29 73085940 11346096 2299979 461275 89434 16723 3068 535 94 22 3 2 0 0 0 31 73089819 11343998 2296939 463694 89470 16657 3040 555 107 15 2 0 0 0 0 33 73088935 11345633 2298769 461980 89505 16591 2974 524 90 19 3 0 1 0 0 35 73088586 11342363 2297950 463738 89569 16733 3053 568 114 20 3 2 0 0 0 37 73094514 11343223 2297561 461970 89693 16751 3085 521 105 33 2 0 0 0 0 39 73093210 11341954 2300338 461629 89491 16578 2963 521 99 15 3 0 0 0 0

Right away we find a nice surprise: when f = 3, there is a size-15 island - and it occurs very near the origin!  Here is an illustration of the 16x16 square at the origin (i.e., the order-4 Hilbert curve) when f = 3.

Here we’ve only colored all the prime squares in the lower left 6x9 rectangle.  There are five islands in this region: the size-15 one (orange), one of size 8 (purple) and three singletons.  If the size-13 island in the 3rd Hilbert iteration with f = 1 is remarkable, this one of size 15 is even more remarkable.  No other size-15 islands were found out to f = 39, and there are none of size 14 either.

With the generalization to arbitrary f, the size-13 island with f = 1 is no longer unique: there is one with f = 17 and another when f = 33.  The size-13 island with f = 17 is located at x = 5860 and y = 4766, which means it can be found in the order-13 curve.  The size-13 island with f = 33 is located at x = 1345 and y = 993, so it first appears at order 11.

As an illustration of what some of these more far-flung islands look like, here is the island of size 13 when

f = 17, located at (x,y) = (5861, 4767), with all the primes colored as usual and the Hilbert Curve overlaid.  The orange squares form the prime island of size 13.

The bottom 8x8 of this 8x9 rectangle is a single continuous strand of the Hilbert Curve that starts at left center and exits at right center.  The top row of the 8x9 is a different piece of the curve that’s pretty far away, in terms of distance along the curve, from the bottom section.  The numbers in the bottom section are of the form 70351xxx while those in the top row are 70346xxx, making these two sections about 2500 steps away from each other (the numbers are about 5000 apart, but this is divided by 2 to get number of steps, since the numbers count by 2).  Note that 12 of the 13 primes are located in the contiguous bottom section of the curve!

The problem we propose is to continue the search for prime islands.  Some of the main open questions are:

1) When f = 1 (arguably the most natural case), are there more islands of size > 11 other than the size-13 island near the origin?

2) For f > 1, is there an island bigger than the size-15 island near the origin when f = 3?  (Side note: a good argument can be made that f=3 is even more natural than f=1, since it eliminates the problematic “unit” of 1.)  Is there an island with size > 15?  No island of size 14 has be found yet, either.

Numbering the path with odd numbers can be viewed as “use all natural numbers but eliminate multiples of 2 in order to increase the density of primes”.  This can be generalized to “use all natural numbers but eliminate multiples of the first m primes”.  So the m=2 numbering would eliminate multiples of 2 and 3 (the sequence starts 5, 7, 11, 13, 17, 19, 23, 25 [the first non-prime], etc).  As m increases the expected island size increases, and indeed by increasing m it is possible to guarantee the appearance of arbitrarily-large islands.  Perhaps only the first few small values of m make interesting problems.

3) Would you like to experiment with this last way of numbering?

Postscript

It would be very desirable to have a second independent computer program to check the results given here.  Although we have some confidence that our program is correct, independent verification of these calculations is important.

 Records   |  Conjectures  |  Problems  |  Puzzles