Problems & Puzzles: Puzzles

Puzzle 82.- The Goldbach's Comet

As you know Goldbach conjectured (1742) that "every even number greater than 4 can be expressed as the sum of two odd primes"

Henry F. Fliegel & Douglas S. Robertson proposed (*) the function G(E) as "the  number of prime pairs which can be found to sum a given even value E".

Example: G(10)=2 because 3+7=10, 5+5=10

They also plotted G(E) for E values up to 100,000. This is the figure they obtained (kindly sent to me by snail-mail by Mr. Robertson and scanned at home)


Amazing!!!... Isn't it? An extreme beautiful artistic-tail-Comet-shape obtained in a very simple way; asking very simple questions related to prime numbers.

Fliegel & Robertson already explained in their article the precise banded structure, with a reasoning based in modular arithmetic. They also obtained an empirical formula for the "sharp bottom of the lowermost band". Their formula is:

G(E)=>0.02745 *(E^0.86)

This lowermost band corresponds "to those even numbers which either are powers of two, or contain at most one large prime factor in addition to two" (p. 28) 

By email I asked to Mr. Robertson if this comet-tail fits inside a cone, that is to say if all these G(E) values can be bounded by (inside) two lines m1E+b1 and m2E+b2.  But they didn't try to do this; as a matter of fact they didn't studied the upper frontier of the comet.

a)Would you like to try to find the thinnest cone that bounds the Goldbach' comet?

Hardy & Littlewood conjectured that the number N2(n) of representation of an even number n as the sum of two primes, is given asymptotically by:

N2(n)=2cn/(ln(n))* Product((p-1)/(p-2)), for all odd prime divisors of n, 2c =1.3203

b) Can you produce a graphic that shows both G(E) and N2(n) for the same & adequate ("even numbers which either are powers of two, or contain at most one large prime factor in addition to two") n numbers for 4<n<100000

c) With the help of the Hardy & Littlewood formula can you explore the upper side of the comet and produce a simplified formula like the Fliegel & Robertson one for this region or band? 


(*)Ref. J. Recreational mathematics, Vol. 21(1),1989


Max S.C. Woon - student at the Cambridge University - has found a theoretical approach to a very simple and analytical function that provides an approximation to a lower bound of G(n), let's say G(n), where:

G(n) > G(n) for all n
G(n) ~ S[1/(ln(k).ln(n-k))]
, for k=1 to n/2, for n=even=>6

According to the Gauss estimate the quantity of primes less or equal to n is p(n) = n/ln(n) approximately. Then the probability P(n) of n being a prime is approximately equal to 1/ln(n). "...Pick an integer k<= n/2, even n => 6. The probability of k and n - k both being primes is P(k).P(n - k). Since the partition function G(n) counts the number of pairs of primes, G(n) can be approximated by the sum of the probabilities P(k).P(n - k)=G(n) over all k such that 3<= k<= n/2...".

Click here to see the complete paper (6 pages) that Max sent to me (you'll need Acrobat Reader to see it).  You may notice that G(n) - at the end - is calculated by this formula without any need of prime numbers.


The approximation to a lower bound of G(n) obtained by Max, G(n) is so good that for n<130000 only fails in 5 cases. This is an excerpt of the email that I (C.R.) wrote him yesterday (8/10/2000):

"...It happens that only for n=332, 398, 632, 992 & 2678 the real partitions G(n) are lower than the G(n) predicted by your formula...

N, G(n), G(n)

332, 6, 7
398, 7, 8
632, 9, 11
992, 13, 15
2672, 28, 29

no other cases for n<130000..."


Chris Nash has (12/10/2000) produced the following approximation to the upper bound for the Goldbach's Comet, after knowing the Max's results.

I consider interesting the Nash's comments about the Max's results and then I reproduce them here:

"I am most impressed by Max Woon's approach here! When first I saw the "1/ln(n)" estimate I did not expect it t work, after all it is only an asymptotic estimate, a better estimate of the number of primes less than n is n/(ln(n)-1), but this 'better' estimate will not easily give a bound [it does not work, since there are more primes close to zero than close to n, so it is no longer 'random']. So it is a wonderful surprise when the method gives an elegant result.

And also a very good result... with only 5 exceptions you have found (and still they are very close) and this is for the lower bound only. I wondered why these five results occurred, 332, 398, 632, 992 & 2678. I notice three of them are very "suspicious" - look at 332, 992/3, and 2678/8. They are surprisingly close. Why is their G(n) so small, is there something special about this number that is close to 333?

I think all these exceptions are a power of 2 times a prime and have no small prime factors. For instance 332=4 mod 7, this means any two numbers summing to 332 must be 0+4, 1+3, 2+2, 3+1, 4+0, 5+6, 6+5 mod 7. And two of those cases do not give primes (this explains the p-2 in the Hardy-Littlewood estimate), in a sense these numbers do not work well if we assume primes are 'random' (2/7 of cases is a lot different from random). But that is exactly what you say on the site, 'exceptions' are likely to be even numbers which either are powers of two, or contain at most one large prime factor in addition to two I wonder if that means this could become a proof of Goldbach for all except these 'difficult' numbers? (Perhaps Max could use the better estimates for prime distribution such as the Schinzel conjecture)."

Regarding the approximation to the upper bound, here is what Chris develops:

"I will try it using a method just like Max did, this is a great idea. Here is a very rough estimate of an upper bound. Let there be P primes less than n/2, and Q primes between n/2 and n. Then the Goldbach number G(n)<=min(P,Q) - the Goldbach number is highest if every one of the Q primes has a pair among the first P primes. This is not a very good estimate, because it is *not* possible that every prime in the set Q has a match in the set P. Suppose a prime p does not divide n. Then about a proportion 1/p of the members of Q will have n-q divisible by p. This suggests we should multiply the estimate by (1-1/p) for many primes p. But which ones? There is no need to consider primes greater than the square root of n (since you only need to trial factor to sqrt(n) to list all the primes). How many different small primes could divide n? Since ln(p#) is about p, n cannot be divisible by all the primes up to ln n, so in the worst case, we should ignore all primes less than ln n and take the primes between ln(n) and sqrt(n) in the product.

Merten's theorem states the product of (1-1/p) up to N is proportional to 1/ln N. So the multiplying factor seems to be (log log n)/(log sqrt n) = (2 log log n)/(log n). Q is asymptotically (n/2)/(log n). And so the upper bound is asymptotically (n log log n)/(log n)^2

This seems to be like Max's bound, it covers a lot of points (but there are a few points like the lonely ones near 60000 which are above this curve). I do not know how good this estimate really is?..."

Some minutes later he added:

"Just a thought.... I realized after sending the last mail what the 'lonely point' around 60000 actually is. As I suggested in the proof, to get the upper bound, you need to scale the number of primes by (1-1/p) for all the odd primes that do not divide n.

This makes me think that lonely point is n=60060 (15015=3*5*7*11*13). Of course, because the estimate is asymptotic I expect that point is a little higher than (60060 log log 60060)/(log 60060)^2 = 1190 but perhaps it works for most *ordinary* points!

That is interesting, the lower bounds are points with few odd factors. The upper bounds are points with many odd prime factors?"


I (C.R.) made a little numerical search to test the approximation to the upper bound of G(n) developed by Nash. This is what I found and wrote to Chris the 14/10/2000:

"Hola Chris: Regarding your upper bound here are some results for n<=10000:

* G(n) is the real partitions for an even "n"

* LB(G(n)) is the lower bound calculated by the Max's formula, rounding-down with "floor" function

* UB(G(n)) is the upper bound calculated with your formula, rounding-up with the "ceil" function

* Z=G(n)/UB(G(n))

N LB(G(n)) G(n) UB(G(n))  Z
6  0  1  2  0.5
8  1  2  2  1.0
24  1  4  3  1.33
90  3  10  7  1.43
120  4  13  9  1.44
210  5  20  13  1.53
... ... ... ... ...
60060 305 1565 1190 1.32

...then I would say that the real upper bound is something like your formula multiplied by a factor like 5/3 or 8/5 or 11/7 or 17/11, or simply 20/13"


Max S.C. Woon has created (15/10/2000) a .gif file to show the lower bound LB(G(n)). Click here to open it.


A nice link sent by Felice Russo is this one article by Kevin S. Brown. Thanks, Felice.


One more contribution to this issue has been sent by Dr. Wes Munsil (20/7/2002):

At one time Max See Chin Woon published on your site a lower bound to G (E), the number of Goldbach partitions of an even number E. I don't know what work has gone on since then, but I thought you might be interested in a result I obtained independently, through similar means. The lower bound I have found seems higher than Max's, though I have only verified this for the first thousand or so cases. Its computation is certainly much faster than the computation of Max's. I have verified this lower bound for even numbers up to 10,000,000, with exceptions as noted below.

The principle is this. For even n = 2k, choose a in [3, k] and b in [k, 2k - 3] at random. The probability of a and b summing to 2k is 1 / (k - 2). The probability of a being prime is (pi (k) - 1) / (k - 2). The probability of b being prime is (pi (2k - 3) - pi (k - 1)) / (k - 2). Multiplying these together, and multiplying the product by the number of pairs (k - 2) (k - 2), gives the following estimate for G (2k):

    m (k) = (pi (k) - 1) (pi (2k - 3) - pi (k - 1)) / (k - 2)

For the first two pi (x) (the positive ones), I use Dusart's upper bound (x/log x)(1 + 1.2762/log x). For the third (the negative one), I use Dusart's lower bound (x/log x)(1 + 0.992/log x).

I do not know to what extent the apparent superiority of this lower bound depends on my use of Dusart's bounds, and to what extent it is inherent in the method. I do know that if I use for pi (x) the estimate x / (log x - 1), then this bound is lower than Max's.

For even numbers between 6 and 10,000,000, this lower bound fails on these cases:

2k   G (2k)             m (k)
12      1       1.7203248493827894
28      2       2.185588894741499
32      2       2.2855108925470966
38      2       2.4327035042901377
68      2       3.1146102667784916
98      3       3.722785033781057
122     4       4.171844243356386
128     3       4.2799798204110076
152     4       4.698705382929435
188     5       5.2924184624062125
248     6       6.213387796600334
326     7       7.319466740588393
332     6       7.401208304313093
398     7       8.274831526333923
458     9       9.034280873021315
488     9       9.403547689703203
542     10      10.052939810006036
632     10      11.097572723489389
668     11      11.503971964887542
692     11      11.771643447119315
992     13      14.939517716903463
1112    16      16.133516903844427
1412    18      18.98519968974206
1718    21      21.741038069953376
2252    26      26.281016830430502
2642    29      29.43295715296445
2672    28      29.670676102652898
2936    31      31.736630558574962


Records   |  Conjectures  |  Problems  |  Puzzles