Project Euler #2: Even Fibonacci Numbers

Posted on 2017-03-06 by nbloomf
Tags: project-euler, literate-haskell

Spoiler alert! This page is part of a series on solutions to Project Euler problems. If you prefer to solve problems yourself, do not read on!

This post is literate Haskell; you can load the source into GHCi and play along.


First some boilerplate.

module ProjectEuler002 where

import Data.Ratio
import System.Exit

Problem 2 from Project Euler:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be: \[1, 2, 3, 5, 8, 13, 21, 34, 55, 89, \ldots\]

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

Let’s start with the obvious thing: take the Fibonacci numbers under four million, filter for the evens, and sum. We’ll use the definition of the sequence from the problem statement, indexed from 1 like so: \[F(1) = 1, F(2) = 1, F(n+2) = F(n) + F(n+1).\] We’ll translate this to an unfold like so.

fibs :: [Integer]
fibs = 1 : 1 : fibs' 1 1
  where
    fibs' a b = let c = (a+) $! b in c : fibs' b c

pe2' :: Integer -> Integer
pe2' n = sum $ filter even $ takeWhile (< n) fibs

Note that strictness annotation ($!) in the definition of fibs; it’s needed to force the evaluation of each successive number, so we don’t end up with a bunch of unevaluated additions.

We can verify that pe2' 89 == 44 as expected. Then our answer is

$> pe2' 4000000
4613732

And done.

But wait! Anything worth doing is worth overdoing, so let’s try something less obvious.

I think it will be pretty hard to squeeze much more performance out of pe2'. The Fibonacci numbers grow reasonably quickly, getting a new digit every 5 terms or so. Why is that? The first Fibonacci number with a given number of digits (always?) has leading digit 1, and the one after that has leading digit 1 or 2, so at that point (after moving the decimal point) it looks like we’re starting a new Fibonacci sequence with initial terms \(1 + \varepsilon\) and \(1 + \delta\), which reaches 10 after at most six terms. This means that computing pe2' (10*n) requires looking at only 5 or 6 more terms in fibs than pe2' n does. And computing the next term in fibs is cheap.

Let’s look at the first several \(F(n)\) again, this time highlighting the even terms: \[1, 1, \fbox{2}, 3, 5, \fbox{8}, 13, 21, \fbox{34}, 55, 89, \fbox{134}, \ldots\] And what’s this then – there’s a pattern! It appears that every third \(F(n)\) is even, and no others. But does this pattern always hold? Indeed it does; to see why, unpack the recursive definition of \(F\) on \(F(n+3)\). This gives the congruence \[F(n+3) = F(n) + 2F(n+1) \equiv F(n) \pmod{2}.\] So the parity pattern of \(F(n)\) repeats every third term, and because \(F(1)\) and \(F(2)\) are odd while \(F(3)\) is even, this explains the pattern. That means we don’t really need to check the parity of our Fibonacci numbers; it’s enough to simply throw out all but every third term.

pe2'' :: Integer -> Integer
pe2'' n = sum $ takeWhile (< n) $ take3rd fibs
  where
    take3rd (_:_:a:as) = a : take3rd as

But the observation that \(F(n)\) is even precisely when \(n = 3k\) leads to another idea; what we really want is \[\sum_{k=1}^t F(3k)\] where \(t\) is the index of the largest Fibonacci number less than \(N = 4,000,000\). This is interesting because there is a closed form formula for \(F(n)\) known as Binet’s formula: \[F(n) = \frac{1}{\sqrt{5}}(\varphi^n - \overline{\varphi}^n),\] where \(\varphi = (1+\sqrt{5})/2\) is the largest real solution of \(x^2 - x - 1 = 0\) and \(\overline{\varphi} = (1-\sqrt{5})/2\) is its quadratic conjugate.

But now:

\[\begin{eqnarray*} \sum_{k=1}^t F(3k) & = & \sum_{k=1}^t \frac{1}{\sqrt{5}}\left( \varphi^{3k} - \overline{\varphi}^{3k} \right) \\ & = & \frac{1}{\sqrt{5}} \left( \sum_{k=0}^{t-1} (\varphi^3)^{k+1} - \sum_{k=0}^{t-1} (\overline{\varphi}^3)^{k+1} \right) \\ & = & \frac{1}{\sqrt{5}} \left( \varphi^3 \sum_{k=0}^{t-1} (\varphi^3)^k - \overline{\varphi}^3 \sum_{k=0}^{t-1} (\overline{\varphi}^3)^k \right) \\ & = & \frac{1}{\sqrt{5}} \left( \varphi^3 \frac{\varphi^{3t} - 1}{\varphi^3 - 1} - \overline{\varphi}^3 \frac{\overline{\varphi}^{3t} - 1}{\overline{\varphi}^3 - 1} \right) \\ & = & \frac{1}{2\sqrt{5}} \left( \varphi^{3t+2} - \overline{\varphi}^{3t+2} - \varphi^2 + \overline{\varphi}^2 \right) \\ & = & \frac{1}{2}\left( \frac{1}{\sqrt{5}}\left( \varphi^{3t+2} - \overline{\varphi}^{3t+2} \right) - \frac{1}{\sqrt{5}}\left( \varphi^2 - \overline{\varphi}^2 \right) \right) \\ & = & \frac{F(3t+2) - 1}{2} \end{eqnarray*}\]

That is, the sum of the first \(t\) even Fibonacci numbers is essentially the \((3t+2)\)th Fibonacci number. The only obstacle to applying this to our current problem is that it’s not obvious how to take an upper bound \(N\) and find the index of the largest even Fibonacci number less than \(N\).

Here’s a quick and dirty way to find the largest index of a Fibonacci number below \(N\).

maxfibidx' :: Integer -> Integer
maxfibidx' n =
  fst $
  head $
  dropWhile (\(_,m) -> m < n ) $
  zip [1..] (tail fibs)

But this doesn’t really improve on our naive implementation of pe2', since it still requires generating all the Fibonacci numbers up to \(n\). Here’s a better idea. \(F\) is a monotone increasing function on the natural numbers, and we wish to find the largest \(m\) such that \(F(m) < N\). We can find this \(m\) using a binary search.

First, we find integers \(a\) and \(b\) such that \(F(a) < N \leq F(b)\); this can be done by looking at \(m = 2^i\) for increasing \(i\).

Then, once such \(a\) and \(b\) are found, we tighten the interval \((a,b)\) bounding a root of \(F(x) - N\) by bisection. The binarysearch function does this, taking a function \(G\) as a parameter.

-- g is nondecreasing on positive integers
-- g(1) < n
-- returns the largest t such that g(t) < n
binarysearch :: (Integer -> Integer) -> Integer -> Integer
binarysearch g n = refine init
  where
    -- find powers of 2 that bound a root of g(x) - n
    k =
      fst $
      head $
      dropWhile ((< n) . snd) $
      map (\k -> (k, g $ 2^k)) [1..]

    -- this interval contains a root of g(x) - n
    init = (2^(k-1), 2^k)

    -- bisect a root-containing interval
    refine (a,b)
      | b-a <= 1 = a
      | otherwise = let m = (a+b)`quot`2 in
          case compare (g m) n of
            EQ -> m-1
            LT -> refine (m,b)
            GT -> refine (a,m)

With an implementation of Binet’s formula, we’d be nearly there. But in order to do exact arithmetic on quadratic numbers of the form \(p + q\sqrt{5}\) – as required by Binet – we need the following type and instances.

data Root5 = Root5 Rational Rational
  deriving (Eq, Show)

phi   = Root5 (1%2) (1%2)
sqrt5 = Root5  0     1

-- quadratic conjugate
conj :: Root5 -> Root5
conj (Root5 a b) = Root5 a (negate b)

-- rational part
ratpart :: Root5 -> Rational
ratpart (Root5 p _) = p

instance Num Root5 where
  fromInteger k = Root5 (k%1) 0

  (Root5 a1 b1) + (Root5 a2 b2) =
    Root5 (a1 + a2) (b1 + b2)

  (Root5 a1 b1) * (Root5 a2 b2) =
    Root5 (a1*a2 + 5*b1*b2) (a1*b2 + a2*b1)

  negate (Root5 a b) =
    Root5 (negate a) (negate b)

  abs = undefined; signum = undefined

instance Fractional Root5 where
  fromRational q = Root5 q 0

  recip (Root5 0 0) = undefined
  recip (Root5 a b) = Root5 (a/d) (-b/d)
    where d = a*a - 5*b*b

Then, first of all, we can implement and test Binet’s formula.

-- directly compute F(n)
binet :: Integer -> Integer
binet n = numerator $ ratpart $
  (phi^n - (conj phi)^n) / sqrt5

-- binet k == fibs !! (k-1)
test_binet :: Integer -> Bool
test_binet n = and $ zipWith (==) fibs $ map binet [1..n]

We can now implement maxfibidx using binary search.

maxfibidx :: Integer -> Integer
maxfibidx n = binarysearch binet n

And finally, an alternative implementation of pe2'.

pe2''' :: Integer -> Integer
pe2''' n = ((binet $ 3*t+2) - 1) `quot` 2
  where
    t = (maxfibidx n) `quot` 3

As a sanity check:

$> pe2' 4000000
4613732
$> pe2'' 4000000
4613732
$> pe2''' 4000000
4613732

And more generally:

test :: Integer -> Bool
test n = (pe2' n == pe2'' n) && (pe2' n == pe2''' n)

With a check:

$> all test [1..1000]
True

For comparison, here is a table of timings on my machine. I should note that the times for pe2'' are misleading; they do not include time spent thrashing in memory. pe2''' did not suffer from this. The problem was so bad I couldn’t complete this table, but I think a trend is already evident.

n pe2'' n Time (s) pe2''' n Time (s)
\(10^{1 \cdot 10^4}\) 0.59 0.15
\(10^{2 \cdot 10^4}\) 1.07 0.28
\(10^{3 \cdot 10^4}\) 1.50 0.44
\(10^{4 \cdot 10^4}\) 2.03 0.58
\(10^{5 \cdot 10^4}\) 3.02 0.74
\(10^{6 \cdot 10^4}\) 6.84 0.98
\(10^{7 \cdot 10^4}\) ??? 1.08
\(10^{8 \cdot 10^4}\) ??? 1.25
\(10^{9 \cdot 10^4}\) ??? 1.40

Just for fun, the sum of all even Fibonacci numbers less than \(10^{1000000}\) is even. Shocking, I know! But my machine determined this by brute force in 14 seconds. :)

So the final answer is:

pe2 :: Integer
pe2 = pe2''' 4000000

main :: IO ()
main = do
  let success = all test [1..1000]
  if success
    then putStrLn $ show pe2
    else exitFailure