Binary GCD algorithm

Visualisation of using the binary GCD algorithm to find the greatest common divisor (GCD) of 36 and 24. Thus, the GCD is 2² × 3 = 12.

The binary GCD algorithm, also known as Stein's algorithm, is an algorithm that computes the greatest common divisor of two nonnegative integers. Stein's algorithm uses simpler arithmetic operations than the conventional Euclidean algorithm; it replaces division with arithmetic shifts, comparisons, and subtraction.

Although the algorithm in its contemporary form was first published by the Israeli physicist and programmer Josef Stein in 1967,^[1] it may have been known by the 2nd century BCE, in ancient China.^[2]^[3]

Algorithm

The algorithm reduces the problem of finding the GCD of two nonnegative numbers v and u by repeatedly applying these identities:

gcd(0, v) = v, because everything divides zero, and v is the largest number that divides v. Similarly, gcd(u, 0) = u.
gcd(2u, 2v) = 2·gcd(u, v)
gcd(2u, v) = gcd(u, v), if v is odd (2 is not a common divisor). Similarly, gcd(u, 2v) = gcd(u, v) if u is odd.
gcd(u, v) = gcd(|u − v|, min(u, v)), if u and v are both odd.

An extended binary GCD, analogous to the extended Euclidean algorithm, can provide the Bézout coefficients in addition to the GCD, i.e. integers a and b such that a·u + b·v = gcd(u, v).^[4]^[5]

Implementation

Recursive version in C

Following is a recursive implementation of the algorithm in C. The implementation is similar to the description of the algorithm given above, and optimised for readability rather than speed, though all but one of the recursive calls are tail recursive.

unsigned int gcd(unsigned int u, unsigned int v)
{
    // Base cases
    //  gcd(n, n) = n
    if (u == v)
        return u;
    
    //  Identity 1: gcd(0, n) = gcd(n, 0) = n
    if (u == 0)
        return v;
    if (v == 0)
        return u;

    if (u % 2 == 0) { // u is even
        if (v % 2 == 1) // v is odd
            return gcd(u/2, v); // Identity 3
        else // both u and v are even
            return 2 * gcd(u/2, v/2); // Identity 2

    } else { // u is odd
        if (v % 2 == 0) // v is even
            return gcd(u, v/2); // Identity 3

        // Identities 4 and 3 (u and v are odd, so u-v and v-u are known to be even)
        if (u > v)
            return gcd((u - v)/2, v);
        else
            return gcd((v - u)/2, u);
    }
}

Iterative version in Rust

Following is an implementation of the algorithm in Rust, adapted from uutils. It removes all common factors of 2, using identity 2, then computes the GCD of the remaining numbers using identities 3 and 4, combining these to form the final answer.

pub fn gcd(mut u: u64, mut v: u64) -> u64 {
    use std::cmp::min;
    use std::mem::swap;

    // Base cases: gcd(n, 0) = gcd(0, n) = n
    if u == 0 {
        return v;
    } else if v == 0 {
        return u;
    }

    // Using identities 2 and 3:
    // gcd(2ⁱ u, 2ʲ v) = 2ᵏ gcd(u, v) with u, v odd and k = min(i, j)
    // 2ᵏ is the greatest power of two that divides both u and v
    let i = u.trailing_zeros();  u >>= i;
    let j = v.trailing_zeros();  v >>= j;
    let k = min(i, j);

    loop {
        // u and v are odd at the start of the loop
        debug_assert!(u % 2 == 1, "u = {} is even", u);
        debug_assert!(v % 2 == 1, "v = {} is even", v);

        // Swap if necessary so u <= v
        if u > v {
            swap(&mut u, &mut v);
        }

        // Using identity 4 (gcd(u, v) = gcd(|v-u|, min(u, v))
        v -= u;

        // Identity 1: gcd(u, 0) = u
        // The shift by k is necessary to add back the 2ᵏ factor that was removed before the loop
        if v == 0 {
            return u << k;
        }

        // Identity 3: gcd(u, 2ʲ v) = gcd(u, v) (u is known to be odd)
        v >>= v.trailing_zeros();
    }
}

This implementation showcases several performance optimisations:

Trial division by 2 is eschewed in favour of a single bitshift and the count trailing zeros primitive u64::trailing_zeros.
The loop is laid out so as to avoid repeated work; for instance, eliminating 2 as a factor of v was moved to the back of the loop, and after the exit condition, as v is known to be odd upon entering the loop.
The body of the loop is branch-free except for its exit condition (v == 0), as the exchange of u and v (ensuring u ≤ v) compiles down to conditional moves.^[6] Hard-to-predict branches can have a large, negative impact on performance.^[7]^[8]

Efficiency

The algorithm requires O(n) steps, where n is the number of bits in the larger of the two numbers, as every step reduces at least one of the operands by at least a factor of 2. Each step involves only a few arithmetic operations (O(1) with a small constant), so the number of machine operations is on the order of log(max(u, v)) when working with word-sized numbers.

However, the asymptotic complexity of this algorithm is O(n²),^[9] as those arithmetic operations (subtract and shift) each take linear time for arbitrarily-sized numbers (one machine operation per word of the representation). This is the same as for the Euclidean algorithm, though neither is the fastest for arbitrary-precision arithmetic; instead, recursive methods that combine ideas from the binary GCD algorithm with the Schönhage–Strassen algorithm for fast integer multiplication can find GCDs in near-linear time, but only outperform older algorithms for numbers larger than about 64 kilobits (i.e. greater than 8×10¹⁹²⁶⁵).^[10]

A more precise analysis by Akhavi and Vallée proved that binary GCD uses about 60% fewer bit operations than the Euclidean algorithm.^[11]

Historical description

An algorithm for computing the GCD of two numbers was known in ancient China, under the Han dynasty, as a method to reduce fractions:

If possible halve it; otherwise, take the denominator and the numerator, subtract the lesser from the greater, and do that alternately to make them the same. Reduce by the same number.
— Fangtian - Land surveying, The Nine Chapters on the Mathematical Art

The phrase "if possible halve it" is ambiguous,^[2]^[3]

if this applies when either of the numbers become even, the algorithm is the binary GCD algorithm;
if this only applies when both numbers are even, the algorithm is similar to the Euclidean algorithm.

References

^ Stein, J. (1967), "Computational problems associated with Racah algebra", Journal of Computational Physics, 1 (3): 397–405, doi:10.1016/0021-9991(67)90047-2, ISSN 0021-9991
^ ^a ^b Knuth, Donald (1998), Seminumerical Algorithms, The Art of Computer Programming, 2 (3rd ed.), Addison-Wesley, ISBN 978-0-201-89684-8
^ ^a ^b Zhang, Shaohua (2009-10-05). "The concept of primes and the algorithm for counting the greatest common divisor in Ancient China". arXiv:0910.0095 [math.HO].
^ Knuth 1998, p. 646, answer to exercise 39 of section 4.5.2
^ "Handbook of Applied Cryptography" (PDF). Retrieved 2017-09-09.
^ Godbolt, Matt. "Compiler Explorer". Retrieved 4 November 2020.
^ Kapoor, Rajiv (2009-02-21). "Avoiding the Cost of Branch Misprediction". Intel Developer Zone.
^ Lemire, Daniel (2019-10-15). "Mispredicted branches can multiply your running times".
^ "GNU MP 6.1.2: Binary GCD".
^ Stehlé, Damien; Zimmermann, Paul (2004), "A binary recursive gcd algorithm", Algorithmic number theory (PDF), Lecture Notes in Comput. Sci., 3076, Springer, Berlin, pp. 411–425, CiteSeerX 10.1.1.107.8612, doi:10.1007/978-3-540-24847-7_31, ISBN 978-3-540-22156-2, MR 2138011.
^ Akhavi, Ali; Vallée, Brigitte (2000), "Average Bit-Complexity of Euclidean Algorithms", Proceedings ICALP'00, Lecture Notes Computer Science 1853: 373–387, CiteSeerX: 10.1.1.42.7616, archived from the original on 2006-10-02

External links

NIST Dictionary of Algorithms and Data Structures: binary GCD algorithm
Cut-the-Knot: Binary Euclid's Algorithm at cut-the-knot
Analysis of the Binary Euclidean Algorithm (1976), a paper by Richard P. Brent, including a variant using left shifts
"Dynamics of the Binary Euclidean Algorithm: Functional Analysis and Operators" (1998), a paper by Brigitte Vallée.

[Stein-1] Stein, J. (1967), "Computational problems associated with Racah algebra", Journal of Computational Physics, 1 (3): 397–405, doi:10.1016/0021-9991(67)90047-2, ISSN 0021-9991

[Knuth98-2] Knuth, Donald (1998), Seminumerical Algorithms, The Art of Computer Programming, 2 (3rd ed.), Addison-Wesley, ISBN 978-0-201-89684-8

[Zhang-3] Zhang, Shaohua (2009-10-05). "The concept of primes and the algorithm for counting the greatest common divisor in Ancient China". arXiv:0910.0095 [math.HO].

[egcd-knuth-4] Knuth 1998, p. 646, answer to exercise 39 of section 4.5.2

[egcd-applied-crypto-5] "Handbook of Applied Cryptography" (PDF). Retrieved 2017-09-09.

[rust_disassembly-6] Godbolt, Matt. "Compiler Explorer". Retrieved 4 November 2020.

[intel-7] Kapoor, Rajiv (2009-02-21). "Avoiding the Cost of Branch Misprediction". Intel Developer Zone.

[lemire-8] Lemire, Daniel (2019-10-15). "Mispredicted branches can multiply your running times".

[gmplib-9] "GNU MP 6.1.2: Binary GCD".

[stehlé-zimmermann-10] Stehlé, Damien; Zimmermann, Paul (2004), "A binary recursive gcd algorithm", Algorithmic number theory (PDF), Lecture Notes in Comput. Sci., 3076, Springer, Berlin, pp. 411–425, CiteSeerX 10.1.1.107.8612, doi:10.1007/978-3-540-24847-7_31, ISBN 978-3-540-22156-2, MR 2138011.

[bit-complexity-11] Akhavi, Ali; Vallée, Brigitte (2000), "Average Bit-Complexity of Euclidean Algorithms", Proceedings ICALP'00, Lecture Notes Computer Science 1853: 373–387, CiteSeerX: 10.1.1.42.7616, archived from the original on 2006-10-02

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Number-theoretic algorithms
Primality tests	AKS APR Baillie–PSW Elliptic curve Pocklington Fermat Lucas Lucas–Lehmer Lucas–Lehmer–Riesel Proth's theorem Pépin's Quadratic Frobenius Solovay–Strassen Miller–Rabin
Prime-generating	Sieve of Atkin Sieve of Eratosthenes Sieve of Sundaram Wheel factorization
Integer factorization	Continued fraction (CFRAC) Dixon's Lenstra elliptic curve (ECM) Euler's Pollard's rho p − 1 p + 1 Quadratic sieve (QS) General number field sieve (GNFS) Special number field sieve (SNFS) Rational sieve Fermat's Shanks's square forms Trial division Shor's
Multiplication	Ancient Egyptian Long Karatsuba Toom–Cook Schönhage–Strassen Fürer's
Euclidean division	Binary Chunking Fourier Goldschmidt Newton-Raphson Long Short SRT
Discrete logarithm	Baby-step giant-step Pollard rho Pollard kangaroo Pohlig–Hellman Index calculus Function field sieve
Greatest common divisor	Binary Euclidean Extended Euclidean Lehmer's
Modular square root	Cipolla Pocklington's Tonelli–Shanks Berlekamp
Other algorithms	Chakravala Cornacchia Exponentiation by squaring Integer square root Integer relation (LLL) Modular exponentiation Montgomery reduction Schoof
Italics indicate that algorithm is for numbers of special forms

Conheça Walt Disney World

Binary GCD algorithm

Contents