Last Updated:

8 Ways to Find the Largest Common Divisor

This article came to light quite unexpectedly. I accidentally came across the code for calculating the NOD in C#.

At first glance, I even liked everything: simple, concise and without unnecessary fanfare. However, the longer I looked at this code, the more doubts arose about its effectiveness. I decided to do a little research.

In the C++ adaptation, the code looked something like this:

#include <iostream>

using namespace std;

int main() {

    int a, b;
    cin >> a >> b;
    for (int i = a; i > 0; i--) {
        if (a % i == 0 && b % i == 0) {
            cout << "nod = " << i;
            break;
        }
    }
    return 0;
}

Preparation

It was decided:

  1. move from type to type, which would have more room for maneuver;intlong
  2. make the calculation of the NOD in the form of a function;
  3. in the function, call alternate versions of the GCD calculation function and compare their performance.main

The obvious prototype of the function is: . The name of the function is from the English Greatest Common Divisor - i.e. NOD.long gcd(long a, long b)

For such a simple task, I did not contact the profiler, but used primitive timing (details can be found in the article "Optimizing the code through manual timing").

The final version of the "test bench" is given at the end of the article. And in the process of research, I used its simplified version, which provided the launch of one of the tested functions and timing.

In the code, I do not use library functions, so that the maximum amount of code is controlled. The use of library functions of type or - a separate topic for experiments. I leave it for particularly meticulous readers.minswap

01 brute force from arbitrary number

long gcd01(long a, long b) {

    long nod = 1L;
    for (long i = a; i > 0; i--) {
        if (a % i == 0 && b % i == 0) {
            nod = i;
            break;
        }
    }
    return nod;
}

This algorithm is the starting point of the study. The idea of the algorithm is very simple: drive the variable of the loop from the first number to 1. If both numbers are divisible by a loop variable with no residue, then the cycle variable is equal to the NOD; the cycle can end prematurely. If the cycle has passed to the end, then for these numbers the NOD is 1.

The obvious drawback is the asymmetry regarding the arguments. Obviously, the NOD is less than or equal to the lesser of the two numbers. Therefore, driving the cycle away from a larger number does not make sense.

02 brute force from minimum number

long min(long a, long b) {
    return a > b ? b : a;
}

long gcd02(long a, long b) {

    long nod = 1L;
    for (long i = min(a, b); i > 0; i--) {
        if (a % i == 0 && b % i == 0) {
            nod = i;
            break;
        }
    }
    return nod;
}

Just add the simplest function to calculate the minimum number for a pair of numbers and initialize the loop variable to the smaller of the two numbers. In half of the cases, such optimization will not work (when the first argument is already less than the second); but in the other half of cases, the time gain can be very significant.

03 divisor decomposition algorithm

But the second option remained an algorithm with a sequential search. What else can I try?

From the school course of mathematics it is known that the NOD can be found from the decomposition of numbers into prime factors.

a = m1 * m2 * m3 * ... * mN
where m are prime numbers

The NOD will be equal to the product of the prime factors common to both numbers. For example:

24 = (2) * 2 * 2 * (3)
54 = (2) * (3) * 3 * 3

common factors are in brackets

gcd(24, 54) = 2 * 3 = 6

Let's implement this idea:

long gcd03(long a, long b) {

    long nod = 1L;

    if (a > b) {
        long tmp = a;
        a = b;
        b = tmp;
    }

    while (a > 1L && b > 1L) {
        for (long i = 2; i <= a; i++) {
            if (a % i == 0 && b % i == 0) {
                nod *= i;
                a /= i;
                b /= i;
                break;
            }
            if (a % i == 0) {
                a /= i;
                break;
            }
            if (b % i == 0) {
                b /= i;
                break;
            }
        }
    }
    return nod;
}

The first catches a situation where both numbers are divided by a loop variable and, therefore, the cycle variable is a common prime (!) multiplier for both numbers and is taken into account to calculate the NOD. The other two catch cases where only one of the numbers is divided by a loop variable; these multipliers are not included in the NOD.ifif

In a good way, the loop should only go through prime numbers. But finding prime numbers is an independent computational problem, which I would not like to solve here. You can, of course, use a table of prime numbers, for example, within the first thousand, and for large numbers, if necessary, calculate prime numbers by checking for primality ... But I did not climb into the wilds of factorization of natural numbers, but simply closed my eyes to the obviously idle passes of the cycle, when the variable of the cycle is not a prime number, because after finding any of the multipliers for at least one of the numbers, the cycle starts again.forfor

The third performance option isn't bad... But for now, it's self-inflicted. And what have smart people come up with in the centuries-old history of mathematics? "O'key, Gug... that is, Wikipedia, what is a NOD?" here, by the way, describes the finding of a NOD through canonical decomposition into simple multipliers, which we have already implemented. But something new...

04 Euclid algorithm (recursive)

Euclid's algorithm.

In the simplest case, Euclid's algorithm is applied to a pair of positive integers and forms a new pair that consists of a smaller number and a difference between a larger and a smaller number. The process is repeated until the numbers are equal. The number found is the largest common divisor of the original pair.

We implement the recursive version:

long gcd04(long a, long b) {

    if (a == b) {
        return a;
    }
    if (a > b) {
        long tmp = a;
        a = b;
        b = tmp;
    }
    return gcd04(a, b - a);
}

It is believed that recursive algorithms are less efficient than iterative algorithms due to the overhead of calling a function. To check, we make an iterative version.

05 Euclid algorithm (iterative)

long gcd05(long a, long b) {

    while (a != b) {
        if (a > b) {
            long tmp = a;
            a = b;
            b = tmp;
        }
        b = b - a;
    }
    return a;
}

By the way, wikibooks have other implementations of Euclid's algorithm.

06 binary algorithm (recursive)

Euclid binary algorithm.

  1. NOD(0, n) = n; NOD(m, 0) = m; NOD(m, m) = m;
  2. NOD(1, n) = 1; NOD(m, 1) = 1;
  3. If m, n are even, then NOD(m, n) = 2*NOD(m/2, n/2);
  4. If m is even, n is odd, then NOD(m, n) = NOD(m/2, n);
  5. If n is even, m is odd, then NOD(m, n) = NOD(m, n/2);
  6. If m, n are odd and n > m, then NOD(m, n) = NOD((n-m)/2, m);
  7. If m, n are odd and n < m, then THE ND(m, n) = NOD((m-n)/2, n);

Recursive version:

long gcd06(long a, long b) {
    if (a == 0L)
        return b;
    if (b == 0L)
        return a;
    if (a == b)
        return a;
    if (a == 1L || b == 1L)
        return 1L;
    if (a % 2L == 0L && b % 2L == 0L)
        return 2L * gcd06(a / 2L, b / 2L);
    if (a % 2L == 0L && b % 2L != 0L)
        return gcd06(a / 2L, b);
    if (a % 2L != 0L && b % 2L == 0L)
        return gcd06(a, b / 2L);
    if (a < b)
        return gcd06((b - a) / 2L, a);
    else
        return gcd06((a - b) / 2L, b);
}

07 binary algorithm (iterative)

Iterative version:

long gcd07(long a, long b) {
    long nod = 1L;
    long tmp;
    if (a == 0L)
        return b;
    if (b == 0L)
        return a;
    if (a == b)
        return a;
    if (a == 1L || b == 1L)
        return 1L;
    while (a != 0 && b != 0) {
        if (a % 2L == 0L && b % 2L == 0L) {
            nod *= 2L;
            a /= 2L;
            b /= 2L;
            continue;
        }
        if (a % 2L == 0L && b % 2L != 0L) {
            a /= 2L;
            continue;
        }
        if (a % 2L != 0L && b % 2L == 0L) {
            b /= 2L;
            continue;
        }
        if (a > b) {
            tmp = a;
            a = b;
            b = tmp;
        }
        tmp = a;
        a = (b - a) / 2L;
        b = tmp;
    }
    if (a == 0)
        return nod * b;
    else
        return nod * a;
}

By the way, wikibooks have other implementations of Euclid's binary algorithm.

08 Binary algorithm (iterative) using bit operations

Since a distinctive feature of the binary algorithm is the ability to use bit shifts instead of slow operations of division and multiplication, it is a sin not to take advantage of such an opportunity.

This version of the function is identical in logic to the previous one, but the operations of dividing and multiplying by 2 are replaced by arithmetic shifts, and the parity check is replaced by checking the lowest bit of the number.

long gcd08(long a, long b) {
    long nod = 1L;
    long tmp;
    if (a == 0L)
        return b;
    if (b == 0L)
        return a;
    if (a == b)
        return a;
    if (a == 1L || b == 1L)
        return 1L;
    while (a != 0 && b != 0) {
        if (((a & 1L) | (b & 1L)) == 0L) {
            nod <<= 1L;
            a >>= 1L;
            b >>= 1L;
            continue;
        }
        if (((a & 1L) == 0L) && (b & 1L)) {
            a >>= 1L;
            continue;
        }
        if ((a & 1L) && ((b & 1L) == 0L)) {
            b >>= 1L;
            continue;
        }
        if (a > b) {
            tmp = a;
            a = b;
            b = tmp;
        }
        tmp = a;
        a = (b - a) >> 1L;
        b = tmp;
    }
    if (a == 0)
        return nod * b;
    else
        return nod * a;
} 

A little more preparation

 

So, the functions for which there was enough time, brains and imagination are written. You can finish the "test bench" and start testing.

As the results of the test runs showed, the operating time of the functions is highly dependent on the input data. Therefore, it was decided to generate random pairs of numbers. However, it turned out that completely random pairs of numbers are too often mutually simple. So I decided to add a random multiplier to a couple of random numbers. It turned out that even if initially the numbers in the pair were mutually simple, then the NOD search algorithm will have to calculate this added multiplier.

Otherwise, the testing algorithm is simple: for each set of random data, one of the tested functions is taken and run once. The operating time of a function is written to the array element corresponding to that function. After all the tests are passed, the results from the array are divided by the number of datasets - we get the average time for the set.REPEAT_TIMES

#include <cstdio>
#include <clocale>
#include <cstdlib>
#include <ctime>

using namespace std;


//
// definitions of GCD calculation functions should be here
//


struct sub {
    long(*func)(long, long);
    const char*comm;
};


sub arr[] = {
    { gcd01, "01 iteration from an arbitrary number " },
    { gcd02, "02 search from the minimum number " },
    { gcd03, "03 with factorization " },
    { gcd04, "04 Euclid's recursive algorithm " },
    { gcd05, "05 Euclid's iterative algorithm " },
    { gcd06, "06 binary algorithm recursive " },
    { gcd07, "07 binary algorithm iterative " },
    { gcd08, "08 binary iteration algorithm with shift " }
};


const unsigned int RAND_TIMES = 500u;
const unsigned long REPEAT_TIMES = 10000uL;

int main() {

    clock_t the_time;
    double elapsed_time;

    setlocale(LC_ALL, "English");

    long a, b, c, nod;

    srand((unsigned int)time(NULL));
    double times[sizeof(arr) / sizeof(sub)] = { 0.0 };

    for (unsigned int t = 0u; t < RAND_TIMES; t++) {

        c = rand() % 50 + 1;
        a = (rand() % 1000 + 1) * c;
        b = (rand() % 1000 + 1) * c;

        for (unsigned int alg = 0u; alg < sizeof(arr) / sizeof(sub); alg++) {

            the_time = clock();

            for (unsigned long m = 0uL; m < REPEAT_TIMES; m++) {
                nod = arr[alg].func(a, b);
            }

            elapsed_time = double(clock() - the_time) / CLOCKS_PER_SEC;
            times[alg] += elapsed_time;
        }
        printf("%4u gcd(%7ld, %7ld) = %7ld\n", t + 1, a, b, nod);
    }

    printf("\nAverage time for %d pairs of random numbers:\n", RAND_TIMES);
    for (unsigned int alg = 0; alg < sizeof(arr) / sizeof(sub); alg++) {
        printf("%s: %8.4f sec\n", arr[alg].comm, times[alg] / RAND_TIMES);
    }

    return 0;
}

Testing

The program was compiled for MS Visual Studio 2013 and TDM-GCC 4.8.1 in 64-bit Release mode.

Average time for 500 pairs of random numbers:
-------------------------------------------------- -----
01 iteration from arbitrary number : 0.5022 sec.
02 search from the minimum number : 0.3256 sec.
03 with factorization : 0.0063 sec.
04 Euclid recursive : 0.0007 sec.
05 Euclid's algorithm iterative : 0.0008 sec.
06 binary algorithm recursive : 0.0006 sec.
07 binary algorithm iterative : 0.0003 sec.
08 binary algorithm iterative with shift : 0.0002 sec.

As expected, the first algorithm is disastrously inefficient. Algorithm No. 2 – the minimum crutch for No. 1 – works almost 2 times faster.

The third algorithm unexpectedly showed a very decent result: 50 times faster than algorithm No. 2.

The fourth and fifth variants are Euclid's algorithm: the recursive version, oddly enough, overtook the iterative one. Compared to the third option, the time has improved by almost an order of magnitude.

Euclid's binary algorithm showed the best results. Of the three implementations, the recursive version is the most leisurely. The best result is for the optimized version using bit operations.

In total, the most productive version works more than 2500 times faster than the original version.

Note. The time specified in the table is the time it takes for the function calls to be executed.REPEAT_TIMES

Findings

The conclusions are rather banal, but still:

  1. The first algorithm that comes to mind for solving a problem is often suboptimal. Although it can work correctly and quite acceptably in certain conditions.
  2. You should always try to think about improving the algorithm or an alternative option.
  3. Learn math!
  4. If possible, see what people before you have done. Maybe we shouldn't reinvent the wheel?
  5. Optimize your code.

Updated : 

 

Even more efficient code

checks — D B x>0 && y>0

while(y){
if(x>y) x=x%y;
if(x==0)break;
y=y%x;
}
return x==0?y:x;

 

Recursive version of Euclid's algorithm:

int gcd(int a, int b) {return (a == 0 ? b : gcd(b % a, a));}

Offer another more or less understandable and easy-to-write code:

 



#include "stdafx.h"
#include<iostream>

using namespace std;

int main()
{

    int a , b ;
    cin >> a , b ;

    int c;

    if (a < b)
    {
        do
        {
            c = b%a;

            if (b%a == 0)
            {
                break;
            }

            b = a;

            a = c;
        } while (b%a == 0);
    }

    cout << "GCD" << " ( " << a << " ; " << b << " ) " << " = "; // GCD is mean " Greatest Common Divisor "

    do
    {
        c = a%b;

        if (a%b == 0)
        {
            break;
        }

        a = b;

        b = c;
    }

    while (a%b == 0);

    cout << b << "\n";

}