Saturday, March 10, 2007

Mode Problem

I recently came across this interesting problem.

Given a list of N integers where more than half of the integers are the same, compute the mode (most common element) of the list.

When I came across the problem, the best I could immediately do was to compute the mode for any given set of numbers by putting the numbers into a heap using the frequency as the comparator. If you put the numbers in a heap like that then the mode is always at the root of the heap. It's equivalent to a heap sort, so it's a N*log(N) computation.

I knew this was not the best solution since it did not use the fact that the mode's frequency is greater than N/2. I mentioned the to my friend Chris and he figured out it was a classic Pigeonhole Principle problem.

His solution involved constructing dislike pairs of numbers. Basically you keep doing this until you can't. Eventually you won't be able to do it (because more than half of the numbers are the same) and the leftover will be the mode. Here's the code:

#include <iostream>
#include <vector>

using namespace std;

int mode(int*, int);

int main (int argc, char * const argv[]) {
    vector<int> nums;
    int n = 0;
    while (n != -999){
        cout << "Enter next number, -999 to quit" << endl;
        scanf("%d",&n);
        if (n != -999){
            nums.push_back(n);
        }
    }
    int sz = nums.size();
    int* a = new int[sz];
    for (int i=0;i<sz;i++){
        a[i] = nums[i];
    }
    cout << "Mode: " << mode(a,sz);
   
    return 0;
}

int mode(int* a, int n){
    int* b = new int[n];
    int j=0;
    for (int i=0; i<n; i++){
        if (i == j)
            b[j] = a[i];
        else if (a[i] == b[j])
            b[2*i-j] = a[i];
        else
            b[(++j)++] = a[i];
    }
    return b[j];
}

Blogged with Flock

No comments: