Vogue Media BLOG


No doubt, there's been times you've wondered "how did that dev come up with a solution?"

How knowing more than programming makes better programmers

author: Marius Conradie

Summary

This blog post is long, I know, but it illustrates by a recent example the difference it makes in knowing more than coding.

When we program, we're actually translating real world requirements to a different language that uses a simpler construct to express what needs to happen. It's not always going to be a mere translation from requirement to code statements. This is when broadening one's experiences and spheres of understanding aids one in finding simple solutions to difficult or just obscure requirements (or problem statements). We draw inspiration from different interests and subject matter and sometimes find the actual solution that just needs to be expressed in code. Other times having exposure to different fields of expertise gives us insights that drives our decisions during coding which leads to code supporting the desired outcome better than had we had no insight into a client's world, domain, vocabulary and issues.

Understand the original requirement

Recently a fellow developer got stuck on writing a method to calculate the consistency between the values in a list. He's a good programmer, knows his stuff but could not work out how to start writing an algorithm to calculate the consistency between a list of values. He tried a few ways and the calculated results didn't make sense except when all the values were the same, then the answer came out as 100% (yay, the code works for one sample set). But in other test sample sets the results would sometimes be above 100% (which isn't possible, after all you can only be 100% consistent).

So the first step was to find out what is meant by consistency in this case. It's easy to draw a meaning from one's own frame of reference but the intended meaning from the client's perspective could be very different. It's irrelevant what the technically correct dictionary meaning is. What is relevant is in what context the client is using the term to describe a required outcome.

He explained to me that they're trying to calculate the consistency in a set of test results. This simple explanation changed my odd frame of reference from calculate the mean value to calculating a higher concept: calculate a percentage indication of how much the results are the same.

I went to work thinking about the requirement and it's meaning (so far I've not touched a keyboard or started writing code or even worse: wrote experimental code until the test results are fulfilled, this leads to false positives and ultimately circumstantial programming).

Understanding the context and non-technical requirement is the first step. Avoid 'coding till it works' helps avoid circumstantial programming, looks like its working until untested data comes in.

A deeper understanding of the requirement

Now that I've reached a deeper understanding of what's required (regardless of code doing the work or it being done manually) I could start exploring existing concepts in different fields than programming that may already provide a proven way to calculate the consistency.

I started to reason my way to discovering the solution: I'm looking for a field of expertise where I can calculate quantitatively a concept that's going to be uncertain upon visual inspection of the data, when the data isn't intuitive.

Examples: all values the same means 100% consistent, but what about three out of five values the same...is it 66% consistent?, 33% consistent...

Exploring different fields of expertise than programming tend to lead to already proven methods or calculations to solve problems that aren't always intuitively derived.

Exploring different fields of expertise

Finally, I remembered a concept in statistics (not one of my pet subjects but some of the concepts have proven valuable in either directly applying them and in other cases in providing inspiration for an algorithm based on the concept itself rather than the mathematical formula).

My thinking explored the concept of correlation. But I realized that it's almost the right concept for the problem but it's meant to express a different characteristic of the data being evaluated. In other words this was an "Island of false hope" (in creative problem solving terms, in maths terms this would be a local minimum when you're looking for the global minimum).

I used this idea to narrow down the problem solving process to stats and soon came across a concept used years back: The standard deviation.

Exploring different fields of expertise than programming tend to lead to already proven methods or calculations to solve problems that aren't always intuitively derived.

Research concepts to find a solution

I were a but rusty on stats 101 so I did a quick recap of standard deviation's formula (the concept I remember well, but I'm not going to remember the formula if I'm not using it daily). I worked through the maths only to the level that I needed to create an algorithm to calculate it. The answer would give me the wrong part of the right answer: "By how much are the samples different in percentage terms?"

Now that I had the difference, the natural final step was to offset this difference from 100% to get the answer he was looking for: "By how much are the samples the same in percentage terms?"

The answer would give me the wrong part of the right answer: "By how much are the samples different in percentage terms?". This is okay, it's part of the solution, keep going...

Calculating standard deviation (JavaScript)


	/**
	 * Calculate the standard deviation.
	 * @param {number} meanValue The mean value for the arrayOfValues
	 * @param {Array} arrayOfValues The array of values for which to calculate standard deviation.
	 * @returns {number} The percentage value of the calculated standard deviation.
	 */
	function calculateStandardDeviation(meanValue, arrayOfValues) {
	    let N = 1/arrayOfValues.length;
	    let sigma = function(value) { return N*(value - meanValue) * (value - meanValue); };
	    let sumTotal = function(previousValue, currentValue) { return previousValue + currentValue; };
	    return arrayOfValues.map(sigma).reduce(sumTotal);
	}
	

Finally, write the code and test it

And now finally I could complete the whole solution for him. The above were worked out from reviewing standard deviation on Wikipedia, here

The final solution (JavaScript)


    /**
     * Calculate the standard deviation.
     * @param {number} meanValue The mean value for the arrayOfValues
     * @param {Array} arrayOfValues The array of values for which to calculate standard deviation.
     * @returns {number} The percentage value of the calculated standard deviation.
     */
    function calculateStandardDeviation(meanValue, arrayOfValues) {
        let N = 1/arrayOfValues.length;
        let sigma = function(value) { return N*(value - meanValue) * (value - meanValue); };
        let sumTotal = function(previousValue, currentValue) { return previousValue + currentValue; };
        return arrayOfValues.map(sigma).reduce(sumTotal);
    }

    /**
     * Calculate the expected value given a range of values.
     * In other words, calculate the mean value assuming each
     * possible value has the same probability of occuring.
     * @param {Array} The list of numeric values.
     * @returns {number} The mean value.
     */
    function expectancy(arrayOfValues) {
        let sumTotal = function(previousValue, currentValue) { return previousValue + currentValue; };
        let u = arrayOfValues.reduce(sumTotal);
        // Assume each member carries an equal weight in expected value
        u = u / arrayOfValues.length;
        return u;
    }

    /**
     * Calculate consistency of the members in the vector or list.
     * @description sig contains the standard deviation, by what percentage are the values different.
     *              We want to know by what the values the same so we calculate the difference between
     *              100% (all) and standard deviation (the percentage different) to get the percentage
     *              by which the values are the sample -> similarity.
     * @param {Array} The vector or list of members to inspect for similarity
     * @return {number} The percentage of members that are the same
     */
    var consistency = function(arrayOfValues) {
        // The improved version using map reduce to simplify code expressions.

        // Step 1: Calculate mean value of the samples
        let meanValue = expectancy(arrayOfValues);

        // Step 2: Calculate the standard deviation of the samples
        let sigma = calculateStandardDeviation(meanValue, arrayOfValues);

        // Step 3: Offset from 100% to get the similarity
        return 100 - sigma;
    };

    console.log('\n\nOutput from second version:\n');
    answer = consistency(ar1);
    console.log(`Consistency for ${ar1} is ${answer.toFixed(3)} fixed to 3 decimal places`);

    answer = consistency(ar2);
    console.log(`Consistency for ${ar2} is ${answer.toFixed(3)} fixed to 3 decimal places`);

    answer = consistency(ar3);
    console.log(`Consistency for ${ar3} is ${answer.toFixed(3)} fixed to 3 decimal places`);

    answer = consistency(ar4);
    console.log(`Consistency for ${ar4} is ${answer.toFixed(3)} fixed to 3 decimal places`);
    

Marius Conradie
A freelance full stack web developer and solutions architect based in Johannesburg, South Africa.