Post No.: 0552
When it comes to making predictions, algorithms – at least decent ones, that have been trained with sufficiently accurate and complete datasets if machine learning – beat or do no worse than the subjective impressions and intuitions of even trained individuals in their relevant fields, especially in low-validity (or low-predictability) environments and regarding long-term predictions where there’s lots of uncertainty.
Not all information that our intuitions rely upon is relevant (e.g. personal statements when predicting student grade outcomes). Experts can try to be too clever by thinking ‘outside of the box’ and considering complex combinations of variables that are irrelevant, when simpler combinations of relevant variables would be better.
Yet even when provided the prediction output from a decent formula or algorithm, people will often overrule it because they think they have additional insight. But these people are more often wrong than right. Humans are typically overconfident in their own instincts and will assign too much weight to their own personal impressions and too little weight to other sources of information, thus lowering predictive validity.
Of course, incredibly rare and decisive events that aren’t accounted for in a simple algorithm may make human predictions more accurate (e.g. if someone has received information that a competitor has broken their leg and will therefore not compete in a particular race at all) but the point is that such events are incredibly rare, as well as decisive. Due to these incredibly rare events or outliers, algorithms can now and again lead to obviously absurd decisions; although these would present opportunities to improve them. Woof.
Computers also parse and follow their rules or instructions, as laid out by their software or computer programs, rigidly. Humans might misread or forget to carry out an instruction whilst computers won’t. This can however be a major problem if some code has been badly written (e.g. with typing errors) or has failed to account for certain events (e.g. the risk of stack buffer overflows). You could argue that these are still human programmer mistakes though.
And human decisions don’t escape absurdity either – humans can be irrationally inconsistent. When asked to evaluate the same information twice, they can give different predictions – even if a case is re-evaluated only a few minutes later! These inconsistencies could possibly be due to momentary priming and contextual effects that affect ‘system one’ (e.g. the sunlight warms the room that moment and makes one feel more optimistic). And because these are subconscious or unconscious effects, one will likely not truly recognise what has influenced one’s feelings and decisions to mitigate for them. They’re chaotic too – a slightly different circumstance could lead to a totally different judgement.
Meanwhile, algorithms are infallibly consistent. Therefore algorithms would serve final admission decisions in low-validity, low-predictability, fragmentary-evidence domains better than human interviews in recruitment contexts. (Post No.: 0543 identified several flaws regarding traditional job interviews.) Generally speaking, true experts rely on data, whilst false ‘experts’ rely purely on intuition. These standardised formulas, algorithms checklists or simple rules don’t always have to be complex or ‘optimally weighted’ either.
True experts do utilise their own past experiences and recognition of genuine patterns to make reasonably reliable predictions. But few hiring managers follow up on the candidates they recruited to check if their past predictions were good, or indeed follow up on those they rejected to see if they should’ve been rejected. Without feedback on the accuracy of our predictions, we cannot hope to detect genuine patterns.
Relying on data and analytics, mathematics and metrics, is better than relying on gut feelings, or on factors like the reputations or existing salaries of prospective employees or sports players – ‘sabermetrics’ revolutionised player evaluation, recruitment, development and strategy in baseball, and its ‘Moneyball’ principles now influence many other team sports too, including relatively more unpredictable games like football/soccer. Success still won’t be guaranteed because luck is still a factor (e.g. season-ending injuries caused by reckless tackles), and the past doesn’t always predict the future, but understanding the numbers increases the probabilities of success. Ego, however, can make coaches and managers believe that their own instincts are more reliable than algorithms.
When interviewing candidates for a vacancy – statistical summaries of separately evaluated relevant attributes are better than global evaluations of a candidate. Specific information about a candidate’s life in their normal environment is better than an abstract evaluation of their mental or hypothetical life. So ask questions with objective answers about their past performance, such as ‘how punctual were you in your previous employment?’ or ‘how many different jobs have you previously held?’ These are better than questions with hypothetical answers like ‘if you were stranded on an island, what three items would you take?’ or ‘how will you adjust to moving home?’
Standardised, factual questions and scores/ratings delivered in a consistent order and manner can help combat the ‘halo effect’, where favourable/unfavourable first impressions can influence later judgements. After this disciplined collection of objective data, one can add a bit of intuitive judgement, but only after such a disciplined collection of objective data because this data should hopefully anchor one’s subsequent intuitions. Don’t simply trust your or anyone’s intuitions – yet don’t dismiss them completely either. (It’s like when utilising one’s unconscious to come up with creative ideas – one must first gather as much (factual) information as possible, then close one’s eyes and let the brain mull on this information to come up with some novel ideas or insights.)
To conduct a structured interview – first select about six traits that are prerequisite for success in the vacancy (e.g. technical proficiency, people skills). These traits should be independent from each other and you should feel that you could assess them reliably by asking a few factual questions. Next make a list of those factual questions for each trait and think about how you’ll score them (e.g. on a 1-5 scale, where you should have a clear idea of what ‘very weak’ or ‘very strong’ means). If you are to weight these attributes in terms of their importance rather than equally then do so in advance to avoid ad hoc biases. To avoid the halo effect, collect information on one trait at a time, scoring each before moving onto the next, with no skips or deviations. To evaluate each candidate, add up the six scores, then hire the person with the highest final score, even if your intuition is trying to tell you otherwise – resist your temptation to append ad hoc dimensions that’ll alter the rankings to the conclusion you expected. This process is better than what most interviewers do, which is to rely on an overall intuitive judgement such as ‘I looked into their eyes and liked what I saw’.
Intuition may be fine for making short-term predictions (and this makes up the majority of people’s personal experiences of any predictive successes they’ve ever made) but it is generally poor at making long-term predictions of a year or longer ahead because these are far more difficult. Even algorithms will do only modestly well, although better than humans overall. Even relatively well-experienced people won’t have experienced many long-term events to build up their intuitions regarding these kinds of events because they logically take years until they offer their feedback, instead of instantaneous feedback. And the line between what a skilled person can do well (their genuine skill) and cannot do at all (their illusion of skill) isn’t always obvious, and not even obvious to them – they may know that they’re skilled but not necessarily the boundary of their skill (e.g. even the most skilled clinician doesn’t know everything, especially as medical science advances). So when we talk about ‘a skill’, we’re really talking about ‘a collection of skills’, where an expert will be proficient in some of them and not so much in others.
But, again, most people would rather trust in their own subtle, natural, dynamic and complex human judgements than a sterile, mechanical, static and synthetic formula or algorithm that only analyses a relatively few variables, especially when the decision affects human lives. Most people heuristically prefer things that appear naturally derived. (Sellers can increase sales by labelling a product as ‘all natural’ or ‘organic’ compared to something that isn’t labelled as such, even when they’re both the same in every other way.)
Most people currently also don’t like the idea that computers or robots can overtake their jobs thus they’ll reject them vehemently. Highly-paid people, like top division football managers or company CEOs, might feel that they need to make the decisions themselves in order to justify their salaries. A clinician may prefer to carry out an intervention despite a formula or checklist saying ‘no’, just in case, because of the potential consequences of not doing it. This may sound prudent, but overall these unnecessary treatments cost time and money – time and money that could be spent on other areas in a health system to save more lives overall. It’s not just the effect of a mistake – the cause of a mistake matters to most people too. In this case, a stone-cold algorithm versus the error of a human who was trying their best, where the emotional intensity of the anticipated injustice will be automatically matched into a moral preference.
Relying on algorithms is becoming more commonplace and everyday though (e.g. for making book or music recommendations), and so more and more people now trust them or at least accept them in their lives. Many of us are ignorant to the fact that our work and leisure lives rely on so many algorithms! It’s often the case that when something works, it becomes invisible to us and we take it for granted – it’s only when something goes wrong do we take notice.
So wherever we can replace a human judgement with an algorithm, we should at least consider it. Various experts do currently debate the impact of automated algorithms on human jobs though – some think that people will just do different jobs and others think that lots of people will be left without jobs at all. Well the fact that human experts can’t seem to even agree on the same predictions just fittingly highlights how capricious humans can be(!) Albeit not all algorithms are created equally either.
In my little humble opinion as I learn more about the world – whenever something is being described as an ‘art form’, it only really means that we don’t quite fully understand the science behind it yet. Well if there are genuine patterns to be found in a particular domain then there will be reasons behind those patterns. Computers can beat human players at chess because the algorithms for playing and improving the probabilities of winning at chess have been figured out. Chess is a game that involves a limited-size playing space with tightly-restricted moves though, which makes it far easier to computationally model than complex real-world behaviours and environments – but ultimately everything in the universe could theoretically be accurately modelled if we could understand its full complexity. Everything in the universe is underpinned by mathematical structures (including all life). Or if there are no genuine patterns in a particular domain then humans would fare no better than making random guesses too.
Woof. They’re not always perfect but algorithms are rapidly improving at a rate that’s much faster than what genetic evolution can achieve. The real issues relate to which areas should they be allowed or not allowed to be used in (e.g. criminal and racial profiling), the quality of the data that’s fed to train them (because ‘crap in, crap out’), and – especially if they’re artificial neural networks – how shall we regulate them when how they come to the decisions they come to can be so opaque? And companies and authorities, arguably, cannot pass blame onto their self-adapting algorithms because it’s their choice to use those algorithms and to feed particular sets of data for them to learn from in the first place.