Thursday 15 June 2006

Modelling.

As you may know, I am a computer programmer by trade. As you almost certainly know, lots of scientists these days — especially climatologists — draw conclusions about the real world from computer models. I have therefore compiled this handy list. It's a list of the questions you need to ask any scientist who has used a computer model to reach a conclusion — and I'm not just picking on the climate-change crowd here; they may be the most prominent in the news, but there are lots of other guilty parties out there in all sorts of scientific fields. If a scientist doesn't give confident and reasonable answers to these questions, take their conclusions with a handful of salt.

  • Who programmed the computer model?

  • Did the same person do the programming as did the science?

  • If not, how was the science communicated from the scientist to the programmer? Are you confident that the programmer fully understood the science?

  • If more than one person programmed the model, do they all have the same background in and approach to programming?

  • If they have different backgrounds or approaches, what did you do to ensure that their contributions to this project would be compatible and consistent?

  • What proportion of total programming time was spent on debugging?

  • Was all the debugging done by the same person?

  • If not, was there a set of rules governing preferred debugging methods?

  • If so, are you sure everyone followed said rules to the letter?

  • Did any of the debugging involve putting in any hacks or workarounds?

  • If not, could you pull the other one, which has bells on?

  • Is there any part of the program which just works even though it looks like it probably shouldn't?

  • Are there any known bugs in the computer hardware or operating system that you run your model on?

  • If so, what have you done to ensure that none of those bugs affects your model?

  • What theories did you base the model on?

  • What proportion of these theories are controversial and what proportion are pretty much proven valid?

  • What information did you put into the model?

  • Where did this information come from?

  • How accurate is the information?

  • Have you at any point had to use similar types of information from significantly different sources? Have you, for instance, got some temperature data from thermometers and some other temperature data from tree rings?

  • If so, what have you done to ensure that these different data sources are compatible with each other?

  • If you've done something to make different data sources compatible, did it involve using any more theories to adjust data? If so, see the previous questions about theories.

  • Where you couldn't get real-world information, what assumptions did you use?

  • What is your justification for those assumptions?

  • Do any other scientists in your field tend to use different assumptions?

  • Have any of your theories, information, or assumptions led to inaccurate predictions in the past?

  • If so, why are you still using them?

  • If they previously led to inaccurate predictions, do you know why?

  • If you think you know why they led to inaccurate predictions, what else have you done to test them before using them in this model?

  • How many predictions has your computer model led to that have been verified as accurate in the real world?

  • How accurate?

  • Has any other computer model used roughly the same theories, assumptions, and data as yours to give significantly different conclusions?

  • If so, do you know why the conclusions were different?

  • How much new information has your computer model given you?


Most of the time, programmers ignore most of these questions. But then, most of the time, programmers aren't asking the world's governments to force all their people into a lower standard of living.

Also, programmers are generally creating software which merely has to work well enough, because the whole point of what we're doing is to create tools, not to discover facts. Who cares whether Excel crashes now and then when it's so powerful most of the time? It is merely a tool, and it works. That's all you need.

But scientists aren't creating mere tools: they are trying to discover facts about the world, often about the future. They are trying to find out things that they would not otherwise know. This means that there is no way for them to verify their results until it is too late. When your target is to achieve something, faulty bits of information don't matter as long as you achieve it. When your target is to discover information, every single bit of faulty information pushes you further from your target.

The software on your mobile phone is buggy. Yet you call someone and get through and talk to them. If you can understand each other, you've verified that the software works well enough; the bugs don't matter. But until you try to make a call, you don't know whether it works. The software Burger King use in their tills is buggy. Yet they usually charge you the right amount of money for your delicious Whopper and get near enough to balancing their books at the end of each month: so the software works well enough; the bugs don't matter. But until they use the tills and do their accounts, they don't know whether it works. The software used by a climatologist is buggy. They say that world temperatures will rise 2 degrees by 2120. And if they wait until 2120 and measure world temperatures and see that they have indeed increased by 2 degrees, then they'll know that the software works well enough; that the bugs don't matter. Until then, they won't know.

The last one is a trick question, by the way. Either the answer is "None" or the scientist knows nothing about computers and should be ignored at all costs. No computer has ever given any human being any new information whatsoever, because they are literally incapable of doing so.

As far as I can see, the usual answer is "Lots."

No comments: