In order to run the climate-modelling software more effectively, our second volunteer (Eric Raymond) bought a new motherboard that should roughly double the speed of his computer. It's a 2.66 GHz Intel Core 2 Duo. It is *not* a super-high-end quad-core hotrod; and thereby hangs a tale.
Some kinds of algorithms parallelize well - graphics rendering is one of the classic examples, signal analysis for oil exploration is another. If you are repeatedly transforming large arrays of measurements in such a way that the transform of each point depends on simple global rules, or at most information from nearby measurements, the process has what computer scientists call "good locality". Algorithms with good locality can, in theory, be distributed to large numbers of processors for a large speedup.
Some algorithms are intrinsically serial. A good example is the kind of complex, stateful logic used in optimizing the generation of compiled code from a language like C or FORTRAN.
Sometimes, you can artificially carve an intrinsically serial problem into chunks that are coupled in a controlled way - for example, to get faster compilation of a program linked from multiple modules, compile the modules in parallel and run a link pass on them all when done. This approach requires a human programmer to partition the code up into module-sized chunks and control the interfaces carefully.
The kind of math used in climate modeling has some parts have good locality (and thus could theoretically parallelize well) and others that don't. Unfortunately it's difficult to capture any benefits from throwing the parts with good locality onto multi-core machines, because recognizing that locality and using it to do automatic partitioning is hard.
Here's what "hard" means: computer scientists have been poking at the problem for four decades and parallelizing compilers are still weak, poor makeshifts that tend to be tied to specialized hardware, require tricky maneuvers from programmers, and not work all that well even on the limited categories of code for which they work at all.
What it comes down to is that if you're compiling C or Fortran climate-modeling code on a general-purpose machine, each model run is going to use one core and one only. Two cores are handy so that one of them can run flat-out doing arithmetic while the other does housekeeping and OS stuff, but above two cores diminishing returns start to set in pretty rapidly. By the time you get to quad-core machines, two of the processors will be space heaters.
This is good news, in a way. It means really expensive hardware is pointless for this job - or, to put it another way, a modern commodity PC is already nearly as well suited to the task as any hardware anywhere.
Now, to downloading the code to look more closely at what it does and to poke at it. Yeah, Eric was going to upgrade his computer soon anyway, and yes, this was an excuse...