When you say "computer models", you do mean computer models who've made predictions into the future, and we've waited around and seen if these predictions came true, right? And that we've done this several times?
You're not talking about computer models which have only predicted the past based on the further past (i.e. predicted known observations), right?
This sort of post is pretty rare in the blogosphere (which is more frequently a debating club than an avenue for inquiry), and I think it reflects well on your character and the site.
You actually believe that it is "much much harder" to look at current conditions and predict whether cloud formations will propagate rain in 12 hours or not, yet climate is easier to predict, even as the results of computer models that you believe in can not take 1975 data and predict today.
Climate on this planet has always been very variable. The reason we are here today and not, say, 55 million years ago is because we are in what climatologists call a climate optimum. Now we have politicians attempting to seize power (is seize too strong a word?) by the old trick of predicting the future and taking responsibility for it, the way The Connecticut Yankee in King Arthur;s Court seized power when he could predict the onset of a solar eclipse. He could not cause an eclipse, and you and Algore and seventy million credulous fools can not convince the rest of us that you are the wizard either.
Well, guess what? Reducing or increasing CO2 levels by a fraction of one percent is NOT the cause of Earth's climate variability. Changing the rate of increase of emission of CO2, or even causing a slight decline in same, will, perhaps, be an issue into office for Algore and some others, but if we indeed wish to alter the future climate of our planet, someone, somewhere better start studying that subject. But NOOOOO. Instead of trying to figure out how to take control of a naturally variable climate, we study ways to reduce CO2 emission. A classic misdirection play. Merely punishing the industrialized nations by forcing us to force ourselves to become incrementally poorer, well, let's just say that it has not worked yet. Kyoto signers have not done it yet, and the USA, China and India will never do so, so you guys better get yourselves a better plan.
This would be funny if it were not so serious. Like Bush v. Gore, our political classes never give us a choice between real alternatives. When they tell us that the Earth can be saved by raising CAFE standards, you gotta realize that you are being lied to, by both sides of the dabate. This entire global warming debate reminds me of one of the sayings of Charles DeGaulle, when asked what his political philosophy was, he replied "Anticipate the inevitable, and support it."
Actually, I'm with Aziz on his assertion that predicting weather is harder than predicting climate.
For the statistically inclined, just consider applying the Law of Large Numbers to a random distribution and realize that climate is the average of samples and weather is a particular sample.
For the rest, sure as Michael points out, it's easy to predict the weather a short time in advance, but it's even easier to predict climate (which works in generalities) a short time in advance. Which of the following two predictions is more likely to be accurate?
1) The Midwest will be hot this summer with occasional afternoon storms.
2) On August 3rd, Des Moines will have a high of 83 and a quarter inch of rain fall.
Climate, being the average of a large number weather events, will always be easier to predict than an individual weather event if you give them the same time scale to work with.
However, the point that shouldn't be lost in this discussion is that the climate models being used to make these drastic future predictions and as a basis for drastic economic changes fail the most basic simulation test - given past inputs, can they reproduce observed results?
I can't stress enough how much a fundamental methodological flaw it is to be claiming results from unvalidated models. Being a wireless guy, the first test that I use when determining what channel model to apply is comparing statistics from the simulated model against measured statistics. The model could be derived from first principles (e.g., Ricean fading) or blindly from the measurements themselves (e.g., Hidden Markov Models), but if the model doesn't match up with measurements, then simulations based on that model are considered invalid.
Based on reports cited in the previous thread and what I've read elsewhere, I am simply dumbfounded that so many climate model predictions have passed peer review without model validation, particularly at least for first and second order statistics. And whatever state of shock is beyond dumbfounded is what I am that massive policy shifts are being insisted upon when the models are so inaccurate.
Fortunately, this problem appears to have been noted by the climate community (see for example this link where researchers are attempting to improve their models though the errors are far to great to be useful). So I expect that over the coming years, we'll have better models and once the models are better, I'll feel much more comfortable with any proposed solutions. (I also believe that the investment of five or so years cleaning up the models will be well worth it assuming that's where the effort is actually placed.)
Because for any dataset, you can make models that predict it. Every set of data, even if it's randomly generated, has patterns in it. Most data sets have patterns which mean nothing. When all you do is predict what's already known, you might just be picking up on meaningless patterns. You can only start to differentiate between meaningful patterns and meaningless patterns by using new data that wasn't available when the model was made.
There's a related truism which applies to mathematical modeling: a high enough order polynomial will fit any data set. It's actually easy, for any discrete data set, to find a polynomial which will yield all of the data in your dataset. The real test is to then have your polynomial predict data points that aren't in your set, and get new data, and see if it works. (E.g. if your data is being generated by something cyclic, then your polynomial is guaranteed to fail fairly quickly, and you'll see that only with new data.)
The problem with modeling complex systems is that they are complex, and unless you completely understand the system, you can never know if you've actually accounted for all of the influences on the system.
For example, it's unlikely that the computer models being used accurately predict how plants will respond to a hotter earth because that's going to depend on the actual species on our planet; they could all die off or some of them could thrive in a carbon-depleting way that drives us into another ice-age (unless we all start flying on chartered jets). The reaction of plants to their new environment may or may not be significant; that depends on things that we almost certainly don't know. We're far from having a complete catalog of all plants that live on the planet.
There's no way to know if a computer model has properly taken these plants, which might exist, into account. Predicting how cloud cover will reduce sunlight in the right places, or how increased phytoplankton growth will cool the earth by converting incoming solar radiation into chemical energy (as opposed to it sticking around as heat energy), or how in a hotter climate a subspecies which grows much faster will start to dominate might all happen. We don't know if such things exist (but the biological world yields to nothing in complexity; it would hardly be surprising if such surprises were in store for a hotter world). The only way to find out if the computer models accurately model them is to run the computer models against data that the models were not built for.
This has been the bane of modeling in general; unless your model is built on a complete understanding of all of the mechanisms involved, models tend to do well on the data sets that you developed them and not so well when applied to the real world. Models are by no means useless; they just need a lot of testing.
There's a reason why engineers still build prototypes after testing their designs in CAD software, though CAD software is getting better all of the time, and its modeling is increasingly correct. CAD software gets an awful lot of testing against data sets that weren't available during the design. (I'm using the term CAD software to include all of the testing components where you can assign material strengths and loads and so forth and then have the software model how the thing will behave under stress.)
Predicting weather is also subject to a lot more testability than predicting the climate is. If I predict that it's going to rain in my neighborhood on thursday, we can all tell pretty quickly.
If I predict that the average summer temperature of my neighborhood is going to go down by 1 degrees Celcius over the next hundred years, very few people are going to be around to remember my prediction. And if I'm wrong, it probably won't be by much, percentage-wise, anyhow.
The other thing about predictions which are the sum of unpredictable things, is that in many cases you don't get all that much precision. Presidential votes are the sum of an awful lot of unpredictable decisions, and people get that sum wrong pretty frequently.
You're right, though, that summing unknown quantities can be successful successful when you have some reason to suspect a bias, e.g. quantum physics.
Whether climate models involve a thorough-enough understanding of all of the biases (and the biases that will crop up in reaction to the biases), is less... certain.
Plus, predicting the weather is is pretty easy if you don't constrain yourself to the details. E.g. The temperature on thursday near my apartment will be between 270K and 310K. The atmosphere will not shoot up into space for a day. Humidity will be between 0% and 100%, with no super-saturation or massive flooding (we live about 2000' above sea-level with no major rivers nearby).
The trick to predicting the future, as most good psychics know or intuit, is to make general predictions sound specific.
the only way that data can train on meaningless patterns is if you allow the validation and training sets to mingle. If you build a model of data from 1890-1955, and then predict 1959-1975 (or whatever), then this is just as valid as building a model using data from 1940-2005 and predicting 2009-whenever. The only difference is that you have instant validation on whether your model is correct or not. in the latter case you'd have to wait a few years. Just because the data is already recorded doesn't have anything to do with anything. In either case your "predictions" could be based on meaningless trends, thats a risk you run whenever you try and build a predictive model. Which is why climate researchers generally try to restrict their variable sets to meaningful parameters with some scientific justification for how such a variable could effect climate patterns.
What you say about knowing all of the variables in the system is obviously true, but it has no bearing on using past versus future data. So long as you make a model based on data from some date range and then predict patterns for a separate date range, it is totally irrelevent where the date ranges fall with respect to the present.
You're right, if the people building the model in question actually don't know the data that they're going to test on. Where the relevance of where the data sets fall with respect to the present is that future data we can guarantee that the model-builders don't know. To what degree the model-builders actually don't know historical data that they're using as a test is much harder to determine.
E.g. I have a hard time believing that climate researchers building a predictive model would actually be completely ignorant of the data for 1940-2005. Absent that complete ingorance, you have some mingling. Especially since we're not talking about very big changes, that can easily be significant. 0.1C warming per decade can be the result of a very subtle bias.
Worse, every time you test a model which fails and then build a better one, you're tainting the validation data set. Every time you find out that your model was off by a certain amount, and go back and fix your model so that it won't be off by that amount (and you'll do so because you'll have to be an idiot to try again with a model that will fail in the same way), you've incorporated the validation data set into the construction of your model. I guess that they could have clean-room tested the models and got only a pass-fail grade, but does anyone in climate research actually use this sort of rigorous, horribly inefficient approach?
Absent that sort of double-blind pass-fail approach, It would also work out if the researchers were in total ignorance of all climate data between 1940 and 2005 (for example they had no idea whatsoever that the models are supposed to predict warmer by the end), and happened to get their model right on the very first try. Then all would be well.
But I don't think that these sort of fanciful scenarios are sufficiently realistic as to be worth considering. Absent them, all testing on historical data will involve feedback based on that data to get it right. For example, how many people who tested against 1940-2005 tested before they had a model which at least predicted the rough outline of what they knew it to be? How many people bothered tested a model which showed uniform global cooling?
The value to future data is that it avoids even good-faith cheating, whereas it's very hard to clean-room engineer something when everyone who might built the thing is already tainted with knowledge.
First, I didn't say anything about the output of stars, although any pat on the head is nice!
Second, I mis-spoke. I should have said climate -not weather- models. As far as I know, climate models can't take a given base state (say 1950), go 30 or 50 years into the "future" (our past) and end at an accurate forecast of what actually happened in 1980 or 2000.
Aziz may know of such models, but I haven't heard of 'em.
Finally I feel very few people completely reject peer review, etc., it is neither extremely cynical nor conspiracist to state that some politics (in this case -as explained in my last post- politics here is used the the more generic sense of groups of humans with conflicting goals trying to achieve them said goals) exist in the scientific community.
If Aziz has problems with this statement (which he dismisses as a "cop out" without any sort of support except personal experience in a single field), I would direct his attention to the regular and flagrant practices of folks such as Scientific American, who have engaged in politics (in all senses of the word) for over 30 years, pushing for specific agendas and ignoring papers which challenge said agendas. I also cite groups such as the Union of Concerned Scientists, a political action group if there ever was one! Check out their website.
Anyone who wishes to do so will find it quite easy to discover all sorts of political influences in the scientific community. While Aziz can dismiss (again, merely by assertation) the reality by claims of conspiracy, he would do better to prove that the scientific community is free of any political bias.
I think we're talking past each other here. I'm assuming the researchers know the climate data for all times in the past for which there is data. But the researchers aren't there strenuously figuring out each parameter in their model themselves. There are efficient and neutral computer algorithms available for that. The idea is that the researchers feed the algorithm data from one set, the so-called training set. The algorithm then smartly tries to determine which variables are varying together, and puts together trends and patterns based on the training set. The researchers then use this model to try and formulate predictions. This is the so-called validation set. Therefore, for the purposes of this example, you can see that an honest researcher will create independent training and validation sets. So the computer, when building up its model, is truly in the dark about the time periods the researcher is leaving out for prediction.
In other words, the computer operates like a black box. The researcher feeds in data corresponding to years A to B. The black box spits out data corresponding to its prediction for years C to D (where both C and D are greater than A and B). The researcher then takes this data and compares it to the actual recorded data for years C to D and determines if the model is any good.
hope that made some sense. In any case, intermingling the training and validation sets will, as you rightly say, cause perfect results based on spurious trends that won't be borne out in other, independent validation trials. Therefore any scientist who's worth anything will be careful to prevent this from happening. But that doesn't limit him to only prediction on unknown-to-him data. It only has to be unknown to whatever's building the model. In this case, the computer.
Aziz, that some high-quality research you must have been doing to come up with all this. I studied meteorology for only a semester in the early 1950s, so I have just enough knowledge to avoid confusing the two.
Now you have me wondering whether or not a temporary fluctuation in solar radiation is responsible for trends in Earth warming and cooling over long periods of time. Knowing little about heat output from stellar bodies, I would have to depend on expert judgement. Perhaps such as yours.
If your assumptions are corrent in this matter, it could alter the endless arguing over Earth warming and cooling. On the other hand, are politically motivated arguments ever settled by science, hard data and logic?
As far as I know, climate models can't take a given base state (say 1950), go 30 or 50 years into the "future" (our past) and end at an accurate forecast of what actually happened in 1980 or 2000.
No 'model' produces the same result as reality, because by definition models are simple when compared to reality. But models can predict where natural processes are PROBABLE to go, and in that case, yes we have models that can take data and show similar results to what happened 30 years later in climate. They predicted el nino and its consequences, though that was over a shorter time span, just as one example.
We model nuclear reactions, ballistics, medical proceedures, chemical reactions, the start of the universe, all of them prove enormously useful in your life every single day from national security to how your car runs. None of them equate to reality though, they just approximate it.
If you are expecting exactitude, especially when modeling something as inherantly chaotic as a thermodynamic system, you just won't understand the science enough to come to reasonable conclusions.
hurricane prediction is amazingly good. Of course not perfect nor omniscient; we thought Rita was going to hammer Galveston when at the last minute it turned north. But the accuracy steadily increases and the warning window steadily gets longer as well.
Also you dont need to model the entire universe at one shot; you just need to model the major proceses and see whether the result gives you the right trends. Its like when you fit data points to a line; the final fit doesnt ever actually intersect teh data, but it does go along the least-squares path that is closest to all of them collectively.
"Also you dont need to model the entire universe at one shot; you just need to model the major proceses and see whether the result gives you the right trends."
The problem with that is that, in the grand scheme of things, climate change is very gradual, especially the kind which is supposedly caused by CO2.
Errors are frequencly found in the functions or parameters used in models which swamp the supposed amount of global warming in terms of magnitude. Doubling CO2 is widely believed to mean an increase of something on the order of 2.7W/m^2 of radiation reaching the surface of the earth. Yet the models often calculate the effects of, say, cloud cover or water vapour feedback in ways which are off by larger amounts than this.
Most modellers--the honest ones, anyway--will admit there are plenty of processes not fully understood which could have effects of that magnitude or larger. Given that, how can we expect models to accurately predict the consequences of changes in CO2 levels, solar brightness, etc. - changes which are small in the grand scheme of things?
Sure, but the problem isn't escaped because you keep fine-tuning the model until it produces the results which match your available data, regardless of how many intermediate steps you introduce.
The point is that if the researchers keep fiddling with something — anything — until the model accurately predicts the validation data, then what you're doing is training to the validation data.
Now, there will undoubtedly be a component which is trained on a different data set, but that's not the entire process. So long as researchers make a model and throw out all of those that don't predict the validation set (i.e. so long as they develop the model, rather than pull it out of thin air correctly, the very first time), you're training to it in a meta-training sense; you're actually evolving your models with a selection criteria to predict that validation data from that training data.
And that's why it's better by far to validate against new data. When you are constantly varying the validation data, you can't be evolving a model to the eccentricities of one bit of data.
Again, this isn't a function of how the model actually works. Regardless of whether the model trains on data, has magic values plugged in, or is created by a random number generator, if it's being validated against a static data set then it's being evolved to that particular data set. So long as you'll throw out any model which doesn't predict the validation data better than what you've got now, you're evolving a system which might be predicting your validation data by luck. It doesn't matter what the researcher is changing; so long as they change anything which has the effect of changing the prediction, this problem comes about.
plants will respond to a hotter earth because that's going to depend on the actual species on our planet; they could all die off or some of them could thrive in a carbon-depleting way that drives us into another ice-age (unless we all start flying on chartered jets). The reaction of plants to their new environment may or may not be significant; that depends on things that we almost certainly don't know. We're far from having a complete catalog of all plants that live on the planet.
There's no way to know if a computer model has properly taken these plants, which might exist, into account. Predicting how cloud cover will reduce sunlight in the right places, or how increased phytoplankton growth will cool the earth by converting incoming solar radiation into chemical energy (as opposed to it sticking around as heat energy), or how in a hotter climate a subspecies which grows much faster will start to dominate might all happen. We don't know if such things exist
To me, this is the very crux of this issue. I find it eminently more likely - a downright certainty, in fact - that the above scenario, or something like is exactly what will happen. To think that this giant planet of ours is so fragile that our little CO2 contribution, or anything else for that matter, has such a terrible effect, is the HEIGHT of human arrogance. The earth will no doubt adapt, survive, and thrive. If a couple species aren't around to see it, so be it. If Homo Sapien is one of them, well, tough shit.
I have ZERO confidence that anything can be done to intentionally affect the earth's climate, especially when 80% of the world is living in third world conditions and clearly has designs on industrialization or at least modernization. The US can go absolutely hog wild reducing pollution and CO2 emissions and whatever else, but countries like India and China not only won't reduce their emissions, it is a virtual certainty that they will increase them 2 or 3-fold.
"I have ZERO confidence that anything can be done to intentionally affect the earth's climate, especially when 80% of the world is living in third world conditions and clearly has designs on industrialization or at least modernization."
Several years ago I read a paper which concerned reduction of insolation by launching huge mylar sheets into orbit, and manipulating them to provide incremental shadow, to affect weather and climate. This was before the warming debate, and the knock on the deal was that we needed more warming, as an ice age was approaching. Since all science is political, the idea vanished.
If altering weather is the desired result, we would approach the subject from that angle. If disarming the West economically is the desired goal, one could go about slowing down industrialization by attacking, say, fossil fuel use, while blocking non-emitting power sources, like nuclear.
Most weather alteration research has been wound down (as far as I know) due to tort liability issues. If we make it rain in one area, another area will sue when they experience dorught. If we change the course of a hurricane, to save one city, the city next door, that gets hit anyway, will sue, claiming that it would never have hit them without interference. If the people in power really believed that we were in trouble, these would be trivial obstacles to taking action to save the planet. Since they believe no such thing, they play with "global warming" in order to achieve political power. I will know that Al Gore believes his own lies about CO2 when he advocates clean, safe, nuclear power, and plug-in electric cars. As long as he is against the only proven, non-emitting, and inexhaustible power source, I know him to be the lying fraud that he so clearly is.
I agree that predicting the weather is different than predicting the climate. But the criticism I've read of the current climate models and their use in global warming is that even using the best climate models we have, the margin of error for predicting climate 50 or 100 years out is +/- several degrees C. (i.e. take all the climate data from 50 years ago, plug it in to the best climate model we have, and it will come within +/- 3 degrees C of the current climate.)
In other words, the model's predicted rise in temperature due to global warming is less than the margin of error in the model. Which means it's statistically meaninless.
Sadly, I don't have links for this. I'm typing from memory. If someone has links defining the margin of error for the current climage models vs historical data, I'd like to see them. (And if I'm wrong about the margins of error, I'll apologize and retract.)
the future is a static data set, too. we just don't know what it is yet. Obviously training, fiddling, and retraining is not a valid way to go about generating a model regardless of if you use known past data or wait until you know some future data. The problem is the same in either case.
I'm not making a position here on the right or wrong way to create computer models, but I am saying that once you've recorded the data, it is then static. Thus any data a computer is training on is going to be static. Obviously as new information becomes available, that new information can be put into the model, and no one here is seriously suggesting that we accept as truth a model which accurately predicts past data but inaccurately predicts future data. My point is that your distinction between future and past data is illusory. If you made a model that predicted climate for 2005 accurately, obviously you would test to see if it did just as well at predicting 2007. but at that point, 2007 is simply another validation set. there's nothing special about it occuring after you generated the model. and if a model trained on data from the 60s accurately predicts data from the 70s, then it is by definition reporting on meaningful trends, since the two datasets are independent.
Whether the future is static is a philosophical question. What's relevant is that the future is (1) unknown and (2) it contains a lot more data than the pittance we have now.
The thing about using future data is that you'll always be using different data. This is the crux of #2. The future constantly unfolds new data for you to try your model on. It's constantly giving you new data in which illusory patterns disappear, leaving only the meaningful patterns.
You've got to get past the issue of training data; obviously the training data is going to be static since you can only get it once it is. The point is what you validate on. And the point isn't about how good the model is, but about how trustworthy the model is. When evaluating a model for how much we trust it, we want to the circumstances of its creation and especially of its validation to make it to the greatest degree impossible that the modelers were lucky.
This is especially true of climate modeling; a climate model is supposed to take in a whole lot of variables between 1850 and 1940, and then output a slight warming trend after 1940. All sorts of things vary over time; heck: a model which shows a warming trend regardless of the data that you train it on would pass this validation data. We, as people who are supposed to act based on the predictions of this model, have to guess at whether the model got the correct answer for the wrong reasons, or the right reasons.
Validating these models against future data is what will lend them credibility that they're going to predict the future well. A million wrong models can validate against any particular data set; it's only the one right model which will consistently validate against the ever-changing data set of what we've just learned.
That's the point — accurately predicting the past has never been very impressive, and never will be, because predicting what you already know can be done more easily by error than by understanding. (I'm talking here of the humans making the models and selecting or rejecting them, not of the models themselves.)
You're not talking about computer models which have only predicted the past based on the further past (i.e. predicted known observations), right?
what's the difference?
Climate on this planet has always been very variable. The reason we are here today and not, say, 55 million years ago is because we are in what climatologists call a climate optimum. Now we have politicians attempting to seize power (is seize too strong a word?) by the old trick of predicting the future and taking responsibility for it, the way The Connecticut Yankee in King Arthur;s Court seized power when he could predict the onset of a solar eclipse. He could not cause an eclipse, and you and Algore and seventy million credulous fools can not convince the rest of us that you are the wizard either.
Well, guess what? Reducing or increasing CO2 levels by a fraction of one percent is NOT the cause of Earth's climate variability. Changing the rate of increase of emission of CO2, or even causing a slight decline in same, will, perhaps, be an issue into office for Algore and some others, but if we indeed wish to alter the future climate of our planet, someone, somewhere better start studying that subject. But NOOOOO. Instead of trying to figure out how to take control of a naturally variable climate, we study ways to reduce CO2 emission. A classic misdirection play. Merely punishing the industrialized nations by forcing us to force ourselves to become incrementally poorer, well, let's just say that it has not worked yet. Kyoto signers have not done it yet, and the USA, China and India will never do so, so you guys better get yourselves a better plan.
This would be funny if it were not so serious. Like Bush v. Gore, our political classes never give us a choice between real alternatives. When they tell us that the Earth can be saved by raising CAFE standards, you gotta realize that you are being lied to, by both sides of the dabate. This entire global warming debate reminds me of one of the sayings of Charles DeGaulle, when asked what his political philosophy was, he replied "Anticipate the inevitable, and support it."
For the statistically inclined, just consider applying the Law of Large Numbers to a random distribution and realize that climate is the average of samples and weather is a particular sample.
For the rest, sure as Michael points out, it's easy to predict the weather a short time in advance, but it's even easier to predict climate (which works in generalities) a short time in advance. Which of the following two predictions is more likely to be accurate?
1) The Midwest will be hot this summer with occasional afternoon storms.
2) On August 3rd, Des Moines will have a high of 83 and a quarter inch of rain fall.
Climate, being the average of a large number weather events, will always be easier to predict than an individual weather event if you give them the same time scale to work with.
However, the point that shouldn't be lost in this discussion is that the climate models being used to make these drastic future predictions and as a basis for drastic economic changes fail the most basic simulation test - given past inputs, can they reproduce observed results?
I can't stress enough how much a fundamental methodological flaw it is to be claiming results from unvalidated models. Being a wireless guy, the first test that I use when determining what channel model to apply is comparing statistics from the simulated model against measured statistics. The model could be derived from first principles (e.g., Ricean fading) or blindly from the measurements themselves (e.g., Hidden Markov Models), but if the model doesn't match up with measurements, then simulations based on that model are considered invalid.
Based on reports cited in the previous thread and what I've read elsewhere, I am simply dumbfounded that so many climate model predictions have passed peer review without model validation, particularly at least for first and second order statistics. And whatever state of shock is beyond dumbfounded is what I am that massive policy shifts are being insisted upon when the models are so inaccurate.
Fortunately, this problem appears to have been noted by the climate community (see for example this link where researchers are attempting to improve their models though the errors are far to great to be useful). So I expect that over the coming years, we'll have better models and once the models are better, I'll feel much more comfortable with any proposed solutions. (I also believe that the investment of five or so years cleaning up the models will be well worth it assuming that's where the effort is actually placed.)
Because for any dataset, you can make models that predict it. Every set of data, even if it's randomly generated, has patterns in it. Most data sets have patterns which mean nothing. When all you do is predict what's already known, you might just be picking up on meaningless patterns. You can only start to differentiate between meaningful patterns and meaningless patterns by using new data that wasn't available when the model was made.
There's a related truism which applies to mathematical modeling: a high enough order polynomial will fit any data set. It's actually easy, for any discrete data set, to find a polynomial which will yield all of the data in your dataset. The real test is to then have your polynomial predict data points that aren't in your set, and get new data, and see if it works. (E.g. if your data is being generated by something cyclic, then your polynomial is guaranteed to fail fairly quickly, and you'll see that only with new data.)
The problem with modeling complex systems is that they are complex, and unless you completely understand the system, you can never know if you've actually accounted for all of the influences on the system.
For example, it's unlikely that the computer models being used accurately predict how plants will respond to a hotter earth because that's going to depend on the actual species on our planet; they could all die off or some of them could thrive in a carbon-depleting way that drives us into another ice-age (unless we all start flying on chartered jets). The reaction of plants to their new environment may or may not be significant; that depends on things that we almost certainly don't know. We're far from having a complete catalog of all plants that live on the planet.
There's no way to know if a computer model has properly taken these plants, which might exist, into account. Predicting how cloud cover will reduce sunlight in the right places, or how increased phytoplankton growth will cool the earth by converting incoming solar radiation into chemical energy (as opposed to it sticking around as heat energy), or how in a hotter climate a subspecies which grows much faster will start to dominate might all happen. We don't know if such things exist (but the biological world yields to nothing in complexity; it would hardly be surprising if such surprises were in store for a hotter world). The only way to find out if the computer models accurately model them is to run the computer models against data that the models were not built for.
This has been the bane of modeling in general; unless your model is built on a complete understanding of all of the mechanisms involved, models tend to do well on the data sets that you developed them and not so well when applied to the real world. Models are by no means useless; they just need a lot of testing.
There's a reason why engineers still build prototypes after testing their designs in CAD software, though CAD software is getting better all of the time, and its modeling is increasingly correct. CAD software gets an awful lot of testing against data sets that weren't available during the design. (I'm using the term CAD software to include all of the testing components where you can assign material strengths and loads and so forth and then have the software model how the thing will behave under stress.)
Predicting weather is also subject to a lot more testability than predicting the climate is. If I predict that it's going to rain in my neighborhood on thursday, we can all tell pretty quickly.
If I predict that the average summer temperature of my neighborhood is going to go down by 1 degrees Celcius over the next hundred years, very few people are going to be around to remember my prediction. And if I'm wrong, it probably won't be by much, percentage-wise, anyhow.
The other thing about predictions which are the sum of unpredictable things, is that in many cases you don't get all that much precision. Presidential votes are the sum of an awful lot of unpredictable decisions, and people get that sum wrong pretty frequently.
You're right, though, that summing unknown quantities can be successful successful when you have some reason to suspect a bias, e.g. quantum physics.
Whether climate models involve a thorough-enough understanding of all of the biases (and the biases that will crop up in reaction to the biases), is less... certain.
Plus, predicting the weather is is pretty easy if you don't constrain yourself to the details. E.g. The temperature on thursday near my apartment will be between 270K and 310K. The atmosphere will not shoot up into space for a day. Humidity will be between 0% and 100%, with no super-saturation or massive flooding (we live about 2000' above sea-level with no major rivers nearby).
The trick to predicting the future, as most good psychics know or intuit, is to make general predictions sound specific.
the only way that data can train on meaningless patterns is if you allow the validation and training sets to mingle. If you build a model of data from 1890-1955, and then predict 1959-1975 (or whatever), then this is just as valid as building a model using data from 1940-2005 and predicting 2009-whenever. The only difference is that you have instant validation on whether your model is correct or not. in the latter case you'd have to wait a few years. Just because the data is already recorded doesn't have anything to do with anything. In either case your "predictions" could be based on meaningless trends, thats a risk you run whenever you try and build a predictive model. Which is why climate researchers generally try to restrict their variable sets to meaningful parameters with some scientific justification for how such a variable could effect climate patterns.
What you say about knowing all of the variables in the system is obviously true, but it has no bearing on using past versus future data. So long as you make a model based on data from some date range and then predict patterns for a separate date range, it is totally irrelevent where the date ranges fall with respect to the present.
You're right, if the people building the model in question actually don't know the data that they're going to test on. Where the relevance of where the data sets fall with respect to the present is that future data we can guarantee that the model-builders don't know. To what degree the model-builders actually don't know historical data that they're using as a test is much harder to determine.
E.g. I have a hard time believing that climate researchers building a predictive model would actually be completely ignorant of the data for 1940-2005. Absent that complete ingorance, you have some mingling. Especially since we're not talking about very big changes, that can easily be significant. 0.1C warming per decade can be the result of a very subtle bias.
Worse, every time you test a model which fails and then build a better one, you're tainting the validation data set. Every time you find out that your model was off by a certain amount, and go back and fix your model so that it won't be off by that amount (and you'll do so because you'll have to be an idiot to try again with a model that will fail in the same way), you've incorporated the validation data set into the construction of your model. I guess that they could have clean-room tested the models and got only a pass-fail grade, but does anyone in climate research actually use this sort of rigorous, horribly inefficient approach?
Absent that sort of double-blind pass-fail approach, It would also work out if the researchers were in total ignorance of all climate data between 1940 and 2005 (for example they had no idea whatsoever that the models are supposed to predict warmer by the end), and happened to get their model right on the very first try. Then all would be well.
But I don't think that these sort of fanciful scenarios are sufficiently realistic as to be worth considering. Absent them, all testing on historical data will involve feedback based on that data to get it right. For example, how many people who tested against 1940-2005 tested before they had a model which at least predicted the rough outline of what they knew it to be? How many people bothered tested a model which showed uniform global cooling?
The value to future data is that it avoids even good-faith cheating, whereas it's very hard to clean-room engineer something when everyone who might built the thing is already tainted with knowledge.
Second, I mis-spoke. I should have said climate -not weather- models. As far as I know, climate models can't take a given base state (say 1950), go 30 or 50 years into the "future" (our past) and end at an accurate forecast of what actually happened in 1980 or 2000.
Aziz may know of such models, but I haven't heard of 'em.
Finally I feel very few people completely reject peer review, etc., it is neither extremely cynical nor conspiracist to state that some politics (in this case -as explained in my last post- politics here is used the the more generic sense of groups of humans with conflicting goals trying to achieve them said goals) exist in the scientific community.
If Aziz has problems with this statement (which he dismisses as a "cop out" without any sort of support except personal experience in a single field), I would direct his attention to the regular and flagrant practices of folks such as Scientific American, who have engaged in politics (in all senses of the word) for over 30 years, pushing for specific agendas and ignoring papers which challenge said agendas. I also cite groups such as the Union of Concerned Scientists, a political action group if there ever was one! Check out their website.
Anyone who wishes to do so will find it quite easy to discover all sorts of political influences in the scientific community. While Aziz can dismiss (again, merely by assertation) the reality by claims of conspiracy, he would do better to prove that the scientific community is free of any political bias.
Good luck on that, by the way! :)
I think we're talking past each other here. I'm assuming the researchers know the climate data for all times in the past for which there is data. But the researchers aren't there strenuously figuring out each parameter in their model themselves. There are efficient and neutral computer algorithms available for that. The idea is that the researchers feed the algorithm data from one set, the so-called training set. The algorithm then smartly tries to determine which variables are varying together, and puts together trends and patterns based on the training set. The researchers then use this model to try and formulate predictions. This is the so-called validation set. Therefore, for the purposes of this example, you can see that an honest researcher will create independent training and validation sets. So the computer, when building up its model, is truly in the dark about the time periods the researcher is leaving out for prediction.
In other words, the computer operates like a black box. The researcher feeds in data corresponding to years A to B. The black box spits out data corresponding to its prediction for years C to D (where both C and D are greater than A and B). The researcher then takes this data and compares it to the actual recorded data for years C to D and determines if the model is any good.
hope that made some sense. In any case, intermingling the training and validation sets will, as you rightly say, cause perfect results based on spurious trends that won't be borne out in other, independent validation trials. Therefore any scientist who's worth anything will be careful to prevent this from happening. But that doesn't limit him to only prediction on unknown-to-him data. It only has to be unknown to whatever's building the model. In this case, the computer.
Now you have me wondering whether or not a temporary fluctuation in solar radiation is responsible for trends in Earth warming and cooling over long periods of time. Knowing little about heat output from stellar bodies, I would have to depend on expert judgement. Perhaps such as yours.
If your assumptions are corrent in this matter, it could alter the endless arguing over Earth warming and cooling. On the other hand, are politically motivated arguments ever settled by science, hard data and logic?
Arnold Harris
Mount Horeb WI
Arnold Harris
Mount Horeb WI
Arnold Harris
Mount Horeb WI
No 'model' produces the same result as reality, because by definition models are simple when compared to reality. But models can predict where natural processes are PROBABLE to go, and in that case, yes we have models that can take data and show similar results to what happened 30 years later in climate. They predicted el nino and its consequences, though that was over a shorter time span, just as one example.
We model nuclear reactions, ballistics, medical proceedures, chemical reactions, the start of the universe, all of them prove enormously useful in your life every single day from national security to how your car runs. None of them equate to reality though, they just approximate it.
If you are expecting exactitude, especially when modeling something as inherantly chaotic as a thermodynamic system, you just won't understand the science enough to come to reasonable conclusions.
Also you dont need to model the entire universe at one shot; you just need to model the major proceses and see whether the result gives you the right trends. Its like when you fit data points to a line; the final fit doesnt ever actually intersect teh data, but it does go along the least-squares path that is closest to all of them collectively.
The problem with that is that, in the grand scheme of things, climate change is very gradual, especially the kind which is supposedly caused by CO2.
Errors are frequencly found in the functions or parameters used in models which swamp the supposed amount of global warming in terms of magnitude. Doubling CO2 is widely believed to mean an increase of something on the order of 2.7W/m^2 of radiation reaching the surface of the earth. Yet the models often calculate the effects of, say, cloud cover or water vapour feedback in ways which are off by larger amounts than this.
Most modellers--the honest ones, anyway--will admit there are plenty of processes not fully understood which could have effects of that magnitude or larger. Given that, how can we expect models to accurately predict the consequences of changes in CO2 levels, solar brightness, etc. - changes which are small in the grand scheme of things?
Sure, but the problem isn't escaped because you keep fine-tuning the model until it produces the results which match your available data, regardless of how many intermediate steps you introduce.
The point is that if the researchers keep fiddling with something — anything — until the model accurately predicts the validation data, then what you're doing is training to the validation data.
Now, there will undoubtedly be a component which is trained on a different data set, but that's not the entire process. So long as researchers make a model and throw out all of those that don't predict the validation set (i.e. so long as they develop the model, rather than pull it out of thin air correctly, the very first time), you're training to it in a meta-training sense; you're actually evolving your models with a selection criteria to predict that validation data from that training data.
And that's why it's better by far to validate against new data. When you are constantly varying the validation data, you can't be evolving a model to the eccentricities of one bit of data.
Again, this isn't a function of how the model actually works. Regardless of whether the model trains on data, has magic values plugged in, or is created by a random number generator, if it's being validated against a static data set then it's being evolved to that particular data set. So long as you'll throw out any model which doesn't predict the validation data better than what you've got now, you're evolving a system which might be predicting your validation data by luck. It doesn't matter what the researcher is changing; so long as they change anything which has the effect of changing the prediction, this problem comes about.
To me, this is the very crux of this issue. I find it eminently more likely - a downright certainty, in fact - that the above scenario, or something like is exactly what will happen. To think that this giant planet of ours is so fragile that our little CO2 contribution, or anything else for that matter, has such a terrible effect, is the HEIGHT of human arrogance. The earth will no doubt adapt, survive, and thrive. If a couple species aren't around to see it, so be it. If Homo Sapien is one of them, well, tough shit.
I have ZERO confidence that anything can be done to intentionally affect the earth's climate, especially when 80% of the world is living in third world conditions and clearly has designs on industrialization or at least modernization. The US can go absolutely hog wild reducing pollution and CO2 emissions and whatever else, but countries like India and China not only won't reduce their emissions, it is a virtual certainty that they will increase them 2 or 3-fold.
Several years ago I read a paper which concerned reduction of insolation by launching huge mylar sheets into orbit, and manipulating them to provide incremental shadow, to affect weather and climate. This was before the warming debate, and the knock on the deal was that we needed more warming, as an ice age was approaching. Since all science is political, the idea vanished.
If altering weather is the desired result, we would approach the subject from that angle. If disarming the West economically is the desired goal, one could go about slowing down industrialization by attacking, say, fossil fuel use, while blocking non-emitting power sources, like nuclear.
Most weather alteration research has been wound down (as far as I know) due to tort liability issues. If we make it rain in one area, another area will sue when they experience dorught. If we change the course of a hurricane, to save one city, the city next door, that gets hit anyway, will sue, claiming that it would never have hit them without interference. If the people in power really believed that we were in trouble, these would be trivial obstacles to taking action to save the planet. Since they believe no such thing, they play with "global warming" in order to achieve political power. I will know that Al Gore believes his own lies about CO2 when he advocates clean, safe, nuclear power, and plug-in electric cars. As long as he is against the only proven, non-emitting, and inexhaustible power source, I know him to be the lying fraud that he so clearly is.
I agree that predicting the weather is different than predicting the climate. But the criticism I've read of the current climate models and their use in global warming is that even using the best climate models we have, the margin of error for predicting climate 50 or 100 years out is +/- several degrees C. (i.e. take all the climate data from 50 years ago, plug it in to the best climate model we have, and it will come within +/- 3 degrees C of the current climate.)
In other words, the model's predicted rise in temperature due to global warming is less than the margin of error in the model. Which means it's statistically meaninless.
Sadly, I don't have links for this. I'm typing from memory. If someone has links defining the margin of error for the current climage models vs historical data, I'd like to see them. (And if I'm wrong about the margins of error, I'll apologize and retract.)
the future is a static data set, too. we just don't know what it is yet. Obviously training, fiddling, and retraining is not a valid way to go about generating a model regardless of if you use known past data or wait until you know some future data. The problem is the same in either case.
I'm not making a position here on the right or wrong way to create computer models, but I am saying that once you've recorded the data, it is then static. Thus any data a computer is training on is going to be static. Obviously as new information becomes available, that new information can be put into the model, and no one here is seriously suggesting that we accept as truth a model which accurately predicts past data but inaccurately predicts future data. My point is that your distinction between future and past data is illusory. If you made a model that predicted climate for 2005 accurately, obviously you would test to see if it did just as well at predicting 2007. but at that point, 2007 is simply another validation set. there's nothing special about it occuring after you generated the model. and if a model trained on data from the 60s accurately predicts data from the 70s, then it is by definition reporting on meaningful trends, since the two datasets are independent.
Whether the future is static is a philosophical question. What's relevant is that the future is (1) unknown and (2) it contains a lot more data than the pittance we have now.
The thing about using future data is that you'll always be using different data. This is the crux of #2. The future constantly unfolds new data for you to try your model on. It's constantly giving you new data in which illusory patterns disappear, leaving only the meaningful patterns.
You've got to get past the issue of training data; obviously the training data is going to be static since you can only get it once it is. The point is what you validate on. And the point isn't about how good the model is, but about how trustworthy the model is. When evaluating a model for how much we trust it, we want to the circumstances of its creation and especially of its validation to make it to the greatest degree impossible that the modelers were lucky.
This is especially true of climate modeling; a climate model is supposed to take in a whole lot of variables between 1850 and 1940, and then output a slight warming trend after 1940. All sorts of things vary over time; heck: a model which shows a warming trend regardless of the data that you train it on would pass this validation data. We, as people who are supposed to act based on the predictions of this model, have to guess at whether the model got the correct answer for the wrong reasons, or the right reasons.
Validating these models against future data is what will lend them credibility that they're going to predict the future well. A million wrong models can validate against any particular data set; it's only the one right model which will consistently validate against the ever-changing data set of what we've just learned.
That's the point — accurately predicting the past has never been very impressive, and never will be, because predicting what you already know can be done more easily by error than by understanding. (I'm talking here of the humans making the models and selecting or rejecting them, not of the models themselves.)