Discussion Central

Topic: All models are wrong, but some are useful 

1.  All models are wrong, but some are useful

Posted 11-16-2017 12:00
A controversial quote, cited by others during my career, attributed to famous statistician George Box (although sometimes attributed to W. Edwards Deming), was that "All Models are Wrong, but Some are Useful."  I have never accepted this as a universal blanket statement, probably since some of my colleagues in the sciences used it as an excuse to not do modeling and, also, I have worked with examples that would seemingly argue against this statement. E.g., for processes that can be represented with only first principle equations (e.g., material and energy balances), models can be highly accurate (i.e., within the limits of measurement capability). Also, in implementing Expert Systems (a branch of AI), professional colleagues [whose expertise on a topic was contained (i.e., modeled) in the application] have signed off that the model was sufficiently accurate that it always drew the same conclusion from analyzing incoming production data that the expert himself/herself would draw.  These applications then eliminated the need for the human "experts" to be constantly available during manufacturing operations to deal with process troubleshooting.  So, in cases such as these, an argument exists that the model is NOT wrong, but that any variability in output is due to measurement error or inaccurate or missing statements by the expert regarding their knowledge of a topic. I.e., a model can accurately represent a process (i.e., is not wrong) if the process can be sufficiently represented with first principle deterministic equations or if its purpose is to mimic an "experts" knowledge of a topic (which, of course, depends on the expert articulating their expertise in statements (e.g., if-then-else, probability) that a computer can understand.   What do you think of the Box (or Deming) quote?

2.  RE: All models are wrong, but some are useful

Posted 11-17-2017 02:29
No generalization is worth a damn, including this one. 
O,over Wendell Holmes ?(Maybe).

Sent from my Galaxy Tab® S2

3.  RE: All models are wrong, but some are useful

Posted 3 days ago
It is true all models are wrong and some are useful. It is left to the discretion of the Chem E to discern the useful inferences. A classic example is the modeling of Polytropic Efficiency of a multistage centrifugal compressor. With so many unknowns the model is as good as its degree of agreement with hardcore field parameters. For Chem Es the actual process vital signs are much more truer and tangible than models - such Temperatures, Pressures, Levels, Flows etc.
Models are useful diagnostic & predictive tools at the command and discretion of a Chem E. Look Listen and Feel techniques at the field or control room are much more useful. At times it is important to perform paper calculations for a certain decision making strategy.

[Venkat] [Subramanian]
[Consultant Expert]
[Principal Consultant]
[Self Employed]
[Chennai] [TN State]

4.  RE: All models are wrong, but some are useful

Posted 11-17-2017 09:40
I've loved this quote for a long time. I believe it is from George E. P. Box from a chapter he wrote. "Robustness in the strategy of scientific model building", in the book Robustness in Science. For me, it reminds us that any model, even a highly accurate first principle one such as F= ma, is a formalized abstraction of reality, not reality itself.

I also love a (sort of) contrapositive version from the founder of social psychology, Kurt Lewin, from his Field Theory in Social Science: "There is nothing more practical than a good theory".  So, a formalized theory/model is not reality, but it can describe reality in a manner that can really helps us manipulate it in ways that we want (i.e it is "useful" and "practical".

Matthew Wagner
Senior Analyst
Lux Research
Cortlandt Manor NY

5.  RE: All models are wrong, but some are useful

Posted 11-18-2017 18:04

There are two other similar quotes that I also like.  The first is from Arthur Conan Doyle via Sherlock Holmes.  "It is a capital mistake to theorize before one has data.  Insensibly, one begins to twists facts to suit theories instead of theories to suit facts."  The second has been attributed to various people but with no factual attribution.  "in theory, theory and practice are the same.  In practice, they are not." 


All models have limitations.  It is very important to understand the limitations of a model before using it.  A model is a tool.  In the hands of a skilled craftsman it is very useful.  It is not nearly as useful in the hands of a novice.  Understanding the technology being modeled and the limitations of the model are more important than understanding how to use the model. 


Steve McGovern

6.  RE: All models are wrong, but some are useful

Posted 11-17-2017 10:20
I'm not sure you entirely understand the quote.  When he says "wrong" he means "not absolutely perfect" when he says "useful" he is saying "accurate" or "not wrong." 

"Not wrong" =/= "Right"

The point is that all models must tolerate some degree of model/reality mismatch. If the model can't due this it won't be useful. 
This quote, in my opinion, is one of the most important in keeping in mind the limitations of modeling. 
-Bob LeBrell
Dow Chemical
Houston ES
Process Automation

7.  RE: All models are wrong, but some are useful

Posted 11-17-2017 13:35
Here's another one along those lines, and one of my favorites:

"Plans are worthless, but planning is everything."
                 Dwight D. Eisenhower,
                     34th US President

William Stuble PE
Senior Design Engineer
Cora WY

8.  RE: All models are wrong, but some are useful

Posted 11-17-2017 17:37
To me, the quote is more thought provoking than controversial. William G. (Bill) Hunter was George Box's protege and first graduate student after starting the Department of Statistics at the University of Wisconsin-Madison in 1959.  I have seen the quote in the book Process Quality published by Pearson.

I remember Bill in his Statistics for Experimenters class talking about a client of his that was reputed to have a superior computer model of their process.  He stated that the model turned out to be just a linear factorial model.  He went on to state that over a limited range. a linear empirical model could approximate a more complicated mechanistic model.  I believe that is what George meant when stating that some [models] are useful.

T. David Griffith, Ph.D.
Blessing (Bay City), TX
(361) 588-6907

9.  RE: All models are wrong, but some are useful

Posted 11-22-2017 14:29
Thank you all for your input.  As hoped for, it generated some interesting discussion. I have ended up thinking the differences we might have in the interpretation of "All models are wrong..   " are mostly due to semantics.

Some of the discussion input suggested that models cannot be a true reflection of reality and that is why they are "wrong."
So, what represents reality?  We used to think F = ma,  E = mc**2, at the atom as protons, neutrons, and electrons, represented reality. Clearly, in some cases, more terms or more subatomic particles are needed to complete the picture.
Of course, the level of detail we need to go to capture reality depends on the application.

I would argue, instead,  that models, at least many of them, are intended as a reflection of a user's understanding of reality, and in such cases, many models are quite correct - i.e., they can accurately reflect a user's understanding of a situation.  E.g., one of the examples cited in the discussion is the widely used fundamental relationship F = ma. This can be accurately represented in a computer model. If F = ma doesn't represent reality in certain situations, it is not that the model couldn't represent F = ma, its that the user's understanding of reality that is not complete or accurate enough.

I am reminded of a possible analogy regarding computers being blamed for most glitches in which they are a part of an application. In the large majority of computer system problems, the computer is just doing what it was told (i.e., programmed/configured) to do. But rarely does an organization report that a computer programmer screwed up.  Its the computer that always gets blamed.  In a similar way, I somethings think that models get a bad rap when it is the engineer's or scientist's understanding and representation of reality (or other factors such as measurement errors) which gives rise to model inaccuracies.

Anyway, thanks again for all your input.

Joseph Alford PE
Zionsville IN

10.  RE: All models are wrong, but some are useful

Posted 11-23-2017 02:19
There is a more provocative version of modelling that I heard from my brother in law may he rest in peace.  He had a pretty important position in the California State Energy Commission with a degree in economics.  I overheard him telling someone else circa 2010 this unattributed quote, "All models are wrong, what do you want it to say?".  This is where monetary compound interest e.g. diverges from physical reality in expecting crop yields or chemical plants to grow their output year over year without additional resources other than the initial borrowed cash.  Einstein is attributed when asked what he thought the greatest human invention to have said "compound interest"!  Humans seem to have a propensity to believe in exponential extrapolations even when they are not physically sustainable.

John Rudesill
Columbia MD

11.  RE: All models are wrong, but some are useful

Posted 12-06-2017 22:00
I think Joseph's model is wrong, but it is useful!  Here is my two cents:

Statistically speaking, I think the accuracy of any model is related to our tolerance for the degree of confidence.  F=ma may be right for 99% of what we deal with every day, but if we are concerned about the remaining 1%, then it is wrong.

The remaining 1% may be more significant than we think.  If we try to use the Newton's Second Law to model the concepts behind the design of the screen we are staring at right now, then we are missing a big part of the picture.

Moreover, it is impossible to have any empirical model for an on-going natural phenomena with 100% confidence level, simply because your sample size should cover the entire population.  If we live in an abstract world, then the models can be true.

On a different note, the notion of blaming any computing system whether it is a silicon-based computer or a carbon-based organism comes from the expectation for some level of perfectness.  Unfortunately, the perfectness comes with an infinite cost, so any complex computing system has inherent weaknesses or biases that increase the output while reducing perfectness.  Ideally, we are expecting the designers to evaluate the risk of each weakness before implementing it in the design, but because the designers are also not perfect, there is always a possibility for mistakes.

At a deeper level, if you allow your computing system to self-evolve from a generation to the next one, certain traits will surface that are highly effective for certain environments/conditions and not for the other ones.  In other words, the system self-optimizes itself for a certain outcome.  At the current human made computing systems we do not see frequent occurrence of this phenomenon yet (excluding genetically-modified microorganisms).  But some cutting edge artificial intelligence, robotics, microelectronics, and bio-engineering research is focused on it.  I think it is possible to "blame" a human-made computing system in the near future, and that is why many researchers and thinkers in the ethics field are working on establishing boundaries for these emerging technologies.

Nader Shakerin
Facilities Manager
Intel Corporation
Chandler AZ

12.  RE: All models are wrong, but some are useful

Posted 12-07-2017 00:30
Here is an example of what I am talking about.

Watch "This Neural Network Optimizes Itself | Two Minute Papers #212" on YouTube
This Neural Network Optimizes Itself | Two Minute Papers #212
YouTube remove preview
This Neural Network Optimizes Itself | Two Minute Papers #212
The paper "Hierarchical Representations for Efficient Architecture Search" is available here: https://arxiv.org/pdf/1711.00436.pdf Genetic algorithm (+ Mona Lisa problem) implementation: 1. https://users.cg.tuwien.ac.at/zsolnai/gfx/mona_lisa_parallel_genetic_algorithm/ 2. https://users.cg.tuwien.ac.at/zsolnai/gfx/knapsack_genetic/ Andrej Karpathy's online demo: http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html Overfitting and Regularization For Deep Learning - https://www.youtube.com/watch?v=6aF9sJrzxaM Training Deep Neural Networks With Dropout - https://www.youtube.com/watch?v=LhhEv1dMpKE How Do Genetic Algorithms Work?
View this on YouTube >

Nader Shakerin
Facilities Manager
Intel Corporation
Chandler AZ

13.  RE: All models are wrong, but some are useful

Posted 13 days ago
To say that "All models are wrong" is misleading; a more accurate statement would be "all models are approximate", but are useful depending on their application for decision making. For example, to get a rough idea of the influence input parameters have on the final output (e.g for policy making or regulation) a simplified model may be adequate.  For making a decision on modifying a potentially hazardous runaway reaction in a chem plant, a far more comprehensive (i.e. accurate and reliable)  model would be needed.  Basic concept is that we need different horses for different courses.

J Kumana
CEO, Kumana & Associates
Missouri City (Houston), Tx

14.  RE: All models are wrong, but some are useful

Posted 01-08-2018 17:48
Can't help mentioning GIGO!

Sankar Raghavan
Eastman Professor of Practice
University of Tennessee-Knoxville
Knoxville TN

15.  RE: All models are wrong, but some are useful

Posted 12 days ago
This discussion made me think of one of my first projects as a process engineer right after grad school.  There was a pump used to feed recycled water to four titanium sponge leaching systems in parallel.  Previous attempts to do this, the pump kept failing.  I looked at the problem as a piping network and used the Bernoulli equation combined with material balances to solve simultaneously (I think it was 11 equations and 11 unknowns).  Each branch had a control valve, steam injector and a flowmeter, which were fed by a single pump from a tank.  The solution resulted in a pump that worked reliably, but was only 13% efficient.  So, as it turns out, most experienced engineers would size the pump for the BEP, which would suck Ti fines settled in the tank into the pump at startup and cause early pump failure.  I'll try to put this into a quote:  "Sometimes a good engineering model trumps experience"

P.S. I published a paper on this if anyone wants a copy.  It's kind of timely in the sense that there's an article in January CEP on piping networks.


Samuel Davis PE
Principal Engineer
Las Vegas NV
1(702) 589-0338