Explaining Validation

This forum contains all archives from the SD Mailing list (go to http://www.systemdynamics.org/forum/ for more information). This is here as a read-only resource, please post any SD related questions to the SD Discussion forum.
Locked
Timothy Quinn tdquinn MIT.EDU
Junior Member
Posts: 3
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Timothy Quinn tdquinn MIT.EDU »

Posted by ""Timothy Quinn"" <tdquinn@MIT.EDU>

I have two very specific questions for the practitioners among you. Before I pose them, here is the setup:

In the past three weeks, I have been asked on several occasions how we ""validate"" our models when they are built from ""subjective"" accounts of how the system works (i.e., on expert opinion). Jay Forrester, in Industrial Dynamics, wrote,

""Any 'objective' model-validation procedure rests eventually at some lower level on a judgment or faith that either the procedure or its goals are acceptable without objective proof."" (Forrester 1961, p. 123)

I take these judgments and faith to be the intuition or common-sense knowledge of a model's critic. For example, we know that physical inventory cannot go negative, or more workers have a higher work completion rate than fewer (unless you reach the nonlinear too-many-cooks regime). Therefore, John Sterman, in Business Dynamics, concludes,

""Validation is.intrinsically social. The goal of modeling, and of scientific endeavor more generally, is to build shared understanding that provides insight into the world and helps solve important problems. Modeling is therefore inevitably a process of communication and persuasion among modelers, clients, and other affected parties. Each person ultimately judges the quality and appropriateness of any model using his or her own criteria."" (Sterman 2000, p.850)

To restate, validation is building shared understanding of

(1) the problem,
(2) how the model simplifies the real world in favor of achieving a purpose, and
(3) the appropriateness of those simplifications for the purpose.

I have struggled to make a compelling case for model ""validity"" when my potential critic's attention span is limited to a few minutes only. I cannot take him or her equation-by-equation and explain the empirical or common-sense justification for each.

Questions:

(1) Is there a quick way, possibly a good analogy, to communicate the point that validation is about confidence in a model's simplifications of reality for a specific purpose? I am looking for something akin to the bathtub analogy so often used for explaining stocks and flows to a lay audience.

Here is my best failed attempt: Imagine your goal is to cross a ford without getting your feet wet. Can you do it? Only by demonstrating that there exists a sequence of stepping stones, each within one stride of the next, spanning the river. If we agree on the existence of each hop, then you must concede the goal is possible.

The reason this analogy fails is that the goal is achieved by cutting it into sequential pieces, each of which must lead to the next. In contrast, from uncontroversial pieces, SD models can produce surprising and counterintuitive results.

(2) What published paper best exemplifies how a model's formulations should be justified, based both on tests of intended rationality, ""common knowledge"" reality constraints (e.g., conservation of mass), and expert opinion?

Thanks,
Tim Quinn
~~~~~~~~~
MIT Sloan School of Management
System Dynamics Group
Posted by ""Timothy Quinn"" <tdquinn@MIT.EDU>
posting date Tue, 28 Feb 2006 10:19:17 -0600
Jack Homer jhomer comcast.net
Junior Member
Posts: 5
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Jack Homer jhomer comcast.net »

Posted by ""Jack Homer"" <jhomer@comcast.net>
Good questions, Tim. People ask me about validation all the time, too. Many of them have statistical or other technical backgrounds and are thinking of confidence intervals and other numerical demonstrations of reliability via sensitivity testing. I tell them about the difference between numerical sensitivity and behavioral or policy sensitivity, but that can be pretty abstract without further explanation. One way of explaining it is to say that if what I was after primarily was a tight confidence interval, then I would not bother with 99+% of my model's equations, and would just look for one or two simple polynomial equations that provide a nice tight fit to historical data. In most cases, the more uncertain parameters I add to my model, the more uncertain its outputs will be as well. Why would I want to add uncertainty to my model? Because that is the only way I can make it useful for understanding what is going on in the real world and for informing policy. But doesn't that uncertainty stand in the way of achieving such understanding? Yes, it can, but only to the extent that the model is behaviorally or policy sensitive to changes--not to the extent that it is numerically sensitive (unable to produce tight confidence intervals). Fortunately, well-constructed models that adhere to laws of conservation and bounded rationality, etc., typically display much less behavioral and policy sensitivity than one might expect, and therefore can tell us useful things even if they do not produce nice tight confidence intervals. And, even if a model does display behavioral or policy sensitivity, it can at least direct research and data gathering to just those areas that really matter the most and not all the other areas one might think of, which is a pretty big deal by itself.

Analogy? Model validation, like model building itself, is like baking a cake. There are several necessary ingredients to get it to look right and smell right and taste right, but the final result is a blending and transformation of all those components and often has elements of surprise from the gestalt of the whole. (I have also in the past compared modeling to a zipper, involving the careful bringing together of structural hypotheses and evidence to create useful theory; see SDR 13(4) 1997.)

Published works? On validation, I generally point people to Sterman Chapter 21 and Forrester and Senge 1980. On testing for intended rationality, I point them to Sterman Chapter 15 and Morecroft 1983 and 1985.

Hope this helps.
- Jack Homer
Posted by ""Jack Homer"" <jhomer@comcast.net>
posting date Wed, 1 Mar 2006 10:03:57 -0500
John Gunkler jgunkler sprintmail
Member
Posts: 30
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by John Gunkler jgunkler sprintmail »

Posted by ""John Gunkler"" <jgunkler@sprintmail.com>
Tim and others,

One approach that helps me is to think about what makes any scientific theory plausible.

1. It must explain the phenomena of interest -- which means, you must be able to derive or create the phenomena using the theory (or model.) 2. It must predict future phenomena. (If some of these are surprising, all the better!) 3. You must make a good case for ruling out alternative explanations.

It's number 3 that's most interesting and, I believe, useful. (1 and 2 are sort of ""table stakes"" -- you must have these or you're not even in the
game.) And it's number 3, I think, that has led Drs. Forrester and Sterman to write some of the things they've written about communication, judgment, and persuasion. Because in order to rule out alternative explanations, you first need to know what your client would think of as plausible explanations (models). This means, you must get ""subjective"" with your clients and encourage them to provide their best shot at causal explanations before you build your model. And it means you also have to think very hard about what a harsh critic of your model will say -- what alternative explanations they could use to discredit your efforts.

You score big if your model explains (and, even more dramatically, if it
predicts) something that their formulations do not. You also score big if, with the deeper understanding provided by your model, it becomes evident that their explanation does not work.

The kinds of plausible alternative explanations I run into are mostly exogenous events -- partly because that's the way most people think about causes, and partly because it seems it is always all of the ""other stuff going on"" that wants the credit for positive changes in outcomes. If, for example, we are working on an endogenous theory of sales growth (i.e., our model is identifying policy decisions that affect growth in sales) -- and sales go up during the modeling period -- then everyone else who did anything at all about sales is going to try to claim credit: ""Oh, that's because of the contest we ran in January. Best contest we ever had!"" or ""You know, the competition had a product recall and that really boosted our sales."" or etc., etc. The good news here is that if we show that endogenous factors override (after some transition effects) exogenous factors, then we can argue against all of these kinds of alternative explanations. They become mere blips on the outcome graph, not fundamental forces.

So, I recommend focusing on learning all you can about people's ""pet projects"" to improve (and ""pet theories"" to explain) whatever outcome you're modeling. I recommend getting all of your critics' mental models captured on paper -- not a bad idea, anyway. And I recommend doing a deep critique of your own thinking, looking for loopholes and poorly understood or poorly communicated aspects of your own work. And I recommend actually performing the exercise (maybe with others' help) of coming up with alternative explanations for the phenomena. Then your challenge is to come up with something that does a better job of (fill in the goal of modeling here -- whether it's helping people understand where the leverage points for change are, or helping choose new policies to improve outcomes, or creating a common mental model to improve communications, etc.) and find all the ways it is superior to their existing mental models, then rule out (or account for in the model) the effects of their pet projects.

Posted by ""John Gunkler"" <jgunkler@sprintmail.com>
posting date Wed, 1 Mar 2006 09:33:27 -0600
Michael J Schwandt SCHWANDT char
Junior Member
Posts: 2
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Michael J Schwandt SCHWANDT char »

Posted by ""Michael J Schwandt"" <SCHWANDT@charter.net>
Hello, Tim.

The paper that Michael Radzicki presented at the 2004 ISDC, ""Expectations Formulation and Parameter Estimation in Uncertain Dynamical Systems: The System Dynamics Approach to Post-Keynesian-Institutional Economics"" contains a solid review of SD validity research and provides a good example of applying those techniques.

I don't know that this is the best article, but I found it to be quite enlightening.

Michael Schwandt
Virginia Tech
Posted by ""Michael J Schwandt"" <SCHWANDT@charter.net>
posting date Wed, 1 Mar 2006 08:35:02 -0500
R. Oliva roliva tamu.edu
Newbie
Posts: 1
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by R. Oliva roliva tamu.edu »

Posted by ""R. Oliva"" <roliva@tamu.edu>
Tim,

You might also be interested in:

Oliva R. 2003. Model calibration as a testing strategy for
system dynamics models. European Journal of Operational
Research 151(3): 552-568.

An abstract of the paper and the tools described in it are
available at:

http://iops.tamu.edu/faculty/roliva/res ... ation.html

Regards,

Rogelio
---
Rogelio Oliva
Associate Professor | Ford Supply Chain Fellow
Mays Business School | Wehner 301F - 4217 TAMU | College Station, TX 77843-4217
Posted by ""R. Oliva"" <roliva@tamu.edu>
posting date Thu, 2 Mar 2006 18:30:16 -0600
yaman barlas ybarlas boun.edu.tr
Junior Member
Posts: 4
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by yaman barlas ybarlas boun.edu.tr »

Posted by yaman barlas <ybarlas@boun.edu.tr>
Hello Tim;
Here are some references (of mine) on model validation that you may find
useful:

1) On conceptual and philosophical aspects of validity and validation:
- ""Philosophical Roots of Model Validation: Two Paradigms"" (with Stanley Carpenter), System Dynamics Review, Vol.6, No.2. 1990, pp.148-166.
- ""Comments on 'On the very idea of a system dynamics model of Kuhnian science'"" System Dynamics Review, Vol.8, No.1, 1992.

2) On Quantitative BEHAVIORTesting:
- ""Multiple Tests for Validation of System Dynamics Type of Simulation Models"", European Journal of Operations Research, Vol.42, no.1, 1989, pp. 59-87.
- ""An Autocorrelation Test for Output Validation"", SIMULATION, Vol.55, No.1, 1990, pp.7-16. -""A Behavior Validity Testing Software (BTS)"" (with H. Topalog(lu ve S. Y?lankaya), Proceedings of the 15th International System Dynamics Conference, I.stanbul, 1997.

3) On STRUCTURE Testing:
- ""Tests of Model Behavior That Can Detect Structural Flaws: Demonstration with Simulation Experiments"", in Computer-Based Management of Complex Systems (P. M. Milling and E. O. K. Zahn, eds.), 1989, pp. 246-254.
- ""A Dynamic Pattern-oriented Test for Model Validation"" (with K. Kanar), Proceedings of 4th Systems Science European Congress, Valencia, Spain, Sept. 1999, pp. 269-286 (Under revision to be submittted for SDR or another journal).
- ""Automated dynamic pattern testing, parameter calibration and policy improvement."" Proceedings of international system dynamics conference. (With Suat Bog), NY, USA, 2005.

4) Overviews on all aspects of validity and validation
- ""Fundamental Aspects and Tests of Model Validity in Simulation"", Proceedings of SMC Simulation Multiconference, Phoenix, Arizona, 1995, pp.488-493.
- ""Formal Aspects of Model Validity and Validation in System Dynamics"", System Dynamics Review, Vol.12, no.3, 1996, pp. 183-210.
- ""System Dynamics: Systemic Feedback Modeling for Policy Analysis""
in Knowledge for Sustainable Development - An Insight into the Encyclopedia of Life Support Systems, UNESCO-Eolss Publishers, Paris, France, Oxford, UK. 2002, pp.1131-1175. ..

Finally; related to item (2) above, we have Behavior Testing Software (BTS II), available at our web site. and related to (3) above, we recentlped a software (SiS) that I demonstrated in Boston, but it has a few bugs and should be ready in a couple of months.

Hope these will be of some use in your questions.
best,
Yaman Barlas
---------------------------------------------------------------------------
Yaman Barlas, Ph.D.
Professor, Industrial Engineering Dept.
Bogazici University,
34342 Bebek, Istanbul, TURKEY
Posted by yaman barlas <ybarlas@boun.edu.tr>
posting date Fri, 3 Mar 2006 19:05:23 +0200
Erling Moxnes Erling.Moxnes geog
Junior Member
Posts: 3
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Erling Moxnes Erling.Moxnes geog »

Posted by ""Erling Moxnes"" <Erling.Moxnes@geog.uib.no>
Let me add one simple example to Jack Homer's excellent summary.

In a model of a renewable resource (SDR 20, No.2, 2004, pp.151), it
turns out that a static model explains very well the (simulated)
historical development. The amount of lichen is found to be a linear
function of the number of reindeer, with the very impressive t-ratio
of 63 for the slope parameter! No polynomials are needed, only two
parameters describing the linear relationship. A proper dynamic model
is not likely to beat this impressive fit to data. However, when the
static and the dynamic models are used to recommend policies, the
advice differ very much. The static model supporting a disastrous
policy. To generalise, the example illustrates the problems of
using too simple models in systems with shifting dominance.

The above critic of statistical tests could also be launched from
the point of view of statistics. Using statistical methods, there
is no excuse for using inappropriate models - a healthy perspective
is provided by Bayesian statistics.

Erling Moxnes
Posted by ""Erling Moxnes"" <Erling.Moxnes@geog.uib.no>
posting date Mon, 6 Mar 2006 11:01:02 +0100
John Barton jabarton ozemail.com
Newbie
Posts: 1
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by John Barton jabarton ozemail.com »

Posted by ""John Barton"" <jabarton@ozemail.com.au>
Tim

Thank you for again raising this critical issue and so generating
a great reference list on this topic.

Another way of viewing ""validation"" is to realize that this terms
frames the debate within the domain of deductive logic and its
related objectivist and refutationist positions. However, managers
do not act on the basis of well established hypotheses. Instead,
they act on the basis of the what they perceive as the best
hypothesis available within their ""community of inquiry"". SD
modelling with its checks and balances (triangulation) provides
a rigorous approach to establishing such a hypothesis using the
events- patterns - structure framework. This constitutes abductive
reasoning, not deductive reasoning.

On taking action, the manager, as participant /observer, monitors
the implementation of the strategy and actively intervenes ""to make
it happen"".

Using abduction (the method of hypothesis) significantly reframes the
debate from validation to triangulation of the hypothesis and
emphasises the importance of a ""community of inquiry"" mush as
described by John Sterman etc. This shifts the emphasis from
""validating"" a hypothesis to evaluating of the strategy from an open
systems perspective.

(A paper detailing this argument has been submitted to the Nijmegen
conference).

Abductive inference is not new- it was part of the Greek dialectic,
but was reconstituted by the American pragmatist philosopher Charles
Saunders Peirce (1839 - 1914) and forms the basis of Dewey's method
of reason.

You may wish to follow up on Thomas Powell (2001) ""Competitive Advantage:
Logical and Philosophical Considerations"". Strategic Management Journal
Vol 22: 875 - 888. Powell argues that competitive advantage is pragmatic, inductive inference.

John Barton

John Barton Consulting (& Monash University)
Office: Suite 4, 24 Bay Road
Sandringham
Victoria, 3191, Australia
Mailing Address: 2 Arthur Street
Sandringham, Victoria, 3191
Office Tel: +61 3 9598 7061
Posted by ""John Barton"" <jabarton@ozemail.com.au>
posting date Wed, 8 Mar 2006 15:05:27 +1100
Geoff McDonnell gmcdonne bigpond
Junior Member
Posts: 10
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Geoff McDonnell gmcdonne bigpond »

Posted by ""Geoff McDonnell"" <gmcdonne@bigpond.net.au>
John, your words below reminded me of Gary Klein's Sources of Power work at
http://www.decisionmaking.com/approach/ ... power.html He talks about
the way highly time-stressed experts make decisions (esp in medical military and fire-fighting using abductive inference IMHO). This book had a profound effect on the way I now think about clinical and health policy decision support. As a review (page 2 on the site link) states Richard I. Cook MD, Focus on Patient Safety, review of Sources of Power, 1998. ""With his colleagues, Klein has spent the past two decades observing people doing mental work in order to discover how they cope with demands of the workplace. What are the processes of decision making? How do people deal with uncertainty and risk? How is it that experts are able to discern subtle cues and do just the right thing in situations where novices fail? Sources of Power is a marvelous summation of Klein's long experience. It may also be the most readable and coherent description of what is presently known about how human cognition works in the real world.""

Posted by ""Geoff McDonnell"" <gmcdonne@bigpond.net.au>
posting date Thu, 9 Mar 2006 02:13:50 +1100
Mike Fletcher mefletcher gmail.c
Junior Member
Posts: 4
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Mike Fletcher mefletcher gmail.c »

Posted by ""Mike Fletcher"" <mefletcher@gmail.com>
Klein's book is good, and I too found it very interesting, but I think it's important to point out that his conclusions probably only hold in respect to expert judgement in respect to their specific area of expertise. Most of his studies show experts working in their specific domain. The decision making of experts within their field probably holds up reasonably well as Klein argues, with some limitations. (Experts for examples can miss new patterns that a novice might see, because the experts are too busy looking for old patterns.)

In the real world most decisions are made by people who think they are experts but really aren't, or are only expert a small portion of the problem! That is, decisions are routinely made using the Klein's ""expert model"" but lack the required expertise!

Overall I'm somewhat alarmed by several of what I would term the
""natural decision making"" books which have been quite popular of late. ""The Wisdom of Crowds"" is another similar title discussed earlier on this list

I think it's extremely risky to profess that there are relatively easy prescriptions available for making hard decisions against complex problems! Several authors, and probably Klein to a small degree, are risking becoming ""prescriptive."" All the pessimism about ""bounded rationality"" is no reason for a permanent state of decision despair. We don't have to apply real critical thinking or rigorous analytic processes to our decisions. A prescription is available. Risky thinking. There might be easy solutions to complex problems, but finding them is usually hard, takes time, and will likely involve costly mistakes.

Crowds do indeed know more any individual member of the crowd, but their wisdom might simply amount to the sum of their ignorance. That is, everyone can be wrong, and the sum total of ideas might still be insufficient. Experts generally make good decisions within the narrow boundaries of their field, but they may be totally incapable of making effective decisions elsewhere.

--
------------------------------------------------------
Michael E. Fletcher
Posted by ""Mike Fletcher"" <mefletcher@gmail.com>
posting date Sun, 12 Mar 2006 15:21:08 -0500
Mike Fletcher mefletcher gmail.c
Junior Member
Posts: 4
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by Mike Fletcher mefletcher gmail.c »

Posted by ""Mike Fletcher"" <mefletcher@gmail.com>
I would like to add some comments regarding your reference to abductive inference, ""triangulation"" and Model Validation. Many published definitions of abductive inference are a bit vague and that has lead to some confusion. Just to be clear, and hoping our definitions are the same, I want to provide my current working definition. Schum's short definition is something ""plausibly true.""

Pierce's ""Beanbag"" definition:
Rule: All the beans from this bag are white
Result: These beans are white
Case: These beans are from this bag.

Based on these definitions, triangulation is a good description. The analogy one could draw here is that the ""truth"" is the point at the center of a circle, and that all inquiry begins on the circumference. Luckily, due to the nature of circles, the distance to truth is equal from any point on the circumference. We triangulate and what results is a region, possibly an ellipse, which outlines a rough estimate of what region the ""truth"" lies in.

So what does this have to do with model validation? Unfortunately,
validation, as it is often practiced, isn't very abductive.
Validation is often practiced like societies practice rituals. The original goal of the ritual usually served some valid purpose, but often, as time goes by, the original purpose of the ritual gets lost in the rote execution of the ritual. Thus validation at the very end of the procedure rather misses the point of validation. We should be learning throughout the entire modeling process, and thus saving validation for the very end means that not much is going to be learned. The ""proof"" at the end isn't likely to be very abductive, divergent, generative, or help us to expand our mental models.

Validation isn't just throwing a bit of ""rigorous math"" in at the end to ""prove"" we are right and all our works brilliant! As my SD Professors at WPI commented, validation is a continual process of learning. If we do validation that way, it is likely to be abductive.

Abductive inference is inherently divergent, and generative and is therefore extremely helpful in adding new information into the debate, and thus expanding mental models. (Which, after all, should be our true goal.) In the sense that information is not evidence until it is attached to a hypothesis, and thus multiple hypothesis are likely to introduce new information. Information that may be vital, but not relevant to current hypothesis, may be discounted, so multiple hypothesis in respect to a question is usually a wise course.

Abduction can also be part of an analysis of competing hypothesis (summaries as: accepting as the working hypothesis the hypothesis with the least inconsistent evidence). We probably should look at each iteration of the model as a hypothesis. All the models iterations we throw into the dust bin are just as important to validation (and perhaps more so) than the ""proof"" at the end. They certainly are critical to the learning process. I should probably also comment, as others no doubt have, that validation not really a good word to use to describe what is really going on, since we are really building confidence and ""goodness to purpose,"" not proving truth.



--
------------------------------------------------------
Michael E. Fletcher
Posted by ""Mike Fletcher"" <mefletcher@gmail.com>
posting date Sun, 12 Mar 2006 16:18:07 -0500
George Richardson gpr albany.edu
Junior Member
Posts: 6
Joined: Fri Mar 29, 2002 3:39 am

Explaining Validation

Post by George Richardson gpr albany.edu »

Posted by George Richardson <gpr@albany.edu>
I'm a little late seeing these posts, but if it helps anyone,
there's a PowerPoint presentation among my papers on line
http:// www.albany.edu/~gpr/Papers.html containing the slides
from a talk on Validation as an Integrated Social Process.
The slides are given without exaplantory notes, so let me know
if need explanations.

..George

George P. Richardson
Chair of public administration and policy
Rockefeller College of Public Affairs and Policy
University at Albany - SUNY, Albany, NY 12222
Posted by George Richardson <gpr@albany.edu>
posting date Sun, 12 Mar 2006 14:50:45 -0500
Locked