Ventana software support forum

Posted: **Thu Jan 29, 2004 8:23 am**

John Sterman wrote:
> My good friend Jim Hines "question[s] whether statistical estimates
> have any rational claim on our attention other than widespread use and
> anecdotal evidence of usefulness." Jim wants rigorous demonstrations
> of the value of statistical reasoning and estimation methods. There
> are so many I dont know where to begin. Epidemiology, insurance,
> quality control, particle physics, medicine and pharmaceuticals
> (evaluating safety and efficacy of new drugs and procedures) and on
> and on.

All of these examples completely ignore the time dimension in the sense
that observations are lumped together as if they happened at the same
time or, in the case of those like pharmaco-kinetic studies, as if they
happened over the same time interval. That is fine for those studies. I
assumed Jim offered his question in the spirit of this list as in,
"rational claim on OUR attention"(emphasis added).

As riposte to Jims question, I see the rapid strides being made by
economists to recognize and use feedback analysis as evidence that
statistical claims (and not just linear or non-linear regression) are
evidence of a "rational claim on our attention other than widespread
use and anecdotal evidence of usefulness."

> We might have a productive conversation on this list about the
> limitations and proper use of different tools to address important
> challenges, how such tools complement one another, how they are
> abused, and how we can improve our discipline and practice.

Enthusiastic agreement here especially for "(improving) OUR discipline
and practice."

Joel Rahn
jrahn@sympatico.ca

Posted: **Thu Jan 29, 2004 8:57 am**

Jim Hines question and John Stermans response regarding statistical
methods elicited a flash of subtle alarm. Not that statistical methods
arent useful, beneficial, or even necessary at times, but perhaps always
worthy of a bit of caution. The following is more for humor than substance,
but like all good humor, I perceive an (tiny) slice of reality in the
question. (As a ChE with a good thermo background I appreciate the
weakness of the scenario, but stepping into this "world" did stretch my
thought process.)

Imagine you are in an incredibly low probability streak in which a flipped
coin always comes up heads. Statistically you would believe with certainty
that a coin will always come up heads. How could you discern the true
uncertainty of a coin flip except through mathematical modeling? If you
perceived it as certainty, would you ever bother to question the true
probability (and the fact that the next flip was still a 50/50
proposition)? How do we reconcile statistics with reality?

******

I recently emailed Jim Hines offline about his comments on model validation
and Jim encouraged me to share one of my statements with the list. My
comment to Jim was:

"One can learn from models one knows are inadequate, invalid, and even wrong."

Barry Richmonds classic DEC customer service model was so trivial (one
stock, one flow) that it could hardly be argued to do any more than capture
extremely fundamental behavior. The insight from the model has nothing to
do with the actual values of the level and flow, only in the relative
dynamics of the two elements in the model. I consider it a great example of
a very useful and insightful model that fails most validation tests and is
intellectually "indadequate" from just about every perspective imaginable
except providing insight that would be hidden in more complete and
"intellectually satisfying" models.

Jay Forrest
From: Jay Forrest <systems@jayforrest.com>

Posted: **Thu Jan 29, 2004 12:05 pm**

Just to support Johns comments - I do hope my original question did not
imply that regression analysis is of no value ... only that its value
may be limited or break down at any point where a flow-to-stock boundary
exists between the independent and dependent variables.
Statistical analysis has been exceedingly helpful in our strategy work
in clarifying the causes of flow-rates themselves (as Johns example of
the Broad St well clearly shows - the rate of new cholera cases
correlates with the probability that people visited the well).
One bank I know found a very high correlation between the extent of
customers warm feeling towards their local bank branch and the
fractional rate of customer losses across their branch network - much
more important than interest rates, product range etc. etc. Since this
warm feeling was itself an accumulating stock, they were then able to
seek statistical evidence for any changes in this measure - again, an
in-flow or out-flow. What was *not* the case, was that customer numbers
or transaction rates were correlated with this or any other variable.
Perhaps the point may have been more clearly (and less contentiously)
made in terms that stock-accumulation will likely obfuscate causal
relationships that *do* exist to the extent that statistical methods
cannot easily find them. ? .. e.g. I understand its taken huge amounts
of work over many years to show any positive relationship between good
Human Resource policies and business performance, even though wed all
expect implicitly that this should be so. Perhaps this is because such
practices have to operate through winning and retaining good people, who
in turn have to develop and sustain strong resources such as products
and customers, which eventually generate strong sales and profits - i.e.
we have at least 2-3 accumulating stocks between our hypothetical
cause (HR practices) and the effect (superior profitability)
Incidentally, one indirect response I have had is that this interference
of stock accumulation with statistical analysis in the social sciences
is very well understood and fully allowed for in any half-decent
empirical research in these fields - so is it only in the study of
management and business performance that the problem seems to be widely
unknown and/or ignored? Can anyone refer me to a good piece of empirical
social science research that fully and accurately allows for this effect
- excluding of course, any work done by SD professionals? Does anyone
know of an accessible text book on research-methods for social
scientists that explains the problem clearly and shows exactly how it
should be dealt with?

Kim
From: "Kim Warren" <Kim@strategydynamics.com>

Posted: **Thu Jan 29, 2004 1:14 pm**

I greatly appreciate John Stermans comments concerning the question whether
statistical estimates have any rational claim on our attention other than
widespread use and anecdotal evidence of usefulness.

John gives many areas where statistical estimates are being used. So, chalk
one up for widespread use. The question still stands.

Three clarifications:

First: The term "anecdotal evidence" is not meant to be disparaging. Most
people find anecdotes -- stories, examples, and personal experience -- quite
convincing.

Second: No one is arguing that statistical methods have **no** value.
People are just asking whether (use of) statistical methods have the
**appearance** of being mathematically derived from first, true principles
-- while in fact always involving a leap of faith, just like other processes
of reaching a conclusion. People are wondering why statistical methods
should rank higher than these other processes, which include anecdotes (like
the ones people keep providing on behalf of statistics), analogies, logic,
and repeatable experiment. (Personally, I find repeatable experiment the
single most convincing process).

------------
And a question for Alan,

What did you mean by "rigorous" when you said FIMLOF is the "rigorous and
practical" approach to use with imperfect data and feedback models?

Jim Hines
jhines@mit.edu

Posted: **Thu Jan 29, 2004 4:32 pm**

Jim Hines asked whether there was "any real reason ... to run experience
through statistical methods as opposed to running it through, say, a good
story telling machine?"

I think any good Organisation Behaviourist/Organization Behaviorist would
point out that part of what makes many businesses great is Culture...and
that one main way in which Culture gets transmitted is through "War
Stories".
These are (in Jims terms) "anecdotes"... Stories about something that
happened, which can act as guides for future behavio(u)r.

So, I think we need both: statistics and stories.

Anecdotes tell what happened. Statistics test and "prove" (or "disprove")
the common threads between the stories (which helps us apply them better to
future situations).

(Or, if we want to get clever about it, statistics test alternative theories
(or stories) about what *are* the common threads between the anecdotes...
And so we have a fractal, and the answer to Jims question has to be "its
both" or "it depends".)

We can see this happening on many levels, from the Models for Winning
Business points that Ray Joseph recently raised again, to
left-brain
ight-brain differences in the human mind.

Or, as I recently saw it put more succinctly: "Art tells the jokes that
science insists on explaining".

With regards,
Finn Jackson
From: "Finn Jackson" <finn.jackson@tangley.com>

Posted: **Fri Jan 30, 2004 7:27 am**

"Jim Hines" <jimhines@interserv.com> writes:

> and repeatable experiment. (Personally, I find repeatable experiment the
> single most convincing process).

Jim, if statistics is "the mathematics of the collection, organization,
and interpretation of numerical data"
(http://dictionary.reference.com/search? ... stics&r=67), then isnt
your repeatable experiment a statistical analysis? I guess not, if its
deterministic, not chaotic, and repeated with the same initial
conditions each time, but, change any of the assumptions I just made,
and I conjecture youre at least talking about the mode of the data you
collect.

Bill
From: Bill Harris <bill_harris@facilitatedsystems.com>
--
Bill Harris 3217 102nd Place SE
Facilitated Systems Everett, WA 98208 USA
http://facilitatedsystems.com/ phone: +1 425 337-5541

Posted: **Fri Jan 30, 2004 9:25 am**

Just to throw in three observations regarding how I used
statistics in an SD study.

First, having interesting statistical results helped the
study be read and accepted by a wider, non-SD audience.
Regardless of how much more advanced non-equilibrium
simulations are than equilibrium models, the fact of the
matter is most quantitatively literate readers find
regression models more convincing, so in my study I
provided both -- a statistical model that made a
reasonable point and a system dynamics model a) to
extend the temporal boundary beyond that for which
there was data, b) to demonstrate why the statistical
result was not an artifact, and c) to convince me and
make me happy.

Second, the statistical model and SD simulation, while
they might seem separate, are tightly related through the
guiding reference mode, and by that I mean,

1. theory
2. ancillary theory
3. time frame
4. 7+/-2 variables
5. graph them over the time frame
6. postulate causality
7. model
8. test & iterate
9. fix system of interest.

Reference mode steps 1-5 set the stage for framing the statistical
model, thus providing a unique perspective or vantage point from
which a unique and well directed model can be created.
Also, the statistics allow for a more defined and defensible step 6,
recognizing of course that correlation is not causation. I dont mean
that as a throw-away point: representing key relationships in two ways,
in a statistical model and an SD sim, creates an implicit analytic
tension that cant help but contribute to a better understanding of the
problem.

Third, statistically I used a diffusion or spatial model that directly
represents highly interconnected systems. SD -- while it does a good
job with stock/flow, nonlinear, and feedback relationships -- tends to
aggregate such connections. The isnt to say that SD cant be applied
to such systems, but the interconnectedness must be finessed and the result
can be obscure. A simple way of thinking about this is that diffusion
models are good in space, SD in time. Using both again allows multiple
angles of attack on a problem, which is a good thing even though it requires
mastering several techniques and thinking harder. In my experience, both
customers look for and vendors try to sell single solutions, but complex
problems are seldom that simple.

Corey Lofdahl
From: "Corey Lofdahl" <clofdahl@bos.saic.com>
SAIC

Posted: **Fri Jan 30, 2004 10:56 am**

A few years ago I took an SD course at MIT from Jim Hines. I remember
that there was a student in the class who had an extensive background in
mainstream statistical analysis. This student seemed skeptical of SD
throughout the semester, at one point asking Jim how he thought SD
matched up to statistical analysis for rigor, where did modelers get
their data, etc. I dont remember Jims answer, other than a vague
recollection that he and the student seemed to be talking apples and
oranges.

Being a slow thinker, it occurred to me much later that they were
talking about the same thing, but came at it differently. I think this
difference is what motivates the frustration I sense below Jims
question about statistical analysis.

Heres what I think the issue is: Operationalization.

Both SD modelers and statistical modelers have to figure out how to
operationalize the hypotheses theyre testing. Statistics has been
around for so long, and is so widely used, that perhaps statisticians
have become slightly blind to how their "data" are merely their best
guess as to what variables should be in their statistical analyses.
With statistical models, there is always the possibility of
"misspecification error," i.e., leaving out an important
variable. Indeed, most social science statistical models are considered
superb if they get R-squareds of .3, which to me indicates that there
are lots of missing variables.

The stock, flow and auxiliary variables in SD models are simply the SD
modelers attempt to operationalize the hypothesis she/he is testing.
Its just a different approach, suited for some questions but not
others (which is also true for statistics).

I think that many people who are VERY steeped in statistics have lost
their perspective on how statistical modelers are also making judgment
calls on operationalization of their models. Both methods are useful in
the right circumstances, and both methods have histories of successes,
and failures. Proponents of both methods need, as John Sterman has
repeatedly pointed out, to keep their perspective on the
"correctness" of their models.

John J. Voyer, Ph.D.
Professor of Business Administration
School of Business
University of Southern Maine
96 Falmouth St.
Box 9300
Portland, ME 04104-9300

voyer@usm.maine.edu
phone: 207-780-4597
fax: 207-780-4662

Posted: **Fri Jan 30, 2004 3:03 pm**

Jim asks what I meant by the FIMLOF method being "rigorous and practical"

"Practical": means you can use it (or approximations to it) with real
problems.

"Rigorous": Well its true that virtually every formal statistic method
is mathematically rigorous in getting from its assumptions to the results.
But the assumptions may be pretty unrealistic, and the results may fall
short of be able to guide action, either by excluding many factors from
the analysis, and by not quantifying the actual action being contemplated.

For example, its well-known how to quantify consumer preferences (conjoint
and newer methods). But such results fall short of saying whether a given
product line can sustain success in the marketplace, and whether it should
be continued or dropped.

So when I say FIMLOF is rigorous, I mean that its mathematically rigorous
all the way from practical assumptions (which include missing and noise data,
and structural uncertainty) to practical conclusions (should I do X or Y, and
whats the uncertainty in payoff associated with each?).

cheers,

alan

Alan K. Graham, Ph.D.

Decision Science Practice
PA Consulting Group
Alan.Graham@PAConsulting.com
One Memorial Drive, Cambridge, Mass. 02142 USA
Direct phone (US) 617 - 252 - 0384
Main number (US) 617 - 225 - 2700
***NEW*** Mobile (US) 617 - 803 - 6757
Fax (US) 617 - 225 - 2631
***NEW*** Home office (US) 617 - 489 - 0842
Home fax (US) 978 - 263 - 6861

Posted: **Fri Jan 30, 2004 3:44 pm**

Dear members of the list,

Although I agree with the spirit of the suggestion of Prof. Sterman still I
think there are issues that are not easily to take for granted.

Basically I hold the view that there is no chance to prove or to sustain the
benefits of statistics as support for causal explanations of any kind. I will
expose briefly some main ideas but the bibliography at the end of the messagge
does it better than me.

About technical problems with social science statistical methods (specifically
factor analysis, logisitic regression and multiple regression) Glymour
(1998)summarizes main problems : little may be known a priori about causal
relations beyond time order, observed associations may be due to unmeasured or
unrecorded common causes and in general all problems associated with
“observation and measurement”, the vast number of alternative possible
hypotheses – larger number of variables - associated with causal structures,
the problems associated with getting accurate representative samples,
complications because of feedback structures. Glymour concludes that “without a
priori causal knowledge, there is no way to get reliable causal information of
any sort from multiple regression” (p. 20). He also questions factor analysis:
“there is no proof of the correctness of any factor analytic procedure in
identifying causal structure in the larger sample limit… There is not even a
proof that the procedures find a correct simple structure when one exists” (p.
9). His explanation points at the incoherence of pretending to deal with
causal structures with tools that lack of formal language for causal analysis
(i.e. statistics). The problem gets worse if also a priori knowledge (for
example to test some hypothesis) comes from previous empirical statistical
based studies (as usually does in many branches of social sciences).

And about the very possibility of having characterizations of causal relations
based on probability and statistics the situation is not better. The basic
premise of a probabilistic theory of causation is that causes raise the
probabilities of their effects. The problem is the issue of multiple causality
associated with complex phenomena and the fact that probabilities does not
enhance our true understanding of why things happen (after all
probability is just a measurement of our ignorance). For example, based in what
we can observe, it is quite conceivable that some
causes actually lower the probability of its ‘effects’ because, simply, an
event can have different effects in different circumstances. The problem of
knowing based on empirical statistics is closely related with the old,
unresolved (and still relevant) Humean problem of induction. And other very
close associated problem is the classic distinction between correlation and
causation that we all know. As in the Humean tradition, we only witness
constant conjunction. Wesley Salmon (1994) condenses the debate: “In an
illuminating discussion of the possibility of characterizing causal concepts in
statistical terms…This enterprise is hopeless” (p. 301) .

Best,

Camilo Olaya
From: Camilo Enrique Olaya Nieto <colaya@uniandes.edu.co>
PhD student University of St. Gallen (Switzerland) - University of Los Andes
(Colombia) -

A mega-short bibliography:

Eells, E. 1986. Probabilistic causal interaction. Philosophy of Science,
53:1,52-64.

Glymour, C. 1998. What went wrong? Reflections on science by observation and
The Bell Curve. Philosophy of Science, 65, 1-32.

Halfpenny, P. 1987. Laws, causality and statistics: positivism, interpretivism
and realism. Sociological Theory, 5:1, 33-36.

Hesslow, G. 1981. Causality and determinism. Philosophy of Science, 48:4,
591-605.

Salmon 1994. Causality without counterfactuals. Philosophy of Science, 61

Posted: **Fri Jan 30, 2004 4:39 pm**

Hi everybody.

I do not know how many people read this thread about statistics.

I hope they are not too many.

I imagine an honest (objective) and competent (knowing the theory and practice) statistician not knowing SD and at fortiori the subtleties of the use of statistics in SD. What will he think about SD? Especially if he learns that some of the members of this list are the more renowned world experts in the field. The field may loose a lot of credibility.

I personally after this thread have to convince myself that I am not dreaming and will need some time to recover I hope the credibility that SD has suddenly lost in a few days for me.

I have studied Statistics quite a lot and for me its results are as sure as 2+2 = 4,

which is certainly not true for SD.

The fact that Statistics is often misused does not change that fact.

Regards.

J.J. Laublé Allocar rent a car company
From: "ddff.jj-lauble" <ddff.jj-lauble@wanadoo.fr>

Posted: **Fri Jan 30, 2004 5:16 pm**

John Stermans comment inspires me to the following comments:

1. At times SD models cannot be validated even though they are run.

John points out that statistical analysis can be very important, and I will
add that the mere idea behind statistical analysis is important for SD model
validation. When one observes that a SD model replicates some time-series
observations, this should not immediately be taken as a proof of the models
validity. Time-series observations are generated by noisy processes and are
distorted by measurement errors and improper definitions. The statistical
properties of the observations should be kept in mind. From the literature
we know that it is easy to see patterns where there are none, that we
practise wishful thinking etc. Hence a seemingly good fit may be an
illusion, the model may be importantly wrong. Statistical analysis and
thinking may help discover this.

2. At times SD models need not and cannot be validated by being run

On the other hand, statistical analysis may also say there is no
relationship when there in fact is an important one. Again poor data could
be the problem. In this case, simple statistical methods may not be
appropriate. To overcome this problem one should use Baysian statistics. In
practise this is not often done, because Bayesian analysis is quite
involved. On the other hand the mere idea behind Baysian statistics is
important in itself and should be well known to system dynamicists: there
are two types of data: prior data (structure in SD models) and time-series
observations. If the prior data (structure) are sufficiently well known, the
model may not have to be run. Many well known physical systems may serve as
examples, for instance a falling body in vacuum. Also the problem of climate
change is an interesting example where simple statistical analysis (claiming
that there is no certain climate changes yet) is used to mislead the debate
that should be based primarily on prior, structural data (the effect of
greenhouse gases on net heat flow).

3. But ususally, models need to be run to be validated

To the extent that all prior data are not know with high precision, and
there are time-series data available, running the model is important for
both calibration and validation (and of course to learn about the behaviour
of the model).

Erling Moxnes
From: Erling Moxnes <Erling.Moxnes@ifi.uib.no>

The System Dynamics Group at the University of Bergen

Posted: **Sat Jan 31, 2004 10:58 am**

Alan,

I was right with you, buddy, up until you said that FIMLOF is mathematically
rigorous from practical assumptions to practical conclusions. I take it the
"practical assumptions" are not the same as the unrealistic "first
assumptions" used in the mathematical derivation? So wheres the
mathematical rigor?

Maybe you mean that the first assumptions are "close enough" to how the real
world works, so that the conclusions will still hold. But, I dont imagine
theres any mathematically rigorous way to derive the impact of violating
the mathematical assumptions. So, wheres the mathematical rigor?

Heres a related question about confidence intervals: Someone using FIMLOF
methods might say something like

"This quantity is within X and Y with 95% confidence".

Is this just short hand for saying,

"If the first assumptions held, then Id be 95% certain that this quantity
was within X and Y. But, since the first assumptions dont hold ...
well, what the heck, I just feel sort of 95% certain about this thing."
................

Bill Harris gently points out that when you run an experiment you often
"collect, organize and interpret numerical data" and that this is
"statistics".

Thats a good point and not one that Id been mindful of. The term
"statistics" does cover a lot of ground.

I probably should stop there; Id be a shoe-in for heaven. I shouldnt say
"but".... Oh, darn here it comes ... Im such a slave to base desires:

**But** much of this thread has been focused more narrowly on applying
formal statistical methods like regression and FIMLOF in conjunction with
dynamic models.

Experiment is rather different. An experiment involves creating special
conditions and then observing what happens in those special conditions. You
create the special conditions so that the results are easy to interpret.

In a way, experiment in the physical sciences reverses the particular
statistical approach that weve been talking about in this thread.
Biologists expend enormous effort creating an environment that will
consequently produce easily interpretable data. In contrast, this thread
has been about an approach involving no effort manipulating the environment
and consequently huge effort interpreting the data generated by that
environment.

.............

A few people have argued that we ought to use both statistics and stories
(i.e. anecdotes) in our modeling.

Fine by me. But doing so doesnt mean you havent had to make a choice.

Theres just not enough money, time or patience to do everything completely
-- statistics, stories, eigenvalue techniques, full literature searches,
interviews, alternative structures, alternative modeling approaches, etc.
In fact, there isnt enough money, time or patience to do even ONE thing
completely. We always have to choose to do one thing and not another, to
do more of this and less of that.

"Doing both" sounds like not having to choose. But, if you werent using
statistics before and you are now, then youll now need to talk less to
people in the system or analyze your model less thoroughly, or maybe do less
modeling. Or maybe you can take it out of some other aspect the study or
even some other aspect of your life, but the point is that "doing both"
means doing less of something else. We need to choose.

Having to choose doesnt bother me at all. What concerns me is that I might
be misled in my choice by what I think is true about statistics, but which
really isnt.

Thats why Im asking whether mathematical derivation actually supports the
**use** of say FIMLOF. Thats why Im interested in whether, say,
biologists routinely use a process that is remotely similar to building a
simulation model, calling it "our theory", and then applying FIMLOF using
data generated without experimental controls.

Like, you know, Im just wondering.

Jim Hines
jhines@mit.edu

Posted: **Sun Feb 01, 2004 11:36 am**

Hi every body.

I have found 4 evidences about the statistical problem and one proposition.

1. The usefulness of a tool in a certain situation, will depend on the tool, the situation, the time and effort devoted to the problem, the competences of the people who solve the problem and probably too the users concerned with the problem. There is then a great chance that the

appreciations of statistics will differ and for ever.

2. If you feel that statistics will help you then use it, otherwise dont, or use only some features. I personally use only averages, standard deviation now and then and simple regression more seldom in my business which I recognize is not much.

3. If statistics are not helpful, why are there dozens of statistical books in the book stores for students? Vensim uses the maximum likelihood as the method of calibration. Is this feature not used at all?

4. If this site was not only arguing about philosophical SD questions, and was used to help people exchange their concrete problems and their models, the proponents of this thread would have other concrete proves for their assertions than vague philosophical explanations.

5. Proposition: from the three first evidences, it is clear that Statistics is not the culprit. One positive outcome of this thread would be to write down the real problems or difficulties expressed in this thread with simple words in one or two phrases.

About the coin flips problem. Suppose that the coin comes up heads 20 times, and you make a prediction about the next result. You will of course say that it will come up heads.

Statistics never say that this prediction is 100% certain. By saying this you make the assumption that the coin has identical faces. You have a very slight probability to be wrong, 1 / 2 power 22 or about 1/ 2000000. It is the likelihood that a two different faces coin will come up heads 20 times multiplied by the probability that it comes up heads again.

Regards.

J.J. Laublé Allocar rent a car company
From: "ddff.jj-lauble" <ddff.jj-lauble@wanadoo.fr>

Posted: **Sun Feb 01, 2004 11:41 am**

People have brought up many threads in this discussion. I will just focus
on, what I think are some key aspects of the issues that Jim raises. This
will not be a politically correct, balanced perspective. (I have full
sympathy for L.L Laubles entry.)

To start, there are no first principles in social science. We have already
noted there are none in science either. (John Sterman brought up the
example of Newton versus Einstein versus string theory versus.) From Gödel,
we know it does not even exist for mathematics. There is no "truth." ALL
models are wrong . but all we have, all anybody has, are models (mental or
formal) of everything. We want a models to be useful. What does "useful"
mean? It means (I suggest) that there has to be confidence that the
recommendations implied in the model results would produce an outcome more
satisfactory than the current (dynamic) REAL WORLD condition. Otherwise,
why waste the time? (A hidden assumption to this assertion is that this
implies rationality. Carl Sagan has argued being rational is our only
choice in his "The Demon Haunted World." I personally take the long view.
Darwin will ultimately sort out the rational from the
irrational --http://www.darwinawards.com/ . )

The confidence in the hypotheses or our models comes not from proving what
is "right." The only choice is to winnow out the wrong (as best we can) and
make progress by examining the hypotheses that seem to have the best chance
of not being so wrong that they are not useful. (Get used to the double
negatives. They dont make a positive. Karl Popper is still the key
starting reference for this logic.) Thus, we have nothing useful unless we
have some basis (corroboration) that it *might* be true - or more easily -
that it passes tests that try to prove it wrong. If you cant come up with
a test to prove it wrong, it is not even a legitimate hypothesis and can
only lead, by definition, to a useless end. This is obvious from a
rationality perspective. John Sterman has, again, discussed this at length.

We seem to be polarizing ourselves in this discussion for no good purpose.
We talk of the sacred stories and anecdotes. We rail against the
over-powerful statistics in all its abstract glory. We want to fall back to
the supposed security of experimental science. We want the freedom to
assume true what we "know" rings true. Do we just want to be a designer
cult? Give me as story. I make a model. It is the new religion. Whatever
you want. Come one, come all. Hope, want, belief, does not make things so.
If everything was black and white we would not need to discuss these topics.
We are prolific in opinion and wanting in substance. We believe we are the
first to have these thoughts and care none to learn what others before us
have done to explore these topics.

More to the point, we need to connect words with purpose and meaning. By
statistics we mean using data to increase the confidence in our hypotheses.
Using statistics does not necessarily mean using regression and R-squares.
It can mean experimental designs of our model runs with non-parametric
statistical comparisons of what the model produced relative what the data
indicate the world, which we are attempting to model, produces. This thread
is not a discussion of two equals -- stories versus statistics, where we
"choose" one or both. All of us are just repositories of misperceptions (me
included). John Sterman and Linda Booth-Sweeny have reams of DATA to show
this - albeit to a less extreme point than the one I am venting. The human
mind is very limited. Its senses and perceptions are colored by experiences
that we necessarily incompletely misunderstand. Data are a record of the
actual events. Like all else, it is far from perfect, but the use data is
the only connection we have between what we think "is" and what the
objective world "does." Jim Hines wants to make dichotomies that I dont
think exist. There is no perfect approach, but there are approaches that
are more formal and rigorous (that is rational) than others. An approach is
irrational until proven rational. It is NOT our choice. Running
hypothetical models can tell us how equations work and provide valid
experimental platforms to understand dynamics. They dont tell us whether
those results have any relevance to the real world. Even if we make
experiments that could be like those of basic science (which we cant), we
can never know whether we have proven anything. We can only increase
confidence or falsify our current thinking.

Anecdotes and stories are only useful to develop hypotheses to be tested. I
think the data shows that these stories and anecdotes have logic flaws that
render 99.999% of the hypotheses wrong. (I am sincerely being an optimist
on how many 9s I used.) R. Dawes is good place to start on showing
(proving?) how well we deceive ourselves (Dawes, R, Rational Choice in an
Uncertain World, Harcourt Brace Jovanovich, San Francisco, 1988) Michael
Sherner makes a living showing our inability to convert experience to
rational conclusion (e.g. Michael Sherner, Why People Believe Weird Things.
1997, Freeman and Company, NY). There are also numerous books on Social
delusions, Mass hysteria, Mass panics, and, what relates to our work, group
delusion - as in group modeling. I will give an example shortly to make
this point meaningful.

As Erling notes, the Bayesian approach can connect "beliefs" to hard data.
Stories and anecdotes are primarily (exclusively?) made of beliefs. There
is much written about the Bayesian approach. A starting book might be
"Scientific Reasoning: the Bayesian Approach" by Colin Howson and Peter
Urbach (Open Court, 1993). In general the "beliefs" are from experts . and
not from producers of anecdotes. The Bayesian twist is often to help avoid
testing bad hypotheses by having the expert explain what does not makes
sense. The weighting, however is still not zero-one in that you may only
partially believe the experts (they have the picture wrong too) or you
(typically) have multiple experts with mutually exclusive positions (just as
you often have multiple data sources that are mutually inconsistent).

Camilo brings up an important new thread - although maybe in a different
vein than intended. Clark Glymour comes to the table trying to discover
causation (Clark Glymour, Computation, Causation, and Discovery, 1999, MIT
Press). He is very much a statistician. What really he argues, as should
we, is that statistics are mostly valuable to us for aiding our
understanding of causality. That causality should be intrinsic to our
hypotheses formation - those wonderful levels and rates imbedded within
feedback loops. The statistical use of data should be looking to refute or
support a causal explanation. This perspective is consistent with the
quotes that Camilo notes and, I think, more consistent with Glymours
intent. Conveniently, this weeks The Economist ("Signifying Nothing," Page
76, January 31, 2004 edition) notes how economists misuse statistics. The
prime complaint is that statistics are allowed to provide conclusions
without theory (i.e., without a causal hypothesis). I propose that we would
(or should) agree that void of a causal meaning, statistical implications
mean nothing. We really do need to focus on the "why." (And maybe do need
to search for better ways to make SD students and practitioners more readily
embrace that logic.)

We could probably show (as may be others have) that even when limiting the
degrees of freedom to maintain "statistical quality," you can always make an
a-theoretical equation that produces statistically impeccable, but causally
meaningless, results. I am arguing that we need statistics to keep us
honest, and we need TESTED hypotheses to maintain any sense of casual
meaning. We need the "why -- that causal meaning -- without which we cease
to be system dynamists.

Lastly, Jim Hines is right to ask for data on statistics versus stories.
Here is a small collection. As John notes, there are so many, it hard to
pick a few. I chose those that brought the point home to me. Certainly,
Alan Graham, a long history of PA Consulting folks (formerly Pugh Roberts),
Bob Eberlein, and a myriad of others have more relevant and numerous
examples.

In the late 19702 we had the oil crises. Roger Naill and I were visiting
oil companies to determine how they invested in new energy projects. The
CFO and comptrollers invariably explained that they used "hurdle" rates (the
then in-vogue management fashion). There would be no investment unless the
project would earn 18% ROI. Other than during the actual crises, the
companies would earn between 7% and 12% (using the same accounting procedure
as used for the "hurdle rate.") What the data indicated was that investment
as driven by growth expectations derived from departments asking for budgets
to maintain their importance and "turf." The budgets they felt they could
request depended on the past ROI. The projects were culled and selected by
an executive committee. The results looked (as expected) like a
Qualitative Choice Theory response (Daniel McFadden, 2000 Nobel Prize in
Economics). When faced with this data and the model results, the oil
companies agreed that this made much more sense and were glad we put a myth
(they always suspected was false.) to rest.

In the electric industry of the 1970s there was only growth on the horizon.
"Build them there nukes as fast as you can." Up until recently, the
electric industry remained one of the few purist optimization shops.
Everything is "perfect." The forecasters do a perfect job in estimating
future demands (using naive statistics) and the Linear programming models
showed the perfect capacity expansion plan. And that is what the utility
builds .except it doesnt. The board of directors (or the senior
management) have a fair bit of trepidation in putting a $billion or so
on-the-line due to results from obtuse techniques they really dont
understand. They use their "experience and wisdom" to guide the final
decision. As John Sterman showed, this experience is exponential smoothing
of actual growth rate information (see figure on page 641 of Johns business
dynamics). All the planning departments below the board had little impact
on final decisions. They were all shocked. The management realization was
to go to the other extreme and get rid of forecasting and the planning
departments. Few electric utilities now have such departments.

As another minor utility "story," there is the problem of having an
integrated model with the physical and financial dynamics as well as the
competitors and consumers. In our case, we initially made the dumb
assumption that revenues equal sales multiplied by price plus other income.
That works great if you look at the financial books, or the regulatory books
or the production books. But each used different prices and different
sales. There is often a long delay between the time a utility produces the
energy and receives the money. There are convolutions as to when it counts
the money in a regulatory, tax, or a financial sense. When integrated in a
model, with lots of prices and delayed sales, the picture of how a utility
works looks much different, and the hitherto unexplainable fluctuations in
the income of a regulated utility makes sense from the internal processes
rather then from the conventional blaming of the regulatory regime. Every
body in the utility had their local story from their perspective. They
could not understand the feedback loops without a model any more than we
could. They were further blinded by their nearsightedness. The stories
(myths) fit their needs, not reality. Their stories did give us hypotheses
to test and refute. We then had more confidence in the hypotheses that
survived.

In a high-tech setting we actually did the classic group modeling. (I
usually use a strawman-model approach to take advantage of the ease with
which people can refute as opposed to make hypotheses.) Their stories fit
well together. If there were discrepancies, the "higher" up departments
"corrected" the story and all agreed with a "oh ya" thats how it must
work." All VPs and the CEO were 100% approving of this representation for
the company. They knew! Among a host of refuted myths, one was that
production was based on orders. The (simple) statistics found little
correlation, even though the eye wanted to see one. After questioning the
production line staff and field sales-staff, we discovered that the company
liked to roll out new products at the end of the year. The customers knew
they could wait until the end of the year for deep discounts. Marketing
knew it needed hype for the roll out to make customers eager to get the new
units and that they need "salesman incentives" to keep the sales staff
pushing during the mid year fall-off. Many of these sales motivations would
give incentive for sales staff to just get the product on the customers
sites (to be returned later). Thus "sales" and the sales forecast has
little to do with what the production line really needed to produce. The
production line simply followed the inventory. They threw out the orders
and the forecasts. They kept the company afloat. Everybody in the "model
development group" - all managers - suffered group delusion.

As a side note to the high-tech story, the company collected lots of data.
Data overload city. Staff could only tolerate the quarterly summaries,
because they could not deal with the daily (hourly) fluctuations in the
data. Humans have an overpowering urge to see patterns (even ones that are
not there). The quarterly data balanced the books. There was great comfort
in the quarterly data, but the critical time constants of the system where
on the order of 2 to 6 weeks in many cases. The sales cycle combined with
these time constants meant that internal statistical analysis never made
even casual sense, but they were used anyway. We could show via the model
how these wrongly interpreted data could be produced and what variables had
to be sampled more frequently to allow meaningful decisions. The actual
point here is that bad data is still data. Understanding why it is bad or
how the dynamics shown could be produced, directs our search for the most
valid (most confidence) representation. Unless it is incompetently measured
data, I argue all data has information that can be used constructively - for
refutation of hypotheses.

Lastly (before I fill a book), the US Department of energy collects energy
data. It is actual relatively easy to collect data on a region level by
looking at in-flow and out-flows over a year. Inventory noise tends to
balance out. It gets tough when there are small states in a region. You
cant afford to sample it more when the impact is minimal to region or to
the more critical big states. The typical approach is to accept the small
state data and subtract out their sum from the regional value to obtain the
number for the big state. It is big and the "small" state errors wont have
much of an effect. When we modeled the region (New England), the small
state data "held together" but Massachusetts showed historical dynamics the
model could not produce. The dynamics were significant and would have big
implications for policy. There were few stories that even tried to explain
the data. We could make hypotheses for each data point. There was no
generalizable causal theory. We could no believe that history was dominated
by a multitude of one-time random events. We went back and smoothed the
data to estimate parameters. We then ran all the states and produced a new
history consistent with the regional total. All the small states made sense
within the error bounds of the collected statistics while Massachusetts now
made sense. A follow-up (small) survey indicated that the data the model
produced was more in-line with what the actual state data would indicate.
Thus, the model, to our and the clients satisfaction, showed the data to be
wrong but "correctable." Was this valid? The model and the data supported
the adjustments - and the model results and data could be understood. There
was a high degree of confidence in the model and the model was useful. We
were most assuredly "wrong." By using casual hypotheses and data, we could
bound how wrong we were and how confident we could be.

In summary, stories and anecdotes are not data. We all make up stories to
fit our needs. They are (ill-informed) mental models and ALL models are
wrong. Stories are a valid source of hypotheses that we can test - but have
no reason to believe a priori. Any meaningful data is better than no data.
Without data, there is no ability to have any confidence in the hypothesis.
We must run the model and use statistics (data). The process produces bad
results, but they are better than any others are.

George

George Backus
Policy Assessment Corporation
14602 West 62nd Place
Arvada, CO 80004
Bus: 303-467-3566
Fax: 303-467-3576
Email: George_Backus@ENERGY2020.com

Posted: **Sun Feb 01, 2004 1:09 pm**

Jim inquires as to the nature of "practical assumptions" that go
into a FIMLOF-based analysis. To answer the immediate question:

1. "Practical" means that the assumptions match up well with the
real-world situation youre trying to model. How do you make
"practical" assumptions? With the setup for FIMLOF, you can
explicitly model the data flaws and model uncertainties that are
typically assumed away in simpler regression methods. From David
Petersons paper in Elements of the System Dynamics Method, these
include:

a. Nonlinearities in model dynamics
b. Nonlinear measurement functions
c. Measurement error (errors in the [r.h.s.] variables)
d. Mixed sampling intervals (e.g., estimating a weekly model,
using monthly and yearly data)
e. Models with unmeasured endogenous variables
f. Cross-sectional, time series mixed data
g. Unknown characteristics of equation errors and measurement noise

People who want to know how this is done should read the paper, or
any textbook on uncertain dynamic systems (e.g. Fred Schweppes
eponymous book).

Perhaps a metaphor is in order: Every modeling exercise is a journey
from the situation as you find it, to the conclusions and maybe their
implementation. All statistical methods are rigorous for some part
of that journey, and pave the road and bridge the chasms that you
encounter, for the part thats rigorous. Regression analysis is a
beautiful bridge that quickly spans the crevasse between idealized
assumptions (complete and perfectly-measured data, uncorrelated drivers,
linear equations, etc.) and parameter values.

Unfortunately, your journey starts far away from that bridge, in the land
of incomplete data, qualitative data, etc. Econometricians know some back
roads that sometimes get you to the regression bridge, but theres no
guarantee that they can get you to good regression from where you start.
The FIMLOF road and many other SD techniques are longer, and perhaps less
elegant, but these paths run all the way from your practical situation to
actionable results. (Thats what I mean by "practical"--getting you
where you need to go.)

And there are difficulties, probably lesser, at the other end of the
regression bridge, the one leading from parameter estimation to
actionable outcomes: Whats the actual policy outcome in terms of
the parameters? How many more assumptions lie along that road?
Whats the uncertainty and sensitivity of the actual policy outcome
(as opposed to the parameter uncertainty or the forecast uncertainty
assuming perfect parameters)? Are the parameters unique? Is the
policy outcome unique?

So theres an analogy between choice of methods and choice of route.
Regression may be a fast road, but it may not start or end up close
to where you start or end up. FIMLOF and traditional SD methods
(which properly done, heuristically approximate FIMLOF) start at
reasonably practical beginning points and end up with practical
end-points. Theyre therefor much more likely to be the routes
you want.

A couple of points of context:

2. Discussants need to keep in mind that "statistics" covers far more
territory than just B-school regression, or the traditional superset,
econometric analysis. In particular, probabalistic system dynamics
starts from engineering control theory, makes very different assumptions,
some less restrictive (see above) and some more restrictive (causal
state-variable system--a feedback system with no simultaneous algebraic
relationship, not onerous to SD modelers). And so such "statistics"
end up with very different methods, like FIMLOF, that have very different
characteristics from, e.g. regression analysis. So when Jim mentions
"unrealistic first assumptions", its a true characterization for some
statistical methods, but not others. It would be helpful to the
discussion if people specified which statistical methods theyre
referring to.

3. Let me be blunt: given that this thread is part of the System dynamics
field, whose foundation is feedback systems, were discussing how
statistics and probabilities statistics play out in dynamic feedback
systems. If most of the discussants arent familiar with how statistics
and probabilities play out in dynamic feedback systems, i.e. FIMLOF and
its mathematical foundations in uncertain dynamic systems, the discussion
is going to be mostly misinformed. People who want to know whats what
need to read the articles. (Not that a good tutorial
eview article
in the System Dynamics Review isnt overdue--its been 20+ years since
Peterson 1980. David Peterson? Bob Eberlein? How about it?)

4. As someone (Jim Hines?) in the thread pointed out, part of being
practical is choosing which methods to use. Doing everything isnt
practical. Real problems wont wait for it, and real clients wont
pay for it--even those with issues involving hundreds of millions of
dollars. Im not advocating literal use of FIMLOF in anywhere close
to all SD. The principle reason I bring up FIMLOF is that its (to my
knowledge) the richest rigorous conceptual framework for understanding
the advantages and pitfal ls of choosing simpler approaches. (My paper
from the Palermo conference discusses such pitfalls in the (somewhat
lengthy) Appendix, relative not to regression, but to traditional SD
manual model calibration and refinement.) So there are distinct dangers
to being ignorant of FIMLOF.

Let me be blunt again: As an experienced manager of SD projects, if
someone came to me and asked to do regression analysis to set model
coefficients, even if they could demonstrate that the precise numbers
were important to a policy outcome and increasing the project duration
and cost mid-project were possible, I would say no, unless someone was
involved who could see the dangers, either by understanding FIMLOF, or
had good theoretical background in econometrics (think PhD) and several
years of practical experience in a rigorous setting. (People need to
have mastered the art before they can reliably use it in a non-traditional
context like dynamic feedback systems.) Theres the immediate danger of
having so little data or so complex a situation that no formulation gives a
credible fit. More dangerously, regression results always have the danger
of seeming to be good but being substantially inaccurate due to flawed
assumptions--noise in the data, autocorrelations, omitted data and variables,
etc. Id be asking clients to put significant money at risk for perhaps
no results, and more dangerously, higher confidence in wrong results. And
there are too many ways in SD to work around such uncertainties, if they
arise.

Over and out,

alan

Alan K. Graham, Ph.D.

Decision Science Practice
PA Consulting Group
Alan.Graham@PAConsulting.com
One Memorial Drive, Cambridge, Mass. 02142 USA
Direct phone (US) 617 - 252 - 0384
Main number (US) 617 - 225 - 2700
***NEW*** Mobile (US) 617 - 803 - 6757
Fax (US) 617 - 225 - 2631
***NEW*** Home office (US) 617 - 489 - 0842
Home fax (US) 978 - 263 - 6861

Posted: **Sun Feb 01, 2004 3:53 pm**

Hi everybody.

I thought that the principles of testing were based on the following:

Suppose that you can prove that A implies B with only a 5% chance of error.

Then if you can prove that B is not true, you can prove that A is not true with only a 5% chance of error.

I am just going to test a feedback loop. I will test one after the others all the assumptions made to construct this loop and try to find all possible outcome from them.

If I can find one outcome that is far from what is expected, I will be able to say that one or many of my assumptions may not correspond to the reality and that I had better reconsider the way I constructed the loop.

I know I will never be able to prove that A is true, but having used all the possibilities to try the contrary, I will feel more confident to go to the next loop.

And in the meantime I will certainly have learned lots of things about the loop.

Regards.

J.J. Laublé Allocar, rent a car company.
From: "ddff.jj-lauble" <ddff.jj-lauble@wanadoo.fr>

Posted: **Mon Feb 02, 2004 9:46 am**

"Jim Hines" <jimhines@interserv.com> writes:

> I probably should stop there; Id be a shoe-in for heaven. I shouldnt say
> "but".... Oh, darn here it comes ... Im such a slave to base desires:

Ah, sweet virtue. Its so difficult.

> Thats why Im asking whether mathematical derivation actually supports the
> **use** of say FIMLOF. Thats why Im interested in whether, say,
> biologists routinely use a process that is remotely similar to building a
> simulation model, calling it "our theory", and then applying FIMLOF using
> data generated without experimental controls.

When I get into such issues, I sometimes refer back to
http://www.win.ua.ac.be/~sdemey/Teachin ... tique1.pdf. That
article reminds me that "proof," even in the mathematical sense, may be
more of a social process than a logical one, that showing something to
be correct may involve not so much the purity of logical form as the
persuasion of (other, also) fallible people, and that whether I need to
show a regression or a model or a proof or a story (or none of those)
may involve a healthy dose of what would persuade first me and second
those for whom Im doing the work (and third others who have a stake in
the work).

Of course, its not quite so simple. If Im creating a model (or
anything else, for that matter) where theres something significant
hanging in the balance, I owe it to me, to my client, and to society (is
that what professionalism is all about?) to attend to best and most
appropriate engineering practices. For _example_, if theres a
technique commonly available to skilled practitioners, and practice has
shown that using that technique reduces the otherwise too high
likelihood of serious mistakes, then I should either apply the technique
myself or be able and prepared to explain why it wasnt needed or
appropriate in this case. Indeed, perhaps I should volunteer that
information to help others judge whether the results theyre getting
from me are worthy of being relied upon.

Bill
From: Bill Harris <bill_harris@facilitatedsystems.com>
--
Bill Harris 3217 102nd Place SE
Facilitated Systems Everett, WA 98208 USA
http://facilitatedsystems.com/ phone: +1 425 337-5541

Posted: **Mon Feb 02, 2004 9:57 am**

Joel Rahn wrote:

>All of these examples completely ignore the time dimension in the sense
>that observations are lumped together as if they happened at the same time
>or, in the case of those like pharmaco-kinetic studies, as if they
>happened over the same time interval.

This strikes a chord with me. There are some statistical measures that
depend on the temporal patterns and ordering in time of events, for
instance the "autocorrelation" statistic that measures the persistence of
behaviour patterns over time. Similarly Markov methods use statistical
parameters to generate time series with persistent temporal patterns. These
patterns are lost in conventional statistical analyses that reduce data to
time-independent sets. Markov methods are invaluable for introducing
additional pattern to simulations that otherwise may be just white noise,
and in particular for simulating threshold behaviours between distinct
states, such as flood and drought states in hydrology.

These methods rely on feedback, an interesting cross-fertilisation of
statistics and system dynamics. Theres a classic paper that explores this
consilience: Sahin, 1979 Equivalence of Markov Models to a Class of System
Dynamics Models in IEEE Transactions on Systems, Man and Cybernetics Vol.
SMC-9 No.7

My point is to agree with both Joel and John Sterman that we can improve
our discipline and practice by attention to existing tools, including
statistical tools, and to suggest some particular ones that may be useful
if not basic to our work.

With much regard,
Michael Evans
Centre for Ecological Economics and Water Policy Research
University of New England
Armidale NSW Australia
phone + 61 2 6773 3744
fax 6773 3237
mevans@metz.une.edu.au

Michael Evans
Centre for Ecological Economics and Water Policy Research
University of New England
Armidale NSW 2351 Australia
phone: + 61 2 6773 3744
fax: + 61 2 6773 3237
email: mevans@metz.une.edu.au
web: www.une.edu.au/cwpr

Posted: **Mon Feb 02, 2004 10:42 am**

J.J. Laubles is extremely clear (and concise!) in providing the principle
that must guide much (but maybe not all?) statistical testing: If you can
prove A implies B with only a 5% chance of error. And, if you can prove that
B is not true, then youve proved that A is not true with only a 5% chance
of error.

But, the usual case is that you KNOW that A (i.e. the model) is false. I
think the way many people interpret the principle of statistical tests is
the following:

(1) You can prove that A (the model) is false.

(2) You can also prove that in a world where A **were** true, then B would
be true with only a 5% chance of error.

(3) You can prove that B is true.

(4) You conclude that other implications of A are also true with a 5% chance
of error.

J.J. -- The above chain of reasoning would horrify a real statistician,
right? So what does a real statistician do when faced with a situation in
which he knows from first principles that the model is wrong?

Jim
From: "Jim Hines" <jimhines@interserv.com>

Posted: **Tue Feb 03, 2004 11:01 am**

An article in the latest number of The Economist refers to an investigation
into economists use of statistics and "dangers" of focusing mainly on
statistics.

Some quotes:

"Two economists, Deirdre McCloskey of the University of Illinois, and Stephen
Ziliak of Roosevelt University, think their colleagues do a lousy job of
making sense of figures, often falling prey to elementary errors. But their
biggest gripe is that, blinded by statistical wizardry, many economists fail
to think about the way in which the world really works."

"To be fair, statistics can be deceptive, especially when explaining human
behaviour, which is necessarily complicated, and to which iron laws do not
apply. Moreover, even if a relationship exists, the wrong conclusions can
be drawn."

"A failure to separate statistical significance from plausible explanation
is all too common in economics, often with harmful consequences. In a past
paper* Professors McCloskey and Ziliak attacked other economists over-
reliance on statistical rather than economic reasoning, and focused on one
case in particular."

"... Ms McCloskey and Mr Ziliak found that 70% of the papers published during
the 1980s in the American Economic Review (AER), one of the most respected
journals of the dismal science, failed to distinguish between "economic" and
"statistical" significance. They relied too much on numbers, and too little on
economic reasoning."

"The two had hoped things might be getting better in recent years. The reverse
seems to be the case. In their latest work**, Ms McCloskey and Mr Ziliak looked
at all the AER articles in the 1990s, and found that more than four-fifths of
them are guilty of the same sin. Indeed, so pervasive is the cult of statistical
significance, say the authors, that ever more economists dispense altogether with
the awkward question of whether the patterns they uncover have anything
meaningful to say about the real world."

"Most fundamentally, argue Ms McCloskey and Mr Ziliak, the focus on statistical
significance often means that they fail to ask whether their findings matter.
They look, in other words, at things that are statistically but not economically
insignificant. Most people would prefer their conclusions to be significant in
both senses. Failing that, economic significance is presumably the more important."

* "The Standard Error of Regressions". By Deirdre McCloskey and Stephen Ziliak.
Journal of Economic Literature, March 1996

**"Size Matters: The Standard Error of Regressions in The American Economic
Review". (Forthcoming in the Journal of Socio-Economics)

Kai Arild Lohre
DNV Research
From: Kai.Arild.Lohre@dnv.com
Phone: +47 67578276

Posted: **Tue Feb 03, 2004 11:10 am**

Dear SD society,

In the magazine The Economist this week (Jan 31st - Feb 6th, p. 63)
there is a article about the "economic" and "statistical" significance,
which I recommend to enrich the discussion about the use of statistical
tools
within the field of SD.

In particular, the article is finished as follows:

"Most fundamentally, argue Ms McCloskey and Mr Ziliak, the
focus on statistical significance often means that they fail to ask
whether their findings matter. They look, in other words, at
things that are statistically but not economically insignificant.
Most people would prefer their conclusions to be significant in
both senses. Failing that, economic significance is presumably
the more important."

Regards,

Santiago Arango
PhD student
University of Bergen
From: Santiago Arango <sar064@student.uib.no>
Norway

Posted: **Tue Feb 03, 2004 11:39 am**

Jim Hines wrote:

>J.J. Laubles is extremely clear (and concise!) in providing the principle
>that must guide much (but maybe not all?) statistical testing: If you can
>prove A implies B with only a 5% chance of error. And, if you can prove that
>B is not true, then youve proved that A is not true with only a 5% chance
>of error.
>
>
I dont think it works that way. A 5% chance of error usually is
invoked to say that the prediction of a model (A implies B) lies in an
interval for which all you can say is that the true value of B is
either inside the interval or not BUT, if my statistical model is
correct, I expect the true value of B to be outside the interval at most
5% of the time. This is not the same as Jims last sentence above. The
true value of B may never EQUAL the value predicted by the model and yet
that would not invalidate the statistical reasoning (caveat: were
talking about distributions of continuous variables here, right?).

>But, the usual case is that you KNOW that A (i.e. the model) is false.
>
The falseness of A is not the issue. Different As could give
different, partially overlapping intervals and a true value of B that
was outside the union of all of these intervals would not p-r-o-v-e that
all of the As were false, just as a true value of B that was inside the
intersection of these intervals would not p-r-o-v-e that all of the As
were true.

> I
>think the way many people interpret the principle of statistical tests is
>the following:
>
>(1) You can prove that A (the model) is false.
>
>(2) You can also prove that in a world where A **were** true, then B would
>be true with only a 5% chance of error.
>
>(3) You can prove that B is true.
>
>(4) You conclude that other implications of A are also true with a 5% chance
>of error.
>
>J.J. -- The above chain of reasoning would horrify a real statistician,
>
The above chain of reasoning would horrify a logician, never mind a
statistician. Even if the model were perfect with a 0% chance of error
so that A does truly imply B, the fact that B is true does not imply
that A is true.

>right? So what does a real statistician do when faced with a situation in
>which he knows from first principles that the model is wrong?
>
I am not a real statistician but I think the answer is in my first paragraph
above. Ideally, one would like to repeat the application of the model A to
produce a value of B, like a lab experiment with controlled conditions. Since
we cant do that in social systems, we make do with the probabilistic reasoning
above. The real question for statisticians then is: What is the probability
distribution of B as derived from A?

Joel Rahn
jrahn@sympatico.ca

Posted: **Tue Feb 03, 2004 12:00 pm**

Wow, there are sure a whole lot of interesting debates that Jim Hines
questions triggered. I fear we need to break this thread into several, so
we have some chance of keeping them separate.

The title of the thread, I note, is still "Using Statistics in Dynamics
Models" -- thats "in" models. With that in mind, I wonder if we cant
create a different conversation about using statistics to
validate/invalidate models? That, it seems to me, is a different kettle of
fish. And I know that we all "hate" economists and econometricians because
they want to substitute their statistical models for our causal ones, but
that, too, is a different conversation.

Failure to separate these conversations is leading many us, and Im as
tempted as any, to make sweeping generalizations that, to the extent they
are at all valuable, are only useful within a very narrow context -- yet
theyre stated as if they apply to any use of statistics in conjunction with
SD models.

To return to the title again, focusing on the word "in," Id like to ask why
so many useful models contain a "noise" element? Why do we so often revert
to "noise" to make our behavior-over-time graphs look like the reality our
models are supposed to connect with? And what is this "noise" we use
supposed to represent? I dont think weve forgotten Jay Forresters prime
directive that everything in an SD model must have some relatively obvious
counterpart in the real world; so what is the counterpart to "noise?"

Isnt "noise" supposed to represent uncertainty in numerical results --
either exogenous or endogenous? And isnt this exactly the kind of
uncertainty that statistics can help us deal with? Wouldnt it be useful,
sometimes, to know that two model outcomes (derived, say, from sensitivity
analysis or investigating the effects of different policies) while they look
to generate different numerical results are "statistically alike" and that
the differences could be due to random variation?

Heres one area where statistics might be used as an adjunct to SD. Dr. W.
Edwards Deming made a lifetime of trying to teach managers not to impugn
differences in results to what he referred to as "special causes" when the
differences were within the usual range of differences built into the
process (what he called "common cause" variation) -- e.g., not to reward or
punish individuals whose performance fell within the statistically
determined range of expected outcomes of the process they were enmeshed in.
Creating (statistical process) control charts from SD model data is a way we
could use statistics as an adjunct to our modeling to help people better
understand the outcomes and what to do with them.

From: "John Gunkler" <jgunkler@sprintmail.com>

Posted: **Tue Feb 03, 2004 2:09 pm**

Quote JJ.Lauble and Jim Hines :

> If you can
> prove A implies B with only a 5% chance of error. And, if you can prove that
> B is not true, then youve proved that A is not true with only a 5% chance
> of error.

Im just a legal scholar, but I find this type of reasoning rather strange or
at least insufficient.
I think that a number of presuppositions are NOT mentioned.

The logic of JJ and Jim could be denoted as:
A=True implies B=True.for.95% AND
A=True implies B=NotTrue.for.5%
[BTW, According to the logic of Brouwers, one would have to add:
A=true implies B=Cannot.determine.True.or.NotTrue.for.0%
In other words, the "I-dont-know" implication is ruled out.]

The next step, however is that they try to reverse the argument, stating:
B=NotTrue implies A=NotTrue.for.95%

This only holds IF and ONLY IF the reasoning from A to B may be
read backwards from B to A.
And that is not known, or at least not explicitly stated.

Let me give an example.
Suppose that A stands for a cattle farmer in a certain area
Suppose also that B stands for a cow and that Not-B stands for a bull.
Suppose also that any cattle farmer has a stock of 19 cows and 1 bull.

Now the reasoning - acoording to JJ and Jim - would lead to the following,
if we would take the question:
What is the chance that the first animal you see, is a cow?

IF
I can proove that (being with) A (cattle farmer) implies
a 95% chance of spotting a cow
(or only a 5% chance of not spotting a cow but a bull)
AND IF
I can proove Not-B, i.e. Ive spotted a Bull
THEN
it follows that there is only a 5% chance of A not being a cattle farmer
???

I would say, that this depends on the number of other farmers
such as, for instance, sheep farmers, or anyone having a bull
Since there is no data on the existence of others besides A, you cant say
anything about the implication of proving Not-B

--
greetings,
Carolus Grütters
From: =?ISO-8859-1?B?Q2Fyb2x1cyBHcvx0dGVycw==?= <C.Grutters@jur.kun.nl>
Law & IT /
University of Nijmegen
The Netherlands
---

Ventana software support forum

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models

Using Statistics in Dynamics Models