QUERY Beer Game Simulation Performance Metrics

Duggan James james.duggan nuigal · Sat Sep 16, 2006 4:56 pm

Posted by ""Duggan, James"" <james.duggan@nuigalway.ie> Hi,

I'd be interested in views regarding metrics for measuring individual performance in a Beer Game simulation. My rationale is to find an objective measure that identifies the best player in a simulated beer game. As I see it, there are three options:

1) Cost (based on, for example, $2.00 for stockouts, $0.5 for inventory
cost)

2) Distance from inventory goal, based on a sum of squares distance of
each individual from their desired inventory.

3) Number of stockouts

Are there others people can think of, or which one of the above three would they think most suitable?

regards,
Jim.

________________________________
Dr. Jim Duggan, Chartered Engineer,
Department of Information Technology,
National University of Ireland, Galway.
IRELAND.
Posted by ""Duggan, James"" <james.duggan@nuigalway.ie> posting date Fri, 15 Sep 2006 12:36:47 +0100

Erling Moxnes Erling.Moxnes geog · Mon Sep 18, 2006 1:26 pm

Posted by ""Erling Moxnes"" <Erling.Moxnes@geog.uib.no> The three measures you suggest all measure both how well subjects do as well as how lucky they are. If for instance the factory stops producing beer, all players will experience stockouts, no matter how clever they are. Similarly, if the retailer suddenly places a huge order, the upstream players will experience stockouts.

Hence, if the goal is to indentify the ""best player"", you need a more advanced metric that consideres player strategies.

Erling Moxnes

----------------------
Erling Moxnes
System Dynamics Group
Dept. of Geography
University of Bergen
Posted by ""Erling Moxnes"" <Erling.Moxnes@geog.uib.no> posting date Mon, 18 Sep 2006 09:54:55 +0200

John Sterman jsterman MIT.EDU · Mon Sep 18, 2006 1:29 pm

Posted by John Sterman <jsterman@MIT.EDU> The metric for evaluating individual performance should be the same as the metric used to determine the reward individuals receive. Anything else is not incentive compatible. So if individuals are rewarded (paid) according to the total costs of their supply chain, then that should be the metric used. You would not be surprised to find that if you reward someone for x but then measure how well they did achieving y that they didn't do too well on y. Of course, x and y may be correlated (e.g., inventory and stockout cost will be correlated to the mean square deviation between actual and desired net inventory).

--
John Sterman
Jay W. Forrester Professor of Management Director, MIT System Dynamics Group MIT Sloan School of Management
E53-351
30 Wadsworth Street
Cambridge, MA 02142
Posted by John Sterman <jsterman@MIT.EDU> posting date Sun, 17 Sep 2006 10:57:17 -0400

Carl Betterton carlb uga.edu · Mon Sep 18, 2006 1:36 pm

Posted by Carl Betterton <carlb@uga.edu> Jim,

Several commercial business simulators permit a range of performance measures similar to those you list, but also include the option for a weighted combination of measures. This I think is in recognition of the difficulty of choosing any one ""best"" measure. Also, you can consider letting players choose (before playing) the performance measure they will use.

Best regards,

Carl
Posted by Carl Betterton <carlb@uga.edu> posting date Sun, 17 Sep 2006 10:53:51 -0400

Thompson James. P (Jim) A142 Jim · Tue Sep 19, 2006 2:19 pm

Posted by ""Thompson, James. P (Jim) A142"" <Jim.Thompson@CIGNA.COM> Regarding performance metrics for the Beer Game, Carl Betterton notes that facilitators may want to ""consider letting players choose (before
playing) the performance measure they will use.""

It would be interesting to have a record of how and what people learn from playing the Beer Game. From a constructivist learning perspective, for example, players should be asked to choose which performance measures they will use before the game and how their choices change as a result of playing the game and reflecting on their experience.

Jim Thompson
Economic & Operations Research
Cigna HealthCare
900 Cottage Grove Road, A142
Hartford, CT 06152
Posted by ""Thompson, James. P (Jim) A142"" <Jim.Thompson@CIGNA.COM> posting date Mon, 18 Sep 2006 10:21:42 -0400

John Gunkler jgunkler sprintmail · Tue Sep 19, 2006 2:19 pm

Posted by ""John Gunkler"" <jgunkler@sprintmail.com> To be realistic about it, taking John Sterman's advice into account, the metrics that matter are measures of impact on customers -- with the resulting impact on the player. So, while it would require a bit of work, I would suggest devising a metric, or metrics, based on what happens to each player's ""customer."" You can either do this thoroughly, adding to the dynamic model, or use approximations.

Adding to the Model:

For the retailer, you would need to have some model of how beer buyers react to stock outs. This model would take into account their immediate changes in buying behavior, their long-term changes in buying behavior [long-term, here, meaning the duration of the game], and the word-of-mouth effect on other buyers (a well-documented parameter, in other contexts, is that a disappointed customer tells 11 other customers; but then you must model the effect of negative word-of-mouth on the short-term and long-term buying behavior of these customers). Then you would have to include a computation of what these changes in behavior did to the revenues of the retailer.

For everyone else it gets a bit trickier because everyone else has two
customers: the immediate next step downstream (toward the beer buyer) in the supply chain plus the ultimate beer buyer. It seems more elegant to combine these two into some measure of how the beer buyers' behavior affects the next step downstream, but my work with process improvement suggests that it is more instructive to actually keep the two separate and look at both.
So, you would need to model how failure to deliver enough product affects the behavior (both short-term and long-term) of the next step downstream and how that affects the supplier -- including a fraction representing the chance they will switch to another supplier and you'll lose their business.

Note: Some may protest that if you're, say, three steps removed from the ultimate customer you should pay attention to what happens to your direct customer, to their direct customer, and to the ultimate customer. But that is unnecessary and needlessly complex. The ""Everybody upstream has two customers"" method seems to work well in real-life situations.

Approximations:

For upstream suppliers (distributor, warehouse, brewery), to handle the immediate customer (next step downstream) you might simply track cash flow from the next step downstream (compared to ""best case"" cash flow -- report it as a ""Percentage of Possible Cash Flow""), plus an expected lost value calculation to take into account the fraction representing the chance they will switch to another supplier (reported as something like ""$ Weekly Cash Flow Put at Risk""). This loss fraction could be a simple table, or graph function, relating back orders to likelihood of switching suppliers, to keep it relatively simple.

For measuring the ultimate customer, ""Dollar Value of Expected Sales Not Made"" would capture the short-term impact. A table or graph function (like the one mentioned in the paragraph above) could capture long-term impact, relating, say, ""# sales not made"" to a customer loss fraction, then multiplying the customer loss fraction by the ""weekly $ value of a customer."" You could account for the word-of-mouth effect by steepening the graph relating sales not made to customer losses.

John
Posted by ""John Gunkler"" <jgunkler@sprintmail.com> posting date Mon, 18 Sep 2006 11:05:33 -0400

Jean-Jacques Laublé jean-jacques · Tue Sep 19, 2006 2:25 pm

Posted by Jean-Jacques Laublé <jean-jacques.lauble@wanadoo.fr> Hi Jim

There are two ways to avoid both the risk to introduce a bias influenced by the problem's view of the examiner by measuring the strategy of the player and the risk of introducing the eventual luck of the player by measuring the play's results.

Both measure the results using as metric the objective chosen by the player as explained by John Sterman.

The first method is to run several plays, and take the average result. This will reduce the probability of a player winning by luck.

The second method less costly is to play only one game, and to ask every player to write down for every decision he takes in the game, the reasons of that decision.
It will permit to draw a relation between the way the players played and his results and eliminate the players having won by chance.
Regards.
J.J. Laublé Allocar .
Strasbourg France
Posted by Jean-Jacques Laublé <jean-jacques.lauble@wanadoo.fr> posting date Tue, 19 Sep 2006 11:28:34 +0200

Duggan James james.duggan nuigal · Tue Sep 19, 2006 2:38 pm

Posted by ""Duggan, James"" <james.duggan@nuigalway.ie> Thanks for the replies.

I guess an important challenge is how to separate the ""noise"" from the ""signal"" for each player's strategy, in order to evaluate how effective such strategies are. A complicating factor are the strong interdependencies, for example, the distributor may play a perfect individual game, but may look uncompetitive (and be unlucky) because of the behaviours of upstream or downstream players (behaviours which the distributor cannot directly control).

A related issue: consider the following scenario: let's say you run a beer game simulation with four different ordering strategies (s1, s2,
s3 and s4), where each strategy is randomly assigned to a player. You know in advance that s1 is the best strategy. Using a metric that is linked with reward (e.g. total cost), and where every agent could review, compare and change their strategy at regular intervals, would the supply chain strategies always ""evolve"" to a dominant best strategy (i.e. after 100 time units, would all the players adopt s1?) Are good ordering strategies path-dependent, will they always win out?

regards,
Jim.

P.S. I'm not asking anyone to develop such a model! I'm just wondering what their intuition might be on the eventual outcome.
Posted by ""Duggan, James"" <james.duggan@nuigalway.ie> posting date Mon, 18 Sep 2006 15:22:45 +0100

Richard Karash Richard Karash.co · Tue Sep 19, 2006 2:40 pm

Posted by Richard Karash <Richard@Karash.com> On Sat, 16 Sep 2006, Duggan James james.duggan nuigalway.ie wrote:

> I'd be interested in views regarding metrics for measuring individual
> performance in a Beer Game simulation. My rationale is to find an
> objective measure that identifies the best player in a simulated beer
> game. ...snip...

Hello James --

I think this is very difficult, if you really want to identify and reward the most effective individual action. I'll assume this is what you want to do.

If you want to use this as a teaching tool to demonstrate that obvious- seeming metrics can be bad, this should be easy to accomplish.

As other have said, the numeric performance at any position depends on the actions of others.

Here is an example... It's the most remarkable beer game performance I've seen:

I was leading the game for a group of BP executives and managers in Africa. When I run the game, I give teams a few minutes to strategize after the explanation and the first two weeks of demo play. The leader of one team, playing the Retailer position, got his team to follow this
strategy:

1. ""Factory, watch what I ship. I'll move each shipment so you can see
it. Produce that much, plus enough to catch me up."" This will
skip levels like an MRP system will do.

2. ""Other positions, order exactly what you see coming in the pipeline.
That is, order what your supplier will be able to ship two weeks
hence. That way, we'll move all the stock to the retail end where
it can do some good; stock at other positions has no value to the
customer. And, you'll have zero inventory and never be in backlog.""
(Take a good look at the beer game table; you'll be able to see which
box to look at in order to do this. In ordering, each player will
look to the right to see what's coming.)

The team executed perfectly. The retailer was out of stock for a while, then had a small inventory (say six) for the remainder of the play (the factory didn't get the ""catch up"" exactly right). After clearing the initial four cases, every other position was exactly zero inventory (no backlog) for the entire game.

Best score I have ever seen. The team metric (beer game score) accurately measures their performance.

What would you do with individual metrics? The retailer was out of stock then had inventory; so the retailer had the worst metrics just about any way you measure.

But, it was the retailer who demonstrated the most effective leadership I've ever seen in the beer game. And, the rest of the team were really
good team players. No panic. No one bailed-out. That's team performance!!
How could you capture individual metrics on performance like this in the beer game? Or in the real world?

I think there is a general lesson that in a system, measurements at fine granularity may be ineffective when performance is determined more widely.

I hope this is helpful.

-=- Rick

--
Posted by Richard Karash <Richard@Karash.com> posting date Mon, 18 Sep 2006 22:11:22 -0400

Joaquim Matos joaquim.n.matos gm · Tue Sep 19, 2006 2:41 pm

Posted by ""Joaquim Matos"" <joaquim.n.matos@gmail.com> Hi,

If you choose cost players will minimize the cost reducing the level of service and if you choose number of stockouts players will increase that level. The distance from inventory goal is not a simple and clear performance indicator, and no player will find it outside the game. In addition players will be more encouraged to aggressively close the gap.
All that behaviours will maximize their local performance but will be irrational from a global perspective. Key performance indicators (and agents
rewards) should be linked to the company goal, that is maximize the company value.
A KPI that you may consider is EVA - economic value added (=revenue-direct costs-(total assets*cost of capital)). With this indicator players will balance the maximization of the gross margin and the minimization of assets (inventory cost included) usage. Some additional parameters and co-flows will be needed but players will have a composite measure always available.
All the best,
joaquim
Posted by ""Joaquim Matos"" <joaquim.n.matos@gmail.com> posting date Mon, 18 Sep 2006 15:06:03 +0100

Ilker Soydan ilkersoydan hotmail · Wed Sep 20, 2006 2:50 pm

Posted by ""Ilker Soydan"" <ilkersoydan@hotmail.com> Hello all,

During my research I have seen some of the measures that the authors used. Apart from the previous comments (that I agree most of them), I wanted to list some of the critical measures that were preferred as an evaluation criteria. Some of them might be synonyms, but they might spark some other terms for the research:

amplification ratio
rogue seasonality
stock variance
coefficient of variation
noise bandwidth
peak order amplification
number of occurrences the demand is magnified backlog-to-inventory cost ratio

Well there is a huge list of references. I can provide them to the ones interested in private.

I hope it helps...

Ilker Soydan, Phd Candidate
Politecnico di Milano, Italy
Posted by ""Ilker Soydan"" <ilkersoydan@hotmail.com> posting date Tue, 19 Sep 2006 18:34:47 +0200

Joel Rahn jrahn sympatico.ca · Wed Sep 20, 2006 2:54 pm

Posted by Joel Rahn <jrahn@sympatico.ca> Richard Karash Richard Karash.com wrote:
> Here is an example... It's the most remarkable beer game performance

Are we still playing the Beer Game when the sectors are so flagrantly allowed to communicate with each other?

R. Joel Rahn
Posted by Joel Rahn <jrahn@sympatico.ca> posting date Tue, 19 Sep 2006 11:53:09 -0400

Jim Hines Jim VentanaSystems.com · Wed Sep 20, 2006 3:11 pm

Posted by ""Jim Hines"" <Jim@ventanasystems.com> Richard Karash writes
> Here is an example... It's the most remarkable beer game performance

Rick,

I think one element of your perfect team's strategy may be missing from the
description: What did the retailer order? (I know that you were making a loftier point -- and making it very nicely, too -- but I'm curious about the strategy itself).

John Sterman writes
>>>> The metric for evaluating individual performance should be the same
>>>> as the metric used to determine the reward individuals receive.

It seems like there's a structure-vs-behavior difference between performance
metrics and reward metrics. A reward metrics is part of the structure of
the system, because it influences the policy a player pursues. In contrast, a performance metric summarizes an aspect of system behavior.

The reward metric in the beer game is usually total costs of the supply chain; but interesting performance metrics might include a player's ordering range (difference between min and max), the damping ratio of system-wide inventory, and the variance of system-wide inventory. In fact one interesting performance metric (reported in a very nice paper) was whether the estimated policy parameters of a team would produce chaotic behavior -- this binary performance metric is very different from the game's reward metric.

Jim Hines
jim@ventanasystems.com
Posted by ""Jim Hines"" <Jim@ventanasystems.com> posting date Tue, 19 Sep 2006 14:52:56 -0400

Thompson James. P (Jim) A142 Jim · Wed Sep 20, 2006 3:19 pm

Posted by ""Thompson, James. P (Jim) A142"" <Jim.Thompson@CIGNA.COM> The original query and responses raise interesting questions: What's the Beer Game for? Why do teachers, instructors, trainers or consultants use the Beer Game? What do we want from the players?
Does the composition of the players affect what one aims to accomplish with the Game? What do players learn from playing the Beer Game?

In sum, what have we learned about players' learning from the use of the Beer Game?

Jim Thompson
Economic & Operations Research
Cigna HealthCare
900 Cottage Grove Road, A142
Hartford, CT 06152
Posted by ""Thompson, James. P (Jim) A142"" <Jim.Thompson@CIGNA.COM> posting date Tue, 19 Sep 2006 08:25:57 -0400

Richard Karash Richard Karash.co · Thu Sep 21, 2006 1:41 pm

Posted by Richard Karash <richard@Karash.com> On Wed, 20 Sep 2006, Jim Hines Jim ventanasystems.com wrote:

>> I think one element of your perfect team's strategy may be missing
>> from the
>> description: What did the retailer order? (I know that you were
>> making a

The retailer followed the perscription perfectly... He looked to the right and ordered what he saw that his supplier would be able to ship. As a result, his supplier was never in backlog. Only the factory was making decisions; every other position simply ordered what they saw their supplier would be able to ship.

Another writer asked, ""Are we letting people communicate?""

Unless you take extraordinary precautions, there is always some communication; players can see up and down the table.

My approach to running the beer game is to allow the teams some time to strategize after the demo rounds and before the real play. In my experience, this supports the learning... There is always a lot of wild variation to debrief. Even in this singular case, the other tables produced the usual variation and we had lots of material to debrief.

-=- Rick

--
Posted by Richard Karash <richard@Karash.com> posting date Wed, 20 Sep 2006 09:55:51 -0400 (EDT)

Alan McLucas a.mclucas adfa.edu. · Sat Sep 23, 2006 3:13 pm

Posted by ""Alan McLucas"" <a.mclucas@adfa.edu.au> Thanks folks for an interesting discussion. I tend to agree with Jim: we need to ask what we learn from the Beer Game, how we use it to teach SD and how it informs consulting practice (and research?).

There are a few critical lessons which come from the game, such as setting out to reduce fluctuations in supply by having the brewery respond as directly as practicable to customer demand (an important practical point made by John Sterman in the Beer Game video).

Such an inventory management strategy was implemented as a consequence of SD analysis of a Defence logistics problem I recently worked on. Each failure of operationally critical equipment in a remote theatre of operations (customer demand for beer) is transformed into a requisition for a replacement submitted directly to the national depot (the brewery). A replacement is dispatched immediately (there is a 10-14 day delivery delay).
A small inventory is held in the theatre of operations (retailer's shop).
Because a stock-out situation is highly undesirable locally held inventory was increased by a small margin. Further, in this instance the Defence logisticians could not prescribe inventory holdings and effective rules for re-ordering at the intervening lines of supply (wholesaler / distributor) and intervening delivery delays for this high-cost, operationally critical
item. These intervening stages (inventory holdings at 2nd and 3rd resupply
lines in military logistics parlance) were omitted in the simplified and highly responsive resupply strategy that was ultimately implemented.
Indeed, this is a fundamental lesson that comes from playing the Beer Game.
The logistics planner involved was previously an SD student who had played the Beer Game. He quickly related to the management flight simulator created to demonstrate the dynamics of the logistics problem he was now responsible for managing. One could argue that the Beer Game led to the creation of important insights.

However, there is limited utility in analysing the Beer Game to death.
After all, it is a game, albeit an interesting one. It is contrived and as such replicates complex real world behaviour only to a limited extent.
Regards, Alan

Dr Alan McLucas
School of Information Technology and Electrical Engineering, UNSW@ADFA, Australian Defence Force Academy, Northcott Drive, CAMBPELL ACT 2600 AUSTRALIA Posted by ""Alan McLucas"" <a.mclucas@adfa.edu.au> posting date Fri, 22 Sep 2006 07:50:42 +1000