census data
Posted: Tue Feb 08, 2000 9:29 am
The project Geoff has suggested is actually something that is done
frequently as a matter of course for demographers. It is, in fact, the
technique by which both current population estimates and population
projections are arrived at. It is also something that is done
frequently by people, such as myself, doing dynamic models of other
things.
Clearly when you are building such a model you want to compare the state
of the model with data at an appropriately matched time. If you are
looking at 5 sets of census data over 50 years you need to be using the
model results at the time the census was taken.
Geoff is absolutely right that taking a census is highly error prone
activity. Model building is also highly error prone. Fortunately,
those errors are not simply additive. The model and data can be each
used to improve the quality of the other. There is, in fact, a
wonderful story about census counts and model as applied, I think, to
Kenya. There had been two censuses taken some number of years apart
(more than 10 if I recall correctly), and a model built to bridge
between them. The model simply could not do that and a little digging
by the demographers demonstrated how poor the early census results had
been. Actually it demonstrated it to my satisfaction, but many people
simply refuse to believe anything but data measured by generally
accepted sampling practices and so there was little consensus on the
census.
As to degree of accuracy required in models, demographics is one of the
few areas for which it is easy to see which is more accurate. People do
indeed age year by year and it is very easy to see echos in population
structures. All of this is lost with an aggregate representation and
even with a yearly aging chain (as opposed to discrete cohort shifting)
you get significant spreading (that is there are more 22 year olds in
1920 because more people were born in 1900).
Given the speed of modern computers and the ability to put things into a
relatively compact notation having 100 population cohorts by <b>***</b> is
often sensible. I would say this with the caveats that first, you do
not want to become so quagmired in detail that you are unable to
actually spend time on the problem you are trying to work and second
that advanced statistical and analytical techniques such as kalman
filtering will not work with these beasts.
Bob Eberlein
bob@vensim.com
frequently as a matter of course for demographers. It is, in fact, the
technique by which both current population estimates and population
projections are arrived at. It is also something that is done
frequently by people, such as myself, doing dynamic models of other
things.
Clearly when you are building such a model you want to compare the state
of the model with data at an appropriately matched time. If you are
looking at 5 sets of census data over 50 years you need to be using the
model results at the time the census was taken.
Geoff is absolutely right that taking a census is highly error prone
activity. Model building is also highly error prone. Fortunately,
those errors are not simply additive. The model and data can be each
used to improve the quality of the other. There is, in fact, a
wonderful story about census counts and model as applied, I think, to
Kenya. There had been two censuses taken some number of years apart
(more than 10 if I recall correctly), and a model built to bridge
between them. The model simply could not do that and a little digging
by the demographers demonstrated how poor the early census results had
been. Actually it demonstrated it to my satisfaction, but many people
simply refuse to believe anything but data measured by generally
accepted sampling practices and so there was little consensus on the
census.
As to degree of accuracy required in models, demographics is one of the
few areas for which it is easy to see which is more accurate. People do
indeed age year by year and it is very easy to see echos in population
structures. All of this is lost with an aggregate representation and
even with a yearly aging chain (as opposed to discrete cohort shifting)
you get significant spreading (that is there are more 22 year olds in
1920 because more people were born in 1900).
Given the speed of modern computers and the ability to put things into a
relatively compact notation having 100 population cohorts by <b>***</b> is
often sensible. I would say this with the caveats that first, you do
not want to become so quagmired in detail that you are unable to
actually spend time on the problem you are trying to work and second
that advanced statistical and analytical techniques such as kalman
filtering will not work with these beasts.
Bob Eberlein
bob@vensim.com