Enhancing the Power of Household Panel Studies - The Case of the German Socio-Economic Panel Study (SOEP)

Gert G. Wagner, Joachim R. Frick and Jürgen Schupp
Montreal Jan 2006

We are grateful to Dean R. Lillard (PAM; Cornell University), Stephen P. Jenkins (ISER, University of Essex), and Silke Anger for comments. All remaining errors, in particular gaps in our descriptions of other studies, are our own. We would like to emphasize that this paper is the result of teamwork in Berlin and Munich (see Section 3.1), without whom SOEP’s continuing development would not be possible. We are particularly grateful to the principal investigators, staff and supporters of SOEP in its early years, especially Hans-Juergen Krupp, Richard Hauser, Christof Helberger, Reinhard Hujer, Karl Ulrich Mayer, Horst Seidler, Wolfgang Zapf, Christoph F. Büchtemann and Ute Hanefeld. 

For over a century, empirical research in the social sciences was based not on data collected by researchers—as is the case in the natural sciences—but on official statistical data. Sociologists and economists in particular thus relied on the statistical tables provided by federal agencies for their analyses. Beginning in the 1960s, however, and in many countries even later, social scientists began to obtain limited access to statistical agencies’ microdata on private households and individuals (and later on firms as well). When working with these new data, social scientists concentrated on “objective” variables such as occupational status and income. Longitudinal analysis in the social sciences was impossible although many theories and models dealt with the life course. Today it is more apparent than ever that longitudinal analysis is crucial  —  not only to test life course models, but also as a basis for establishing the causes of social phenomena and evaluating public policy programs.

Over the last two decades, many statistical agencies have significantly changed the kind of data they provide in order to meet this demand. Some have even responded by vastly increasing their research capacities. StatCan is one example of this evolution in official statistics: its SLID survey now comprises statistics that not only describe but also can be used to explain the causes and effects of social change.

Many important longitudinal insights can be gleaned from routinely collected administrative data. But there remain numerous theoretical concepts in the social sciences, and psychology in particular, that cannot be addressed by administrative data, and numerous policy questions that still cannot be answered. In particular for the evaluation of labor market policies, additional survey data are often needed. 

While the data collected for official statistics are often more applicable to social scientific concepts than administrative data, their focus is mainly on “objective data” that can be used for political and administrative purposes. The surveys are not built on theory-driven concepts and do not take into account, for example, “subjective indicators” or “physical health measures”. Many of the central concepts of social science theory, such as utility (and the parameters of the utility function), thus cannot be analyzed empirically using official statistical data due to their inherent limitations.

The difficulty of measuring such concepts as life satisfaction (as a proxy for utility) or risk aversion (a prominent parameter of the utility function) is documented by the ongoing debates on measurement methodology, and for this reason statistical agencies have been wise not to measure them.

Finally, at the end of the 1960s in the US (with the Panel Study of Income Dynamics, PSID) and in the 1980s in Europe (with PSELL in Luxembourg, the German SOEP, and the Swedish HUS), social scientists began collecting not only cross-sectional but also longitudinal household data themselves (and Statistics Netherlands started a household panel study as well). Glen Elder’s classic long-running panel study on “The Children of the Great Depression” is an early example of a successful and empirically grounded theory supported by a long-running, interdisciplinary panel survey uniting the disciplines of sociology, psychology, and history (Giele/Elder 1998). The “Wisconsin Longitudinal Study” which was started in 1957 is another example of a study under academic direction.

The early household panel studies began working with theoretical concepts and questionnaires much like those used in official statistics, and this tradition continues up to the present day. Over the course of time, however, panel studies have widened their scope to include new research questions as well, especially those dealing empirically with the utility of respondents and the parameters of their utility function (e.g., health, trust, fairness and reciprocity, risk aversion, control beliefs,  inequality aversion). In other words, socio-economic panel studies are incorporating more and more concepts from the fields of medicine and psychology. This development is driven by specific research questions, and its pioneers include the Health and Retirement Survey (HRS), the English Longitudinal Study on Aging (ELSA), and the Survey on Health, Ageing and Retirement in Europe (SHARE). The latter study provides a new comprehensive, international view on aging, but does not cover the population under 50 years of age.

General household panel studies that seek to provide a representative view of the entire population of a given society (for example, PSID, BHPS, and SOEP) have stuck much more closely to traditional survey concepts, partly due to the need for a stable conceptual basis and the requirements of longitudinal analysis. Mature panel studies, however, are also recognizing the need to incorporate new research concepts. SOEP has undertaken efforts to create a solid methodological basis for such expansions (with the hope that other panel studies will ultimately follow suit), making it a more open academic research tool than when it began in 1984. This has included comprehensive discussions of methods of data collection. In SOEP, the basic sampling unit is the household, and the focus is on all current (and future) members of thereof. Household composition is not stable over time, and demographic changes—including regional mobility, migration, death, and birth—are a crucial part of the research questions addressed in SOEP questionnaires. The more data that can be collected on the individual life course, the better the opportunities for analyzing intergenerational transmissions of behavior and social structures. And the possibilities of doing research along the traditional lines of research in “Behavioral Genetics” are improved by household panel data too, because the mixture of different intergenerational relationships in households. Of particular interest are the similarities and differences in behavior of siblings, twins, stepchildren, adopted children, and different kinds of grandchildren. The analyses of “family networks” help to disentangle the influence of genes and environment without measuring genes directly. 

To put it succinctly, SOEP, as one of the major household panel studies, stands for theory-based data collection, not just more data and better statistics.

If we look at social statistics, and especially longitudinal studies,  from a natural science perspective, it becomes clear that the “production” of empirical data should be the task of the scientific community itself, and not of a governmental agency. This means that longitudinal studies and social statistics should be treated as “big science” with all the consequences: they require the continuous expenditure of “big money” in order to compile the longitudinal data needed to establish a sound methodological basis. This is especially difficult for international comparisons.

Given the inherent differences in institutions and countries, which sometimes establish natural experiments, international comparison makes it possible to draw causal inferences on the impacts of diverse institutional, social, and economic developments on human behavior. The international comparability of data must therefore be a central objective in the governance of social statistics and longitudinal studies, one that can only be guaranteed through the design of optimal organizational and financial structures.  Two prime examples of “good governance” are European Social Survey (ESS – a set of repeated cross sectional surveys run by political scientists) and SHARE (a truly interdisciplinary longitudinal study of economics, sociology, and health), internationally harmonized data sets that provide an infrastructure of theory-driven research questions. Unfortunately Initiatives for internationally harmonized household panels, which are more expensive than studies like ESS, are often not research-driven—for example, the ECHP (European Community Household Panel). EU-SILC, the follower of ECHP, will have a short-term panel structure of just four years that will not allow the in-depth studies of life courses that are necessary to test social science theories.

This paper is organized as follows. In Section 1, we very briefly sketch current theoretical and empirical developments in the social sciences. All of them point in the same direction: they demand multidisciplinary longitudinal data covering a multitude of variables for valid empirical testing of social science theories and for valid evaluation of policy measures. Cohort and panel studies are therefore  called upon to become truly interdisciplinary tools. In Section 2, we outline an “ideal” household panel study. In Section 3, we describe the German Socio-Economic Panel Study (SOEP), identifying still-existent shortcomings as well as recent improvements made to approach the ideal. Section 4 concludes with a discussion of potential future issues and developments.

 

1. Our Evaluation of Theoretical Developments

A comprehensive overview of the numerous theoretical and empirical developments that have taken place in the social sciences in the last two decades is far beyond the scope of this paper. We focus on theoretical  developments which are most crucial for the development of empirical testing and analyses and thus for data collection in the social sciences.

In general, we can identify an increasing interdisciplinarity of concepts within the social sciences. As, for example, Diewald (2001) does too. Many disciplines are dealing with the life course as a central element of theoretical constructs. Sociology is incorporating elements of “rational choice” theory, a basic economic paradigm. Economics is still dealing with “objective” concepts like employment, income, and wealth, but economic models have expanded to incorporate even “harder” biological concepts such as the structure of the human brain and a wide array of “soft” sociological and social-psychological concepts such as tastes, values, personal traits, and particularly expectations (as an indicator of “bounded rationality”; cf. Kahneman 2003) as a framework for social behavior and actions.

Many of the social sciences are looking at health variables as well (e.g. Kalwij and Vermeulen 2005). The importance of controlling for health factors in empirical analyses has gained salience, among other reasons because of the differing effects of health factors on different social groups (e.g., illness affects less-educated people more severely than highly educated people).

Finally, empirical research in the social sciences has focused on two major gaps that have come to light—in our view—through some newly available data sets: the issues of “ability” and “utility”. In the latter case, SOEP was one of the data sets that made meaningful analysis possible. 

Utility is a basic concept in social sciences, described by economists in terms of its “outcome feeling” or sociologists in terms of “satisfaction”. But due to severe measurement problems, this ultimate outcome has been a kind of black box for the last two centuries. 

The same is true for “ability”. Social scientists (like everybody else) have long known that people—due to genetic codes, past experience (including education) and for other reasons as well—possess different “basic skills” (described as cognitive abilities and personal traits by psychologists). But these differences were never explicitly taken into account in social science theories. Ability was modeled as a distribution of “noise”. Personal traits were not even mentioned. The lack of explicit modeling ability and personal traits limits the understanding of economic behavior, especially given the likelihood that concepts like “education” and “human capital” have differential impacts on people with different levels of ability, possibly rooted in the individual’s genetic make-up (economists are aware of this and take measurement error into account  when they model the correlation of the “noise” in their models with the variables of interest, but they do not model it explicitly).

It is important that new kinds of data are providing the basis for studies by social scientists, and in particular economists, that are attempting to better understand the determinants of satisfaction (“utility”) and the interrelation between economic behavior, success, and ability and personal traits. In order to disentangle natural effects and social environment, it will be necessary to consider the methodological consequences of starting as early as possible in the life course with the collection of data.

Looking beyond the social sciences, we also see that Geographers are also interested in virtually every imaginable variable that relates to spatial information (and spatial data may also be a control device for the common clustering effect in most survey samples). We also see that researchers in psychology, public health, and epidemiology have become more and more interested in “social” and “economic” control variables and the richness of large surveys. And we predict that researchers who do research along the traditional lines of “Behavioral Genetics” will not only discover social context (Shanahan et al. 2004) but also exploit household survey data more and more. What makes household survey data most interesting for research in behavioral genetics  is the mixture of different intergenerational relationships in households and between households. Of particular interest are the similarities and differences in behavior of siblings, twins, stepchildren, adopted children, and different kinds of grandchildren. The analyses of “family networks” help to disentangle the influence of genes and environment without measuring genes directly (cf. Baker 2004, pp. 42). The combination of “traditional” household panel data with new kinds of data can make household panel studies to powerful instruments of new kinds of studies in behavioral genetics.

In sum, social scientists ranging from economists, sociologists and demographers, to epidemiologists, public health researchers, psychologists and geographers all share an interest in extremely broad, multi-topic data sets. The variables of interest are not only variables summarizing  traditional “objective” concepts (like employment status and incomes), non-traditional “objective concepts” (like doctor visits, physical health measures like height and weight), but also “subjective” variables summarizing cognitive ability, tastes and traits, expectations as input and “throughput” variables, and satisfaction (“utility”) as the final “outcome variable”.

Contextual information about networks, neighborhood and the environment has attracted interest as well. Economists and sociologists call this embeddedness of behavior “social capital”. In economics  “(matched) employer-employee datasets” are representative of this new focus, but also neighborhood effects (measured by geocode data) are being analyzed more and more. If these efforts succeed, we will be able to improve the empirical differentiation between “genetic/biological” and “socially” motivated behavior.

Our very brief selection of recent theoretical and empirical developments in the social sciences points to the strong conclusion that for valid empirical testing of theories in the social sciences and for a valid evaluation of policy measures, we need longitudinal data that not only cover the variables of one discipline of social science, but of multiple disciplines. Cohort and panel studies must therefore become more and more interdisciplinary devices starting with the collection of data on individuals as early as possible in the life course (cf. Diewald 2001).

 

2. Recent and Open Developments in Household Panel Studies

If we think about a household panel study as the basis for analysis of life course behavior and its determinants (necessitating a model of the household context), then a multitude of questions arise. The reason is simple: the “old” panel studies were not designed along these lines due to the lack of experience with panel studies 25 years ago.

These questions have been discussed within the SOEP group at DIW Berlin for over 20 years together with our “power users” and Advisory Board. They are also being discussed with and within the BHPS team, currently one of the few other panel studies in Europe under academic direction.  Others include the SHP (Swiss Household Panel), PSELL (Panel Study of Luxembourg), and the Russian and Ukrainian Longitudinal Monitoring Studies.

In the following discussion of the problems facing panel  studies, we sketch some of the innovations we have introduced to SOEP (brief notes in brackets; a detailed discussion follows in section 3). 

2.1. Coverage of population
Household panel studies have the aim of representing a “full population” and not just specific cohorts. Many social phenomena cannot be understood without this kind of complete picture (for example, income distribution and poverty issues) , because those questions require the consideration of relative positions and mobility within the entire population. And full household panel studies are important as well because they are the only means of contextualizing cohort studies. A new insight is that household panel studies by their very nature—gathering data on “family networks” — can become a powerful resource for  research in behavioral genetics.  Household panel studies which represent all kinds of family relations might especially helpful to get better estimations of measurement errors in traditional family studies.

Such “complete” samples always provide the relevant reference population and only if we succeed in observing the entire population do we have the basis for defining and following any given special group of interest (e.g., age cohorts, individuals experiencing a specific event, full range of family relations).  This aspect will be discussed below in more detail, together with the measurement of the impact of events (“triggered” studies). 

Although gaining a picture of the full population is the aim of all household panel studies, none of them meet this aim. For practical reasons of sampling, the universe covered by the panel sample is the “population living in private households”. This is unsatisfactory even in a cross-sectional framework, and completely unsatisfactory when thinking about a household panel study as a means for generating data on life course questions. This in turn raises issues of the more comprehensive coverage of the population of a territory.

Subpopulations that are missed  by all panel surveys, both cross-sectional and household, include:

The last two groups are of special importance in Europe and the enlarged EU, but may also be relevant in the case of USA and Mexico.

Note: Resurveying dropouts (when available and willing) would enable coverage of a very special “subpopulation” that is normally absent from panel studies.

2.2. Unit of Analysis and Age of Entry
One of the main questions regards the definition of a longitudinal unit of analysis. Household panels are always cross-sectional representative studies with (private) households as their sampling unit. But a life course perspective requires following a selection of individuals who live in changing living and household arrangements prospectively over time.

With respect to new research questions about the full life course, the question of the optimal “age of entry” into a household panel study arises. “Age of entry” means the age when an individual becomes a respondent on his own. At the moment, the age of 16 is the entry year to BHPS and SOEP. But starting individual surveys at 13 or 14 might make it possible to better observe the transition into adulthood (a special BHPS questionnaire is given to children aged 11 to 16). Developmental psychologists and researchers in behavioral genetics might even consider surveying small children as respondents on their own or as “test subjects”.  The age-of-entry issue raises not only practical questions on the acceptance in the field, but ethical questions as well (unavoidable not only because interviewing and/or testing children raises ethical questions but because collecting “proxy information” about children from their parents can raise ethical questions as well).

2.3. Sample Size
The “traditional” sample size for household panel studies was about 5,000 households, with information on about 10,000 individual respondents. One finds this sample size with PSID, SOEP, and BHPS and later in the national sub-samples of the ECHP (European Community Household Panel Study). There was probably no real scientific rationale behind this “magic number” but only the insight that (1) the sample size of an opinion poll (1,000 or 2,000 individuals) was too small and that (2) it was almost impossible to raise more money to afford a larger sample.

After several years of experience with panel analyses, it became apparent that this sample size was too small to answer important research questions dealing with smaller subgroups of the population. With respect to small subgroups, the specific issue of attrition in panels must be taken into account: sufficiently large numbers of observations are needed to identify selectivity in attrition and then also sufficient numbers of continuous participants to analyze substantive research questions (e.g., income or labor mobility). With SOEP, for example, the sample size was doubled in the year 2000 (after 15 years). By adding extensions for Scotland, Wales and Northern Ireland BHPS doubled its sample size as well. SOFIE, Statistics New Zealand, has about the same sample size. SLID was larger from the very beginning (but the number of waves per unit is limited).

It should be kept in mind that especially for policy analyses, it is not sufficient to understand the “all-purpose sample” of a household panel study merely as a framework for special studies that have larger numbers of cases for special target groups. For policy analyses, data needs to be available instantaneously, and for special studies, one needs control groups that are large enough for in-depth comparisons. Both arguments imply the need for “all-purpose” household panels with large sample sizes.

Cohort studies do not offer a completely satisfactory alternative. For example, a cohort study of newborn babies does not enable a researcher to control for the selectivity of motherhood, whereas household panel studies sample not only new mothers but also women in the same age groups who, although having the potential to be mothers, do not give birth to a child.

We believe that the time has come to think more systematically about the optimal sample sizes of household panel studies. Based on our experience with the different kinds of analyses that have been conducted on the basis of household panel studies in the last two decades, we offer the following proposals.

For many research questions, it is necessary that a researcher analyze single birth cohorts. For example, after the introduction of a new retirement scheme, it is necessary to have enough cases at hand for the first cohort that retires under the new scheme and the last cohort that retired under the old scheme. Of course there is some potential for pooling similar cohorts (e.g., some age cohorts or cohorts of movers from school to employment) in order to get sufficient case numbers, but the restriction is that one has to wait several years before analyzing the new scheme (one should note: the expectation of policy makers in longitudinal data sets are higher). More important: pooling mixes the effects of the different schemes with other cohort effects. So what is the minimal number of cases/events needed to analyze a change in a policy regime?

There is no clear-cut answer, but our research experience suggests that the minimal case/event number is about 500 per age cohort. If we expand this to the overall population that needs to be covered by the sample of a household panel study, we get about 40,000 individual respondents, which means about 20,000 households.

If one accepts our argument, this means that the sample sizes of panel studies like BHPS and SOEP are not large enough yet: in fact, both should double in size again. And if you want to have 500 persons per single year of age who are completely independent of each other (that implies: they are not living in the same households), then even more households are needed. About 40,000 households are needed to arrive at 500 persons per single year of age who are living in different households. In clustered samples, the effective sample size is smaller than the nominal number of observations. However, our research experience shows that the numbers mentioned here are sufficient even in the case of clustered samples.

2.4. Disproportional Representation of Sub-groups
Even in the case of a large sample, an overrepresentation of particular sub-populations might be of interest. Due to their “policy relevance,” the following sub-populations are “natural” candidates for disproportional representation:  

For the purpose of basic research in behavioral genetics groups like twins or adopted children could be overrepresented.

It should be noted that almost all of these subgroups are subject to differential attrition in panel studies, which reinforces the argument for disproportional sampling. For example, in the case of migrants, higher attrition may be caused by return migration, and for the observation of extremes in the income distribution (especially low-income individuals and households), there is ample evidence for an endogenously higher risk of dropping out.

2.5. Rhythm of interviewing
Most major household panels conduct interviews once a year (and some, like HRS, at two-year intervals, which in the case of the PSID has been done since 2001 due to a shortage of funding). Using retrospective questions about income and employment it is possible to generate “continuous” histories (for example, on a monthly, weekly, or bi-weekly level) for “objective” variables. A recall period of about twelve months still produces acceptable results on potential response errors for objective indicators (cf. Schuman and Scott 1989). The Australian HILDA survey improves this “calendar” on employment and education by asking for three measures per month over the last year. But for subjective concepts like satisfaction, such retrospective designs are impossible and the data we get is not very precise if you claim to link life events with outcome variables like satisfaction. Another issue is the attrition losses that may occur because people are contacted less frequently.

When thinking about the rhythm of interviewing one should have in mind that measuring  subjective data is a special challenge. And subjective concepts are more than just a supplementary outcome measure to other “hard facts” such as income. Indeed, individual perceptions, together with objective indicators, form some of the most important instruments for shaping our understanding of human behavior. As such, it is crucial to be able to measure subjective indicators consistently. Only panel studies make it possible both to distinguish between “noise“ and “signals“ and to control for fixed effects. In addition, there is evidence that respondents give better answers to subjective questions when they are interviewed repeatedly (cf. Frick et al. 2006).

If we want to link events with subjective data than it must be discussed  whether periods of less than one year between interviews are feasible. One possibility would be to take interviews, at least on subjective concepts, two or even four times per year. An even better possibility would be to use the Internet to screen for a recent event, “triggering” interviews or questions about the event, its causes, and its effects. In any case, if this would be realized in studies like BHPS or SOEP, this appears to be a major deviation from the usual approach in official statistics, where the idea is to collect information on a well-defined reference day or period.

Here are some examples of interesting transitions that could lead to “triggered interviews” by means of special questionnaires which allow for more in-depth analyses than would be possible by merely adding questions to existing questionnaires:

2.6. Subject Fields
In principle, the social sciences offer thousands of interesting fields and research questions, and many that are scientifically important. But if our community of social scientists spends public money, we must focus on relevant fields that meet the public’s needs for our data infrastructure.

In the UK, Peter Elias mentions the following fields in his outline of a “National Data Strategy”. We—and many SOEP users—believe that these fields are relevant for panel studies in general (where globalization is not an issue by its own but it is evident in immigration and also in rapid changes in labor markets and employment structures):

Fortunately, there is general agreement that these fields are also interesting from a purely academic point of view, and thus no need for a “real world vs. academic relevance” discussion. However, if we think about household panels as a series of cohort studies, we can still identify some additional areas that need to be covered. By starting very early in the life course with the collection of a broad set of individual data, the following extensions could be made:

2.7. Theoretical Concepts
As mentioned in Section 1, the users of household panel data are increasingly interested in broad interdisciplinary theoretical concepts that address human life and behavior from different disciplinary views. So fields and variables such as the following are becoming important for household panel studies:

2.8. Survey Methodology
In recent years, many new possibilities have emerged for innovative surveying methods. When the “old” household surveys started, the possibilities of telephone- and computer-assisted surveying were non existent. For this reason, all panel surveys started with standardized questionnaires administered by interviewers using “paper and pencil”. In the future, further new methods are foreseeable. They are of interest for three reasons: (1) cheaper fieldwork, (2) better data quality, and (3) the possibility to measure new concepts that were not measurable before (which will help to examine and develop some of the new theoretical concepts mentioned above).

Examples of new methods and their main aims are:

2.9. Context Data
In any case, one of the top priorities appears to be that of supplementing the micro-data with contextual information describing the individual’s environment and institutional context. Such data is even more important for panel analyses in a cross-national context, given the intertemporal, interregional and international sources of variation. Selected examples of contextual information include the following:

 

3. The Case of SOEP

The German Socio-Economic Panel Study (SOEP) is a “multi purpose” household panel study like PSID or BHPS. And like them, it is carried out under academic direction, but with special funding from the German (federal and state) government.

Like the PSID and BHPS data, SOEP data are available free of charge as “scientific use files”. Together with Cornell University, the SOEP Group has compiled all data and documentation in English (and German). A statistical primer for longitudinal statistics applications with examples of the SOEP database for the statistical package Stata is available as a book in English as well as in German language (cf. Kohler and Kreuter 2005, 2006).

Up to now more than 1,500 users have signed a user contract, which is necessary for data protection reasons. Each year, more than 300 users ask for the new releases of the study. Users are working in the fields of economics, sociology, statistics, demography, survey methodology, psychology, public health, political science, geography and sport science.

More than 3,100 SOEP-related publications (in peer-reviewed and other journals, collected volumes, etc.) have been entered into our literature database SOEPlit (http://www.diw.de/english/sop/soeppub/soeplit/index.html).

3.1. Governance of SOEP
The SOEP is a panel study that has, from the very beginning, been under full academic direction. SOEP was originally conducted as a project of the Special Research Unit 3 “SfB 3: Micro-analytical Foundations of Social Policy”, which was financed by the German Science Foundation (DFG) at universities of Frankfurt, Mannheim and Berlin. The project also included the DIW Berlin, a non-profit, non-partisan think-tank (German Institute for Economic Research – Deutsches Institut für Wirtschaftsforschung – DIW Berlin). When the activities of the Special Research Unit came to their scheduled conclusion in 1990, the entire responsibility for the SOEP project was transferred to the DIW Berlin, which runs SOEP as a “public good” in a joint research and service department which supports social sciences by collecting high-quality micro data. SOEP is part of the German and global “research infrastructure”.

To give a sense of the importance of this kind of infrastructural tool for the scientific community, one can compare SOEP and its funding with the large-scale telescopes and accelerators shared by astronomers and physicists around the world. May be the best analogy in natural sciences is the worldwide network of weather stations (like our network of respondents) which collects data which are shared by meteorologists all over the world (as our data are  not only analyzed in Germany, but – in comparison with panel data for other countries –  but abroad too). 

At DIW Berlin, the SOEP survey group designs the survey questionnaire, regularly incorporating suggestions from the SOEP advisory board and SOEP users around the world. The DIW Berlin, as the host institute of the survey and its council, has no privileges whatsoever in designing the SOEP survey. The DIW Berlin is just oneof many research institutions that use the data.

The SOEP fieldwork, cross-sectional data editing and coding is outsourced to a private sector survey institute (TNS Infratest Sozialforschung, Munich). This is the most efficient and effective option due to the skill and experience that professional interviewers from large survey institutes bring with them, in contrast to those of interviewers hired on a contractual basis. However, surveys like SOEP cannot be performed by fieldwork institutes without adequate research competence and a high-quality staff of interviewers trained and provided with different survey technologies. Infratest Sozialforschung, Munich, is not just a fieldwork organization with a broad field staff (about 500 interviewers are needed per wave for the SOEP survey; households are now spread to nearly all county  districts (Landkreise) in Germany, but a high-quality survey research institute and, as part ofTNS Global (Taylor Nelson Sofres), London, a global provider of market research, information, and consultancy operating out of 70 countries worldwide.

From 1982 to 2002, SOEP funding was provided mainly by the DFG (German Science Foundation – Deutsche Forschungsgemeinschaft). In addition DIW Berlin supported the SOEP from the very beginning by providing rooms, information and telecommunication support (hard and software), and some research and service staff.  The funds granted by the DFG came from the Federal Ministry of Science (BMBF) and the State Ministries of Science via the Senatsverwaltung für Wissenschaft, Forschung und Kultur (SenWFK) in Berlin.

In 1994, the German Council on Higher Education and Research (Wissenschaftsrat) recommended that the SOEP group be financed in the future as an independent unit with the functions of a service institution within the DIW Berlin. After lengthy negotiations, the German Commission for Educational Planning and Research Promotion (BLK) followed this recommendation, and from January 1, 2003 onwards the SOEP has been funded as a “Service Unit” (Serviceeinrichtung) of the Wissenschaftsgemeinschaft Gottfried Leibniz (WGL). It is set up as a special department of the DIW Berlin. The funding agencies have remained the same as before (BMBF and Sen-WKF). Thus, on the federal side, the SOEP is still funded by a different ministry (BMBF) than the institutional funding of the DIW Berlin (BMWA, Ministry of Economic Affairs). The Federal Government funds two-thirds of the SOEP’s budget, the Länder (federal states) fund the remaining third. SOEP is now funded out of the basic budget (Grundhaushalt) of the DIW Berlin, but its budget makes up a separate part thereof.

In order to generate high-quality data for a broad international user community, it is necessary that the SOEP survey group know about the latest methodological and conceptual developments in the relevant disciplines and in data distribution. To realize these objectives in the long term, it is crucial that the SOEP survey group conduct internationally recognized research. To meet these objectives, SOEP staff members carry out self-defined research projects covering the entire conceptual and methodological range of the SOEP questionnaire. This enables the SOEP team to strengthen its skills in all areas and focus on special problems of data collection methodology for panel surveys. The SOEP group is supported by designated departmental research professors and research affiliates of the DIW Berlin and guest researchers at the SOEP group.

3.2. The Design and Development of the Study
SOEP is a household panel study that was designed along the basic ideas of PSID:  all members of the first-wave survey households are part of the sample, and they—and all their offspring—are followed as long as possible in the field. But in contrast to PSID, not just one respondent per household is interviewed (proxy interview) but all adult members (individuals 17 years and older).

SOEP was started in 1984 as a regular cross-section of the adult population living in private households in Germany. But the coverage of minority groups was improved from the very beginning by oversampling immigrants and later with a special sub-sample of new immigrants (started in 1995) and a subsample of high-income households (started in 2002).

In SOEP, children (up to the age of 16) were not (and still are not) respondents on their own. For this reason, for most of the respondents in the first wave, there is a considerable degree of left-censoring. However, the retrospective information gathered for adult respondents does not go back to their birth but only to the beginning of adulthood. In the case of SOEP, entry to adulthood is defined as age 15. But for many theory-based research questions, information about the full life cycle of a respondent is desirable. And for an identification of causal effects, even more information is desirable, namely about the parents and the whole family history and social background of a respondent.

In order to address life-course questions and research, SOEP started collecting retrospective information about childhood in 2001, when the first children who were born into a “SOEP household” became respondents on their own. And in 2003, we started to collect information about newborn babies (and about their mothers’ period of pregnancy). The latter method of collecting “proxy data” about childhood of later respondents to SOEP will be extended in the coming years by asking age-group specific information at age three (upon entry to pre-school institutions), age six (upon entry to school), and age 12 (at the transition from childhood to youth).

Due to an increasing demand for “subjective data”, we started in the 1990s to integrate more and more psychological and “behavioral” concepts into the SOEP questionnaire, also by adding behavioral experiments (since 2003). In 2006, we are introducing the first physical health measure (grip strength) and  start substantially improving the measurement of cognitive potential (ability).

3.3. Enhancement of the power of SOEP in Detail
The SOEP survey was started in West Germany in 1984 with two subsamples: Sample A, the main sample, covering the population of private households, and Subsample B, which oversampled the “guest worker households” (Turkish, Spanish, Italian, Greek and (Ex-)Yugoslavian heads of household) that were not covered by Sample A. The original sample size was slightly below 6,000 households.

3.3.1    Data Collection up to 2005
In 1989, Germany faced a historically unique situation: an enlargement of its national territory. With the fall of the wall, Germany was reunited after more than 40 years of separation. In terms of integration into a household panel framework, unification was an extremely promising and interesting enterprise.

The extension of SOEP to cover the former German Democratic Republic (GDR) was an exciting task, but also one that presented many challenges (with the questionnaire, funding, but also new cooperation partners). From a sampling and methodological point of view, it was an easy task to establish a new subsample for SOEP because sample C covered the GDR population completely, independent of the original SOEP, which was started in 1984 in West Germany (Federal Republic of Germany). We were thus able to simply add the new sample to the old one (with independent weighting/expansion-factors) in order to make SOEP not only representative for West Germany, but for the unified Germany as well.

After this sample was established, all subsequent moves from East to West—and after a few years from West to East as well—got and get covered by means of our standard annual tracking procedures for households that change their address.

Subsample C, however, is unique in the sense that it is the only longitudinal micro data available allowing the analysis of the transition of an entire society from one regime to another. This is possible because we collected the first wave prior to German unification in June 1990.

Because immigrants who do not join an existing household have a sampling probability of zero in an ongoing panel study, they are not covered by studies like PSID, BHPS and SOEP. But because the massive immigration that took place between 1985 (just after the start of SOEP) and the beginning of the nineties makes up more than five percent of Germany’s population, we felt it was necessary to deal with this problem in a constructive manner and look for an innovative solution. We therefore raised special funds to start a small subsample of new immigrants in 1994/1995. This is a random sample based on a screening of 20,000 households.

After a test run in 1998 (based on subsample E, which included a methodological test of a new survey technology—computer assisted personal interviews, CAPI) we were able to begin to raise additional money in 2000, almost doubling the sample size of SOEP with the addition of subsample F. The reason for that task was the demand – last not  but not least by the Federal Government —to enable better policy analyses for subgroups of the population (focusing on labor market integration, welfare recipients, family formation, etc.).

Even with a sample size of more than 10,000 households, it is almost impossible to draw valid conclusions for high-income households (the top 2.5 percentile of the income distribution). We therefore started subsample G in 2002 representing “high-income households” in Germany. Like subsample D, this sample is also a random sample based on a screening of households. In order to get about 1,000 high-income households, we screened nearly 100,000 households. In 2002 we introduced wealth measures for the first time at the individual level (in 1988 we already had a wealth supplement as a drop-off questionnaire on the household level).

In 2003 we created a very special sample of “genuine fakes” that were identified in the existing SOEP interview (see Schraepler and Wagner 2005, Schaefer et al. 2005). This was possible because data collected in the course of a panel survey often reveals itself to be “faked”, which would be never detected in a cross-sectional survey. Detection was possible, for example, because interviewers who made up interviews were unable to do so in a consistent manner over time, and because some households that were sent small gifts for participating in SOEP but never actually got interviewed called the fieldwork organization and asked why they had received the letters and gifts. Data users can thus analyze about 180 faked interviews (less than 0.5 percent of all interviews in the respective waves). These fakes are stored in a special file and deleted from the sample representing the German population.

Beginning with subsample E, we introduced CAPI as an additional interview mode. We were able to do this in a controlled experiment that revealed no major mode effects when changing the interview mode in an ongoing panel survey from PAPI (paper and pencil interview) to CAPI.

In the 1990s, adding new subsamples was one of our major tasks in strengthening the analytical power of SOEP. We also started—on a very low level—to broaden the theoretical scope of our questionnaire. We introduced questions and improved scales about expectations, personal values  and self-control (locus of control).

Our users’ publications and developments in other longitudinal studies provide evidence that we can—and should—strengthen SOEP data by introducing broader self-reported health measures and new self-reported measures of our respondents’ personal traits and social capital. In 2002, we introduced new health indicators (smoking, height and weight), which are collected on a bi-annual basis. In 2003 we started to introduce subjective indicators on personal traits. We began – in the tradition of SOEP which was designed mainly for economic and sociological research – with specific concepts of personal traits which are especially if interest for economists and sociologists. Namely trust, trustworthiness and fairness, and in 2004, indicators on risk aversion. In 2005 we added indicators for reciprocity and a short version of the NEO Personality Inventory: the “Big Five Inventory” (BFI) of personal traits. This is a pure psychological concept, but with the potential to “rekindle the dialogue between sociology and personality psychology” (Roberts et al. 2004, 592). In 2006, we start to repeat these new indicators for the first time, and following up on our 1986 and 1996 surveys in 2006 we again repeat the so-called Inglehart Index. This will make SOEP the first long-term panel survey worldwide to study period, cohort and age effects on this established and important indicator index introduced by a political scientist but used by many sociologists to study value changes in modern societies.

Because of major discussion as to whether personal traits can be measured in a valid manner by “ordinary” survey questions, we added to the new survey questions some selected behavioral (that is, controlled) experiments that have been used e.g. by experimental economists and psychologists in laboratory settings. Starting in 2003, we ran—on a random subsample of nearly 1 500 households—experiments on “trust and trustworthiness” (this is a two-step social dilemma experiment of two randomly paired individuals) and in 2006, we run an experiment on “time preferences” (this will be a one-step experiment with randomly chosen winning chances for each 9th of the sample). These three concepts are personal traits that are conceptualized in economics and sociology (these are more specific concepts than the “Big Five Traits” as conceptualized by psychologists).

In 2001, we started with  “triggered questionnaires”, which contain in-depth questions asked if a respondent experienced a specific event. We started these in-depth interviews in 2001 because that was the first year in which children born into a “SOEP household” (in 1984/1985) reached the age of becoming respondents on their own. Since then, we have given young people at this age a special “Youth Questionnaire” to collect retrospective information about childhood, school performance indicators, in-depth information about living conditions and “feelings” as a teenager (including a baseline measure of personal traits, values, etc), relationship to parents (social capital), cultural capital and sports, and expectations about family, work and their future life.

In 2003, we began to deal with the event of “birth” which was underevaluated in SOEP (and other household panel studies). Household panel studies have a great advantage compared to cohort studies: household panels do not only observe mothers, but they observe women who do not become mothers too. The analysis of the selectivity of childhood and its impact on mothers and children is possible with household panel data, if the questionnaire is sensitive to this. So, using a “Mother and Child” questionnaire, we collected information about newborn babies, the time of pregnancy of their mothers, and a first valuation of motherhood, the “care setting” of the babies, and support by the partner.

Starting in 2005, we followed up the birth events. In 2005 we introduced a special questionnaire “Infant”, which collects information about two and three-year-old children (again with health indicators, activities with child, “care setting”, support by the partner and an ability and fitness scale (Vineland)). This means that we collect these data on children whose birth we observed in SOEP two waves ago. In other words, we have started to collect data about the birth cohorts 2003 and later. In 2007 or 2008 we will introduce a questionnaire for four, five or six-year-old children. Later we will also introduce a questionnaire for older children before leaving elementary school until they reach the entry age (17 years) for the standard SOEP adult questionnaire (and the special Youth Questionnaire). The first cohort of newborn sample members with completely enriched life-course data will be interviewed in 2018. By then, SOEP will be in its 34th year (which is not an inconceivably old age for a household panel study, as PSID shows).

3.3.2   Data Collection 2006 and beyond
Coverage and Sample Size
As a basic rule for the administration of SOEP, we want to stabilize the cross-sectional number of cases by drawing fresh samples on a regular basis. We believe that—due to the minimal number of cases for small subgroups in the population—the cross-sectional number of cases should not be smaller than 10,000 households (as recommended by EUROSTAT for the German component of the official European Survey EU-SILC as well).

Regular Refresher Samples will also enable us to identify and survey recent immigrants on a regular basis. However, it could become necessary (as in 1994/95) to survey special immigrant samples from time to time to obtain sufficient numbers of cases for sound analyses of this specific subgroup.

A special problem with certain cohorts will soon materialize in East Germany. Due to the dramatic drop in fertility rates after the fall of socialism in East Germany to half of their original level, the numbers of births in Sample C were small. Thus, the size of the teenager and young adult cohorts in Sample C will also be small, especially when they reach the age of respondents. Because we believe that the research interest in this unique “transition generation” will be significant, it may become necessary to oversample this generation in a future refresher sample.

Refresher samples always require raising supplementary funding for the additional task. At the moment, we are fairly confident that we can carry out a refresher sample of about 1,500 households starting in late spring of 2006. We will also take this as an opportunity to test the Internet as a survey tool. The survey will be taken off-line in the usual way, recruiting a gross sample by random walk and then using interviewers to conduct the household interviews and a distributing a special non-response survey to the households refusing to participate. Although not all respondents have online access—which is selective, as we can see from the existing sample—we will try to do the second wave (or an in-between wave) by Internet. In a small pretest sample of about 500 respondents carried out in spring 2004, we tested a small set of the same questions asked in the current SOEP. The results showed that the questionnaire can be administered by Internet, but non-response on income data was significantly higher.

Beyond 2006 we are thinking about asking widows and widowers about the death of their partners (“exit interviews”). And we are thinking about tracing emigrants. Migration plays a growing role all over the world. Thus not only are immigrants an issue for panel surveys, but emigrants have taken on greater importance in surveys. Up to now, respondents are no longer considered part of the universe that is to be represented after they leave the country where a panel survey takes place. However, if we think about a panel survey as a set of cohorts, then emigrants still belong to the universe of the survey. Even in the case that one is not interested in cohort analysis, keeping track of emigrants will raise the quality of a panel because a large part will return to their country of origin. Our prognosis is that tracing emigrants and interviewing them will be considered a standard procedure for household panels in Europe over the next decade.

Another potentially worthwhile improvement to our sample would be an attempt to reinterview those dropouts that did not explicitly refuse to participate (see BHPS procedures). There are doubts whether these dropouts represent a self-selection of all dropouts. We plan to check whether this could be a worthwhile future project. A first step was already an address check of all “lost” survey members in 2002. For about 8,000 individuals with whom we had lost contact due to non-response and attrition, we found out that about 1,000 had died and a significant number had emigrated. This information not only improves the number of observations available for mortality or migration-related analyses but also the quality of attrition analysis by differentiating demographic losses from “regular” panel attriters.

Concepts and Measurement
In 2006, we introduce the physical health measure of grip strength (for a subsample only, after a successful pre-test in 2005). Changes in grip strength are a predictor for changes in health status, and are more accurate than the self-reported health scales that are standard in all household panel studies. The grip strength measure is already used, for example, in SHARE.

In 2006 we also collect “physical” information about twins who can be identified as such in the SOEP samples. We will ask their mothers or them whether they are monozygotic or not.. This marginal investment (in terms of costs) in better information will improve considerably  the possibility of doing analyses in the research tradition of “Behavioral Genetics”.

In 2006 we also introduce measures or tests of the “cognitive ability” of our respondents. One test takes about 30 minutes for three dimensions of ability (verbal potentials, numerical potentials, and figural potentials). It is applied to first-time teenage respondents. And two ultra-short tests (enumerating animals and a symbol-digit test with three time stops each after 30, 60 and 90 seconds), which take less than five minutes, are applied in a subsample of the adult respondents.

We are currently discussing the idea with some of our users and our advisory board of introducing a new kind of triggered questionnaire in 2006: “exit interviews,” which would be triggered by the death of a partner . This innovation has been recommended based on positive experiences in the HRS that have improved the analysis of intergenerational wealth transfers and inheritance.

In order to reduce the burden for our respondents, we will test “matrix sampling”, which means that not all questions are given to all respondents, but that the questionnaire includes “missing values” by design. Because this design ensures that the missing values are completely random, “perfect” multiple imputations are possible. This kind of sampling is more or less nonexistent in official statistics and survey research, but fairly common in educational research, for example, in tests like PISA. We plan a serious test of its potential use in SOEP because matrix sampling reduces the burden for the respondent and thus gives the opportunity to introduce new subject fields (e. g. triggered questionnaires) and concepts.

3.4. Data Preparation, Documentation and Access
Data preparation, documentation and access are just as important as the collection of micro data (cf. Collins 2006., 524). Here we cannot provide anything close to a comprehensive overview of these aspects, but would like to mention some highlights and features of SOEP data that are new and not yet commonly known.

In panel studies like SOEP, the absolute focus is on standardized answers. But in all studies we also collect some “qualitative data”, for example, questions on worries or an open  “cool-down question” at the end of a questionnaire. In SOEP, we also ask—mainly for intra-household and longitudinal control purposes—for the given name of all sample members. These data are of interest for special research questions. In 2004, we started putting these answers into data formats and codes that allow for user-friendly and data-protected analysis.

Much of this kind of information is embedded in the data but difficult to “find” and analyze. We have made a significant effort to generate user-friendly data, for example, by identifying variables like “tenure with current employer”, which are straightforward and in high demand. We also provide data files with extensive biographical information (on parents, fertility, migration, marital status history, employment history, social origin, youth, etc., cf. Frick and Schneider 2005) as well as status variables with a focus on demographics like “year of death”, time-invariant immigration-related variables (country of birth, year of first migration to Germany), and link variables like pointers to parents, partners, children and to twin siblings as well as to other households at the same postal address (the latter only available since 2005).

In 2001, we started compiling spatial context data given by detailed geo-code information that can be matched to the micro data in SOEP. At the moment, this is possible at the level of the sixteen federal states (NUTS1), the 95 German spatial planning regions (Raumordnungsregionen), the almost 400 counties (NUTS2) and at the zip-code level (reduced information only). Finally, we are in the process of preparing geo-coded data at the block level (Strassenabschnitte).

We are considering a match of SOEP micro data with register data from the employment office by asking a sub-sample of respondents for their social security numbers. This new procedure would offer a comparatively superior opportunity for modeling labor market procedures.

The imputation of missing income values has been a major undertaking in recent years which was particularly relevant in order to achieve full comparability within the various member datasets o the Cross National Equivalent File (CNEF) (see below). In this context, it appeared most important to include longitudinal information in the imputation process (if available), which yields more reliable imputation results than purely cross-sectional imputation techniques.

Up to now, non-responding individuals within a responding household have been treated as missings, which can bias household income structures. Following the BHPS, we will invest in the imputation of missing income values for those temporary non-respondents by using their income structure from previous waves. Moreover the imputation methods will be checked on a regular basis.

In the more than 20 years of running the SOEP, we have learned much about the analysis of dropouts. For example, over the years, more and more variables have been taken into account for attrition analyses. We will check whether these improvements can be used to improve attrition analyses and the longitudinal weighting of the first waves (in the 1980s). A special project will be the analysis of non-response of individual household members within participating households (“partial unit-non-response”). This will also entail analysis of elderly respondents approaching death (observed over the course of time), which will be of special interest.

The longitudinal weighting of SOEP is based on a solid attrition analysis and on certain assumptions about the survey probabilities of respondents who join the survey for the first time by moving into existing households (i.e., living with sample members). In this context, it is also worthwhile to note that in 2004, 21 waves after the start of SOEP, the share of newly founded households in Samples A and B was 45% and 55%, respectively.  Not only for this reason, we will have to invest in alternative sets of assumptions about the survey probabilities of new respondents (for example, like the one used in case of the ECHP). The "fair share" approach will be tested in subsequent years.

Since beginning of 2006 online access to the sensitive geo-codes is possible through a “secure interface”. The software we use is called SOEPremote and is basically adopted from the LIS remote system LISSY which is         more tailored to our aims than, for example, NESSTAR. For a description of SOEPremote see Goebel (2005).

An extensive documentation of SOEP-data is available via the project’s homepage (www.diw.de/gsoep) including the “Desktop Companion, DTC” (cf. Haisken-DeNew and Frick (2005), a detailed description of the set-up of the biographical information (cf. Frick and Schneider (2005) and various introductory papers for using prominent statistical software packages (SPSS, Stata, SAS) with SOEP. The most important of these is SOEPinfo, a web-based information system that allows users to identify information at the variable level (including frequencies and an item’s correspondence across time) and gives support in setting up data retrievals (in Stata, SPSS, SAS) for generating rectangular analysis files from the underlying 250 SOEP micro-data files (http://panel.gsoep.de/soepinfo/).

The SOEPmonitor publishes statistical time series information based on SOEP data (http://www.diw.de/english/sop/service/soepmonitor/index.html). We provide data series for the years 1984 to 2004, disaggregated for East and West Germany since 1990, for selected cross-sectional and longitudinal information at the level of households and persons. This gives interested parties relevant information on how “life in Germany” has developed since the mid 1980s, but may eventually also provide users with benchmark information for their own research . 

SOEP plays an important and active role in international networks working on the construction of cross-nationally comparative databases (of both a cross-sectional and a panel nature) (cf. Burkhauser and Lillard 2005).  SOEP data is available for such comparative academic research and policy analyses in the following datasets and projects:

In order to achieve this goal, it is of utmost importance to apply international coding and classification standards in compiling national micro data. We have identified the following as prime examples of user-friendly data produced using “flexible” concepts in our questionnaires and doing afterwards ex-post harmonization:

 

4. Conclusion & Prospects

Panel studies are particularly well-suited to address the major substantive social science research questions that will have sweeping effects on society in the near future, from the local to the global level: aging, migration, globalization, and childhood development.

Recent theoretical and empirical developments in the social sciences provide strong evidence that for valid empirical testing of social science theories and for reliable evaluation of policy measures, we need longitudinal data that cover the variables of not only one but many disciplines. Cohort and panel studies must therefore expand continuously to become more interdisciplinary devices, and must begin with data collection on individuals as early as possible in the life course.

SOEP, like other panel studies under academic direction, such as BHPS, HRS, ELSA and SHARE, stands for theory-based longitudinal data collection, not just “more and better statistics” (cf. Burkhauser 2006).  Given the multidisciplinary set-up of the data and its users, recent SOEP-based papers have been published in top academic journals in a variety of disciplines. See for example:

In addition to the scientific use of the data, they have also been used in numerous policy analyses for Germany alone as well as for the German component of cross-national comparative policy studies. SOEP data have been used in recent years in the following reports by the German government and the OECD:

Major current concerns with longitudinal analysis include how to provide researchers with appropriate concepts that enable them to make full use of the data, and how to design the organizational infrastructure to facilitate and improve access to the data. The SOEP team is currently grappling with these issues and will continue to seek solutions in line with the plans for SOEP’s future development outlined here. Above and beyond this, through our ongoing interaction with other producers of panel data, we are currently discussing methodological (e.g. pre-testing) and substantive issues (e.g., timing of special topical modules) that can simplify future data harmonization and thus support cross-national analyses as the most efficient means for identifying the “best practice” in various policy fields.  In any case, a successful ex-ante coordination of further survey improvements will also facilitate future ex-post harmonization.

Panel studies under academic directions will undoubtedly continue to provide an important data source for policy analyses in the future. So some division of labor between official statistics and academic data collection would be conceivable in the next few decades (at least in Europe). Official statistics will  run short-term panels (like EU-SILC) that satisfy the short-term needs of policymakers, whereas panel studies under academic direction could emphasize the life-course of respondents including intergenerational aspects and transmission in particular. 

SOEP is currently discussing questions of data collection and analyses more and more intensively with the teams who run PSID and  BHPS. BHPS and SOEP are the only panel studies under academic direction at the moment in Europe.  Expanding the existing network of active panel data providers and analysts from official statistics and the academic community by pooling their experiences will not only improve the quality of the international panel data infrastructure, but also the analytic competence of users. This can also foster the emergence of new panels, as can be seen in the case of New Zealand (SOFIE) and the Australian HILDA survey, which has succeeded in combining all the “goodies” of the “old” panels in Europe and North America.

 

References

Adler, Marina A. .2004. Child-Free and Unmarried: Changes in the Life Planning of Young East German Women, Journal of Marriage and Family, 66(5), 1170-1179.

Baker, Catherine .2004.  Behavioral Genetics, Washington, D.C.: http://www.aaas.org/spp/bgenes/publications.shtml.

Banks, Randy and Heather Laurie 2000: From PAPI to CAPI: The Case of the British Household Panel Survey. In: Social Science Computer Review, 18(4), 397-406

Becher, Gunther et al. 2005.  Nicht-invasive Diagnostik von Entzündungsmarkern bei chronischen Atemwegserkrankungen, Laborwelt 6(3): 26-29

Becker, Irene, Joachim R. Frick, Markus M. Grabka, Peter Krause, and Gert G. Wagner. 2003.  A Comparison of the Main Household Income Surveys for Germany: EVS and SOEP, in: Richard Hauser and Irene Becker (eds.), Reporting on Income
Distribution and Poverty. Berlin, Springer: 55-90 

Boersch-Supan, Axel and Hendrik Juerges (eds.). 2005.  The Survey of Health, Ageing and Retirement in Europe – Methodology. Mannheim: http://www.mea.unimannheim.de

Borghans, José A.M., Lex Borghans and Bas ter Weel. 2005. Is there a Link between Economic Outcomes and Genetic Evolution? Cross-Country Evidence from the Major Histocompatibility Complex, IZA Discussion Paper No. 1838; Bonn

Burkhauser, Richard V. 2006.  Creating Internationally Comparable All-Age or Older-Age Cohort Long-Term, Social-Science-Based Longitudinal Data Sets in Canada - Discussion prepared for the Conference on Longitudinal Social Surveys in an International Perspective, January 24, 2006, Ithaca/NY

Burkhauser, Richard V. and Dean R. Lillard. 2005.  The Contribution and Potential of Data Harmonization for Cross-National Comparative Research, Journal of Comparative Policy Analysis, 7 (4) (December): 313-330 

Burkhauser, Richard V. and Dean R. Lillard. 2006.  The Case for NIA Leadership in Integrating State-of-the-Art Biomarkers into Next Generation Social-Science-Based Data, Cornell University Working Paper, Ithaca/NY

Burkhauser, Richard V. and John Cawley. 2005.  Obesity, Disability, and Movement onto the Disability Insurance Rolls, paper presented at the International Health Economics Association 5th World Congress

Burkhauser, Richard V., Michaela Kreyenfeld, and Gert G. Wagner. 1997.  The German Socio-Economic Panel - A Representative Sample of Reunited Germany and its Parts, Vierteljahrshefte zur Wirtschaftsforschung 66(1): 7-16

Camerer, Colin F. and George Loewenstein. 2003.  Behavioral Economics: Past, Present Future, in: Camerer, Colin F., George Loewenstein and Matthew Rabin.Advances in Behavioral Economics, Princeton University Press: 3-51

Camerer, Colin, George Loewenstein, and Drazen Prelec. 2005.  Neuroeconomics: How Neuroscience Can Inform Economics. Journal of Economic Literature, 43(1): 9-64 

Carneiro, Pedro, James Heckman and Dimitriy Masterov. 2005.  Labor market discrimination and racial differences in premarket factors, IFAU Working Paper Series2005:3, Uppsala

Cawley, John, Markus M. Grabka, and Dean R. Lillard. 2005.  A Comparison of the Relationship between Obesity and Earnings in the U.S. and Germany, SchmollersJahrbuch 125(1): 119-129

Clark, W. A. V., M. C. Deurloo, and F. M. Dieleman. 1997. Entry to Home-ownership in Germany: Some comparisons with the United States, Urban Studies, 3, 7–19

Collins, Linda M. 2006.  Analysis of Longitudinal Data; The Integration of Theoretical Model, Temporal Design, and Statistical Model, Annual Review of Psychology 57: 505 528

De Quervain, D., U. Fischbacher, V. Treyer, M. Schellhammer, U. Schnyder, A. Buck, and E. Fehr. 2004.  The Neural Basis of Altruistic Punishment, Science 305: 1254-1258

Denny, Kevin and Vincent O'Sullivan .2004. Can education compensate for low ability? Evidence from British data, IFS Working Paper No. W04/19, London

Diener, Ed .1994. Assessing Subjective Well-Being: Progress and Opportunities, Social Indicators Research 31(2): 103-157.

Diener, E., R. E. Lucas, and C. N. Scolon . 2006. Beyond the Hedonic Treadmill-Revising the Adaption Theory of Well-Being, American Psychologist 61(4), 305-314.

Diewald, M. .2001.  Unitary social science for causal understanding? Experiences and prospects of life course research. Canadian Studies in Population. Special issue on Longitudinal Research,  28(2),  219-248.

Diewald, Martin, Jörg Lüdicke, Frieder Lang, and Jürgen Schupp. 2006. Familie und soziale Netzwerke. Ein revidiertes Erhebungskonzept für das Sozio-oekonomische Panel (SOEP) im Jahr 2006. DIW Research Note 2006-14. Berlin.

DiPrete, T. A., and H. Engelhardt. 2004. Estimating Causal Effects with Matching Methods in the Presence and Absence of Bias Cancellation, Sociological Methods and Research, 32(4), 501–528.

DiPrete , Thomas A., S. Philip Morgan , Henriette Engelhardt , and Hana Pacalova. 2003. Do Cross-National Differences in the Costs of Children Generate Cross National Differences in Fertility Rates? , Population Research and Policy Review, 22(5-6), 439-477.

Dohmen, T. et al. 2005.  Individual Risk Attitudes: New Evidence from a Large, Representative, Experimentally-Validated Survey, DIW Discussion Paper No. 511, Berlin.

Dolton, Peter, Gerald Makepeace and Oscar Marcenaro-Gutierrez. 2005. Career progression: Getting-on, getting-by and going nowhere, Education Economics 13(2): 237-255

Drever, Anita I. and Clark, William A.V. .2002. Gaining Access to Housing in Germany: The Foreign-minority Experience, Urban Studies, 39 (13): 2439-2453

Dustmann, C., and A. vanSoest .2002. Language Fluency and Earnings: Estimation with Misclassified Language Indicators,” The Review of Economics and Statistics, 83(4), 663–674

Ermisch, J., Francesconi, M., Siedler, T., 2006. Intergenerational Economic Mobility and Assortative Mating, Economic Journal, July 2006 (forthcoming).

Fehr, Ernst, Michael Kosfeld, Markus Heinrichs, Paul Zak, and Urs Fischbacher. 2005a. Oxytocin increases Trust in Humans, Nature 435: 673-676

Fehr, Ernst, Urs Fischbacher and Michael Kosfeld. 2005b.  Neuroeconomic Foundations of Trust and Social Preferences, American Economic Review - Papers & Proceedings:  346-351.

Fehr, Ernst, Urs Fischbacher, Bernhard von Rosenbladt, Jürgen Schupp, and Gert G. Wagner. 2002.  A Nation-Wide Laboratory : Examining Trust and Trustworthiness by Integrating Behavioral Experiments into Representative Surveys, Schmollers ahrbuch, 122(4): 1-24

Freese, Jeremy, Jui-Chung Allen Li and Lisa D. Wade. 2003.  The Potential Relevances of Biology to Social Inquiry, Annual Review of Sociology 29: 233-256.

Frey, Bruno S, and Alois Stutzer. 2002.  Happiness & Economics. Princeton and Oxford: Princeton University Press.

Frick, Joachim R. and Grabka, Markus M. .2005. Item-non-response on income questions in panel surveys: incidence, imputation and the impact on inequality and mobility. Allgemeines Statistisches Archiv, 89: 49-60

Frick, Joachim R. and Markus M. Grabka. 2003.  Imputed Rent and Income Inequality: A Decomposition Analysis for Great Britain, West Germany and the U.S.,The Review of Income and Wealth 49(4): 513-537.

Frick, Joachim R., and Thorsten Schneider .2005. Biography and Life History Data in the German SOEP, DIW Berlin: http://www.diw.de/deutsch/sop/service/doku/docs/bio.pdf

Frick, Joachim R., Jan Göbel, Edna Schechtman, Shlomo Yitzhaki, and Gert G. Wagner. 2006. Using Analysis of Gini (ANOGI) for Detecting Whether Two Sub Samples Represent the Same Universe: The German Socio-Economic Panel Study (SOEP) Experience, Sociological Methods and Research (forthcoming)

Frick, Joachim R., John P. Haisken-DeNew, Martin Spiess, and Gert G. Wagner. 2005a. Overview of the SOEP, in: Haisken-DeNew, John P. and Joachim R. Frick (eds.). 2005. Desktop Companion to the German Socio-Economic Panel (SOEP) – Version 8.0, Berlin: http://www.diw.de/deutsch/sop/service/dtc/dtc.pdf

Frick, Joachim R., John P. Haisken-DeNew, Peter Krause, Rainer Pischner, Juergen Schupp, and C. Katharina Spiess. 2005b. Survey Extensions, in: Haisken-DeNew, John P. and Joachim R. Frick (eds.). 2005. Desktop Companion to the German Socio Economic Panel (SOEP) – Version 8.0, Berlin: http://www.diw.de/deutsch/sop/service/dtc/dtc.pdf

Frijters, P., J. P. Haisken-DeNew, and M. A. Shields .2004. Investigating the Patterns and Determinants of Life Satisfaction in Germany Following Reunification, Journal of Human Resources, 39(3), 649–673.

Fryer, Roland G. Jr. and Steven D. Levitt .2004. The Causes And Consequences Of Distinctively Black Names, Quarterly Journal of Economics. CXIX(3): 767-803

Gallagher, Dympna, Marjolein Visser, Dennis Sepulveda, et al. .1996.  How Useful is Body Mass Index for Comparison of Body Fatness Across Age, Sex, and Ethnic Groups?  American Journal of Epidemiology, 143(3): 228-39

Gangl, Markus .2004.  Welfare States and the Scar Effects of Unemployment: A Comparative Analysis of the United States and West Germany, American Journal of Sociology, 109(6): 1319–1364.

Giele, Janet Z. and Elder, Glen H. Jr. (1998): Methods of Life Course Research. Qualitative and Quantitative Approaches. Thousand Oaks-London: Sage.

Green, David A. and W. Craig Riddell. 2003.  Literacy and earnings: an investigation of the interaction of cognitive and unobserved skills in earnings generation, Labour Economics, 10 165-184

Gul, Faruk and Wolfgang Pesendorfer 2005: The Case for Mindless Economics. Levine's Working Paper Archive [2005-11-13]

Haisken-DeNew, John P. and Joachim R. Frick (eds.). 2005. Desktop Companion to the German Socio-Economic Panel (SOEP) – Version 8.0, Berlin: http://www.diw.de/deutsch/sop/service/dtc/dtc.pdf 

Hamermesh, Daniel S. 2004.  Subjective outcomes in economics, NBER Working Paper 10361. Cambridge.

Hank, Karsten, Michaela Kreyenfeld, and C. Katharina Spiess. 2004. Kinderbetreuung und Fertilität in Deutschland, Zeitschrift für Soziologie, 33 (3), 228 244.

Hank, Karsten .2003. The Differential Influence of Women's Residential District on the Risk of Entering First Marriage and Motherhood in Western Germany, Population and Environment, 25 (1), 3-21.

Hank, Karsten, Hendrik Jürges, Jürgen Schupp, and Gert G. Wagner. 2006. Die Messung der Greifkraft als objektives Gesundheitsmaß in sozialwissenschaftlichen Bevölkerungsumfragen: Erhebungsmethodische und inhaltliche Befunde auf Basis von SHARE und SOEP. DIW Discussion Paper  577. Berlin

.Headey. Bruce, Markus M. Grabka, Jonathan Kelley, Prasuna Reddy, and Yi-Ping Tseng. 2002.  Pet Ownership Is Good for Your Health and Saves Public Expenditure too: Australian and German Longitudinal Evidence, Australian Social Monitor 5(4): 93-99

Hedley, Allison A., Cynthia L.Ogden, Clifford L. Johnson, Margaret D. Carroll, Lester R. Curtin, and Katherine M. Flegal. 2004.  Overweight and Obesity Among U.S. Children, Adolescents, and Adults, 1999-2002, JAMA, 291(23): 2847-2850

Heijke, Hans, Christoph Meng and Catherine Ris. 2003.  Fitting to the job: the role of generic and vocational competencies in adjustment and performance, Labour Economics 10: 215-229

Hsu, Ming, Meghana Bhatt, Ralph Adolphs,, Tranel, Daniel, Camerer, Colin F. .2005. Neural Systems Responding to Degrees of Uncertainty in Human Decision-Making, Science 310: 1680-1683.

Hunt, Jenny .1999. Has Work-Sharing Worked in Germany?, Quarterly Journal of Economics, 114, 1–32

Hunt, Jenny .2004. Are Migrants More Skilled than Non-Migrants? Repeat, Return and Same-Employer Migrants, Canadian Journal of Economics, (37), 830-849).

Hurd, Michael D. and Smith, James P. 2001. Anticipated and Actual Bequests. In: David A. Wise (Ed), Themes in the Economics of Aging. Chicago, IL: University of Chicago Press.

Huschka, Denis, Juergen Gerhards, and Gert G. Wagner. 2005.  Naming Differences in Divided Germany, DIW Research Note 8/2005, Berlin.

Jäckle, Annette. 2006. Dependent Interviewing: A Framework and Application to Current Research. ISER Working Paper 2006-32. Essex.

Kahneman, Daniel. 2003.  A Psychological Perspective on Economics, AEA Papers and Proceedings 93(2), 162-174.

Kalwij, Adriaan and Frederic Vermeulen. 2005.  Labour Force Participation of the Elderly in Europe: The Importance of Being Healthy, IZA Discussion Paper No. 1887. Bonn

Karlan, Dean S. 2006.  Using Experimental Economics to Measure Social Capital and Predict Financial Decisions, American Economic Review, forthcoming

Klein, Markus and Pötschke, Manuela .2004. Die intra-individuelle Stabilität gesellschaftlicher Wertorientierungen - Eine Mehrebenenanalyse auf der Grundlage des sozio-oekonomischen Panels (SOEP), Kölner Zeitschrift für Soziologie und Sozialpsychologie (KZfSS), 56(3): 432-456.

Knutson, Brian, und  Richard Peterson .2005. Neurally reconstructing expected utility, Games and Economic Behavior 52: 305-315.

Kohler, Ulrich and Kreuter, Frauke 2005: Data Analysis using Stata. Stata Press, College Station / Texas.

Kohler, Ulrich und Kreuter, Frauke 2006: Datenanalyse mit Stata: Allgemeine Konzepte der Datenanalyse und ihre praktische Anwendung, 2. Auflage, Oldenbourg Verlag München.

Korupp, Sylvia E.  and Szydlik, Marc .2005. Causes and Trends of the Digital Divide, European Sociological Review, 21(4), 109-422.

Kreyenfeld, Michaela and Karsten Hank .2000. Does the availability of child care influence the employment of mothers? Findings from western Germany, Population Research and Policy Review, 19 (4), 317-337.

Kroh, Martin. 2005. An Experimental Evaluation of Popular Well-Being Measures, DIW Discussion Papers, Berlin.

Kuhnen, Camelia M., Brian Knutson (2005): The Neural Basis of Financial Risk Taking. In: Neuron 47: 763-770. Lang, Frieder R. (2006): Erfassung des kognitiven Leistungspotenzials und der "Big Five" mit Computer-Assisted-Personal-Interviewing (CAPI): Zur Reliabilität und Validität zweier ultrakurzer Test und des BFI-S, DIW Research Note 9/2005, Berlin

Lieberson, Stanley (2000):  A Matter of Taste. How Names, Fashions, and Culture Change. New Haven: Yale University Press.

Lillard, Dean R. and Richard V. Burkhauser. 2005.  Income Inequality and Health:  A Cross Country Analysis.  Schmollers Jahrbuch:  Journal of Applied Social Science Studies, 125 (1): 109-118.

Lofstrom, Magnus and John H.Tyler. 2004.  Measuring the Returns to the GED: Using an Exogenous Change in GED Passing Standards as a Natural Experiment, IZA Discussion Paper No. 1306, Bonn.

Lucas, R. E., A. E. Clark, Y. Georgellis, and E. Diener 2003. “Reexamining Adaptation and the Set Point Model of Happiness: Reactions to Changes in Martial Status,” Journal of Personality and Social Psychology, 84(3), 527–539.

Lucas, Richard E. 2005. Times Does Not Heal All Wounds. In: Psychological Science 16(12): 945-950.

McClure, Samuel M., David I. Laibson, George Loewenstein, et al. (2004): Separate Neural Systems Value Immediate and Delayed Monetary Rewards. In: Science 306: 503-507.

McCrae, Robert R., P.T. Costa (Jr.) 1992: Revised NEO Personality Inventory (NEO PI-R) and NEO Five Factor Inventory – Professional Manual, Psychological Assessment Resources, Odessa.

McGinnity, F. (2002): “The Labour Force Participation of the Wives of Unemployed Men,” European Sociological Review, 18(4), 473–488.

National Institutes of Health.  1996.  Bioelectrical Impedance Analysis in Body Composition Measurement: National Institutes of Health Technology Assessment Conference Statement, American Journal of Clinical Nutrition, 64(3S): 524S-532S.

Neave, Nick, Sarah Laing, Bernhard Fink and John T. Manning. 2003. Second to fourth digit ratio, testosterone and perceived male dominance, Proc. R. Soc. Lond. B.270: 2167-2172.

Nolte, Ellen and Martin McKee  .2004. Changing Health Inequalities in East and West Germany after Unification, Social Science & Medicine, 58(1), 119-136.

Nyhus, Ellen K., Empar Pons. 2005.  The Effects of Personality on Earnings, Journal of Economic Psychology, 26: 363-384.

Osborne Groves, Melissa. 2005.  How Important is your Personality? Labor Market Returns to Personality for Women in the US and UK, Journal of Economic Psychology 26: 827-841.

Pannenberg, Markus, Rainer Pischner, Ulrich Rendtel, Martin Spiess and Gert G. Wagner. 2005.  Sampling and Weighting, in: Haisken-DeNew, John P. and Joachim R. Frick (eds.). 2005.  Desktop Companion to the German Socio-Economic Panel (SOEP) – Version 8.0, Berlin: http://www.diw.de/deutsch/sop/service/dtc/dtc.pdf

Prentice Andrew M. and Susan A. Jebb.  2001.  Beyond Body Mass Index. Obesity Reviews, 2(3): 141-147.

Roberts, Brent W., Richard W. Robins, Kali H. Trzesniewski, and Avshalom Caspi. 2004. Personality Trait Development in Adulthood. In: Jeylan T. Mortimer and Michael J. Shanahan (Hg.); Handbook of the Life Course. New York: Kluver Academic/Plenum Publishers, S. 579-595.

Sampson, Robert J., Jeffrey, D. Morenoff, and Thomas Gannon-Rowley. 2002., Assesing “Neighborhood Effects”: Social Processes and New Directions in Research, Annual Review of Sociology 28: 443-478

Schaefer, Christin, Joerg-Peter Schräpler, Klaus-Robert Müller, and Gert G. Wagner. 2005.  Automatic Identification of Faked and Fraudulent Interviews in Surveys by Two Different Methods,  Schmollers Jahrbuch 125(1): 183-193. 

Schraepler, Joerg-Peter and Gert G. Wagner. 2005.  Characteristics and Impact of Faked Interviews in Surveys: An Analysis of Genuine Fakes in the Raw Data of SOEP, Allgemeines Statistisches Archiv 89 (1): 7-20.

Schraepler, Joerg-Peter. 2005.  Respondent Behavior in Panel Studies: A Case Study for Income Nonresponse by Means of the German Socio-Economic Panel (SOEP), Sociological Methods & Research 33(1): 113-156.  

Schraepler, Joerg-Peter. 2006.  Explaining Income-Nonresponse: A Case Study by Means of the British Household Panel Study (BHPS), Quality & Quantity (forthcoming).

Schraepler, Joerg-Peter 2004: Respondent Behavior in Panel Studies - A Case Study for Income Nonresponse by Means of the German Socio-Economic Panel (SOEP), Sociological Methods & Research, 33 (1): 118-156.

Schraepler, Joerg-Peter; Jürgen Schupp and Gert G. Wagner (2006): Changing from PAPI to CAPI: A longitudinal study of Mode-Effects based on an Experimental Design (mimeo). Paper presented at the European Conference on Quality in Survey Statistics (Q2006) Cardiff (UK). DIW-Discussion Papers No. 593. Berlin.

Schuman, Howard, und  Jacqueline Scott (1989): Response Effects Over Time. In: Sociological Methods & Research 17(4): 398-408.

Schupp, Jürgen and Gert G. Wagner. 1995.  The German Socio-Economic Panel – a Database for Longitudinal International Comparisons, Innovation 8(1): 95-108. 

Schupp, Jürgen and Gert G. Wagner. 2002.  Maintenance of an Innovation in Long Term Panel Studies: The Case of the German Socio-Economic Panel (SOEP), Allgemeines Statistisches Archiv 86(2): 163-175.

Singer, Tanja, Ben Seymour, John P. O'Doherty, KLaas E. Stephan, Raymond J. Dolan, und Chris D. Frith 2006. Empathic neural responses are modulated by the perceived fairness of others. In: Nature doi:10.1038/04271.

Shanahan, Michael, Scott M. Hofer, Lilly Shanahan. 2003. Biological Models of Behavior and the Life Course, in: Jeylan T. Mortimer and Michael J. Shanahan (Eds.); Handbook of the Life Course. New York: Kluver Academic/Plenum Publishers, S. 597-622.

Skitka, Linda J. and Edward G. Sargis. 2006. The Internet as Psychological Laboratory, Annual Review of Psychology 57: 529-555.

Solga, Helga, Elsbeth Stern, Bernhard von Rosenbladt, Juergen Schupp and Gert G. Wagner. 2005. The measurement and importance of general reasoning potentials in schools and labor markets, DIW Research Note 10/2005, Berlin.

Spiess, C. Katharina .2005. Das Sozio-oekonomische Panel (SOEP) und die Moeglichkeiten regionalbezogener Analysen, in: Gert Grözinger und Wenzel Matiaske (eds.),  Deutschland regional,  München und Mering: Rainer Hampp Verlag: 57-64.

Spiess, C. Katharina, Felix Buechel, and Gert G. Wagner. 2003. Children's School Placement in Germany: Does Kindergarten Attendance Matter?, Early Childhood Research Quarterly 11: 255-270

Spiess, Martin and Jan Goebel. 2004. A Comparison of Different Imputation Rules. In: Rendtel, Ulrich, Manfred Ehling et al. 2004.  Harmonisation of Panel Surveys and Data Quality. Stuttgart: Metzler-Poeschel: 293-316.

Spiess, Martin and Martin Kroh. 2005.  Documentation of Sample Sizes and Panel Attrition in the German Socio-Economic Panel (SOEP) (1984 until 2004), DIW Data Documentation 6. Berlin.

Tyler, John H., Richard J. Murnane and John B. Willett. 2000.  Estimating The Labor Market Signaling Value Of The GED, The Quarterly Journal of Economics, 115(2): 431-468.

Urry, Heather L., Jack B. Nitschke, Isa Dolski, Daren C. Jackson, Kim M. Dalton, Corrina J. Mueller, Melissa A. Rosenkranz, Carol D. Ryff, Burton H. Singer, und Richard J. Davidson 2004. Making a Life Worth Living. In: Psychological Science 15(6): 367-372.

VanKerm, P. 2004. “What Lies Behind Income Mobility? Reranking and Distributional Change in Belgium, Western Germany and the USA,” Economica, 71(282), 223–239.

Wagner, Gert G., Richard V. Burkhauser and Friederike Behringer. 1993.  The English Language Public Use File of the German Socio-Economic Panel, The Journal of Human Resources 28(2): 429-433.

Wagner, Gert G.. 1988. Zur Notwendigkeit empirischer Analysen für dieökonomische Fundierung staatlicher Versicherungs- und Sozialpolitik. In: Gabriele

Rolf, P. Bernd Spahn and Gert G. Wagner (eds.), Sozialvertrag und Sicherung – Zur ökonomischen Theorie staatlicher Versicherungs- und Umverteilungssysteme. Frankfurt and New York: Campus Verlag: 275-317.

Télécharger le document
Session 3 : Enhancing the Power of Household Panel Studies
Click on the PFD icon above to download the entire paper