Home » Lagged Variables

Lagged Variables

Lagged variables often appear in time series and time series cross sectional (TSCS or panel) models. A lagged variable takes the form of the variable’s value in the period t-1, used in the model for time t. Adding lagged variables to a model allows us to control for any persistence of an effect based on the prior period’s value. The best predictor of your weight today (your car’s mileage, a district’s two-party presidential vote, etc.), for example, is your weight (or whatever) yesterday. To ensure that we are capturing the effect of our variables of interest instead of the lingering effects of underlying causes and previous values, we can include this previous value in the model itself. This transforms OLS into dynamic OLS.

Most of the time, we are interested in purging the DV of unwanted residue of previous periods. That’s the primary use of lagged variables, lagged DVs that enter the model as independent (right-hand-side) variables. It is possible, however, and sometimes desirable, to use lagged independent variables. That is, we want to use the value of IV1 at t-1 to explain the value of the DV at time t. For example, this is true where the X variable causes the Y variable not instantaneously but only after some delay. Consider the time between when a government enacts a policy and we observe some effect of that policy. Economic policy reforms don’t influence economic growth at the time they are approved but at some later period. Lagged variables are also particularly useful for variables that we think cause one another. If I want to explain the influence of democracy on the current year’s conflict behavior, there is some chance that getting involved in a war reduced a state’s observed democracy level in that same year – that a simultaneous (and therefore endogenous) relationship exists between the two variables. Instead, I should use the previous year’s democracy level, which by definition could not have been affected by a conflict that hadn’t happened yet. Likewise, we often see lagged values of GDP growth, GDP, and similar economic variables whose full effect isn’t felt until the end of the year.

There are costs to using lagged variables. First, the temporally first observation for each unit studied gets ‘lost’ because we don’t have a previous value of it to use as the lag. If you only have a few observations per unit, this is a cost you may be unwilling to pay. Second, using a lagged DV soaks up a lot of the explanatory power of the remaining independent variables, and so makes finding significant effects on any of the remaining IVs much more difficult. If your relationship is weak, you may not find it at all when using a lagged DV; try the model with and without just to be sure.

Generating lagged variables in Stata is easy. It relies on so-called ‘underscore variables,’ which are system variables that Stata generates internally and doesn’t usually show to you.  One of those, _n, numbers observations in the order they appear in the data. Let’s say I have time series data for the United States and want to generate a lagged value of GDPgrowth. First I need to make sure my data are sorted in time order. My command would look something like generate lagGDPgrowth = GDPgrowth(_n-1) . The (_n-1) part tells Stata to look at the observation (denoted by _n) above (denoted by -1) to locate the desired value. If I have panel data, with multiple countries, the command gets a little more complex because we have to tell Stata to copy the value down for each country, without mixing them up. Fortunately, the fix is simple: I would simply append by country:  to the beginning of the previous command.

As always, if you think you need a lagged variable, consult closely with your professor as you design and implement your research. For more on the costs and benefits of using lagged variables in OLS, check out Michael Bailey, Real Stats, chapters 13 and 15.  He also includes detailed instructions for generating lagged variables in both Stata and R in his Computing Corner. There is no remotely comparable resource online; sorry.

Archives

Categories

Site contents (c) Leanne C. Powner, 2012-2026.
Background graphic: filo / DigitalVision Vectors / Getty Images.
Cover graphic: Cambridge University Press.

Powered by WordPress / Academica WordPress Theme by WPZOOM