Skip to main content

Slide: one function for lag/lead variables in data frames, including time-series cross-sectional data

I often want to quickly create a lag or lead variable in an R data frame. Sometimes I also want to create the lag or lead variable for different groups in a data frame, for example, if I want to lag GDP for each country in a data frame.

I've found the various R methods for doing this hard to remember and usually need to look at old blog posts. Any time we find ourselves using the same series of codes over and over, it's probably time to put them into a function.

So, I added a new command–slide–to the DataCombine R package (v0.1.5).

Building on the shift function TszKin Julian posted on his blog, slide allows you to slide a variable up by any time unit to create a lead or down to create a lag. It returns the lag/lead variable to a new column in your data frame. It works with both data that has one observed unit and with time-series cross-sectional data.

Note: your data needs to be in ascending time order with equally spaced time increments. For example 1995, 1996, 1997.


Not Cross-sectional data

Let's create an example data set with three variables:

# Create time variable
Year <- 1980:1999

# Dummy covariates
A <- B <- 1:20

Data1 <- data.frame(Year, A, B)

##   Year A B
## 1 1980 1 1
## 2 1981 2 2
## 3 1982 3 3
## 4 1983 4 4
## 5 1984 5 5
## 6 1985 6 6

Now let's lag the A variable by one time unit.


DataSlid1 <- slide(Data1, Var = "A", slideBy = -1)

##   Year A B A-1
## 1 1980 1 1  NA
## 2 1981 2 2   1
## 3 1982 3 3   2
## 4 1983 4 4   3
## 5 1984 5 5   4
## 6 1985 6 6   5

The lag variable is automatically given the name A-1.

To lag a variable (i.e. the lag value at a given time is the value of the non-lagged variable at a time in the past) set the slideBy argument as a negative number. Lead variables, are created by using positive numbers in slideBy. Lead variables at a given time have the value of the non-lead variable from some time in the future.

Time-series Cross-sectional data

Now let's use slide to create a lead variable with time-series cross-sectional data. First create the example data:

# Create time and unit ID variables
Year <- rep(1980:1983, 5)
ID <- sort(rep(seq(1:5), 4))

# Dummy covariates
A <- B <- 1:20

Data2 <- data.frame(Year, ID, A, B)

##   Year ID A B
## 1 1980  1 1 1
## 2 1981  1 2 2
## 3 1982  1 3 3
## 4 1983  1 4 4
## 5 1980  2 5 5
## 6 1981  2 6 6

Now let's create a two time unit lead variable based on B for each unit identified by ID:

DataSlid2 <- slide(Data2, Var = "B", GroupVar = "ID",
                    slideBy = 2)

##   Year ID A B B2
## 1 1980  1 1 1  3
## 2 1981  1 2 2  4
## 3 1982  1 3 3 NA
## 4 1983  1 4 4 NA
## 5 1980  2 5 5  7
## 6 1981  2 6 6  8

Hopefully you'll find slide useful in your own data analysis. Any suggestions for improvement are always welcome.


Unknown said…
in your last exmample, what if i want to have the first 3 vairble in each group to have a lag value of 2, how can i do it?

Unknown said…
Tilly, do you mean lag the A and B variables by 2? Or do you mean lag the first three rows for each group by 2?
Highgamma said…
Cool. I can use this to create returns for any time lag.
is this faster than using plyr as is what I'm currently using? plyr is beautiful but incredibly slow at times.

Unknown said…
Sebastian for lags/leads with time- series cross-sectional data it will probably be as fast as plyr because it relies on ddply.

There might be a way to have slide use data.table rather than plyr. data.table is usually faster. I'll look into it. Thanks for the comment.
Anonymous said…
I'm using lagpanel() in the {simcf} package which makes sure that the panel structure is taken into account. Works perfectly! The package is described here:
Unknown said…
@politicalsciencereplication Thanks for the info. lagepanel looks pretty interesting. I like how it doesn't rely on ddply. Not sure if it does lead variables and has possibly less intuitive syntax for new users wanting to lag a variable in an existing data frame and return the new variable to the old data frame.

You've got a great blog by the way.
Fr. said…
I have submitted a fix that should make the function as quick as possible with plyr. On my training data, it is now as quick as lagpanel (which cannot do lead variables).
Unknown said…
Thanks Fr. I just merged it in and the new version of DataCombine should be on CRAN shortly.
Danilo said…
This is by far the easiest function to lag and lead variables that I've seen. It works wonders! Thanks for lowering R's entry costs for newbies like me. :)
Unknown said…
Thanks Danilo. Really glad that you found it useful.
Anonymous said…
Thanks very much for this package! I am using your slide function with fairly ordinary Time-series Cross-country dataset. When I use SlideBy=-1 (or SlideBy=1) it works fine, but when I put, say, SlideBy=-2 (or SlideBy=2) I get the following error:

Error in `[<`(`*tmp*`, , NewVar, value = c(NA, NA, NA, NA, :
replacement has 4098 rows, data has 4097

What do you think could cause this problem?
Unknown said…
Hi homo-loquens

First of all, can you run the examples in the slide documentation ok?

If so then there might be some issue with your grouping or time variables. Not exactly sure without seeing the data.
Anonymous said…
Thank you, I think I have figured it out for the time being. Would post further if necessary.
Anonymous said…
There are lots of data scientist are here for solving your all type of data related problems, but the selecting of best data scientist from the variety of different scientist is also a big problem for us because we need the best data solution provide on very cheap rate. So this can help you better to find the best and cheap data scientist for your work.
Unknown said…
Hi Christopher
I really like your slide function! However, I'm getting kind of annoyed about the naming of lag variables where the "-" confuses other functions. Could this possibly be changed into some other naming such as lagivar where i is the lag number and var is the variable name?
Best, Christoffer
When I use slide to construct lagged x, the resulting variable is x-1. When I try to use the dataframe in other functions this name does not work. I could not find to rename it either. Please help.
Unknown said…

Worthful Data Science tutorial. Appreciate a lot for taking up the pain to write such a quality content on Data Science course. Just now I watched this similar Data Science tutorial and I think this will enhance the knowledge of other visitors for sure. Thanks anyway.:-
anusha said…

MEAN Stack Training in Chennai MEAN Stack Training in Chennai with real time projects. We are Best MEAN Stack Training Institute in Chennai. Our Mean Stack courses are taught by Industrial Experts which would help you to learn MEAN Stack development from the scratch.
basha said…
Excellent blog I visit this blog it's really awesome. The important thing is that in this blog content written clearly and understandable. The content of information is very informative.
Magnificent blog I visit this blog it's extremely wonderful. Interestingly, in this blog content composed plainly and reasonable. The substance of data is useful.
Oracle Fusion HCM Online Training
Oracle Fusion SCM Online Training
Oracle Fusion Financials Online Training
Big Data and Hadoop Training In Hyderabad
oracle fusion financials classroom training
Oracle Fusion HCM Classroom Training
oracle cpq online training / Oracle CPQ Class Room Training
Oracle Taleo Online Training
Thanks for provide great informatic and looking beautiful blog, really nice required information & the things i never imagined and i would request, wright more blog and blog post like that for us. Thanks you once agian
we offer services delhi birth certificate which inculde name add in birth certificate and birth certificate correction complete process is online and we offer
birth certificate and we offer this birth certificate online same service offers at yourdoorstep at birth certificate in ghaziabad our dream to provide birth certificate in india and other staes like birth certificate in bengaluru and birth certificate in gurgaon book service with us birth certificate in noida also, service at yoursdoorstep only birth certificate in india or
Shanmugam said…
Thanks for sharing this wonderful information. I too learn something new from your post..
Mean Stack Training in Chennai
diana said…

Thank you so much for the sharing, it was so informative and I read all your blogs that's so nice and informative.
interior designers in chennai
interiors in chennai
best builders in chennai
Construction companies in chennai
fatimashaikh said…
Event Lead Capture Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon. Big thanks for the useful info.
Ramesh Sampangi said…
Big thumbs up for this blog. You did a great job. Keep maintaining this work in further blogs.
AI Training in Hyderabad
Kamalesh said…
Glad to find this. Your site very helpful and this post gives lots of information. Do share more updates.
Artificial Intelligence In Industry
Application Areas Of Artificial Intelligence
sundar raj said…
Thanks For sharing the blog..Watingfor next update..
Types of digital marketing
What is Digital Marketing?
Nathan said…
I wanted to thank you for this great read!! I definitely enjoying every little bit of it waiting for next one.
Data Analytics Course in Chennai
Franticpro said…
Hi, I am John Smith I am Web Developer, It is an amazing blog thanks for the sharing the blog. Frantic infotech provide the tablet app development such as an information about software development for costumer service. Franti infotech also provide the wearable app development. The development of advanced web applications is Orient Software’s specialty and we will successfully fulfill all your web application development requirements, from small-sized to wider-ranged projects. We Also do work multiple platforms like:
mobile app development services
android app development on mobile
ios app development services
react native app development
ionic app development company
flutter app development company
angularjs web development
Derek Lafortune said…
I intended to post you a bit of observation so as to thank you so much as before regarding the magnificent pointers you’ve provided here. It’s strangely open-handed with you to offer freely exactly what a few people would have distributed as an e book in order to make some dough for themselves, primarily considering the fact that you might well have tried it in case you decided. Those smart ideas in addition acted like a easy way to fully grasp that other people have the identical eagerness just like my personal own to grasp more in terms of this problem. I think there are lots of more pleasurable times ahead for individuals who looked over your website.

would like to show my appreciation to the writer for rescuing me from this type of situation. After looking through the the net and seeing methods which were not pleasant, I figured my life was over. Living minus the answers to the problems you’ve sorted out by way of your entire short article is a crucial case, and those that might have adversely affected my entire career if I hadn’t come across your web site. Your actual capability and kindness in maneuvering all the pieces was vital. I don’t know what I would have done if I had not come upon such a stuff like this. I can at this time look ahead to my future. Thanks for your time very much for this high quality and results-oriented guide. I won’t hesitate to refer your site to anyone who requires counselling about this issue.

My wife and i ended up being so fortunate that Jordan managed to deal with his inquiry out of the precious recommendations he was given out of the blog. It’s not at all simplistic just to find yourself giving away tactics which often a number of people have been trying to sell. And we realize we’ve got the writer to be grateful to because of that. All of the explanations you made, the easy website menu, the friendships your site help engender – it’s got most fabulous, and it’s really aiding our son in addition to the family understand the topic is enjoyable, and that’s really pressing. Thank you for the whole lot!
Derek Lafortune said…
Great code, the author is handsome! It seemed to me that you have it too detailed and from this large in size, I think you can reduce it at least twice if you use pseudo-classes and identifiers, for example, I generally recommend watching a video on YouTube on how to shorten any code by almost five times and not cut it its functionality, unfortunately I don't remember the name of this video, but I do remember that it had posted by account with about 28 thousand of subscribers! I am sure that the owner of this account are always buy youtube subscribers in order to increase their number.
naina k said…
Thanks for sharing valuable information....

how to lose facial fat
Sruthi Karan said…
Lovely post..! I got well knowledge of this blog. Thank you!
Solicitation Of A Minor VA
Online Solicitation Of A Minor
Ahana said…
I really admire your post! Examples that are explained by you help me to learn frames. Thank you! best Java training course in Delhi
Muthu said…
Great blog created by you. I read your blog, its best and useful information. You have done a great work. Super blogging and keep it up abogado testamentario
abogado de testamentos y sucesiones cerca de mí
anushiya said…
Superb Blog very Useful Information Thanks For Sharing

Divorce Lawyers Culpeper VA
Divorce Lawyers Northern VA
Divorce Lawyers Fredericksburg VA
sivakumar said…
The European Union's IABAC, a foundation, has accredited our data analytics certification programme in Mumbai. The DataMites data analytics course fee in Mumbai is 42,000 INR. The phrase "data analytics" has become more popular today due to the rise in data generation. As the curriculum is designed to train applicants from level 1, there are no formal prerequisites for the DataMites Data Analytics Training in Mumbai.
Steffan said…
Found this smtp relay service providers about us this website about this service provided
Albert Clifford said…
The slide "One Function for Lag/Lead Variables in Data Frames, Including Time-Series Cross-Sectional Data" discusses a useful function for handling lag and lead variables in data frames. However, it needs more details on its functionality, syntax, and usage examples to better understand its application in data analysis. Visual aids or code snippets could enhance the slide's informativeness. Additionally, potential use cases or scenarios should be included to demonstrate its practicality. In summary, more information and practical illustrations would enhance the slide's clarity and utility.New York Divorce Laws Adultery
MNK said…
Sounds interesting...

BroadMind - IELTS Class in Madurai
susan said…
An excellent and valuable post. It generates fresh and informative ideas. Continue to share quality blogs.Abogado Criminal del Condado Prince William

Popular posts from this blog

Dropbox & R Data

I'm always looking for ways to download data from the internet into R. Though I prefer to host and access plain-text data sets (CSV is my personal favourite) from GitHub (see my short paper on the topic) sometimes it's convenient to get data stored on Dropbox . There has been a change in the way Dropbox URLs work and I just added some functionality to the repmis R package. So I though that I'ld write a quick post on how to directly download data from Dropbox into R. The download method is different depending on whether or not your plain-text data is in a Dropbox Public folder or not. Dropbox Public Folder Dropbox is trying to do away with its public folders. New users need to actively create a Public folder. Regardless, sometimes you may want to download data from one. It used to be that files in Public folders were accessible through non-secure (http) URLs. It's easy to download these into R, just use the read.table command, where the URL is the file name

A Link Between topicmodels LDA and LDAvis

Carson Sievert and Kenny Shirley have put together the really nice LDAvis R package. It provides a Shiny-based interactive interface for exploring the output from Latent Dirichlet Allocation topic models. If you've never used it, I highly recommend checking out their XKCD example (this paper also has some nice background). LDAvis doesn't fit topic models, it just visualises the output. As such it is agnostic about what package you use to fit your LDA topic model. They have a useful example of how to use output from the lda package. I wanted to use LDAvis with output from the topicmodels package. It works really nicely with texts preprocessed using the tm package. The trick is extracting the information LDAvis requires from the model and placing it into a specifically structured JSON formatted object. To make the conversion from topicmodels output to LDAvis JSON input easier, I created a linking function called topicmodels_json_ldavis . The full function is below. To