Skip to main content

Dropbox & R Data

I'm always looking for ways to download data from the internet into R. Though I prefer to host and access plain-text data sets (CSV is my personal favourite) from GitHub (see my short paper on the topic) sometimes it's convenient to get data stored on Dropbox.

There has been a change in the way Dropbox URLs work and I just added some functionality to the repmis R package. So I though that I'ld write a quick post on how to directly download data from Dropbox into R.

The download method is different depending on whether or not your plain-text data is in a Dropbox Public folder or not.

Dropbox Public Folder

Dropbox is trying to do away with its public folders. New users need to actively create a Public folder. Regardless, sometimes you may want to download data from one. It used to be that files in Public folders were accessible through non-secure (http) URLs. It's easy to download these into R, just use the read.table command, where the URL is the file name. Dropbox recently changed Public links to be secure (https) URLs. These cannot be accessed with read.table.

Instead you need can use the source_data command from repmis:

FinURL <-"https://dl.dropbox.com/u/12581470/code/Replicability_code/Fin_Trans_Replication_Journal/Data/public.fin.msm.model.csv"

# Download data
FinRegulatorData <- repmis::source_data(FinURL,
                             sep = ",",
                             header = TRUE)

Non-Public Dropbox Folders

Getting data from a non-Public folder into R was a trickier. When you click on a Dropbox-based file's Share Link button you are taken to a secure URL, but not for the file itself. The Dropbox webpage you're taken to is filled with lots of other Dropbox information. I used to think that accessing a plain-text data file embedded in one of these webpages would require some tricky web scrapping. Luckily, today I ran across this blog post by Kay Cichini.

With some modifications I was able to easily create a function that could download data from non-Public Dropbox folders. The source_DropboxData command is in the most recent version of repmis (v0.2.4) is the result. All you need to know is the name of the file you want to download and its Dropbox key. You can find both of these things in the URL for the webpage that appears when you click on Share Link. Here is an example:

https://www.dropbox.com/s/exh4iobbm2p5p1v/fin_research_note.csv

The file name is at the very end (fin_research_note.csv) and the key is the string of letters and numbers in the middle (exh4iobbm2p5p1v). Now we have all of the information we need for source_DropboxData:

FinDataFull <- repmis::source_DropboxData("fin_research_note.csv",
                                  "exh4iobbm2p5p1v",
                                  sep = ",",
                                  header = TRUE)

Comments

K. Ram said…
You should also just try authenticating with Dropbox directly using my R package, rDrop.. Feedback welcome.
Unknown said…
Yeah, I only just now saw your rDrop package (I added a shout out to it in the repmis README). I'm definitely going to start using it.

Correct me if I'm wrong, but can you use rDrop db.read.csv to access data if you don't have the user's credentials?
Bob Muenchen said…
Thanks for the helpful post! If you find that rDrop does the same two examples (i.e. without authentication), it would make a great followup article.

Cheers,
Bob
Unknown said…
this seems to no longer be supported, unfortunately.
PaperCoachNet said…
Great and useful article. Creating content regularly is very tough.Thanks you.Write more with reflective essay thesis.

basha said…
Excellent blog I visit this blog it's really awesome. The important thing is that in this blog content written clearly and understandable. The content of information is very informative.
Magnificent blog I visit this blog it's extremely wonderful. Interestingly, in this blog content composed plainly and reasonable. The substance of data is useful.
Oracle Fusion HCM Online Training
Oracle Fusion SCM Online Training
Oracle Fusion Financials Online Training
Big Data and Hadoop Training In Hyderabad
oracle fusion financials classroom training
Oracle Fusion HCM Classroom Training
oracle cpq online training / Oracle CPQ Class Room Training
Oracle Taleo Online Training
Data Science said…
Finally found very interesting blog with valuable information wafting for next blog update.
Data Analytics Course Online 360DigiTMG
Writing in style and getting good compliments on the article is hard enough, to be honest, but you did it so calmly and with such a great feeling and got the job done. This item is owned with style and I give it a nice compliment. Better!
Cyber Security Training in Bangalore
A good blog always contains new and exciting information and as I read it I felt that this blog really has all of these qualities that make a blog.

Data Analytics Course in Bangalore
TECHNOLOGY said…
I will really appreciate the writer's choice for choosing this excellent article appropriate to my matter. Here is deep description about the article matter which helped me more.
Data Science Course
Amazing information. I was looking for this one. Thanks a lot. I have a suggestion for the Best Digital Marketing Course in Janakpuri. If you want to enroll in Data Science Course, Join 99 Digital Academy as it offers Digital Marketing Course at an affordable Price. Click to Enroll Today.
Digital Marketing Course in Janakpuri.

Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I’ll be subscribing to your feed and I hope you post again soon.
business analytics course
I wanted to leave a little comment to support you and wish you a good continuation. Wishing you the best of luck for all your blogging efforts.

best data science institute in hyderabad
Anonymous said…
Thanks for the informative and helpful post, obviously in your blog everything is good..
best data science institute in hyderabad
Priya Rathod said…
Really immeasurable information for us... Thank you for presenting such wonderful details.
DevOps Training in Hyderabad
DevOps Course in Hyderabad
mshahid said…
hi!,I like your writing so much! share we keep up a correspondence extra approximately your post on AOL? I require a specialist on this space to solve my problem. May be that is you! Looking ahead to peer you. Unique Dofollow Backlinks
lionelmessi said…
Excellent Blog! I would like to thank for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well. I wanted to thank you for this websites! Thanks for sharing. Great websites!

AWS Training in Hyderabad
Innomatics said…
Data Science Course in Hyderabad
Become a Data Science Expert with us.We provide Classroom training on IBM Certified Data Science at Hyderabad for the individuals who believe hand-held training. We teach as per the Indian Standard Time (IST) with In-depth practical Knowledge on each topic in classroom training, 80 – 90 Hrs of Real-time practical training classes.
Priya Rathod said…
I am impressed by the useful information on this site. It is very helpful and makes me wonder why I didn't think of that!
Data Science Training in Hyderabad
Data Science Course in Hyderabad
Priya Rathod said…
Your blog is filled with unique good articles! I was impressed how well you express your thoughts. You have a communicable and well-articulated writings . I enjoyed reading all of them.
AWS Training in Hyderabad
AWS Course in Hyderabad
Ramesh Sampangi said…
I want to leave a little comment to support you, good work keeps this work in forwarding postings.
Python Certification in Hyderabad
Machine Learning Certification in Hyderabad
Sarika A said…
It was nice reading your blog. Marvelous work!. A blog is brilliantly written and provides all necessary information I really like this site. We are also providing the best services click on below links to visit our website.

Oracle Fusion HCM Training
Workday Training
Okta Training
Palo Alto Training
Adobe Analytics Training
Great tips and very easy to understand. This will definitely be very useful for me when I get a chance to start my blog.
data scientist course

Digital Chandu said…
Great Post Thanks For Sharing, have a look at it for free digital marketing training..

digital marketing training in hyderabad
Pallavireddy said…
This comment has been removed by the author.
Pallavireddy said…
i am glad to discover this page : i have to thank you for the time i spent on this especially great reading !! i really liked each part and also bookmarked you for new information on your site.
Data Engineering Course in India
It is late to find this act. At least one should be familiar with the fact that such events exist. I agree with your blog and will come back to inspect it further in the future, so keep your performance going.

Data Scientist Training in Bangalore
Maneesha said…
Thanks for your post. I’ve been thinking about writing a very comparable post over the last couple of weeks, I’ll probably keep it short and sweet and link to this instead if thats cool. Thanks.
ai course in hyderabad

I at long last discovered incredible post here.I will get back here. I just added your blog to my bookmark locales. thanks.Quality presents is the urgent on welcome the guests to visit the website page, that is the thing that this site page is giving.data analytics course in rohtak
Anonymous said…
I like your post. I appreciate your blogs because they are really good. Please go to this website for the Data Science Course: Data Science course in Bangalore. These courses are wonderful for professionalism.
Tech Institute said…
Really, this article is truly one of the best in the article. And this one that I found quite fascinating and should be part of my collection. Very good work!.
Data Science Training in Jaipur
data science said…
This is the first time I visit here. I found such a large number of engaging stuff in your blog, particularly its conversation. From the huge amounts of remarks on your articles, I surmise I am by all accounts not the only one having all the recreation here! Keep doing awesome. I have been important to compose something like this on my site and you have given me a thought.
360DigiTMG, the top-rated organisation among the most prestigious industries around the world, is an educational destination for those looking to pursue their dreams around the globe. The company is changing careers of many people through constant improvement, 360DigiTMG provides an outstanding learning experience and distinguishes itself from the pack. 360DigiTMG is a prominent global presence by offering world-class training. Its main office is in India and subsidiaries across Malaysia, USA, East Asia, Australia, Uk, Netherlands, and the Middle East.
data science said…
I have bookmarked your site since this site contains significant data in it. You rock for keeping incredible stuff. I am a lot of appreciative of this site.
thekeygens said…
“Thank you so much for sharing all this wonderful info with the how-to's!!!! It is so appreciated!!!” “You always have good humor in your posts/blogs. So much fun and easy to read!


Scrivener Crack

EarthView Crack

ProgDVB Professional Crack

Dropbox Crack
George Mark said…
It's a really great and useful piece of information. Thanks, and please keep up the rewarding work. Biker Boyz Jacket
I was very happy to find this site. I wanted to thank you for this excellent reading !! I really enjoy every part and have bookmarked you to see the new things you post.

Data Science Course in Durgapur
unogeeks said…
Really impressed! Everything is very open and very clear clarification of issues. It contains truly facts. Your website is very valuable. Thanks for sharing.
UnoGeeks Offers the best Oracle Fusion Financials Training in the market today. If you want to become Expert Fusion Financials Consultant, Enrol in the Oracle Fusion Financials Online Training offered by UnoGeeks.

I have bookmarked your website because this site contains valuable information in it. I am really happy with articles quality and presentation. Thanks a lot for keeping great stuff. I am very much thankful for this site.data science training in jabalpur
you have done a great job. I will definitely dig it and personally recommend to my friends.
릴게임사이트
Thank you for this very informative post. This is some of the best content I've read online lately. I have jotted down many points from this post to refer to later in my work. It is quite encouraging to come across an article that is informative, relevant, up-to-date and engaging.
ai course
oliva said…
Hi
Thanks for sharing an amazing and informative post. The information shared by you is really useful for me. Keep it up to do great work and hope to see more of your posts in the near future.
read more :semi frameless shower door
deekshitha said…
This comment has been removed by the author.
MNK said…
Interesting article!

Thanks & Regards,
BroadMind - Best Study abroad consultant in Chennai
Really useful informations. Thankyou for sharing. Builders in Kochi
Luna Montana said…
Christopher Gandrud’s contributions to data science and transparency in research reflect a commitment to innovation, similar to zoë kravitz movies and tv shows. Both excel in their fields, showing creativity and adaptability—Gandrud in research, and Kravitz in acting. Their diverse expertise enriches their respective industries.
Lily Gomez said…
Dropbox is an excellent tool for storing and sharing R data, especially for collaborating on projects that involve large datasets. streamer designs, which often require handling extensive visual assets, can also benefit from Dropbox’s file-sharing capabilities, making it a versatile solution for both data scientists and creative professionals.

Popular posts from this blog

Slide: one function for lag/lead variables in data frames, including time-series cross-sectional data

I often want to quickly create a lag or lead variable in an R data frame. Sometimes I also want to create the lag or lead variable for different groups in a data frame, for example, if I want to lag GDP for each country in a data frame. I've found the various R methods for doing this hard to remember and usually need to look at old blog posts . Any time we find ourselves using the same series of codes over and over, it's probably time to put them into a function. So, I added a new command– slide –to the DataCombine R package (v0.1.5). Building on the shift function TszKin Julian posted on his blog , slide allows you to slide a variable up by any time unit to create a lead or down to create a lag. It returns the lag/lead variable to a new column in your data frame. It works with both data that has one observed unit and with time-series cross-sectional data. Note: your data needs to be in ascending time order with equally spaced time increments. For example 1995, 1996

A Link Between topicmodels LDA and LDAvis

Carson Sievert and Kenny Shirley have put together the really nice LDAvis R package. It provides a Shiny-based interactive interface for exploring the output from Latent Dirichlet Allocation topic models. If you've never used it, I highly recommend checking out their XKCD example (this paper also has some nice background). LDAvis doesn't fit topic models, it just visualises the output. As such it is agnostic about what package you use to fit your LDA topic model. They have a useful example of how to use output from the lda package. I wanted to use LDAvis with output from the topicmodels package. It works really nicely with texts preprocessed using the tm package. The trick is extracting the information LDAvis requires from the model and placing it into a specifically structured JSON formatted object. To make the conversion from topicmodels output to LDAvis JSON input easier, I created a linking function called topicmodels_json_ldavis . The full function is below. To