Introduction
A question I get often when talking to younger statisticians and data scientists is “how did you learn R”? The real/cheating answer I want to give first is “well I took a Java class and that made everything click wayyy smoother.” And although that’s an important step, taking one course in another language didn’t get me to where I am today. I should add that I have not learned R. I would argue there are very few people who really know R, especially because like any active language, it changes every few hours. Okay, that is a little hyberbolic, but the point I’m making is that even when you get good, there will always be ways to change what you’re doing for the better.
First, if you want good advice on practice or programming broadly, I turn to this article every few months and I agree with nearly every point. Another artifact that helps me think about growth in anything is this Youtube video. If you can get over the fact that it’s about a video game, I think it’s a valuable approach to practice (review what you’ve done, focus on a specific thing, practice in real scenarios once you have focused on that thing). I only found it a few months ago, so we’ll see how effective it is with time.
I’m going to start with the most relatable point to the least relatable point starting with making or finding a data set you understand to the sad but relatable point.
Books and Googling (or Ducking or Bing…Ing)
Okay, I’m well aware that reading a programming book is only like .00001 percent of the population’s idea of something they want to do after work, but I think spending time with a programming book is extremely valuable. This isn’t going to solve all of your problems immediately, and finding a book that’s on your skill level is going to be extremely important. I remember my dad bringing home a Java book from work as a teen (I know what it sounds like, and although my dad has many talents, math and programming are not among them). Like any young nerd I, too, was excited to hack into the mainframe or whatever. Unfortunately, many programming books are super jargon heavy or start teaching programming by talking about hardware. I highly recommended trying R for Data Science as it’s free online, and the people behind it are extremely focused on helping other people understand their material.
As you read the text and work through the examples, sometimes things aren’t going to work. No worries! ALL PROGRAMMERS USE WEB SEARCH CONSTANTLY. Knowing how to search for what your are looking for is half of the work you have to do.
Find or Make a Dataset You Like
I made a few tiny data sets in excel when I started learning SAS (don’t ask). Using these small data sets allowed me to practice the language on something I understand well and something I am moderately interested in. It was ten or so rows of my friends with their ages and my guess at their heights and birthdays. Be honest. You don’t know all your friends’ birthdays off the top of your head either. Working with date values is important in statistics and data science, too. I also included some text fields like their names and hair color just so I had some text to work with in addition to numbers. Having this data set was extremely helpful for these practice exercises, you can even use this for complex things you are trying to do in the future. Sometimes working on datasets with thousands of observations can be frustrating and hard to look at. Having this small data set can be nice when you want to see if your code works at all.
Follow R Programmers on Twitter or Wherever
I follow some R Programmers on Twitter to make sure I know what’s new or fun in R. Off the top of my head I would recommend @drob, @sharlagelfand, and @thomas_mock. Once you get a good sampling of these pages you can find a ton of other ones by checking who they retweet and follow.
Other Languages
Well I haven’t written this part, but like 5 people read this blog, so I’ll get around to it. Progress not perfection, amirite?