Subtitles section Play video Print subtitles Hello again! In this video we will learn how to import data into R, or how-to-do-that-thing-you’ll-be-doing-all-the-time-as-data-scientists. As you can tell, this is an important lesson, so buckle up, and let’s get to it. If you want to work in parallel with me and still haven’t downloaded the resources for the course, please do it now. The resources for this lesson include a couple of data sets which we will be loading into R during the lesson. If you have downloaded them, check where they are saved, or copy them into your working directory so they’re easy to locate. The data file names are: pokRdex_comma, and pokRdex_delim. Do you remember how to check and set your working directory? The working directory is where we will be loading data from and saving data to. To check what your current working directory is, type getwd() into the command. If you’re not happy with where this takes you, you can set your directory by typing setwd() and passing a path to the location on your computer that works best for you. Alternatively, you can use RStudio’s Session tab in the ribbon on top and set your directory from there. Cool. Now that we’re in the directory where our files are, we can load data easily. Although there are tons of different data file types, we will stick to the most commonly used ones: text and .csv; you can find information about importing other file types in the resources for this lesson. Okay, let's get to it! The general-purpose data reading function in R is the read.table() function. To load data in the form of a text file from your working directory, pass the name of the file in quotes, and then specify at least the following: first, the separator for your data (this tells R what distinguishes your values); second, whether your data set has a header row or not; and third, whether R should encode your string variables as factors. Using our Pokémon data, the command looks like this... So, why is it important to specify all these things? Well. Separators tell R how to look at your data and how to structure the data frame. The header signals whether the first row of the data file is values or variable names. If your data doesn’t have variable names and begins directly with the first observation, then set that argument to FALSE. Finally, the stringsAsFactors = argument tells R whether to convert all string variables to factors or not. Since our string variables include Pokémon names and these are clearly not factors, we set the argument to FALSE. Of course, there are tons more arguments for the read.table() function but these are the crucial ones to remember. Okay, this covers the basic architecture of reading a data file into R. You can use the read.table() function with a lot of different data types because of its flexibility and argument set-up. Next, we’ll talk about reading specific data types: comma-separated-values, or CSV files, and tab-delimited files. See you there.
A2 data directory file argument string lesson Data frames - Importing data in R 2 0 林宜悉 posted on 2020/03/09 More Share Save Report Video vocabulary