Subtitles section Play video Print subtitles Alright. So, now we know how read.table() works, and that paves our way to learning some shortcuts. The file we loaded into R the previous lesson is a CSV file; it is a simple text document in which the values are separated by commas. CSV files are extremely common, so R’s brainy developers have given us a shortcut function with which to load them faster... This function is from the read.table() family, and it is called read.csv(). read.csv() takes fewer arguments than read.table(), because its defaults are set in a very convenient way; headers are set to TRUE, and separators are set to a comma. All we need to do in order to read a table is pass the name of the file, and specify whether we want our strings to be factors or not. Sweet! Apart from comma-separated, values in a text data file can be separated by tabs; these types of documents are called tab-delimited files. And just as with CSVs, there is a read.table() shortcut to reading them: read.delim(). What’s happening behind the scenes here is that the sep = argument is set to \t, header is again TRUE, and a bunch of other useful arguments are set to default to their most commonly used values. Now, just before we wrap this up, I want to mention a few important things. First, for those of you in Europe or anywhere else in the world where the notation for the decimal is a comma, and therefore CSV files don’t really work for you, there is a read.csv2() function designed to deal with this problem. It reads CSV files with a semi-colon as a separator. The same goes for read.delim()which also has a version 2 with the exact same purpose. Second. Often, data files from external sources come with additional text, either as an introduction or a sign-off, which will only cause havoc in your data if your end up importing it. Therefore, it is excellent that we can tell R to completely ignore the first few lines of text in our data file. If you want to restrict where R stops reading the data file, you can tell it to read a precise number of rows with the nrow = argument. For example, our Pokémon data is way too large, and I may only be interested in the first 100 Pokémon. If I set nrow = 100, this is exactly what I will get. Pay attention to what happened here: the heather doesn’t count towards the number of rows specified. nrow = stands for rows of observations. Okay, let’s break it off here. Super good job, everyone! The next lesson will be very short, and it will complete the data import/export circle: we will be talking about exporting data. See you there! And… May the Force Be with You
B1 csv read data file comma table Data frames in R - Import a CSV in R 8 0 林宜悉 posted on 2020/03/09 More Share Save Report Video vocabulary