Well I'm making up this week for the lack of time I had to devote to this blog over the last three weeks after Darren kindly created this sub-domain for me and my soon to have, huge fan base! Thanks Darren!
Well this post is all about learning R on Codeschool.com, sponsored by O'Reilly Press.
Earlier this week I completed the course, as any of my social media followers will know.
Here is an image of the final badge I received on completion of the course! You may applaud now.
Final Badge showing completion of the course[/caption]
So what did I learn. Well the course began by explaining expressions and logical values and showing how to pass values to a variable that you name, which could be x, y, line, city or just about anything. Functions were then explained like sum.
Vectors are a phrase that comes up time and again in the course and refers to a list of values albeit of the same type, so like the values; 1 TRUE, "Three", are all different types and cannot be called together or else the out put would just be, "1", "TRUE" and "Three".
If you ever see the letter 'c' before brackets it is a function which is saying 'A combination of the following values in these brackets. For example a string vector c ('a', 'b', 'c',).
A vector with a sequence of numbers could be shortened to 2:9 with the use of a colon. There were other variations of vectors but if I told you about everything you'd have no need to try out the course for yourself.
For vector access and indeed most other types of access in R you need to remember to use square brackets like so: [data].
Plotting a vector was discussed as was vector math, adding and subtracting vectors, comparing them, creating scatter-plots with vectors and how to cope with NA values with vectors.
After all that my brain was exhausted and I was ready to lie down on a mattress! Instead I was introduced to Matrices!
Matrices are like frames or tables to slip your data in, not unlike the columns and rows on a spreadsheet. I won't spoil it for you and tell you the gory details but when you have to access a matrix, you've guessed it, .... yes you have to use square brackets again, like so: [ ]. It was at this stage that art met science, well nearly, when we got to visualize the data, although, except for the heat-map, the visualizations weren't very aesthetic, in my view.
LIES, DAMN LIES AND STATISTICS
Well in fact at this stage we weren't lied to but we were simply introduced to just some of the statistics that could be applied to data such as average and median measurements and Standard Deviation, and this led nicely on to chapter five which was about...
Which is where data needs to be grouped by category. We were introduced to the long hand version of accessing this data (don't forget the square brackets) but I really got excited, as one does, when I saw dollar signs! Cashing (sound of money drawer opening) ... $$$$$$!! .....except in this case there was no money in the drawer or any drawer but it was still exciting because it turned out to be a simple way of picking out a column or group of data such as: treasure$prices or treasure$types. This short cut actually came following an explanation of .....
Which like matrices is about columns and rows but with a indeterminable number of rows. Data frame access requires...? Square brackets ... come on guys you should know this by now but only if you're doing it without the $ symbol!
Our kind friends at O'Reilly have developed a number of data frames to load into 'R' and experiment with. To find them write: list files() and then "target.csv" (which stands for - Comma Separated Values) format and then add, in this case, "infantry.txt". You need to tidy up this a bit.. do the course to find out how. Hint; read.table ("infantry.txt", sep=" \t", header=TRUE - although it could be FALSE depending on whether there is a header or not in the original data).
REAL WORLD DATA
A recommended Book on ggplot2, below:
Finally we were introduced to real world data, more statistical tools and methods and finally saw the beauty that is ggplot2 which is like a library but we're told not a library because there are assassination squads out there in 'The Real World' who, apparently, will get very angry if you describe it as a library even though the word library has to be used in the syntax to call up the ggplot2 package and other packages when one wants to use these non-libraries, OK?
Just some of the many visualizations one can produce with 'R'.
List of European Airports
A map of political groupings in Europe