This workshop provides both a short introduction to multivariate exploratory approaches and step-by-step instructions to implement each of them with R. Five methods are presented. These methods are designed to explore and summarize large and complex data tables by means of summary statistics. They help generate hypotheses by providing informative clusters using the variable values that characterize each observation.

1 Downloads, setup, and R packages

1.1 Downloads

For you to fully benefit from this workshop, you need:

Save the R code and R environment files in a folder that is easy to find! This means that the path to the folder should not be too long and should not contain spaces.

1.2 Setup

I am going to be using Rstudio. I suggest you do the same.

In order to run RStudio, you need to have already installed R 2.11.1 or higher (preferably higher). You can download the most recent version of R for your environment from CRAN.

The setup is quick and easy.

1.2.1 Load the R code

First, you are going to load the R code. Its filename is TaLC2020.code.R.

This is what Rstudio looks like upon opening.

Welcome to Rstudio

Next, you are going to open the R code file. The easiest way is to look for TaLC2020.code.R and double click the file.

Otherwise, click on File > Open file... and look for TaLC2020.code.R. Click on it.

Selecting the R code file

If the file has been imported properly, you can see that Rstudio has opened a scripting window in the upper left pane. This is where you find the code.

The R code file is loaded

1.2.2 Load the R environment

The R environment is a data file in which the data sets that we are going to use are preloaded. Its filename is TaLC2020.RData. There are different ways you can load the R environment.

The easiest way is simply to double click TaLC2020.RData!

Another way is to enter the following in the R console:

load(file.choose())

This will open an interactive window. Look for TaLC2020.RData and select it.