Introduction to web analysis with R

Introduction to web analysis with R

Depending on the preferences of the web analyst or according to the viewpoint of a company or agency on web analysis, it can come to the use of the statistical software R. This software is particularly suitable when a web analyst needs to statistically evaluate data volumes or make forecasts.

What is R about?

R is a software or object-oriented programming language for statistical calculations and visualisations available free of charge on the Internet. The programming language was developed in 1992.

With the basic package of R, the most important statistical tests and calculations can be performed. The large R-Community constantly introduces new packages and makes them available for download. Thus, the software is always evolving. For example, a web analyst who wants to perform analyses with the tracking tool Google Analytics via R can download the Google Analytics package. This enables the tracking data to be processed directly in R via the Google Analytics API.

Where can I download R?

R can be downloaded free of charge from the platform https://www.r-project.org/. Under “Download” you will find the term “Cran” which stands for “Comprehensive R Archive Network”. Here are the packages as well as the software itself.

Install and load Google Analytics package in R

a) Install

To install any package that is not present in the default R package, we click on “Packages” in the top menu bar and select “Installed packages”.

RGui

In the window that opens afterwards, we can select a CRAN mirror. This is a server on which the data is stored; it one can be chosen arbitrarily.

After the server selection, another window appears, where all available packages are displayed in alphabetical order.

Packages

Alternatively, it is also possible to install the desired package by entering install.packages(“googleAnalyticsR”) in the R console.

b) Loading

If a package is installed, it can be loaded for new projects at any time. This is done via the “Packages” tab on the menu bar. Then click on “Load package”.

>First commands in R

In R, commands are entered after the input character “>” (also called “prompt”). It is important to mention that there is a distinction between lower and upper case.

If 5+15 is entered after the prompt, 20 appears as the result.

Furthermore, logical queries like 3>2 can be performed.

The entries are executed in each case with the confirmation key (Enter).

Create first object

When working with R, it is necessary to write results of functions in an object. For example, if the web analyst wants to create a vector named “OrganicTraffic” and fill it with the session numbers of the last five days, he can enter the following:

OrganicTraffic <- c(100,120,150,180,200)

Organic Traffic

When “OrganicTraffic” is called, the values are displayed (see figure).

An object is thus created by first entering an object name and then calling at least one function. In this case, the function “c”, which stands for “combine”, was applied. This will combine the values together.

If the web analyst wants to know the average value of the sessions within these selected five days, he can type mean(OrganicTraffic).

Alternatively, the web analyst could have created a new object:

Organic Traffic

Create factors

In statistics and web analytics, there is data that cannot be put in any natural order. Such a data set is nominal scaled data. In web analytics, for example, this would be the visitor type (new and returning visitors or the various entry channels such as paid, organic, display, direct, etc.).

In R, an object or vector is converted into a factor as follows:

First, a vector is created with the respective entry channels. Here 1 stands for “Direct”, 2 for “SEO” and 3 for “SEA”.

Entry Channel <- c(1,2,2,1,3,1,3,2,2).

To perform the conversion, the factor(vector name) function must now be used:

Einstiegskanal Faktoren

In this example, the entry channel.factor contains three levels or characteristics: 1, 2 and 3.

Data Frames

Often, data occurs not only in one row or one column, but the analyst is presented with multiple rows and columns containing a wide variety of data types. A typical Google Analytics CSV export with at least one dimension and one metric is an example of a Data Frame.

Since we won’t be linking R to Google Analytics until a later date, we’ll create our own Data Frame.

First, we create two more vectors in addition to the “OrganicTraffic” vector we already created:

Organic Traffic

Then we connect the data in a data frame by using the function data.frame(a,b). This allows the vector a to be linked to the vector b.

Data Frame Channel

Above the call dataframe.channel our three online entry channels are now displayed.

An easier way to create such a data frame would be to create an empty table:

Console dateneditor Tabelle

The function edit(as.data.frame(NULL)) creates an empty table and calls it in the data editor. In this table, any cell can be accessed with a mouse click and the data can be typed in manually.

Click here for part 2 of the web analysis with R:

IT-WINGS NEWSLETTER

Stay up to date!

Additional blog posts