Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

A flexible two-column Jekyll theme. Perfect for personal sites, blogs, and portfolios hosted on GitHub or your own server. Latest release v4.9.1

Data Analysis

Use R for data analysis and visualization, train models and estimate errors, and use GitHub for comprehensive documentation and task management.

Splash Page

Bacon ipsum dolor sit amet salami ham hock ham, hamburger corned beef short ribs kielbasa biltong t-bone drumstick tri-tip tail sirloin pork chop.

Posts

unit00

Learning Environment

This course is intended as a blended learning module, although the provided introductions, explanations and examples might be useful for self-study only, too.

Deliverables

Assignments We distinguish between unmarked and marked deliverables (“Studien- und Prüfungsleistung”). Both are required for passing the course but only the...

Frequently Asked Questions

This is a continuously updated collection of frequently asked questions. Course requirements What is the expected workload for this course? This course giv...

unit01

First Things First

Go through a brute force introduction into R, R Markdown, the RStudio IDE, version management with Git and GitHub’s classroom functionality to get ready for ...

R and RStudio

Start to learn the essentials for working with R and RStudio.

Example: Vector Basics

Vectors are the basis for many data types in R. Creating a vector A vector is created using the c function. Here are some examples: my_vector_1 <- c(1,2...

Example: Data Frame Basics

Data frames are one of the most heavily used data structures in R. Creation of a data frame A data frame is created from scratch by supplying vectors to the...

Example: R Markdown with html output

This page shows how a compiled R markdown file looks like (in fact, all code examples in this course were compiled with R markdown). This is a header This ...

Assignments and GitHub

A note on individual learning log assignments with GitHub Within this course, you will individually submit your personal solutions for the course assignments...

unit02

First Things Second

Look closer at data types and object types before focusing on the most important features of programming languages, namely operators and control structures.

Data Types

Learn how data is measured and organized from an R perspective.

Object Types

Learn how data types are structured within different object types in R.

Indexing

Learn how to find, address, and change elements in R objects.

Operations

Learn how operators and control structures work in R.

Unmarked Assignment: Loop and Conquer

This worksheet provides some control structure and loop examples to help you getting familiar with these probably most important properties of any programmin...

unit03

Look at Your Data

Become familiar with reading and writing data, computing summary statistics and visual data exploration as the basics of data analysis.

Tabulated Data I/O

Reading or writing tabulated data into or from a data frame is a quite common task in data analysis. You could use the read.table function for this. df <-...

Visualization

Do not wait until the very final analysis stage to produce some publication quality graphics but produce fast (not necessarily nice) visualizations all the w...

Example: CSV I/O

Reading data from text files Reading text files is realized using the read.table function from R’s utils library. The function will return a data frame whic...

Example: Aggregation Statistics

Summarizing a data set The most straight forward function which returns some aggregated statistical information about a data set is summary. a <- c("A",...

Example: Visual Data Exploration

Visual data exploration should be one of the first steps in data analysis. In fact, it should start right after reading a data set. The following examples ar...

Marked Assignment: Read and Plot

This worksheet will guide you in getting a first overview of the wood harvest in Hessen between 1997 and 2014 using a visual data exploration. After completi...

unit04

Introduction

Check the integrity of datasets and clean them up to ensure that the data basis for your analysis is consistent.

Cleaning 101

Cleaning 101 Cleaning dataset is a standard procedure in data analysis and the most annoying. It can be quite time consuming but it is the most important ste...

Example: Missing Values

Handling missing values is straight forward. Let’s start with a vector with one NA value at position 3. Please note that NA is not inside quotation marks sin...

Example: Date/Time

Coercing data types to date and/or time information is generally performed using as.Date or either as.POSIXct or as.POSIXlt. Let’s start with as.Date: as.Da...

Example: Sorting

For a quick introduction to sorting and combining data in R check out our own material in the accompanying Base R course,

Example: Cleaning Columns

Cleaning data frames involves quite different aspects like splitting cell entries, converting data types or the conversion of “wide” to “long” format. In ge...

Unmarked Assignment: Cleaning Crops

This assignment is the first in a series which use regional statistical data. While the wood harvest data from Hessen was (i) quite small and (ii) quite tidy...

unit05

Describe your linear data

Compute simple statistical linear regression models that relate a dependent to an independent variable.

Basic idea of statistical modeling

Basic idea of statistical modeling Use observation samples to describe the relationship between a dependent variable and one or more independent variables. ...

unit06

Predict your linear data

Compute simple linear models to predict dependent data and assess the performance with independent test samples.

Cross validation

Test statistics can describe the quality or accuracy of regression models if the assumptions of the models are met. However, the assessment would still be b...

unit07

Select your variables

Evaluate the importance of your independent variables and select an optimal subset for your prediction model.

unit08

Tune your model

Evaluate model tuning strategies and find optimal settings for your prediction model.

Generalized additive models

So far, the models we have seen only considered linear relationships. The corresponding model type to simple linear models would be an additive model and fo...

Unmarked Assignment: Model Tuning

This worksheet uses cross-validation strategies for tuning an additive model. After completing this worksheet you should have improved your skills for handl...

unit09

Predict Your Temporal Data

Look into some specific characteristics of time series data and predict future observations based on past dynamics.

Time Series

Although we already had contact with some temporal datasets, we did not have a closer formal look on time series analysis. Time series datasets often inhibit...

Predicting time series

Time-series analyses can generally be divided into forecasting future dynamics and describing and potentially explaining past patterns. Since the latter oft...

unit10

Time Series Decomposition

After looking into time-series forecasting, we will now switch to some basics of describing time series. To illustrate this, we will again use the (mean mon...

Time series clustering

Just as one last example on time series analysis for this module and mainly for demonstrating that this module only tipped a very small set of analysis conce...

Unmarked Assignment: NAO and Cölbe

This worksheet focuses on the analysis of meteorological time series data recorded at a station near Marburg University Forest and some global teleconnection...

unit11

unit12

Graphics

Visualize your data, get some hints for publication quality graphics, and learn about some packages specifically made for visualizations.

Example: Colours

Before we expand our plotting capabilities, we want to spend a bit more time thinking about colours and colour spaces. A careful study of colour-spaces (e....

Example: Colours and maps

This is a short example on how to use the hcl colour palette for colouring features of a shapefile. Set up # Load the required packages library("terra") # ...

Example: The R Graph Gallery

Finally, check out the R Graph Gallery for getting an impression of the many more data visualization possibilities in R.

worksheets