Master level course in Physical Geography at Marburg University

Data analysis is a key competence for professional geographers that requires profound knowledge in both (statistical) analysis methods and computer sciences. While the reason for the former is obvious, the latter is a direct result of a growing data deluge, induced by technological progress on both the fields of data collection and distribution.

Data analysis is based on a variety of skills related to organizing, handling, describing and understanding a diversity of datasets. By using the programming environment R, this course will not just open the door to a cosmos of data analysis functionality but will moreover provide a domain specific and flexible tool for workflow automation.

Intended learning outcomes

At the end of this course you should be able to

  • organize a variety of datasets and (intermediate) analysis results in structured fashion,
  • document your workflow in an understandable and transparent manner, collaborate in teams and handle issues and task management using Git and GitHub as software management tool and platform,
  • implement data analysis workflows using tailored R scripts along with readily available functions from third-party R packages,
  • model relationships between data variables and calculate reliable error estimates, and to
  • critically evaluate your analysis.

Coronavirus

Due to the ongoing Corona pandemic this course will take place in a digital classroom with up to ten students being additionally present in person in the physical classroom. Details on this synchronous hybrid classroom format will be provided in the first session, which will take place online only on Tuesday 10.11.2020 at 9:15 am. The link to the digital classroom of the first session is provided in the Ilias course environment (only accessible for members of the course who are logged-in into Ilias). Please also seriously check the hygiene policy for the course, which can be found on Ilias, and the regularly updated Information on the Coronavirus of the University of Marburg.

Syllabus

The course encompassed 13 sessions from 10.11.2020 to 23.02.2021 with a Christmas break between 19.12.2020 and 10.01.2021.

Session Topic Content
  Data basics  
01 First things first Data and information, R, R Studio, R markdown, GitHub, GitHub classroom
02 First things second Working environment, data sets, data types, data structures, logical operators, control structures
  Data exploration  
03 Look at your data Reading and writing (tabulated) data, visual data exploitation, descriptive statistics
04 Clean your data Tailoring data sets, fill values and NA, aggregating, merging or sub-setting data sets
  Data modelling  
05 Explain your data Linear regression modelling, confidence intervals, sample tests, variance analysis
06 Predict your data Cross-validation
07 Select your variables Multiple linear models, feature selection
08 Predict your non-linear data Generalized additive models
09 Predict your temporal data Auto-correlation, AR and ARIMA models
10 Explain your temporal data Decomposing time series
  Marburg Open Hackathon  
11 MOHA session Marked assignment
  Visualization  
12 Visualize your data Publication quality graphics
  Wrap up  
13 Wrap up Time for questions, addressing potential individual data analysis problems, goodbye

Deliverables

The graded course certificate will be based on an individual portfolio hosted as a personal repository on GitHub. The individual portfolio items are defined in the respective course assignments along with the information if they will be marked or not.

Preparation and prerequisites

The course assumes basic knowledge and skills in R and geo-information science. A pre-course for learning basic R skills is currently under construction here, which you can check out and help improving.

Parallel courses

The course provides a basis for the parallel Geo Information Systems and Remote Sensing courses. It is intended as a blended learning module in our study program although the provided introductions, explanations and examples might be useful for self-study, too.

Instructor

Dirk Zeuss

University of Marburg