Master level course in Physical Geography at Marburg University

Data analysis is a key competence for professional geographers that requires profound knowledge in both (statistical) analysis methods and computer sciences. While the reason for the former is obvious, the latter is a direct result of a growing data deluge, induced by technological progress on both the fields of data collection and distribution.

Data analysis is based on a variety of skills related to organizing, handling, describing and understanding a diversity of datasets. By using the programming environment R, this course will not just open the door to a cosmos of data analysis functionality but will moreover provide a domain specific and flexible tool for workflow automation.

Intended learning outcomes

At the end of this course you should be able to

  • organize a variety of datasets and (intermediate) analysis results in structured fashion,
  • document your workflow in an understandable and transparent manner, collaborate in teams and handle issues and task management using Git and GitHub as software management tool and platform,
  • implement data analysis workflows using tailored R scripts along with readily available functions from third-party R packages,
  • model relationships between data variables and calculate reliable error estimates, and to
  • critically evaluate your analysis.

Setting

This course will take place in a synchronous setting in presence in room F 14 | 00A19. In addition, there will be regular meetings with a tutor. Details on the additional tutor sessions will be provided in the first regular session, which will take place on Tuesday 24.10.2023 at 9:15 am. The tutor sessions are directly after the course in room 00A19 and Fridays 15:00 - 17:00 in room 00A12. Note that the tutor sessions are voluntary.

Syllabus

The course encompasses 14 sessions from 24.10.2023 to 06.02.2024 with a Christmas break between 27.12.2023 and 05.01.2024.

Session Date Topic Content
    Data basics  
01 24.10.2023 First things first Data and information, R, R Studio, R markdown, GitHub, GitHub classroom
02 31.10.2023 First things second Working environment, data sets, data types, data structures, logical operators, control structures
    Data exploration  
03 07.11.2023 Look at your data Reading and writing (tabulated) data, visual data exploitation, descriptive statistics
04 14.11.2023 Clean your data Tailoring data sets, fill values and NA, aggregating, merging or sub-setting data sets
    Data modelling  
05 21.11.2023 Explain your data Linear regression modelling, confidence intervals, sample tests, variance analysis
06 28.11.2023 Predict your data Cross-validation
07 05.12.2023 Select your variables Multiple linear models, feature selection
08 12.12.2023 Predict your non-linear data Generalized additive models
09 19.12.2023 Predict your temporal data Auto-correlation, AR and ARIMA models
Christmas break  
10 09.01.2024 Explain your temporal data Decomposing time series
16.01.2024 Build-in hold No course
    Marburg Open Hackathon  
11 23.01.2024 MOHA session Marked assignment
    Visualization  
12 30.01.2024 Visualize your data Publication quality graphics
    Wrap up  
13 06.02.2024 Wrap up Time for questions and feedback, individual data analysis problems, goodbye

Deliverables

The graded course certificate will be based on an individual portfolio hosted as a personal repository on GitHub. The individual portfolio items are defined in the respective course assignments along with the information if they will be marked or not.

Preparation and prerequisites

The course assumes basic knowledge and skills in R and geo-information science.

This course uses additionally provided material for teaching basic R skills, which can be found here.

Instructor

Dirk Zeuss

University of Marburg