eXtrem Gradient Boosting (XGBoost)

Tree boosting is a widely used machine learning method and XGBoost use far fewer resources than existing systems. It`s usage went from advertising systems learn to match from users and their interests up to detecting anomaly events in physics that lead to new pyhsics (Chen, T., Guestrin, C. 2016, P. 1).

Example for Decision Tree: Decision Tree: https://xgboost.readthedocs.io/en/latest/tutorials/model.htm (Zugriff: 29.06.2021)

Note: Score in model is not the Gini impurity!

Gradient Boost

Creates a single leaf which represents an initial guess for a value of all of the samples. Out of the leaf it´s create a tree and build a next tree based on the errors made by the previous tree. Until it failed to fit or it created the trees you set at the beginning.

First Tree

First Tree

Second Tree

Second Tree

Third Tree

Third Tree

Statquest Gradient Boost Part 1 (of 4): Regression Main Ideas

XGBoost

Does an initial prediction regardless of regression or classification. Use unique regression trees starting with one leaf. Leaf contains a similarity score of (lamdba). This scores used for calculating the gain. Higher gain = better split in a tree. Gamma can be used to prune Trees.

Lambda is a regularization parameter this results into a less higher sensitivity to individual observations. It also prevent the training data to over fitting.

Next tree will be created by learning rate “eta” (default setting is 0.3).

Statquest XGBoost Part 1 (of 4): Regression

[XGBoost read the docs]: https://xgboost.readthedocs.io/en/latest/R-package/xgboostPresentation.html

Data Preparation

#Loading Tools
library(xgboost)
## Warning: package 'xgboost' was built under R version 4.0.5
library(raster)
## Loading required package: sp
#Check Workspace
getwd()
## [1] "C:/Users/Jan-Niklas Tripp/Desktop/SDM/XGBoost_Tutorial"
#Set Workspace into created Folder
setwd("C:/Users/Jan-Niklas Tripp/Desktop/SDM/XGBoost_Tutorial")

##Data Preparation
#Loading the CSV Data
species_locations = read.csv("PakistanLadakh.csv", sep = ",")

#Check for Colnames
head(species_locations)
##       ï..species        x        y
## 1 Acraea_issoria 73.08635 34.43060
## 2 Acraea_issoria 73.36326 34.24367
## 3 Acraea_issoria 73.36758 34.14221
## 4 Acraea_issoria 73.82637 33.84266
## 5 Acraea_issoria 74.19839 33.94852
## 6  Acraea_violae 72.34951 31.70947
#Change Colnames for readability
colnames(species_locations) = cbind("species", "long", "lat")

#Check for NA Values
summary(species_locations)
##    species               long            lat       
##  Length:5870        Min.   :62.23   Min.   :24.13  
##  Class :character   1st Qu.:71.90   1st Qu.:33.79  
##  Mode  :character   Median :73.20   Median :34.74  
##                     Mean   :72.71   Mean   :34.27  
##                     3rd Qu.:73.93   3rd Qu.:35.94  
##                     Max.   :77.78   Max.   :37.12
#New Colnames check
head(species_locations)
##          species     long      lat
## 1 Acraea_issoria 73.08635 34.43060
## 2 Acraea_issoria 73.36326 34.24367
## 3 Acraea_issoria 73.36758 34.14221
## 4 Acraea_issoria 73.82637 33.84266
## 5 Acraea_issoria 74.19839 33.94852
## 6  Acraea_violae 72.34951 31.70947
##A raster background map showing elevation data 
#get ISO code for Pakistan
getData("ISO3")
##     ISO3                                         NAME
## 1    AFG                                  Afghanistan
## 2    XAD                        Akrotiri and Dhekelia
## 3    ALA                                        Åland
## 4    ALB                                      Albania
## 5    DZA                                      Algeria
## 6    ASM                               American Samoa
## 7    AND                                      Andorra
## 8    AGO                                       Angola
## 9    AIA                                     Anguilla
## 10   ATA                                   Antarctica
## 11   ATG                          Antigua and Barbuda
## 12   ARG                                    Argentina
## 13   ARM                                      Armenia
## 14   ABW                                        Aruba
## 15   AUS                                    Australia
## 16   AUT                                      Austria
## 17   AZE                                   Azerbaijan
## 18   BHS                                      Bahamas
## 19   BHR                                      Bahrain
## 20   BGD                                   Bangladesh
## 21   BRB                                     Barbados
## 22   BLR                                      Belarus
## 23   BEL                                      Belgium
## 24   BLZ                                       Belize
## 25   BEN                                        Benin
## 26   BMU                                      Bermuda
## 27   BTN                                       Bhutan
## 28   BOL                                      Bolivia
## 29   BES            Bonaire, Saint Eustatius and Saba
## 30   BIH                       Bosnia and Herzegovina
## 31   BWA                                     Botswana
## 32   BVT                                Bouvet Island
## 33   BRA                                       Brazil
## 34   IOT               British Indian Ocean Territory
## 35   VGB                       British Virgin Islands
## 36   BRN                                       Brunei
## 37   BGR                                     Bulgaria
## 38   BFA                                 Burkina Faso
## 39   BDI                                      Burundi
## 40   KHM                                     Cambodia
## 41   CMR                                     Cameroon
## 42   CAN                                       Canada
## 43   CPV                                   Cape Verde
## 44   XCA                                  Caspian Sea
## 45   CYM                               Cayman Islands
## 46   CAF                     Central African Republic
## 47   TCD                                         Chad
## 48   CHL                                        Chile
## 49   CHN                                        China
## 50   CXR                             Christmas Island
## 51   XCL                            Clipperton Island
## 52   CCK                                Cocos Islands
## 53   COL                                     Colombia
## 54   COM                                      Comoros
## 55   COK                                 Cook Islands
## 56   CRI                                   Costa Rica
## 57   CIV                                Côte d'Ivoire
## 58   HRV                                      Croatia
## 59   CUB                                         Cuba
## 60   CUW                                      Curaçao
## 61   CYP                                       Cyprus
## 62   CZE                               Czech Republic
## 63   COD             Democratic Republic of the Congo
## 64   DNK                                      Denmark
## 65   DJI                                     Djibouti
## 66   DOM                           Dominican Republic
## 67   DMA                                     Dominica
## 68   ECU                                      Ecuador
## 69   EGY                                        Egypt
## 70   SLV                                  El Salvador
## 71   GNQ                            Equatorial Guinea
## 72   ERI                                      Eritrea
## 73   EST                                      Estonia
## 74   ETH                                     Ethiopia
## 75   FLK                             Falkland Islands
## 76   FRO                                Faroe Islands
## 77   FJI                                         Fiji
## 78   FIN                                      Finland
## 79   FRA                                       France
## 80   GUF                                French Guiana
## 81   PYF                             French Polynesia
## 82   ATF                  French Southern Territories
## 83   GAB                                        Gabon
## 84   GMB                                       Gambia
## 85   GEO                                      Georgia
## 86   DEU                                      Germany
## 87   GHA                                        Ghana
## 88   GIB                                    Gibraltar
## 89   GRC                                       Greece
## 90   GRL                                    Greenland
## 91   GRD                                      Grenada
## 92   GLP                                   Guadeloupe
## 93   GUM                                         Guam
## 94   GTM                                    Guatemala
## 95   GGY                                     Guernsey
## 96   GNB                                Guinea-Bissau
## 97   GIN                                       Guinea
## 98   GUY                                       Guyana
## 99   HTI                                        Haiti
## 100  HMD            Heard Island and McDonald Islands
## 101  HND                                     Honduras
## 102  HKG                                    Hong Kong
## 103  HUN                                      Hungary
## 104  ISL                                      Iceland
## 105  IND                                        India
## 106  IDN                                    Indonesia
## 107  IRN                                         Iran
## 108  IRQ                                         Iraq
## 109  IRL                                      Ireland
## 110  IMN                                  Isle of Man
## 111  ISR                                       Israel
## 112  ITA                                        Italy
## 113  JAM                                      Jamaica
## 114  JPN                                        Japan
## 115  JEY                                       Jersey
## 116  JOR                                       Jordan
## 117  KAZ                                   Kazakhstan
## 118  KEN                                        Kenya
## 119  KIR                                     Kiribati
## 120  XKO                                       Kosovo
## 121  KWT                                       Kuwait
## 122  KGZ                                   Kyrgyzstan
## 123  LAO                                         Laos
## 124  LVA                                       Latvia
## 125  LBN                                      Lebanon
## 126  LSO                                      Lesotho
## 127  LBR                                      Liberia
## 128  LBY                                        Libya
## 129  LIE                                Liechtenstein
## 130  LTU                                    Lithuania
## 131  LUX                                   Luxembourg
## 132  MAC                                        Macao
## 133  MKD                                    Macedonia
## 134  MDG                                   Madagascar
## 135  MWI                                       Malawi
## 136  MYS                                     Malaysia
## 137  MDV                                     Maldives
## 138  MLI                                         Mali
## 139  MLT                                        Malta
## 140  MHL                             Marshall Islands
## 141  MTQ                                   Martinique
## 142  MRT                                   Mauritania
## 143  MUS                                    Mauritius
## 144  MYT                                      Mayotte
## 145  MEX                                       Mexico
## 146  FSM                                   Micronesia
## 147  MDA                                      Moldova
## 148  MCO                                       Monaco
## 149  MNG                                     Mongolia
## 150  MNE                                   Montenegro
## 151  MSR                                   Montserrat
## 152  MAR                                      Morocco
## 153  MOZ                                   Mozambique
## 154  MMR                                      Myanmar
## 155  NAM                                      Namibia
## 156  NRU                                        Nauru
## 157  NPL                                        Nepal
## 158  NLD                                  Netherlands
## 159  NCL                                New Caledonia
## 160  NZL                                  New Zealand
## 161  NIC                                    Nicaragua
## 162  NER                                        Niger
## 163  NGA                                      Nigeria
## 164  NIU                                         Niue
## 165  NFK                               Norfolk Island
## 166  PRK                                  North Korea
## 167  XNC                              Northern Cyprus
## 168  MNP                     Northern Mariana Islands
## 169  NOR                                       Norway
## 170  OMN                                         Oman
## 171  PAK                                     Pakistan
## 172  PLW                                        Palau
## 173  PSE                                    Palestina
## 174  PAN                                       Panama
## 175  PNG                             Papua New Guinea
## 176  XPI                              Paracel Islands
## 177  PRY                                     Paraguay
## 178  PER                                         Peru
## 179  PHL                                  Philippines
## 180  PCN                             Pitcairn Islands
## 181  POL                                       Poland
## 182  PRT                                     Portugal
## 183  PRI                                  Puerto Rico
## 184  QAT                                        Qatar
## 185  COG                            Republic of Congo
## 186  REU                                      Reunion
## 187  ROU                                      Romania
## 188  RUS                                       Russia
## 189  RWA                                       Rwanda
## 190  BLM                             Saint-Barthélemy
## 191  MAF                                 Saint-Martin
## 192  SHN                                 Saint Helena
## 193  KNA                        Saint Kitts and Nevis
## 194  LCA                                  Saint Lucia
## 195  SPM                    Saint Pierre and Miquelon
## 196  VCT             Saint Vincent and the Grenadines
## 197  WSM                                        Samoa
## 198  SMR                                   San Marino
## 199  STP                        Sao Tome and Principe
## 200  SAU                                 Saudi Arabia
## 201  SEN                                      Senegal
## 202  SRB                                       Serbia
## 203  SYC                                   Seychelles
## 204  SLE                                 Sierra Leone
## 205  SGP                                    Singapore
## 206  SXM                                 Sint Maarten
## 207  SVK                                     Slovakia
## 208  SVN                                     Slovenia
## 209  SLB                              Solomon Islands
## 210  SOM                                      Somalia
## 211  ZAF                                 South Africa
## 212  SGS South Georgia and the South Sandwich Islands
## 213  KOR                                  South Korea
## 214  SSD                                  South Sudan
## 215  ESP                                        Spain
## 216  XSP                              Spratly Islands
## 217  LKA                                    Sri Lanka
## 218  SDN                                        Sudan
## 219  SUR                                     Suriname
## 220  SJM                       Svalbard and Jan Mayen
## 221  SWZ                                    Swaziland
## 222  SWE                                       Sweden
## 223  CHE                                  Switzerland
## 224  SYR                                        Syria
## 225  TWN                                       Taiwan
## 226  TJK                                   Tajikistan
## 227  TZA                                     Tanzania
## 228  THA                                     Thailand
## 229  TLS                                   East Timor
## 230  TGO                                         Togo
## 231  TKL                                      Tokelau
## 232  TON                                        Tonga
## 233  TTO                          Trinidad and Tobago
## 234  TUN                                      Tunisia
## 235  TUR                                       Turkey
## 236  TKM                                 Turkmenistan
## 237  TCA                     Turks and Caicos Islands
## 238  TUV                                       Tuvalu
## 239  UGA                                       Uganda
## 240  UKR                                      Ukraine
## 241  ARE                         United Arab Emirates
## 242  GBR                               United Kingdom
## 243  UMI         United States Minor Outlying Islands
## 244  USA                                United States
## 245  URY                                      Uruguay
## 246  UZB                                   Uzbekistan
## 247  VUT                                      Vanuatu
## 248  VAT                                 Vatican City
## 249  VEN                                    Venezuela
## 250  VNM                                      Vietnam
## 251  VIR                         Virgin Islands, U.S.
## 252  WLF                            Wallis and Futuna
## 253  ESH                               Western Sahara
## 254  YEM                                        Yemen
## 255  ZMB                                       Zambia
## 256  ZWE                                     Zimbabwe
#ISO code = PAK
elevation_PAK = getData("alt", country="PAK", level=0)

#Boundary of Pakistan for 
boundary_PAK = getData("GADM", country="PAK", level=0)

show(boundary_PAK)
## class       : SpatialPolygonsDataFrame 
## features    : 1 
## extent      : 60.89944, 77.84308, 23.70292, 37.09701  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 2
## names       : GID_0,   NAME_0 
## value       :   PAK, Pakistan
#loading Bioclim Data Bio 2 around Pakistan
bioclim_data = getData(name = "worldclim",
                       var = "bio",
                       res = 2.5,
                       lon = 23.70292,
                       lat = 77.84308)[[c(2)]]

#Elevation Map
plot(elevation_PAK, main = "Elevation of Pakistan", col = terrain.colors(255))


#Checking if data are out of Pakistan
points(x = species_locations$long,
       y = species_locations$lat,
       col = "olivedrab",
       pch = 15,
       cex = 0.50)

#Convert to SP-DataFrame
coordinates(species_locations) = ~long+lat

#Setting CRS to Elevation Map
crs(species_locations) = crs(boundary_PAK)

#Crop species to Pakistan
crop_species_locations = crop(x = species_locations, y = boundary_PAK)

#Plot for checking
plot(elevation_PAK, main = "Elevation of Pakistan", col = terrain.colors(255))

plot(crop_species_locations, add = TRUE)

#loading Bioclim Data Bio 2 
bioclim_data = getData(name = "worldclim",
                       var = "bio",
                       res = 5)[[c(2)]]

plot(bioclim_data, main = "Worldclim map of layer Bio_2")

#Shows extent of Pakistan
print(extent(boundary_PAK))
## class      : Extent 
## xmin       : 60.89944 
## xmax       : 77.84308 
## ymin       : 23.70292 
## ymax       : 37.09701
#Create an extent Object which contains the max and min Values for Longitude and Latitude
extent_pak = extent(boundary_PAK)

#Use the extent via Crop Function on bioclim Data
bioclim_data = crop(bioclim_data, extent_pak) #bioclim_data = mask(bioclim_data, boundary_PAK)

#Test Plot for visual check 
plot(bioclim_data, main = "Worldclim map of layer Bio_2")

plot(crop_species_locations, add = TRUE)

What does XGBoost need?

Head Documentation in R-Studio

Head Documentation in R-Studio

Closer look into Documentation in R-Studio

Closer look into Documentation in R-Studio

Example of XGBoost

#Matrix of bioclim_data
bio_values = extract(bioclim_data, extent_pak)
head(bio_values)
## [1] 140 140 140 140 139 140
#Matrix of Values where our species are located
species_bio = extract(bioclim_data, crop_species_locations)
head(species_bio)
## [1] 121  90  89  99  88 148
#Our train dataset.
train_bio = data.matrix(species_bio)
head(train_bio)
##      [,1]
## [1,]  121
## [2,]   90
## [3,]   89
## [4,]   99
## [5,]   88
## [6,]  148
#Remember we need to transform our species (Char) into numeric values 
#For this we have to convert the char in a factor and than into a number otherwise it contains only NA
classes = as.numeric(as.factor(crop_species_locations$species)) -1
head(classes)
## [1] 0 0 0 0 0 1
#A very simple xgb model
xgb <- xgboost(booster="gbtree",
               data = train_bio, 
               label = classes, 
               eta = 0.3,
               max_depth = 6, 
               nround=10, 
               objective = "multi:softprob",
               num_class = length(unique(classes)),#Number of unique Classes = 421
               nthread = 3)
## [09:59:29] WARNING: amalgamation/../src/learner.cc:1095: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
## [1]  train-mlogloss:5.661157 
## [2]  train-mlogloss:5.437970 
## [3]  train-mlogloss:5.276015 
## [4]  train-mlogloss:5.150365 
## [5]  train-mlogloss:5.049788 
## [6]  train-mlogloss:4.967163 
## [7]  train-mlogloss:4.898900 
## [8]  train-mlogloss:4.841171 
## [9]  train-mlogloss:4.791974 
## [10] train-mlogloss:4.749990
#Use the Model on the Matrix of bioclim_data 
result = predict(xgb, as.matrix(bio_values))

#Creating a raster with bioclim_data
res = raster(bioclim_data)

#Giving the raster the Values of Species 1 and their probability to occur in one Cell of the raster
res = setValues(res, result[1:32683]+1)

#Plotting the first Species
plot(res, main = "Probability of species 1 to occur")

XGBoost with Train and Test data

#Set.seed ensures that we get the same sample 
set.seed(1)

#Train index contain 25% random numbers in range of the length from species_bio
train_index = sample(length(species_bio), floor(0.25*length(species_bio)))
head(train_index)
## [1] 1017 4775 2177 5026 1533 4567
#Train data contain 25% of the data of species_bio 
train_data = species_bio[train_index]
head(train_index)
## [1] 1017 4775 2177 5026 1533 4567
#Test data contain 75% of the data of species_bio
test_data = species_bio[-train_index]
head(test_data)
## [1] 121  90  89  99  88 148
#Container for species classes
train_label = as.numeric(as.factor(classes[train_index]))-1

test_label = as.numeric(as.factor(classes[-train_index]))-1

#Container for species bio clim values
train_data2 = as.matrix(train_data)

test_data2 = as.matrix(test_data)

#Create an xgb model with train_data2
xgb_test <- xgboost(booster="gbtree",
               data = train_data2,
               label = train_label,
               eta = 0.9,
               max_depth = 8, 
               nround=50, 
               objective = "multi:softmax",#softprob doesn´t work with this set of data.
               num_class = length(unique(train_label)),
               nthread = 3)
## [09:59:36] WARNING: amalgamation/../src/learner.cc:1095: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softmax' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
## [1]  train-mlogloss:5.003776 
## [2]  train-mlogloss:4.597611 
## [3]  train-mlogloss:4.347867 
## [4]  train-mlogloss:4.176858 
## [5]  train-mlogloss:4.055059 
## [6]  train-mlogloss:3.957138 
## [7]  train-mlogloss:3.877603 
## [8]  train-mlogloss:3.808987 
## [9]  train-mlogloss:3.749453 
## [10] train-mlogloss:3.699752 
## [11] train-mlogloss:3.655303 
## [12] train-mlogloss:3.618612 
## [13] train-mlogloss:3.589193 
## [14] train-mlogloss:3.563043 
## [15] train-mlogloss:3.540428 
## [16] train-mlogloss:3.521586 
## [17] train-mlogloss:3.504595 
## [18] train-mlogloss:3.488876 
## [19] train-mlogloss:3.475848 
## [20] train-mlogloss:3.463068 
## [21] train-mlogloss:3.451809 
## [22] train-mlogloss:3.441724 
## [23] train-mlogloss:3.432001 
## [24] train-mlogloss:3.423303 
## [25] train-mlogloss:3.415419 
## [26] train-mlogloss:3.408140 
## [27] train-mlogloss:3.400987 
## [28] train-mlogloss:3.394957 
## [29] train-mlogloss:3.388676 
## [30] train-mlogloss:3.383113 
## [31] train-mlogloss:3.377686 
## [32] train-mlogloss:3.372787 
## [33] train-mlogloss:3.367757 
## [34] train-mlogloss:3.363231 
## [35] train-mlogloss:3.358610 
## [36] train-mlogloss:3.354526 
## [37] train-mlogloss:3.350906 
## [38] train-mlogloss:3.347500 
## [39] train-mlogloss:3.343986 
## [40] train-mlogloss:3.340747 
## [41] train-mlogloss:3.337713 
## [42] train-mlogloss:3.334894 
## [43] train-mlogloss:3.331864 
## [44] train-mlogloss:3.329363 
## [45] train-mlogloss:3.327100 
## [46] train-mlogloss:3.324417 
## [47] train-mlogloss:3.321903 
## [48] train-mlogloss:3.319762 
## [49] train-mlogloss:3.317321 
## [50] train-mlogloss:3.314564
#Predict with the model classes in test_data
result2 = predict(xgb_test, test_data2)

head(test_label)
## [1] 0 0 0 0 0 1
head(result2)
## [1] 132 235  41   3 204 111

Just Predict 2 Species

#New container for species_locations
species1 = species_locations

#Transform species in species1 into numbers
species1$species = as.numeric(as.factor(species1$species))

#I choose the species 6 because it has 53 entries and added another one to the set
species1 = species1[species1$species==6 | species1$species==5,]
head(species1)
##    species
## 16       5
## 17       5
## 18       5
## 19       5
## 20       5
## 21       5
species1_bio = extract(bioclim_data, species1)

#Set class value of species 5 to 0
species1[species1$species==5,] = 0
head(species1)
##    species
## 16       0
## 17       0
## 18       0
## 19       0
## 20       0
## 21       0
set.seed(2)

#Train index contain 25% random numbers in range of the length from species1
train_index2 = sample(length(species1), floor(0.25*length(species1)))
head(train_index2)
## [1] 21 15  6 60 32  8
#Train data contain 25% of the data of species1_bio 
train_data3 = species1_bio[train_index2]
head(train_index2)
## [1] 21 15  6 60 32  8
#Test data contain 75% of the data of species1_bio
test_data3 = species1_bio[-train_index2]
head(test_data2)
##      [,1]
## [1,]  121
## [2,]   90
## [3,]   89
## [4,]   99
## [5,]   88
## [6,]  148
#Container for species classes
train_label2 = as.numeric(as.factor(species1$species[train_index2]))-1

test_label2 = as.numeric(as.factor(species1$species[-train_index2]))-1

#Container for species bio clim values
train_data4 = as.matrix(train_data3)

test_data4 = as.matrix(test_data3)


#Create an xgb model with train_data2
xgb_test <- xgboost(booster="gbtree",
               data = train_data4,
               label = train_label2,
               eta = 0.1,
               max_depth = 6, 
               nround=10, 
               objective = "binary:hinge",
               num_class = length(unique(train_label2))-1,
               nthread = 4)
## [1]  train-error:0.266667 
## [2]  train-error:0.266667 
## [3]  train-error:0.266667 
## [4]  train-error:0.266667 
## [5]  train-error:0.266667 
## [6]  train-error:0.266667 
## [7]  train-error:0.000000 
## [8]  train-error:0.000000 
## [9]  train-error:0.000000 
## [10] train-error:0.000000
#Predict with the model classes in test_data
result3 = predict(xgb_test, test_data4)

head(test_label2)
## [1] 0 0 0 0 0 1
head(result3)
## [1] 0 0 0 1 0 0
#Wow it looks perfect! Too perfect...
#Lets do a prediction with all data
result_perf = predict(xgb_test, as.matrix(bio_values))

#Check what it predict for every Cell
summary(result_perf)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  1.0000  1.0000  0.9144  1.0000  1.0000
res2 = raster(bioclim_data)

res2 = setValues(res2, result_perf)

plot(res2, main = "Prediction if species 5 or 6 occur")
plot(boundary_PAK, add = TRUE)

Sources: StatQuest with Josh Starmer (22.01.2018): StatQuest: Decision Trees. https://www.youtube.com/watch?v=7VeUPuFGJHk (Zugriff: 28.06.2021).

StatQuest with Josh Starmer (16.12.2019): XGBoost Part 1 (of 4): Regression. https://www.youtube.com/watch?v=OtD8wVaFm6E&t=213s (Zugriff: 28.06.2021).

StatQuest with Josh Starmer (13.01.2020): XGBoost Part 2 (of 4): Classification. https://www.youtube.com/watch?v=8b1JEDvenQU&t=923s (Zugriff: 28.06.2021).

StatQuest with Josh Starmer (25.03.2019): Gradient Boost Part 1 (of 4): Regression Main Ideas. https://www.youtube.com/watch?v=3CC4N4z3GJc (Zugriff:28.06.2021).

Chen, T., Guestrin, C. (2016): XGBoost: A Scalable Tree Boosting System. https://arxiv.org/pdf/1603.02754.pdf (Zugriff: 15.06.2021).