Overview of functions

Category Function Input Type Description Key Considerations
Reading Data read.csv File path (character string) Reads a CSV file into a data frame. Ensure the file path is correct. Handles text encoding issues and can set stringsAsFactors = FALSE to avoid unexpected factor conversion.
  read.table File path (character string) Reads a table into a data frame. Offers flexibility for delimiter-separated files but requires manual settings like sep for delimiters and header for column names.
Simple Math sum Numeric vector Calculates the sum of vector elements. Watch out for NA values; use na.rm = TRUE to ignore missing data.
  mean Numeric vector Computes the mean (average) of vector elements. Default behavior includes NA values unless na.rm = TRUE is specified.
  round Numeric vector Rounds numeric values to a specified number of decimal places. Rounding can introduce numerical bias; use carefully when precision matters.
  log Numeric vector Calculates the natural logarithm of values. Check input for non-positive values, as the logarithm is undefined for these.
Indexing [] Vector, matrix, or data frame Extracts elements of vectors, matrices, or data frames. Indices are 1-based in R. Use logical, numeric, or character indices carefully to avoid errors.
Subsetting subset Data frame Returns subsets of data frames based on conditions. Simpler than using [, but can be slower for large datasets.
Sorting sort Vector Sorts vector elements in ascending or descending order. Handles NA values by default. Can specify decreasing = TRUE for descending order.
  order Vector Returns indices to sort data in ascending or descending order. Often used to sort data frames by multiple columns.
  rank Vector Returns the ranks of elements in a vector. Be cautious with ties; the ties.method argument determines tie-breaking behavior.
Selecting Data which Logical vector Returns indices of elements that meet a condition. Works with logical conditions; helpful for subsetting data programmatically.
  match Vector x, Vector y Finds matches of elements in one vector within another. Returns indices of matches; may return NA for unmatched elements.
  %in% Vector x, Vector y Logical operator to test if elements belong to another vector. Easier than match for boolean results but does not return indices.
Writing Data write.csv Data frame Writes a data frame to a CSV file. Check path and permissions. May require row.names = FALSE to avoid writing row indices.
  write.table Data frame or matrix Writes a data frame or matrix to a table file. Use sep to specify delimiter. Be careful with special characters in data.
Aggregating Data aggregate Data frame Splits data into groups and applies functions to summarize them. Useful for simple aggregation but limited for complex tasks. Grouping variables should be carefully selected.
  tapply Vector x, Factor Applies a function to subsets of a vector based on a factor. Great for 1D aggregation; for multidimensional aggregation, consider alternatives.
Merging Data merge Data frame x, Data frame y Combines data frames by columns or rows based on shared keys. Specify by to avoid unexpected joins. Handles one-to-one, one-to-many, and many-to-many relationships.
  cbind Vectors or data frames Combines objects by columns. Objects must have matching row dimensions. Risk of mismatched data if row orders differ.
  rbind Vectors or data frames Combines objects by rows. Objects must have matching column dimensions. Missing values can cause errors.
Plotting plot x: Numeric vector, y: Numeric vector Creates a scatterplot or line plot depending on the inputs. Highly customizable. Use type, col, pch, and main for customization.
  hist Numeric vector Creates a histogram to display the distribution of data. Use breaks to control bin size. Labels and axis scaling may need adjustment for clarity.
  boxplot Formula or Numeric vectors Creates boxplots to display data distribution and outliers. Can handle grouped data with a formula interface. Use notch = TRUE for confidence intervals.
  barplot Numeric vector or matrix Creates barplots for categorical data or summary statistics. Grouped barplots require matrix input. Customize colors and labels for better visualization.
  lines x: Numeric vector, y: Numeric vector Adds connected lines to an existing plot. Typically used to overlay data on an existing plot. Ensure x and y lengths match.
  points x: Numeric vector, y: Numeric vector Adds points to an existing plot. Useful for highlighting specific data points. Combine with pch and col for customization.
  legend Character labels and positioning Adds a legend to an existing plot. Customize position and symbol appearance using pch, col, and cex.

Updated: