In this section, we will explore the fundamental data types and structures in R. Understanding these basics is crucial for effective data manipulation and analysis.

Key Concepts

  1. Basic Data Types

    • Numeric
    • Integer
    • Character
    • Logical
    • Complex
  2. Data Structures

    • Vectors
    • Lists
    • Matrices
    • Arrays
    • Data Frames
    • Factors

Basic Data Types

Numeric

Numeric data types are used to store numbers. By default, R treats numbers as double-precision floating-point numbers.

# Example of numeric data type
num <- 42.5
print(num)

Integer

Integers are whole numbers. You can specify an integer by appending an 'L' to the number.

# Example of integer data type
int <- 42L
print(int)

Character

Character data types are used to store text or string values.

# Example of character data type
char <- "Hello, R!"
print(char)

Logical

Logical data types are used to store TRUE or FALSE values.

# Example of logical data type
logi <- TRUE
print(logi)

Complex

Complex data types are used to store complex numbers.

# Example of complex data type
comp <- 3 + 4i
print(comp)

Data Structures

Vectors

Vectors are the most basic data structure in R and can hold elements of the same type.

# Example of a numeric vector
num_vector <- c(1, 2, 3, 4, 5)
print(num_vector)

# Example of a character vector
char_vector <- c("a", "b", "c")
print(char_vector)

Lists

Lists can hold elements of different types, including other lists.

# Example of a list
my_list <- list(1, "a", TRUE, 1 + 4i)
print(my_list)

Matrices

Matrices are two-dimensional, homogeneous data structures.

# Example of a matrix
my_matrix <- matrix(1:9, nrow = 3, ncol = 3)
print(my_matrix)

Arrays

Arrays are multi-dimensional, homogeneous data structures.

# Example of an array
my_array <- array(1:8, dim = c(2, 2, 2))
print(my_array)

Data Frames

Data frames are two-dimensional, heterogeneous data structures, similar to tables in a database or Excel.

# Example of a data frame
my_data_frame <- data.frame(
  Name = c("John", "Jane", "Doe"),
  Age = c(23, 25, 28),
  Gender = c("M", "F", "M")
)
print(my_data_frame)

Factors

Factors are used to handle categorical data and can be ordered or unordered.

# Example of a factor
my_factor <- factor(c("low", "medium", "high", "medium", "low"))
print(my_factor)

Practical Exercises

Exercise 1: Create a Vector

Create a numeric vector containing the numbers 10, 20, 30, 40, and 50.

# Solution
num_vector <- c(10, 20, 30, 40, 50)
print(num_vector)

Exercise 2: Create a List

Create a list containing a numeric value, a character string, and a logical value.

# Solution
my_list <- list(100, "R Programming", FALSE)
print(my_list)

Exercise 3: Create a Data Frame

Create a data frame with columns for Name, Age, and Score. Populate it with at least three rows of data.

# Solution
my_data_frame <- data.frame(
  Name = c("Alice", "Bob", "Charlie"),
  Age = c(24, 30, 28),
  Score = c(85, 90, 88)
)
print(my_data_frame)

Common Mistakes and Tips

  • Mixing Data Types in Vectors: Remember that vectors can only hold elements of the same type. If you mix types, R will coerce them to a common type, often leading to unexpected results.
  • Using c() for Lists: Use list() instead of c() to create lists, as c() will create a vector.
  • Data Frame Column Types: Ensure that each column in a data frame is of a consistent type. Mixing types within a column can cause errors.

Conclusion

In this section, we covered the basic data types and structures in R. Understanding these fundamentals is essential for effective data manipulation and analysis. In the next section, we will delve into basic operations and functions in R, building on the knowledge of data types and structures.

© Copyright 2024. All rights reserved