Structures, Unions, and Enums

Structures

struct

  • A struct is a collection of data fields that can be of different types.
  • The members are allocated in contiguous memory locations.
  • The members of a struct can be accessed using the dot operator.

To define a struct, you use the struct keyword followed by the struct name and the members enclosed in curly braces.

struct Student {
    char name[10]
    int age;
    bool male;
};

After defining a struct, you can create a variable of that type:

  • Using the dot operator to access the members of the struct.
Student s1; // empty
strcpy(s1.name, "John");
s1.age = 20;
s1.male = true;

// alternatively, you can initialize a struct using an initializer list:
Student s2 = {"Jane", 21, true};

Default values

  • If you don’t initialize a struct, its members will have undefined values.
  • Since C++11, you can use the = <value> to specify a default value for a struct.
struct Student {
    char name[10] = "Unknown";
    int age = 0;
    bool male = false;
};

Then you can create a struct without initializing it:

Student s1; // all members are initialized to their default values

Structure padding

Structure padding is a technique used by compilers to align data in memory for efficient access.

  • Modern processors are designed to read data most efficiently when it’s aligned to its natural boundary.
    • For example, a 4-byte integer is most efficiently read when it starts at an address divisible by 4.
    • Structures themselves are typically aligned to the largest alignment requirement of any of their members.
  • The compiler adds extra bytes (padding) between structure members to ensure each member starts at an appropriate address.

An example

classDiagram
    class Struct1 {
        char c
        int i
        char d
    }
    class Memory1 {
        +0: c
        +1: padding
        +2: padding
        +3: padding
        +4: i
        +5: i
        +6: i
        +7: i
        +8: d
        +9: padding
        +10: padding
        +11: padding
    }
    Struct1 --> Memory1

    class Struct2 {
        int i
        char c
        char d
    }
    class Memory2 {
        +0: i
        +1: i
        +2: i
        +3: i
        +4: c
        +5: d
        +6: padding
        +7: padding
    }
    Struct2 --> Memory2

Struct1:

struct Inefficient {
    char c;   // 1 byte
    int i;    // 4 bytes
    char d;   // 1 byte
};

Struct2:

struct Efficient {
    int i;    // 4 bytes
    char c;   // 1 byte
    char d;   // 1 byte
};

You can use the sizeof operator to get the size of a struct:

std::cout << sizeof(Struct1) << std::endl; // 12
std::cout << sizeof(Struct2) << std::endl; // 8

You don’t need to worry about structure padding in most cases, but it’s good to know that it exists.

Array of structures

You can also have an array of structures:

Student studentsArray[10];
// the size of the array is 10 * sizeof(Student)
std::cout << sizeof(studentsArray) << std::endl; // 10 * 20 = 200

Similarly, you can have a vector of structures:

std::vector<Student> studentsVector(10);
std::cout << studentsVector.size() * sizeof(Student) << std::endl; // 10 * 20 = 200

To initialize a vector/array of structures, you can use an initializer list:

std::vector<Student> studentsVector = {
    {"John", 20, true},
    {"Jane", 21, false}
};

Or you can use a loop to initialize the vector/array.

union

  • A union is a user-defined type in which all members share the same memory location.
    • At any given time, a union can contain no more than one object from its list of members.
    • The members of a union are different representations of the same data.
  • The size of a union is the size of its largest member.
  • The members of a union can be accessed using the dot operator.

To define a union, you use the union keyword followed by the union name and the members enclosed in curly braces.

union RecordType    // Declare a simple union type
{
    char   ch;
    int    i;
    long   l;
    float  f;
    double d;
};

int main()
{
    RecordType t;
    std::cout << sizeof(t) << std::endl;
    t.i = 5; // t holds an int
    t.f = 7.25; // t now holds a float
}

When you need to store different types of data but only one at a time, using a union can save memory compared to using separate variables.

For example, if you want to store IP address, you can use a union to store it as either decimal or octets.

union IPAddress {
    uint32_t numeric;
    uint8_t octets[4];
};
  • Represent the IP address in two different formats
  • Access the IP address in either format
  • Switch between formats without data conversion

enum

  • An enum is a user-defined type in which all members are given constant values.
  • It provides a way to define a set of symbolic constants.
  • Its members are automatically assigned integer values starting from 0.
enum Color {
    RED,
    GREEN,
    BLUE
};

Color c = RED;
c+=1; // error: cannot increment enum

int color_int = c; // color_int is 0
color_int += 1; // color_int is 1
Color c2 = static_cast<Color>(color_int); // c2 is GREEN

Because the enum members are assigned integer values starting from 0, you can use them in switch statements.

std::cin >> c; // user input: 0, 1, or 2

switch (c) {
    case RED:
        std::cout << "Red" << std::endl;
        break;
    case GREEN:
        std::cout << "Green" << std::endl;
        break;
    case BLUE:
        std::cout << "Blue" << std::endl;
        break;
    default:
        std::cout << "Invalid color" << std::endl;
        break;
}