Function1 & Arrays

Function

Why functions?

Sometimes we want to repeat a set of instructions many times, functions allow us to do that.

Avoid repeating the same code over and over again.
Break down a complex task into smaller, more manageable pieces.
Increase code reusability.
Make code more readable and easier to understand.

What is a function?

A function is a block of code that performs some operation.

For example, you want to calculate the sum of two numbers, you can define a function sum that takes two arguments and returns their sum:

int sum(int a, int b)
{
    return a + b;
}

int is the return type of the function.
sum is the name of the function.
int a, int b are the parameters of the function.
return a + b; is the body of the function.
The function can be called from other parts of the program.
The values passed to the function are the arguments, which should match the parameters in the function definition.

For example:

int main()
{
    int i = sum(10, 32);
    int j = sum(i, 66);
    cout << "The value of j is" << j << endl; // 108
}

There’s no practical limit to function length, but good design aims for functions that perform a single well-defined task.

Function declaration

A minimal function declaration consists of

return_type functionName(parameter_list);

return_type is the type of the value the function returns.
- void if the function does not return a value.
- Since C++11, auto can be used to specify that the return type is deduced from the return statement, but this is rarely used.
functionName is the name of the function, which must begin with a letter or underscore and can’t contain spaces.
The parameter list, a brace delimited, comma-separated set of zero or more parameters.

Some examples:

int sum(int a, int b);

void printHello();

auto max(float a, float b) -> float; // -> is a trailing return type, introduced in C++11

double divide(double numerator, double denominator);

bool isEven(int number);

Function definition

A function definition consists of

return_type functionName(parameter_list)
{
    // function body
    return value;
}

Variables declared in the function body are local to the function.
return statement is used to return a value from the function.
- If the return type is void, you can use return; to exit the function.

Function parameters

The functions we’ve seen so far are all functions passing arguments by value.

The value of the arguments are copied to the parameters.
Changes to the parameters inside the function do not affect the original arguments.

int square3(int x)
{   
    x = 3;
    return x * x;
}

int main()
{
    int x = 5;
    cout << square3(x) << endl; // 9
    cout << x << endl; // 5
}

`inline` keyword

The inline keyword is a hint to the compiler to inline the function, which means the compiler will replace the function call with the function body.

inline helps to improve performance where a function is called frequently and its body is small.
inline function must be defined in the same file as the one that makes the call.
inline is only a hint, the compiler may choose to ignore it.

For example, I want to calculate the square of a number in many places:

inline double square(double x) { return x * x; }

double x = square(5.0); // square is replaced with x * x if it's inline

Macros

Macros are a way to define a piece of code that can be reused throughout your program.

Macros are defined using the #define directive.
Macros are expanded by the preprocessor before the compilation process.

Object-like macros:
- #define PI 3.14159
Function-like macros:
- #define SQUARE(x) ((x) * (x))
For function-like macros, the preprocessor replaces the macro call with the macro body, substituting the arguments.
The preprocessor does not understand C++ syntax or semantics, it performs a blind textual replacement.
After the replacement, the code is compiled.

Some classic macro mistakes:

#define MAX(a, b) ((a) > (b) ? (a) : (b))

// x++ is evaluated twice!
int x = 12;
int result = MAX(x++, 10);

// type mismatch
double x = 5.0;
int y = 10;
double result = MAX(x, y);

// expected max of (3 + 2) and (4 * 2)
int result = MAX(3 + 2, 4 * 2);
// works in our case, but not always
// for example, #define MAX(a, b) a > b ? a : b

`inline` vs macros

Macros are expanded by the preprocessor, while inline functions are expanded by the compiler.
Macros do not perform type checking, while inline functions do.
Macros can result in multiple evaluations of their arguments, while inline functions do not.
Debugging macros is harder than debugging inline functions.
Macros are always expanded, while inline functions are decided by the compiler.
…

Where to put functions?

In the source file, declaration and definition can be put together.

// main.cpp
...
// the declaration must be put before it's used
int sum(int a, int b){
    return a + b;
}

int main(){
    int i = sum(10, 32);
    ...
}

In the source file, declare the function first, then put the definition somewhere.
- It’s often used in small projects, you only use this in the specific file.

// main.cpp
...

// 
int sum(int a, int b);

int main(){
    int i = sum(10, 32);
    ...
}

// definition after the main function
int sum(int a, int b){
    return a + b;
}

In header files, declare the function first, then put the definition somewhere.

// sum.h
#pragma once

int sum(int a, int b);

// sum.cpp
#include "sum.h"

int sum(int a, int b){
    return a + b;
}

// main.cpp
#include "sum.h"

int main(){
    int i = sum(10, 32);
    ...
}

Why we prefer to put the declaration in the header file and the definition in the source file?

When changing the function, you only need to recompile that source file.
Avoid multiple definitions of the same function.
Modularity and encapsulation.
- Interface and implementation separation.
- Easy to change the implementation without changing the interface.

Arrays

Array definition

Arrays are sequences of identically typed elements.

The elements are stored in contiguous memory locations.
The size of the array is fixed at compile time.
The type of the elements must be the same.

Array initialization starts with the element type, followed by the array name, and then the size of the array in square brackets.

int arr[5];  // uninitialized array, random values
int arr[5] = {1, 2, 3, 4, 5};  // initialized

`constexpr`

A constant expression is an expression that can be evaluated at compile time.

You can use constexpr to declare a constant variable.

constexpr double pi = 3.14159;
constexpr int r = 5;
constexpr int area = pi * r * r;

const is different from constexpr.

const is used to declare a constant variable, it doesn’t necessarily need to be known at compile time.
constexpr is stricter than const, it must be known at compile time.

Array initialization

Array’s size must be a constant expression, so it must be known at compile time.

Some examples:

int arr[5] = {1, 2, 3, 4, 5}; // OK

constexpr size_t size = 10;
int arr[size]; // OK

You can omit the length of the array because it can be inferred from the number of elements in the braces at compile time.

int arr[] = {1, 2, 3, 4, 5}; // OK, size is 5

The variable-length array (VLA) is a feature introduced in C99, some c++ compilers treat it as an extension but not standard C++.

size_t size = 10;
int arr[size]; // error: size is not a constant expression

Accessing array elements

You can access array elements using the index operator [].
Array indexing starts at 0 in C++.
- For example, the first element is arr[0], the second is arr[1], and so on.
The index must be an integer expression.

int arr[5] = {1, 2, 3, 4, 5};
int a = arr[0]; // a is 1
int b = arr[4]; // b is 5

int i = 3;
int c = arr[i]; // c is 4

Bound checking

In C++, array bounds are not checked at runtime.

If you access an element outside the bounds of an array, the program will have undefined behavior.

int arr[5] = {1, 2, 3, 4, 5};
int a = arr[5]; // undefined behavior

Multidimensional arrays

Multidimensional arrays are arrays of arrays.

The first dimension is the row, and the second dimension is the column.
C++ stores multidimensional arrays in row-major order.
- The elements are stored in a contiguous block of memory, row by row.

int arr[2][3] = 
{
    {1, 2, 3}, 
    {4, 5, 6}
};

You can access elements in a multidimensional array using multiple index operators.

int a = arr[0][1]; // a is 2

C++ requires all dimensions except the first to be specified.
- The compiler needs to know the size of each dimension to allocate the correct amount of memory.

int arr[][3] = {{1, 2, 3}, {4, 5, 6}}; // OK

Array as function parameter

You can pass an array to a function as a pointer.

void printArray(int arr[], size_t size)
{
    for (size_t i = 0; i < size; i++) {
        cout << arr[i] << " ";
    }
}

int arr[] is equivalent to int *arr, first element’s address.
- We will talk about pointers later.
size_t size is the size of the array.
If you change the elements in the function, the changes will be reflected in the original array.

Range-based for loop

C++11 introduced a range-based for loop to iterate over arrays.

int arr[5] = {1, 2, 3, 4, 5};

for (int elem : arr) {
    elem *= 2;
}

for (int elem : arr) {
    cout << elem << " ";
}
// 1 2 3 4 5

int elem is the element in the array.
arr is the array.
The loop variable elem is a copy of the array element.
- If you want to modify the elements, you can use int &elem instead of int elem.
- If you don’t want to modify the elements, you can use const int &elem.
int &elem is a reference to the element in the array, we will talk about references later.

Vector

Vector definition

Vector is a sequence container that represents a dynamic array.

The elements are stored in contiguous memory locations.
The elements are of the same type.
The size of the vector can be changed at runtime.

Note

vector is in the <vector> header, and it’s in the std namespace. As a result, you need to include the header and use the std::vector.

For simplicity, the snippets in the lecture note assume that you have using namespace std;.

Initialization:

vector<int> vec1; // empty vector
vector<int> vec2(5); // 5 elements, all initialized to 0

vector<int> vec3(5, 2); // 5 elements, initialized with 2
vector<int> vec4{5, 2}; // 2 elements, initialized with 5 and 2

vector<int> vec5 = {1, 2, 3, 4, 5}; // 5 elements, initialized with 1, 2, 3, 4, 5
vector<int> vec6 {1, 2, 3, 4, 5}; // 5 elements, initialized with 1, 2, 3, 4, 5

Accessing vector elements

You can access vector elements using the index operator [].
Vector indexing starts at 0.
The index must be an integer expression.

vector<int> vec = {1, 2, 3, 4, 5};
int a = vec[0]; // a is 1
int b = vec[4]; // b is 5

int i = 3;
int c = vec[i]; // c is 4

Use .front() and .back() to get the first and last element of the vector.

int a = vec.front(); // a is 1
int b = vec.back(); // b is 5

Inserting and removing elements

Fast operations: insertion and removal of elements at the end.

.push_back(): add an element to the end of the vector.

vector<int> vec = {1, 2, 3, 4, 5};
vec.push_back(6); // vec is {1, 2, 3, 4, 5, 6}

.pop_back(): remove the last element of the vector.

vector<int> vec = {1, 2, 3, 4, 5};
vec.pop_back(); // vec is {1, 2, 3, 4}

Useful methods

.size(): get the number of elements in the vector.
.empty(): check if the vector is empty.
.clear(): remove all elements from the vector.

vector<int> vec = {1, 2, 3, 4, 5};
cout << vec.size() << endl; // 5
cout << vec.empty() << endl; // false
vec.clear();
cout << vec.empty() << endl; // true
cout << vec.size() << endl; // 0

Copies are deep

When you assign a vector to another vector, a deep copy is made.

vector<int> vec1 = {1, 2, 3, 4, 5};
vector<int> vec2 = vec1; // deep copy
vec2[0] = 10;
cout << vec1[0] << endl; // 1
cout << vec2[0] << endl; // 10

However, you cannot use = to copy an array.

int arr1[5] = {1, 2, 3, 4, 5};
int arr2[5];
arr2 = arr1; // error

Vector as function parameter

When a vector is passed to a function, a copy of the vector is created.

No need to pass the size of the vector, you can use .size() to get the size.
This new copy of the vector is then used in the function.
Any changes made to the vector in the function do not affect the original vector.

void modifyVector(vector<int> vec)
{
    for (size_t i = 0; i < vec.size(); i++) {
        vec[i] *= 2;
    }
}

vector<int> vec = {1, 2, 3, 4, 5};
modifyVector(vec);
cout << vec[0] << endl; // 1

You can use pass-by-reference (add & in the parameter list) to modify the original vector.

void modifyVector(vector<int>& vec)
{
    vec[0] = 10;
}

vector<int> vec = {1, 2, 3, 4, 5};
modifyVector(vec);
cout << vec[0] << endl; // 10

In practice, we prefer to use pass-by-reference to avoid copying the vector. Copying a large vector can be expensive.

If you don’t want to modify the vector and still want to avoid copying, you can use const in the parameter list.

void printVector(const vector<int>& vec)
{
    for (const auto elem : vec) {
        cout << elem << " ";
    }
}

Bound checking

The index operator [] does not perform bound checking.

vector<int> vec = {1, 2, 3, 4, 5};
int a = vec[10]; // undefined behavior

Alternatively, you can use .at() to get the element at a specific position. It throws an out_of_range exception if the index is out of bounds.

vector<int> vec = {1, 2, 3, 4, 5};
int a = vec.at(10); // throws an out_of_range exception

.at() performs bound checking, it is slower than the index operator [].

Multidimensional vectors

Multidimensional vectors are vectors of vectors.

vector<vector<int>> vec = 
{
    {1, 2, 3},
    {4, 5, 6},
    {7, 8, 9}
};

cout << vec[1][2] << endl; // 6

Alternatively, you can initialize a 3x4 vector with 1.

vector<vector<int>> vec(3, vector<int>(4, 1)); // 3x4 vector, initialized with 1

`using` directive

We know that using directive is used to bring names from a namespace into the current scope.

Now we introduce a new usage of using directive. It can be used create an alias for a type.

For example, we can create an alias for std::vector<std::vector<int>>.

using Matrix = vector<vector<int>>;
// after this, Matrix is a synonym for vector<vector<int>>
Matrix mat(3, vector<int>(4, 1)); // 3x4 matrix, initialized with 1

String

C-style strings

C-style string is an array of characters terminated by a null character \0.
It can be declared using the following syntax:

char str[16] = {'H', 'e', 'l', 'l', 'o', '\0'};
char badStr[5] = {'H', 'e', 'l', 'l', 'o'}; // no null terminator
char goodStr[20] = {'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', '\0'};

You can use the strlen() function to get the length of a C-style string, it returns the number of characters in the string excluding the first null terminator.

cout << strlen(goodStr) << endl; // 12

char str2[15] = {'H', 'e', 'l', 'l', 'o', '\0', 'W', 'o', 'r', 'l', 'd'};
cout << strlen(str2) << endl; // 5

Be careful that the length of the string and the length of the array are different.

The length of the string is the number of characters in the string excluding the null terminator.
The length of the array is the number of elements in the array.
- You can use the sizeof() operator to get the length of the array.

char str2[15] = {'H', 'e', 'l', 'l', 'o', '\0', 'W', 'o', 'r', 'l', 'd'};
std::cout << "str2: " << str2 << std::endl; // Hello
std::cout << "str2 size: " << strlen(str2) << std::endl; // 5
std::cout << "array size: " << sizeof(str2) << std::endl; // 15

You can also use (narrow) string literals to initialize a C-style string.

char str[] = "Hello"; // str is {'H', 'e', 'l', 'l', 'o', '\0'}
char str2[] = "Hello \"World\""; // you must use escape character for "

The null terminator is automatically added to the end of the string literal.

You can use wide string literals to represent wide characters.

wchar_t wide_str[] = L"Hello, 世界! π ≈ 3.14159";

// set the locale to UTF-8
std::wcout.imbue(std::locale("en_US.UTF-8"));
// use wcout to print wide string
std::wcout << wide_str << std::endl;

String manipulation

Copy: strcpy(char* dest, const char* src)
- Copies the C-string pointed by src into the array pointed by dest
Concatenate: strcat(char* dest, const char* src)
- Appends a copy of the C-string pointed by src to the end of the C-string pointed by dest
Compare: strcmp(const char* str1, const char* str2)
- Compares the C-string pointed by str1 to the C-string pointed by str2

More safer functions:

Copy: strncpy(char* dest, const char* src, size_t count)
- Copies the first count characters of the C-string pointed by src into the array pointed by dest
Concatenate: strncat(char* dest, const char* src, size_t count)
- Appends the first count characters of the C-string pointed by src to the end of the C-string pointed by dest
Compare: strncmp(const char* str1, const char* str2, size_t count)
- Compares the first count characters of the C-string pointed by str1 to the first count characters of the C-string pointed by str2

String class

string is a class in the <string> header, and it’s in the std namespace.
As a result, you need to include the header and use the std::string.

String class provides a lot of useful methods for string manipulation.

std::string str1 = "Hello";
std::string str2 = "World";
std::string str3 = str1 + " " + str2; // "Hello World"

std::cout << "str3: " << str3 << std::endl;
std::cout << "Length of str3: " << str3.length() << std::endl; // 11
std::cout << "str1 == str2: " << (str1 == str2) << std::endl; // false

// Additional string operations
std::cout << "First character of str1: " << str1[0] << std::endl;
std::cout << "Substring of str3: " << str3.substr(0, 5) << std::endl;

Wide strings

wstring is a string class for wide characters.

std::wstring wstr = L"Hello 世界";
std::wcout.imbue(std::locale("en_US.UTF-8"));
std::wcout << wstr << std::endl;

Other types of strings:

std::u8string is a string class for UTF-8 encoding. (C++20)
std::u16string is a string class for UTF-16 encoding. (C++11)
std::u32string is a string class for UTF-32 encoding. (C++11)

Function

Why functions?

What is a function?

Function declaration

Function definition

Function parameters

inline keyword

Macros

inline vs macros

Where to put functions?

Arrays

Array definition

constexpr

Array initialization

Accessing array elements

Bound checking

Multidimensional arrays

Array as function parameter

Range-based for loop

Vector

Vector definition

Accessing vector elements

Inserting and removing elements

Useful methods

Copies are deep

Vector as function parameter

Bound checking

Multidimensional vectors

using directive

String

C-style strings

String manipulation

String class

Wide strings

`inline` keyword

`inline` vs macros

`constexpr`

`using` directive