Data types
Bit and byte
Bit (binary digit) is the smallest unit of information in a computer.
- It is either 0 or 1.
- Bits are used to represent the most basic form of data, such as the on/off state of an electrical signal in a computer’s hardware.
Byte is a unit of data that is composed of 8 bits.
- Each byte can represent 256 different values (2^8).
- Bytes are used to represent a wide range of data types, characters, and symbols.
- Bytes are also the basic storage unit for files, memory, and network communication.
Integer types
int
is the most frequently used integer type, normally 4 bytes.unsigned int
is an unsigned integer type, can represent non-negative numbers from 0 to 2^32 - 1.
To initialize an integer variable, the traditional ways
int a; // not recommended
= 0; // you should never forget to initialize a variable
a
int b = 0; // init b with 0
int c(0); // init c with 0
Since C++11, you can uniform initialization for all variables:
int a {}; // init a with 0, also called zero initialization
int b {0}; // init b with 0
Signed and unsigned
- Both
unsigned int
andsigned int
are 4 bytes, can represent 2^32 numbers.unsigned int
can represent 2^32 numbers, from0
to2^32 - 1
.signed int
can represent from-2^31
to2^31 - 1
.- The first bit is used to represent the sign of the number.
Overflow
Here’s an example of overflow:
#include <iostream>
int main() {
int a = 56789;
int result = a * a;
std::cout << "Result: " << result << std::endl;
return 0;
}
The answer should be 3224990521, however the result is -1069976775.
56789 * 56789 > 2^31 - 1, so it overflows.
More integer types
short int
for shorter integerslong int
for longer integerslong long
for even longer integers
C++ has standard defined minimum and maximum size for each type, the actual size can be different for different compilers and systems. See here.
Microsoft official documentation for the size of types,
sizeof
operator
sizeof
is an operator that returns the size of a type or a variable in bytes.
- Not a function, can be used with types and variables.
- The sizeof operator is evaluated at compile time.
int i = 0;
short s = 0;
<< "sizeof(int)=" << sizeof(int) << endl;
cout << "sizeof(i)=" << sizeof(i) << endl;
cout << "sizeof(short)=" << sizeof(s) << endl;
cout << "sizeof(long)=" << sizeof(long) << endl;
cout << "sizeof(size_t)=" << sizeof(size_t) << endl; cout
Character types
char
is a single byte, which can represent a character. It’s an 8-bit integer indeed.signed char
is a signed 8-bit integer.unsigned char
is an unsigned 8-bit integer.char
is either signed or unsigned (depending on the compiler and system).
How to represent a character?
char
is an 8-bit integer, so it can represent 2^8 = 256 different characters. The ASCII table is a list of characters that are assigned to the numbers from 0 to 255. See here.
char c1 = 'C'; // C
char c2 = 80; // decimal
char c3 = 0x50; // hexadecimal
bool
A Boolean type that can have one of two values: true or false.
true
is represented by 1,false
is represented by 0.bool
is actually anint
type under the hood.- Boolean width: 8 instead of 1. (1 Byte)
- Any non-zero value is considered
true
, zero is consideredfalse
.
bool b1 = true;
int i = b1; // value of b1 is 1
bool b2 = -256; // value of b2 is 1, not recommended
bool b = (-256 != 0); // better choice
bool b3 = 0; // value of b3 is 0
Choose appropriate integer types
The resolution of the image is 4226 x 2847, char
is widely used for pixel values. See RGB model
The final memory usage is 4226 * 2847 * 3 bytes, which is about 37 MB.
Byte
Since c++17, std::byte
is defined in the <cstddef>
header.
- Using
char
makes you think you are dealing with characters - Using
std::byte
makes you think you are dealing with bytes
Fixed width integer types
C++11 introduces fixed width integer types, which are guaranteed to be the same size across different compilers and systems. See here.
Types:
int8_t
for 8-bit integerint16_t
for 16-bit integerint32_t
for 32-bit integerint64_t
for 64-bit integeruint8_t
for 8-bit unsigned- …
Some useful macros:
INT8_MIN
,INT8_MAX
INT16_MIN
,INT16_MAX
INT32_MIN
,INT32_MAX
INT64_MIN
,INT64_MAX
UINT8_MIN
,UINT8_MAX
- …
size_t
Computer memory keeps increasing
- 32-bit int was enough in the past to for data length
- But now it is not.
size_t
is an unsigned integer type
size_t
is defined in the<cstddef>
header.size_t
is usually 32-bit or 64-bit, depending on the system.
Floating point types
float
for single precision floating point numbers, 32 bitsdouble
for double precision floating point numbers, 64 bitslong double
for extended precision floating point numbers, 80 bits
c++23 introduces fixed width floating point types, which are guaranteed to be the same size across different compilers and systems. (Support for these types is not widespread yet.)
float16_t
for 16-bit floating point numbersfloat32_t
for 32-bit floating point numbers- …
Range and accuracy
An example
// float.cpp
#include <iostream>
#include <iomanip>
using namespace std;
int main(){
float f1 = 1.2f;
float f2 = f1 * 1000000000000000; // 1.2e15
<< std::fixed << std::setprecision(15) << f1 << endl;
cout << std::fixed << std::setprecision(1) << f2 << endl;
cout return 0;
}
- How many numbers in range [0, 1]?
- How many numbers can 32 bits represent?
- You want 1.2, but
float
can only represent 1.200000047683716.
float point representation
float
bit layoutThe value of float
is calculated as follows:
\[ (-1)^{b_{31}} \times 2^{\left(b_{30} b_{29} \ldots b_{23}\right)_2-127} \times\left(1 . b_{22} b_{21} \ldots b_0\right)_2 \]
Floating-point VS integer
- Represent values between integers
- A much greater range of values
- Floating-point operations are slower than integer operations
double
is even slower thanfloat
- Lose precision
Precision
Will f2 be greater than f1?
// precision.cpp float f1 = 23400000000; float f2 = f1 + 10; // but f2 = f1
Can we use
==
operator to compare two floating point numbers?if (f1 == f2) //bad if (fabs(f1 - f2) < numeric_limits<float>::epsilon()) // good
numeric_limits<float>::epsilon()
is the machine epsilon forfloat
.You can change
float
todouble
to get the machine epsilon fordouble
.
static_cast
static_cast
is a cast operator that converts a value from one type to another type.
double d = static_cast<double>(f1) + 10;
- when use
f1+10
withoutstatic_cast
, it will first do afloat
addition, then implicitly convert the result todouble
. static_cast
forcesf1
to be converted todouble
before the addition such that the result is adouble
.static_cast
is also used for other types, likeint
tochar
,float
toint
etc.- It’s a compile-time cast.
Literal
A literal is a program element that directly represents a value.
Integer literal
- Decimal:
123
- Octal:
0123
- Hexadecimal:
0x123
- Binary:
0b1010
- unsigned:
123u
or123U
- long:
123l
or123L
- long long:
123ll
or123LL
Floating point literal
- float:
1.23f
or1.23F
- double:
1.23
- long double:
1.23l
or1.23L
- Exponential:
1.23e-4
or1.23E-4
, which is \(1.23 \times 10^{-4}\).
inf
and nan
IEEE 754 floating point numbers can represent positive or negative infinity, and NaN (not a number).
±inf
: infinity (Exponent=11111111, fraction=0)nan
: not a number (Exponent=11111111, fraction!=0)
- \(1/0 =\)
inf
- \(\log(0) =\)
-inf
- \(\sqrt{-1} =\)
nan
Arithmetic operators
Operator | Description | Example |
---|---|---|
+ |
Addition | a + b |
- |
Subtraction | a - b |
* |
Multiplication | a * b |
/ |
Division | a / b |
% |
Modulus (remainder) | a % b |
++ |
Increment | a++ or ++a |
-- |
Decrement | a-- or --a |
+ |
Unary plus | +a |
- |
Unary minus | -a |
Operator precedence:
a++
,a--
++a
,--a
+a
,-a
*
,/
,%
+
,-
- You can refer to this table.
- If you are not sure about the precedence, use parentheses!
For more details, see cppreference.
Assignment operators
Operator | Description | Example |
---|---|---|
= |
Assignment | a = b |
+= |
Addition assignment | a += b |
-= |
Subtraction assignment | a -= b |
*= |
Multiplication assignment | a *= b |
/= |
Division assignment | a /= b |
%= |
Modulus assignment (remainder) | a %= b |
Increment and decrement operators:
a++
anda--
are post-increment and post-decrement operators.++a
and--a
are pre-increment and pre-decrement operators.
// not recommended
int a = 0;
int b = a++; // b = 0, a = 1
int c = ++a; // c = 2, a = 2
// recommended
int a = 0;
int b = a;
++; // or a = a + 1;
a...
Implicit type conversion
Implicit type conversion is done by the compiler, without programmer’s explicit permission.
- Widening conversion:
- Any integer type except
long long
todouble
. bool
andchar
to any other built-in type.short
toint
,int
tolong
,long
tolong long
.float
todouble
.
- Any integer type except
- Narrowing conversion:
- Any floating point type to integer type.
- Any integer type to
bool
. - …
Example:
// Implicit type conversion examples
// Promotion (widening conversion)
int i = 42;
double d = i; // int promoted to double
std::cout << "Promotion: int to double - " << d << std::endl;
char c = 'A';
int ascii = c; // char promoted to int
std::cout << "Promotion: char to int - " << ascii << std::endl;
// Coercion (narrowing conversion)
double pi = 3.14159;
int rounded = pi; // double coerced to int, fractional part lost
std::cout << "Coercion: double to int - " << rounded << std::endl;
int large = 1000;
char narrowed = large; // int coerced to char, possible data loss
std::cout << "Coercion: int to char - " << static_cast<int>(narrowed) << std::endl;
// Mixed-type arithmetic (promotion occurs)
int num = 5;
double result = num / 2; // int promoted to double after division
std::cout << "Mixed-type arithmetic: int/int as double - " << result << std::endl;
Signed - unsigned conversions
- A signed integer type and its unsigned integer always have the same bit width.
- When a signed - unsigned conversion happens,
- the bit pattern is the same,
- but the interpretation is different.
using namespace std;
unsigned short num = UINT16_MAX;
short num2 = num;
<< "unsigned val = " << num << " signed val = " << num2 << endl;
cout // Prints: "unsigned val = 65535 signed val = -1"
// Go the other way.
= -1;
num2 = num2;
num << "unsigned val = " << num << " signed val = " << num2 << endl;
cout // Prints: "unsigned val = 65535 signed val = -1"
Explicit type conversion
Explicit type conversion is done by the programmer, using the cast operator.
C-style cast:
(int) x; // old-style cast, old-style syntax
int(x); // old-style cast, functional syntax
The c-style cast is not recommended,
- It’s not type-safe.
- It’s not clear what the cast is doing.
C++11 introduces static_cast
to cast between types.
- Syntax:
static_cast<new_type>(expression)
- It’s a compile-time cast.
- It returns an error if the cast is not possible.
double d = 1.58947;
int i = d; // warning C4244 possible loss of data
int j = static_cast<int>(d); // No warning.
= static_cast<string>(d); // cannot convert from
string s // double to std:string
const type qualifier
const
is a type qualifier that specifies that the value of the variable cannot be changed.- The variable must be initialized when declared.
const
can be applied to variables, parameters, and return types.
const int a = 10;
int b = a; // ok
= 20; // error a
const_cast
const_cast<type>(expression)
- remove the
const
qualifier from a variable, - add
const
qualifier to a non-const variable.
- remove the
Arithmetic conversions
Many binary operators cause implicit type conversion. Here’re conditions:
- If either operand is of type
long double
, the other operand is converted tolong double
. - Otherwise, if either operand is of type
double
, the other operand is converted todouble
. - Otherwise, if either operand is of type
float
, the other operand is converted tofloat
. - Otherwise, if either operand is of type
unsigned long
/long
/unsigned int
, the other operand is converted to the most precise type that can hold both operands. - Otherwise, both operands are of the type with lower precision. (On MSVC, it may be different.)
Example
Case 1: Addition
int a = 10; double b = 20.5; double c = a + b; // a is converted to double
Case 2: Integer division
int total = 7; int count = 2; // 'total' and 'count' are integers, integer division would occur double average1 = total / count; // 3, wrong result // To get a floating-point result, promote one operand to 'double' double average2 = total / static_cast<double>(count); // 3.5, correct result
Case 3: Compound assignment
int i = 10; double d = 3.5; // 'i' is converted to 'double' before addition, // then the result is assigned back to 'i' after truncation += d; i
I’m lost in the type conversions
- In arithmetic operations, compiler will use the higher-precision type as the result type.
- In assignment, compiler will convert the right-hand side to the type of the left-hand side.
auto keyword
auto
keyword is used to let the compiler deduce the type of the variable from the initializer.
auto a = 10; // a is int
auto b = 10.5; // b is double
auto c = 'a'; // c is char
auto d = "hello"; // d is const char*
Be careful:
auto a = 2; // a is int
= 2.5; // a is still int and 2.5 is truncated to 2 a
Summary
C++ is a statically typed language, every variable has a type and its type is determined at compile time.
After declaration, you can just use the variable, no need to write its type again.
// eg int a = 10; = a + 10; // no need to write int before a and int before 10 a
C++ supposes the programmer is responsible for the correctness of the types.
You should always be aware of the types you are using.
Footnotes
from scaler.com↩︎