CPU masks
Introduction
Cpumasks
is a special way provided by the Linux kernel to store information about CPUs in the system. The relevant source code and header files which contains API for Cpumasks
manipulation:
As comment says from the include/linux/cpumask.h: Cpumasks provide a bitmap suitable for representing the set of CPU's in a system, one bit position per CPU number. We already saw a bit about cpumask in the boot_cpu_init
function from the Kernel entry point part. This function makes first boot cpu online, active and etc...:
set_cpu_online(cpu, true);
set_cpu_active(cpu, true);
set_cpu_present(cpu, true);
set_cpu_possible(cpu, true);
Before we will consider implementation of these functions, let's consider all of these masks.
The cpu_possible
is a set of cpu ID's which can be plugged in anytime during the life of that system boot or in other words mask of possible CPUs contains maximum number of CPUs which are possible in the system. It will be equal to value of the NR_CPUS
which is which is set statically via the CONFIG_NR_CPUS
kernel configuration option.
The cpu_present
mask represents which CPUs are currently plugged in.
The cpu_online
represents a subset of the cpu_present
and indicates CPUs which are available for scheduling or in other words a bit from this mask tells to kernel is a processor may be utilized by the Linux kernel.
The last mask is cpu_active
. Bits of this mask tells to Linux kernel is a task may be moved to a certain processor.
All of these masks depend on the CONFIG_HOTPLUG_CPU
configuration option and if this option is disabled possible == present
and active == online
. The implementations of all of these functions are very similar. Every function checks the second parameter. If it is true
, it calls cpumask_set_cpu
otherwise it calls cpumask_clear_cpu
.
There are two ways for a cpumask
creation. First is to use cpumask_t
. It is defined as:
typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
It wraps the cpumask
structure which contains one bitmask bits
field. The DECLARE_BITMAP
macro gets two parameters:
- bitmap name;
- number of bits.
and creates an array of unsigned long
with the given name. Its implementation is pretty easy:
#define DECLARE_BITMAP(name,bits) \
unsigned long name[BITS_TO_LONGS(bits)]
where BITS_TO_LONGS
:
#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
As we are focusing on the x86_64
architecture, unsigned long
is 8-bytes size and our array will contain only one element:
(((8) + (8) - 1) / (8)) = 1
NR_CPUS
macro represents the number of CPUs in the system and depends on the CONFIG_NR_CPUS
macro which is defined in include/linux/threads.h and looks like this:
#ifndef CONFIG_NR_CPUS
#define CONFIG_NR_CPUS 1
#endif
#define NR_CPUS CONFIG_NR_CPUS
The second way to define cpumask is to use the DECLARE_BITMAP
macro directly and the to_cpumask
macro which converts the given bitmap to struct cpumask *
:
#define to_cpumask(bitmap) \
((struct cpumask *)(1 ? (bitmap) \
: (void *)sizeof(__check_is_bitmap(bitmap))))
We can see the ternary operator operator here which is true
every time. __check_is_bitmap
inline function is defined as:
static inline int __check_is_bitmap(const unsigned long *bitmap)
{
return 1;
}
And returns 1
every time. We need it here for only one purpose: at compile time it checks that a given bitmap
is a bitmap, or in other words it checks that a given bitmap
has type - unsigned long *
. So we just pass cpu_possible_bits
to the to_cpumask
macro for converting an array of unsigned long
to the struct cpumask *
.
cpumask API
As we can define cpumask with one of the method, Linux kernel provides API for manipulating a cpumask. Let's consider one of the function which presented above. For example set_cpu_online
. This function takes two parameters:
- Number of CPU;
- CPU status;
Implementation of this function looks as:
void set_cpu_online(unsigned int cpu, bool online)
{
if (online) {
cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits));
cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits));
} else {
cpumask_clear_cpu(cpu, to_cpumask(cpu_online_bits));
}
}
First of all it checks the second state
parameter and calls cpumask_set_cpu
or cpumask_clear_cpu
depends on it. Here we can see casting to the struct cpumask *
of the second parameter in the cpumask_set_cpu
. In our case it is cpu_online_bits
which is a bitmap and defined as:
static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly;
The cpumask_set_cpu
function makes only one call to the set_bit
function:
static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
{
set_bit(cpumask_check(cpu), cpumask_bits(dstp));
}
The set_bit
function takes two parameters too, and sets a given bit (first parameter) in the memory (second parameter or cpu_online_bits
bitmap). We can see here that before set_bit
will be called, its two parameters will be passed to the
- cpumask_check;
- cpumask_bits.
Let's consider these two macros. First if cpumask_check
does nothing in our case and just returns given parameter. The second cpumask_bits
just returns the bits
field from the given struct cpumask *
structure:
#define cpumask_bits(maskp) ((maskp)->bits)
Now let's look on the set_bit
implementation:
static __always_inline void
set_bit(long nr, volatile unsigned long *addr)
{
if (IS_IMMEDIATE(nr)) {
asm volatile(LOCK_PREFIX "orb %1,%0"
: CONST_MASK_ADDR(nr, addr)
: "iq" ((u8)CONST_MASK(nr))
: "memory");
} else {
asm volatile(LOCK_PREFIX "bts %1,%0"
: BITOP_ADDR(addr) : "Ir" (nr) : "memory");
}
}
This function looks scary, but it is not so hard as it seems. First of all it passes nr
or number of the bit to the IS_IMMEDIATE
macro which just calls the GCC internal __builtin_constant_p
function:
#define IS_IMMEDIATE(nr) (__builtin_constant_p(nr))
__builtin_constant_p
checks that given parameter is known constant at compile-time. As our cpu
is not compile-time constant, the else
clause will be executed:
asm volatile(LOCK_PREFIX "bts %1,%0" : BITOP_ADDR(addr) : "Ir" (nr) : "memory");
Let's try to understand how it works step by step:
LOCK_PREFIX
is a x86 lock
instruction. This instruction tells the cpu to occupy the system bus while the instruction(s) will be executed. This allows the CPU to synchronize memory access, preventing simultaneous access of multiple processors (or devices - the DMA controller for example) to one memory cell.
BITOP_ADDR
casts the given parameter to the (*(volatile long *)
and adds +m
constraints. +
means that this operand is both read and written by the instruction. m
shows that this is a memory operand. BITOP_ADDR
is defined as:
#define BITOP_ADDR(x) "+m" (*(volatile long *) (x))
Next is the memory
clobber. It tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters).
Ir
- immediate register operand.
The bts
instruction sets a given bit in a bit string and stores the value of a given bit in the CF
flag. So we passed the cpu number which is zero in our case and after set_bit
is executed, it sets the zero bit in the cpu_online_bits
cpumask. It means that the first cpu is online at this moment.
Besides the set_cpu_*
API, cpumask of course provides another API for cpumasks manipulation. Let's consider it in short.
Additional cpumask API
cpumask provides a set of macros for getting the numbers of CPUs in various states. For example:
#define num_online_cpus() cpumask_weight(cpu_online_mask)
This macro returns the amount of online
CPUs. It calls the cpumask_weight
function with the cpu_online_mask
bitmap (read about it). Thecpumask_weight
function makes one call of the bitmap_weight
function with two parameters:
- cpumask bitmap;
nr_cpumask_bits
- which isNR_CPUS
in our case.
static inline unsigned int cpumask_weight(const struct cpumask *srcp)
{
return bitmap_weight(cpumask_bits(srcp), nr_cpumask_bits);
}
and calculates the number of bits in the given bitmap. Besides the num_online_cpus
, cpumask provides macros for the all CPU states:
- num_possible_cpus;
- num_active_cpus;
- cpu_online;
- cpu_possible.
and many more.
Besides that the Linux kernel provides the following API for the manipulation of cpumask
:
for_each_cpu
- iterates over every cpu in a mask;for_each_cpu_not
- iterates over every cpu in a complemented mask;cpumask_clear_cpu
- clears a cpu in a cpumask;cpumask_test_cpu
- tests a cpu in a mask;cpumask_setall
- set all cpus in a mask;cpumask_size
- returns size to allocate for a 'struct cpumask' in bytes;
and many many more...