Many types of processor can execute instructions in a different order than issued by the compiler or assembler. On a uniprocessor system, out of order execution is transparent to the programmer, operating system and applications, as the processor must ensure that it is self consistent.
On multiprocessor systems, out of order execution can present a problem where locks are not used to guarantee atomicity of access, because loads and stores issued by any given processor can appear on the system bus (and thus appear to other processors) in an unpredictable order.
mb_memory(),
mb_read(), and
mb_write() can be used to control the order in which memory accesses occur, and thus the order in which those accesses become visible to other processors. They can be used to implement “lockless” access to data structures where the necessary barrier conditions are well understood.
Memory barriers can be computationally expensive, as they are considered “serializing” operations and may stall further execution until the processor has drained internal buffers and re-synchronized.
The memory barrier primitives control only the order of memory access. They provide no guarantee that stores have been flushed to the bus, or that loads have been made from the bus.
The memory barrier primitives are guaranteed only to prevent reordering of accesses to main memory. They do not provide any guarantee of ordering when used with device memory (for example, loads or stores to or from a PCI device). To guarantee ordering of access to device memory, the
bus_dma(9) and
bus_space(9) interfaces should be used.