Question Details

No question body available.

Tags

c++ allocation

Answers (1)

April 29, 2025 Score: 3 Rep: 892 Quality: Medium Completeness: 60%

free attempts to be a lock-free stack of addresses, but it falls short in ABA protection. Thread can update the free pointer while another thread is in middle of storing (a stale) next pointer:

A: while (!free.compareexchangestrong(block->elements[0].next() = free.load(), &block->elements[BlockSize-1])) {}

A: x = load(free) B: y = load(free) store(next, x)

B: reorders top two items in the stack. A: x is now stale. CAS(free, x, addr)

You can use a 16-byte CAS and a counter that you increment on each attempted CAS. However, due to the standard library ABI, GCC/Clang can refuse to emit the CMPXCHG16B on x86-64. So staticassert(std::atomic::isalwayslockfree); would error, and you would get bad performance if you ignored this.

The last (portable-and-performant, but limited count of threads) option I can think of is to align the memory addresses to say 32-byte boundaries and use the 5 low zero bits of the address as an ABA counter. Non-portable, (but common) way is to use the high 8 or 16-bits of the virtual address for the ABA-tag.

Edit: I'd like add a note about the ABA counter. It is enough that each thread stores an per-thread unique tag with the CAS. Such a per-thread constant tag is likely better than a incrementing counter since it cannot overflow, which can silently break an otherwise working lock-free algorithm.

Why is your element buffer storing next pointers and data values in the same memory (and why are you not using a union for that purpose)?

Space-compaction, not that this matters much here. The next pointer is only needed for items that are in the stack. (i.e. free). AllocBlock() pushes a block of addresses at once. Alloc() pops a single address, after which the succeeding thread owns that piece of memory. (Push/pop requires ABA protection.) Free() then pushes an address onto the stack to be reused by Alloc().