Why No HTM Forward-Progress Guarantees?
Why are there no forward-progress guarantees for
hardware transactional memory (HTM)?
Only the HTM vendors know for sure, but here are a few possible reasons:
- HTM transactions cannot tolerate non-idempotent operations such
as I/O and system calls, so any forward-progress guarantees
will apply only to transactions that operate only on normal memory.
- HTM transactions
whose data falls into the same cache/TLB set and whose size
exceeds either the cache's or the TLB's associativity
(plus the size of any corresponding victim caches)
cannot possibly succeed at all, which eliminates
any possibility of a forward-progress guarantee for these
unfortunate transactions.
Any forward-progress guarantees must therefore be conditioned
on the transaction's cache footprint relative to the minimum
of the the worst case cache/TLB capacity.
- It is possible that some HTM implementations will be subject to
other hardware resource limits, for example,
store-buffer size.
- The fact that interrupts and exceptions on a given CPU will abort
transactions running on that CPU means
that a transaction cannot commit if its execution time exceeds
the duration of the interval between interrupts.
Any forward-progress guarantees must therfore be conditioned
on the transaction's duration.
(Sorry, no successful transactions containing infinite loops!)
- A production-quality system must handle a very large number of
arbitrary and mutually conflicting concurrent transactions of
varying shapes, sizes, and numbers of prior attempts.
The difficulty inherent in efficiently resolving the resulting
conflicts becomes apparent when you consider
the large number of schemes that have been proposed.
Perhaps current HTM implementations refrain from providing
forward-progress guarantees because it is extremely hard to do so
in the general case in a production-quality setting.
Until HTM implementations provide some sort of forward-progress guarantee,
HTM will be limited by its fallback code sequences.
For example, if the fallback code uses locking, then HTM will be just as
vulnerable to deadlock as is the fallback code.
Some speculate that HTM transactions will frequently succeed, and that
the fallback code may therefore use simpler and less-scalable locking
designs that are less prone to deadlock.
However, this speculation requires
demonstration, especially under overload conditions.
At the same time, it is important to note that even highly constrained
forward-progress guarantees could be quite useful.
For example, only a few cache lines are required to be able to atomically
push and pop to/from a stack or queue.
Furthermore, if an HTM update is nested within an RCU or hazard-pointer
read-side critical section, small changes to large data structures
can be handled by HTM implementations with small size guarantees.
Such guarantees could greatly ease validation of HTM software, because
it would not be necessary to separately validate fallback code.
Of course, the larger the guaranteed size the better, but even small
guarantees can be highly useful.
As always, infeasible perfection should be spurned in favor of
achievable goodness.