07 Index Locking & Latching 2018年2月25日 15:01 —

Motivation

Database Indexes
A data structure that improves the speed of data retrieval operations on a table at the cost of additional writes
and storage space. Indexes are used to quickly locate data without having to search every row in a table every
time a table is accessed. Indexes require different locking because the physical structure can change as long
as the logical contents are consistent.
Order Preserving Indexes
• A tree structure that maintains keys in some sorted order.
• Supports all possible predicates with O(log(n)) searches.
Hashing Indexes
• An associative array that maps a hash of the key to a particular record.
• Only support equality predicates with O(1) searches.
B-tree vs B+Tree
• The original B-Tree from 1972 stored values in all nodes in the tree [2].
• B-Tree was more memory efficient since each key only appears once in the tree.
• B+Tree only stores values in leaf nodes, and inner nodes only guide the search process.
• In practice, people use B+Trees over B-Trees because its easier to manage concurrent index access
  because values are only in the leaf nodes.

LOCKS VS. LATCHES

Locks

→ Protects the index’s logical contents from other txns.
→ Held for txn duration.
→ Need to be able to rollback changes.

Latches

→ Protects the critical sections of the index’s internal data structure from other threads.
→ Held for operation duration.
→ Do not need to be able to rollback changes.
 

LOCK-FREE INDEXES

  • Possibility #1: No Locks (relate to lock manager)
    → Txns don’t acquire locks to access/modify database.
    → Still have to use latches to install updates.
  • Possibility #2: No Latches (relate to compare-and-swap, lock free programming)
    → Swap pointers using atomic updates to install changes.
    → Still have to use locks to validate txns.
      This section focuses on order-preserving index log(n) access that supports all predicates.  

    LATCH IMPLEMENTATION

    Typical latches

    1. Mutes std::mutex
    2. Spinlock std::atmoic
    3. Queuebased spinlock (Mellor-Crummey and Scott lock)
    4. readwritelock  

      LATCH CRABBING

      Acquire and release latches on B+Tree nodes when traversing the data structure. A thread can release latch on a parent node if its child node considered safe.
      → Any node that won’t split or merge when updated.
      → Not full (on insertion)
      → More than half-full (on deletion)
      Search: Start at root and go down; repeatedly,
      → Acquire read (R) latch on child
      → Then unlock the parent node.
      Insert/Delete: Start at root and go down, obtaining write (W) latches as needed. Once child is locked, check if it is safe:
      → If child is safe, release all locks on ancestors.
       

      Better Latch Crabbing
      The problem with the previous latch crabbing approach is that it requires each thread to lock the root as the
      first step each time. This is a major bottleneck.
      A better approach is to optimistically assume that the leaf is safe [1].
      • Take R latches as you traverse the tree to reach leaf and verify.
      • If leaf is not safe, then fallback to previous algorithm.
      

      INDEX LOCKS

Crabbing does not protect from phantoms because we are releasing locks as soon as insert/delete operation
ends. There needs to be a way to protect the index’s logical contents from other transactions to avoid
phantoms.
Difference with index latches:
• Locks are held for the entire duration of a transaction.
• Only acquired at the leaf nodes.
• Not physically stored in index data structure.

Need a way to protect the index’s logical contents from other txns to avoid phantoms. Difference with index latches:

  • → Locks are held for the entire duration of a txn.
  • → Only acquired at the leaf nodes.
  • → Not physically stored in index data structure.  

INDEX LOCKING SCHEMES

  1. Predicate Locks
    Proposed locking scheme from System R.
    • → Shared lock on the predicate in a WHERE clause of a SELECT query.
    • → Exclusive lock on the predicate in a WHERE clause of any UPDATE, INSERT, or DELETE query.
    • Never implemented in any system
  2. Key-Value Locks
    Locks that cover a single key value. Need “virtual keys” for non-existent values.

  3. Gap Locks
    Each txn acquires a key-value lock on the single key that it wants to access. Then get a gap lock on the next key gap.

  4. Key-Range Locks A txn takes locks on ranges in the key space.

    → Each range is from one key that appears in the relation, to the next that appears.

    → Define lock modes so conflict table will capture commutativity of the operations available.

  5. Hierarchical Locking
    Allow for a txn to hold wider key-range locks with different locking modes.

    → Reduces the number of visits to lock manager.  

    Hierarchical locking essentially provides predicate locking without complications.
    → Index locking occurs only in the leaf nodes.
    → Latching is to ensure consistent data structure.