Notes on : 15-721 (2018) 08-09 OLTP Indexes Part I & II
08 Part I 2018年2月26日 16:00
- t-tree
- Lock free skip list(mainly talked about this)
- bwtree
09 Part II 2018年2月26日 23:29
Index is not a big problem for olap.
Working on large data, column store & sequential scan is more suitable
Index implementation issues
Memory Pools
Garbage Collection
Non-Unique Keys
Variable-length Keys
Prefix Compression
Memory pools:
Avoid calling malloc and free every time add/delete a node
Works for fixed size node
Retracting is not trivial. Say insert and then delete a bilion keys or shrinking a extendable hash table. There may be lots of holes. Memory needs to be repact. This is expensive.
Garbage collection
Need to know when it is safe to reclaim memory for deleted nodes.
Methods:
Reference counting
Epoch-based reclaimation
Hazard pointer ….
Reference counting:
Maintain a counter for each node to keep track of the number of threads that are accessing it.
• Increase counter before access
• Decrease when finished
• It's safe to delete if the node's counter is zero
Acctually performs bad on multi core cpus as compare and swap produces many cache coherence traffic
Epoch garbadge collection(best for in-memory index)
Maintain a global epoch counter that is periodicaly updated.
Keep track of what threads enter the index during an epoch and when they leave.
Mark the current epoch of a node when it is markded for deletetion
The node can be reclaimed once all threads(of current & preceeding epoch) leave that epoch.
Same as read copy update
The idea is:
Do inform epoch manager of what you want to do and when you finish doing.
Mark the one you want to delete.
Let the epoch manager do the deletetion once it find a marked node is not being access any more.
About non-unique indexes
Approach#1: duplicate keys. No change of node structure. Need value of a key during deletetion.
Approach#2: value lists. Keys appears once and each key points to its value list.
Variable length keys
Approach#1:pointers. Store pointers to real index keys. Nobody does this.
Approach#2:variable length nodes. Nodes in index vary in length. Need careful memory management. Nobody does this.
Approach#3:padding. Padding short keys to max length.
Approach#4:key map/indirection layer.
Embed an array of pointers that map to the key + value list within the node.
Approach#1 is using pointer to record attribute. Need to read a tuple for every compare from pages.
Approach#4 is using pointer to infos in the same index node. Pointers lives in key map and they point to key/value pairs lives in the same index node or same page.
Possible improvement is to add prefix of keys in the key map so that prefix provides more info during searching.
May use overflow chain if keys cannot fit in the same index node.
Prefix compression
Store a minimum prefix that is needed to correctly route probes into the index.
Intuiation :Since keys are sorted in lexicographical order, there will be a lot of duplicated prefixes.
Art(modern radix tree for multi-core cpu)
Uses digital representation of keys to examine prefixes 1-by-1 instead of comparing entire key.
Radix trees properties:
→ The height of the tree depends on the length of keys.
→ Does not require rebalancing
→ The path to a leaf node represents the key of the leaf
→ Keys are stored implicitly and can be reconstructed from paths.
The index supports four different internal node types with different capacities. Pack in multiple digits into a single node to improve cache locality.
Binary comparable keys
Not all attribute types can be decomposed into binary comparable digits for a radix tree.
→ Unsigned Integers: Byte order must be flipped for little endian machines.
→ Signed Integers: Flip two’s-complement so that negative numbers are smaller than positive.
→ Floats: Classify into group (neg vs. pos, normalized vs. denormalized), then store as unsigned integer.
→ Compound: Transform each attribute separately.
I think it is not mentioned but the key length must be varying.
About latch free:
HyPer’s ART is not latch-free.
→ The authors argue that it would be a significant amount of work to make it latch-free.
Approach #1: Optimistic Lock Coupling
Optimistic crabbing scheme where writers are not blocked on readers.
→ Writers increment counter when they acquire latch.
→ Readers can proceed if a node’s latch is available.
→ It then checks whether the latch’s counter has changed from when it checked the latch.
Approach #2: Read-Optimized Write Exclusion
About perfiling