2018年2月27日
16:22
Type representation
In-Memory Data Layout
Storage Models
System Catalogs
Some background info
About data organization
In memory databases can be taken as a large block of memory, say a very large array.
The memory is managed by OS as pages.
About THP :To speed up address translate, CPUS have TLB caches.
TLB cache size is limited and possibly needs to read translate info from memory which is not the databases want. In order to reduce the TLB cache misses, we can use larger page frame size. Or to use transparent huge pages that OS provides to the applications. THP will deal with split and combine of pages into or from a huge page. But it can also stall databases during the process. Almost no DBMS use this and advice turning off this option in OS.
Data types representation
Ingeter bigint smallint tinyint
c/c++ representation
Float/real vs numeric/decimal
Ieee-754/fixed-pointe decimals
VARCHAR/VARBINARY/TEXT/BLOB
→ Pointer to other location if type is ≥64-bits
→ Header with length and address to next location (if segmented), followed by data bytes.
TIME/DATE/TIMESTAMP → 32/64-bit integer of (micro)seconds since Unix epoch
Null value
Choice #1: Special Values
→ Designate a value to represent NULL for a particular data type (e.g., INT32_MIN).
Choice #2: Null Column Bitmap Header
→ Store a bitmap in the tuple header that specifies what attributes are null.
Choice #3: Per Attribute Null Flag
→ Store a flag that marks that a value is null.
→ Have to use more space than just a single bit because this messes up with word alignment.
In-memory layout
About word align
If the CPU fetches a 64-bit value that is not wordaligned, it has three choices:
→Execute two reads to load the appropriate parts of the data word and reassemble them.
→Read some unexpected combination of bytes assembled into a 64-bit word.
→Throw an exception
STORAGE MODELS
N-ary Storage Model (NSM)
The DBMS stores all of the attributes for a single tuple contiguously.
Ideal for OLTP workloads where txns tend to operate only on an individual entity and insertheavy workloads.
Use the tuple-at-a-time iterator model.
Choice #1: Heap-Organized Tables
→ Tuples are stored in blocks called a heap.
→ The heap does not necessarily define an order.
Choice #2: Index-Organized Tables
→ Tuples are stored in the primary key index itself.
→ Not quite the same as a clustered index.
Advantages
→ Fast inserts, updates, and deletes.
→ Good for queries that need the entire tuple.
→ Can use index-oriented physical storage.
Disadvantages
→ Not good for scanning large portions of the table and/or a subset of the attributes.
Decomposition Storage Model (DSM)
The DBMS stores a single attribute for all tuples contiguously in a block of data.
→ Sometimes also called vertical partitioning.
Ideal for OLAP workloads where read-only queries perform large scans over a subset of the table’s attributes.
Use the vector-at-a-time iterator model.
Choice #1: Fixed-length Offsets
→ Each value is the same length for an attribute.
Choice #2: Embedded Tuple Ids
→ Each value is stored with its tuple id in a column.
Advantages
→ Reduces the amount wasted work because the DBMS
only reads the data that it needs.
→ Better compression.
Disadvantages
→ Slow for point queries, inserts, updates, and deletes
because of tuple splitting/stitching.
Hybrid Storage Model
Single logical database instance that uses different storage models for hot and cold data.
Store new data in NSM for fast OLTP
Migrate data to DSM for more efficient OLAP
Choice #1: Separate Execution Engines
→ Use separate execution engines that are optimized for
either NSM or DSM databases.
Choice #2: Single, Flexible Architecture
→ Use single execution engine that is able to efficiently
operate on both NSM and DSM databases.
Separate Execution Engines
Run separate “internal” DBMSs that each only operate on DSM or NSM data.
→ Need to combine query results from both engines to appear as a single logical database to the application.
→ Have to use a synchronization method (e.g., 2PC) if a txn
spans execution engines.
Two approaches to do this:
→ Fractured Mirrors (Oracle, IBM)
→ Delta Store (SAP HANA)
Fractured Mirrors
Store a second copy of the database in a DSM layout that is automatically updated.
→ All updates are first entered in NSM then eventually copied into DSM mirror.
Delta Store
Stage updates to the database in an NSM table.
A background thread migrates updates from delta
store and applies them to DSM data.
CATEGORIZING DATA
Choice #1: Manual Approach
→ DBA specifies what tables should be stored as DSM.
Choice #2: Off-line Approach
→ DBMS monitors access logs offline and then makes decision about what data to move to DSM.
Choice #3: On-line Approach
→ DBMS tracks access patterns at runtime and then makes decision about what data to move to DSM.