SQL Server Hekaton In-memory tables: Understanding the Row Chains of Hash Indexes

Having a good understanding of how the hashing and row chains work will go a long way in helping you design for performance and diagnose performance and resource issues you may get once live. This post covers off some of the basics and hopefully will give you a working insight. We’ll start with a Hash index, consider a table made up of 6 columns, we have a composite hash index on {col1, col2, col3}. When we insert a row into the table hash functions are executed for each hash index, the row is then added to a row chain that is found through the hash bucket the result of the hash function points to. Let’s look at that in a step-wise fashion.

Step 1: We start with our source rows.

Step 2: For each hash index, concatenate the key columns to get a single binary value.

Step 3: Run the value through the hash function, the hash function will return a value within the range you defined using the BUCKET_COUNT value on the CREATE TABLE. Every time you run that same value through the hash function you always get the same result, it’s deterministic.

Step 4: The result of step 3 is actually a position in a 1 dimensional array termed the Mapping Table, that cell position is called a hash bucket and is actually a 8 byte memory pointer that points to the most recently modified/inserted row in the row chain. The Mapping Table is simply ( 8 bytes * the number of BUCKET_COUNT ) you defined on your create table – a mapping table exists for each hash index. The hash function is balanced i.e. it is designed to spread rows equally across the Mapping Table so you get a similar number of rows tracked per hash bucket – more on that later – it’s really important!

Step 5: The memory pointer (hash bucket) from Step 4 points to the actual data, ideally if your hash index is unique then there will be on average 1 row per hash bucket, but it depends entirely on how many hash buckets you defined on the create table and how many unique values are in the hash index. Anyway, that aside, the row chain starts here. Each hash bucket tracks it’s own row chain – a bit like multiple heaps, one heap per hash bucket.

Hash Collisions

I’m going to go into depth here because I cover it in another post of mine: http://dataidol.com/tonyrogerson/2013/11/25/what-is-hashing-using-modulus-to-partition-data.

A collision is simply when two different pieces of data e.g. ‘Tony’, ‘Joseph’ are passed through the hash function and that hash function returns the same value e.g. 12345.

The Row Chain

I’m going to first introduce the simple way of looking at a row chain, a row chain on a table that has a single hash index, I will then expand to cover how multiple hash indexes are handled.

The figure above shows a row structure, the header can be up to 88 bytes and comprises of an 8 byte field used internally for things like Garbage Collection, two 8 byte timestamp fields defining the life-span of the row and an 8 byte index memory pointer field for each index (hash or range up-to a maximum of 8 indexes) defined on the table.

The row chain is formed by using the 8 byte index memory pointer field as a forward only linked list, rows are segmented by hash bucket and as such, the last row in the row chain for the hash bucket the index pointer will point to NULL rather than to the first row in the next hash bucket, it’s done like that for performance reasons – basically, think about the overhead involved keeping that up-to-date where the first row in the row chain is always the most recent version (or inserted row) – more on that on a post dedicated to MVCC (Multi-Version Concurrency Control).

Row Chains from multiple Hash Indexes

There is only one copy of the data, the 8 byte index pointer (per index) stored in the row header is used to materialise the row chain – you have a row chain per index holding multiple row versions for modifications, rows where there is a hash collision or in the case of range indexes where the key columns have the same value.

The figure above reflects a table having two hash indexes, for clarity the Mapping Tables for both hash indexes are not shown nor are the timestamp that reflect version as part of the MVCC in Hekaton.

There are two hash indexes and therefore two row chains, Col3 and Col5 each have a hash index, the number between ( ) shows which hash bucket the row has been hashed to and therefore forms a group of rows pinned to that hash bucket.

The [Index Col3 Pointer] and [Index Col5 Pointer] when read from top down shows you the row chain, for instance memory location 1 of [Index Col3 Pointer] points to the next row in the hash bucket group at memory location 2, then location 5. The last row in the hash bucket group points to NULL.

The diagram below shows data organised by hash bucket, it clearly shows the row chain.

That concludes this introductory post to row chains for hash indexing, some more topics I’ll cover in the near future are how versioning works and how range indexing with BW-Tree’s work.

SQL Server Hekaton In-memory tables: Understanding the Row Chains of Hash Indexes

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: Ziba Zako ft Rich Bizzy & General Kanene – Chikwati (Prod by: Bicko...

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...