Concurrent LSM store #20

Open
opened 2023-11-16 18:22:41 +01:00 by Jef Roosens · 2 comments

There's currently no protection against two parallel insertions modifying the trie structure at the same place at the same time, e.g. two threads trying to add a new child node to the same node at the same time. This bug is currently not an issue, as the application is single-threaded, but I'd like to fix this for when I do want to add multithreading.

My only real idea for solving this issue right now would be adding a mutex to every single node in the trie, but this sounds very inefficient.

There's currently no protection against two parallel insertions modifying the trie structure at the same place at the same time, e.g. two threads trying to add a new child node to the same node at the same time. This bug is currently not an issue, as the application is single-threaded, but I'd like to fix this for when I do want to add multithreading. My only real idea for solving this issue right now would be adding a mutex to every single node in the trie, but this sounds very inefficient.
Jef Roosens added the
Kind/Bug
label 2023-11-16 18:22:41 +01:00

This issue boils down to a bigger issue I've been avoiding: updating the trie's internal structure while not requiring readers to lock the trie structure itself.

This article has given me some idea on how to accomplish this.

Instead of only keeping track of a single trie, we keep track of two. Readers can use the read map without locks, as it's immutable. Writers use the other map. Writing is serialized using a simple mutex. A writer first updates the writer trie after locking the mutex (note that this only applies to the trie's internal structure, and not the entries, as these are already locked themselves. The tries can simply share the entries). After updating the trie, the tries are swapped atomically. At this point, readers will start using the new trie.

Important here is that writers can only start writing to the write trie after all of its readers have stopped using it, to ensure we don't write to a trie that's still being read. After a write has been completed, the tries can be swapped immediately.

In order to track how many readers are currently reading the old trie, we could introduce some sort of reference-counting pointer, perhaps with a condition variable that is sent out if the value drops to 0.

Steps for writing:

  1. Lock the write mutex
  2. Wait until the write trie no longer has any readers
  3. Update the trie as usual
  4. Swap the tries atomatically

Steps for reading:

  1. Read from the reader struct

The nice thing about this design is that it favors read speeds, which is ideal for Lander.

This issue boils down to a bigger issue I've been avoiding: updating the trie's internal structure while not requiring readers to lock the trie structure itself. [This article](https://questdb.io/blog/concurrent-lockfree-datastructure-design-walkthrough/) has given me some idea on how to accomplish this. Instead of only keeping track of a single trie, we keep track of two. Readers can use the read map without locks, as it's immutable. Writers use the other map. Writing is serialized using a simple mutex. A writer first updates the writer trie after locking the mutex (note that this only applies to the trie's internal structure, and not the entries, as these are already locked themselves. The tries can simply share the entries). After updating the trie, the tries are swapped atomically. At this point, readers will start using the new trie. Important here is that writers can only start writing to the write trie after all of its readers have stopped using it, to ensure we don't write to a trie that's still being read. After a write has been completed, the tries can be swapped immediately. In order to track how many readers are currently reading the old trie, we could introduce some sort of reference-counting pointer, perhaps with a condition variable that is sent out if the value drops to 0. Steps for writing: 1. Lock the write mutex 2. Wait until the write trie no longer has any readers 3. Update the trie as usual 4. Swap the tries atomatically Steps for reading: 1. Read from the reader struct The nice thing about this design is that it favors read speeds, which is ideal for Lander.
Jef Roosens changed title from Possible race condition when inserting new entries to Concurrent LSM store 2023-12-09 11:21:44 +01:00
Jef Roosens added
Kind/Enhancement
and removed
Kind/Bug
labels 2023-12-09 11:22:05 +01:00

A different approach is called "crab latching".

Each node have a rwlock. When reading an element from the trie, we traverse down the trie by read locking both the parent node and the child node. After having acquired the child node, we can release the parent node. We repeat this process until we reach our entry, which we already lock. Once we have the entry pointer, we can release the final child pointer.

For writing, we write-lock each node on the path to the entry, only releasing the locks if we know for certain the nodes on the path will not need to be updated by our operation. For our trie, this would be always, as the trie doesn't mutate its internal structure, meaning we could simply acquire a write lock for a single node at a time, releasing the previous one once the child node has been locked.

A different approach is called "crab latching". Each node have a rwlock. When reading an element from the trie, we traverse down the trie by read locking both the parent node and the child node. After having acquired the child node, we can release the parent node. We repeat this process until we reach our entry, which we already lock. Once we have the entry pointer, we can release the final child pointer. For writing, we write-lock each node on the path to the entry, only releasing the locks if we know for certain the nodes on the path will not need to be updated by our operation. For our trie, this would be always, as the trie doesn't mutate its internal structure, meaning we could simply acquire a write lock for a single node at a time, releasing the previous one once the child node has been locked.
Jef Roosens added the
Project/Lander
label 2024-08-26 10:42:18 +02:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Chewing_Bever/lander#20
There is no content yet.