fix rss feeds; move stuff around

2024-06-06 09:54:26 +02:00 · 2024-06-06 09:54:26 +02:00 · 323b9e2e3c
commit 323b9e2e3c
parent cd10df9d32
9 changed files with 16 additions and 60 deletions
--- a/content/blog/alex/index.md
+++ b/content/blog/alex/index.md
@ -1,142 +0,0 @@
---
-title: "Automating Minecraft Server Backups"
-date: 2023-09-07
---
-
-I started playing Minecraft back in 2012, after the release of version 1.2.5.
-Like many gen Z'ers, I grew up playing the game day in day out, and now 11
-years later, I love the game more than ever. One of the main reasons I still
-play the game is multiplayer, seeing the world evolve as the weeks go by with
-everyone adding their own personal touches.
-
-Naturally, as a nerd, I've grown the habit of hosting my own servers, as well
-as maintaining instances for friends. Having managed these servers, I've
-experienced the same problems that I've heard other people complaining about as
-well: backing up the server.
-
-{{< figure src="./the-village.jpg" title="Sneak peak of the village we live in" >}}
-
-## The Problem
-
-Like any piece of software, a Minecraft server instance writes files to disk,
-and these files, a combination of world data and configuration files, are what
-we wish to back up. The problem is that the server instance is constantly
-writing new data to disk. This conflicts with the "just copy the files"
-approach (e.g. `tar` or `rsync`), as these will often encounter errors because
-they're trying to read a file that's actively being written to. Because the
-server isn't aware it's being backed up, it's also possible it writes to a file
-already read by the backup software while the other files are still being
-processed. This produces an inconsistent backup with data files that do not
-properly belong together.
-
-There are two straightforward ways to solve this problem. One would be to
-simply turn off the server before each backup. While this could definitely work
-without too much interruption, granted the backups are scheduled at times no
-players are online, I don't find this to be very elegant.
-
-The second solution is much more appealing. A Minecraft server can be
-controlled using certain console commands, with the relevant ones here being
-`save-off`, `save-all`, and `save-on`. `save-off` tells the server to stop
-saving its data to disk, and cache it in memory instead. `save-all` flushes the
-server's data to disk, and `save-on` enables writing to disk again. Combining
-these commands provides us with a way to back up a live Minecraft server: turn
-off saving using `save-off`, flush its data using `save-all`, back up the
-files, and turn on saving again using `save-on`. With these tools at my
-disposal, I started work on my own custom solution.
-
-## My solution
-
-After some brainstorming, I ended up with a fairly simple approach: spawn the
-server process as a child process with the parent controlling the server's
-stdin. By taking control of the stdin, we can send commands to the server
-process as if we'd typed them into the terminal ourselves. I wrote the original
-proof-of-concept over two years ago during the pandemic, but this ended up
-sitting in a dead repository afterwards. However, a couple of months ago, some
-new motivation to work on the project popped into my head (I started caring a
-lot about our world), so I turned it into a fully fletched backup tool! The
-project's called [alex](https://git.rustybever.be/Chewing_Bever/alex) and as
-usual, it's open-source and available on my personal Gitea instance.
-
-Although Alex is a lot more advanced now than it was a couple of months back,
-it still functions on the same principle of injecting the above commands into
-the server process's stdin. The real star of the show however is the way it
-handles its backups, which brings us into the next section.
-
-## Incremental backups
-
-You could probably describe my usual projects as overengineered, and Alex is no
-different. Originally, Alex simply created a full tarball every `n` minutes
-(powered by the lovely [tar-rs](https://github.com/alexcrichton/tar-rs)
-library). While this definitely worked, it was *slow*. Compressing several
-gigabytes of world files always takes some time, and this combined with shaky
-hard drive speeds resulted in backups easily taking 5-10 minutes. Normally,
-this wouldn't bother me too much, but with this solution, the Minecraft server
-isn't writing to disk for the entire duration of this backup! If the server
-crashed during this time, all this data would be lost.
-
-This called for a better method: incremental backups. For those unfamiliar, an
-incremental backup is a backup that only stores the changes that occurred since
-the last backup. This not only saves a ton of disk space, but it also greatly
-decreases the amount of data that needs to be compressed, speeding up the
-backup process tremendously.
-
-Along with this, I introduced the concept of "chains". Because an incremental
-backup describes the changes that occurred since the last backup, it needs that
-other backup in order to be fully restored. This also implies that the first
-incremental backup needs to be based off a full backup. A chain defines a list
-of sequential backups that all depend on the one before them, with each chain
-starting with a full backup.
-
-All of this combined resulted in the following configuration for backups: the
-admin can configure one or more backup schedules, with each schedule being
-defined by a name, a frequency, a chain length and how many chains to keep. For
-each of these configurations, a new backup will be created periodically
-according to the defined frequency, and this backup will be appended to the
-current chain for that schedule. If the chain is full (as defined by the chain
-length), a new chain is created. Finally, the admin can configure how many of
-these full chains to keep.
-
-As an example, my server currently uses a dual-schedule system:
-
-* One configuration is called "30min". As the name suggests, it has a frequency
-  of 30 minutes. It stores chains of length 48, and keeps 1 full chain. This
-  configuration allows me to create incremental backups (which take 5-10
-  seconds) every 30 minutes, and I can restore these backups in this 30-minute
-  granularity up to 24 hours back.
-* The second configuration is called "daily", and this one simply creates a
-  full backup (a chain length of 1) every 24 hours, with 7 chains being stored.
-  This allows me to roll back a backup with a 24-hour granularity up to 7 days
-  back.
-
-This configuration would've never been possible without incremental backups, as
-the 30 minute backups would've simply taken too long otherwise. The required
-disk space would've also been rather unwieldy, as I'd rather not store 48
-multi-gigabyte backups per day. With the incremental backups system, each
-backup after the initial full backup is only a few megabytes!
-
-Of course, a tool like this wouldn't be complete without some management
-utilities, so the Alex binary contains tools for restoring backups, exporting
-incremental backups as a new full backup, and unpacking a backup.
-
-## What's next?
-
-There's still some improvements I'd like to add to Alex itself, notably making
-Alex more aware of the server's internal state by parsing its logs, and making
-restoring backups possible without having to stop the Alex instance (this is
-rather cumbersome in Docker containers).
-
-On a bigger scale however, there's another possible route to take: add a
-central server component where an Alex instance can publish its backups to.
-This server would then have a user management system to allow certain users of
-the Minecraft server to have access to the backups for offline use. This server
-could perhaps also show the logs of the server instance, as well as handling
-syncing the backups to another location, such as an S3 store. This would make
-the entire system more resistant to data loss.
-
-Of course, I'm well aware these ideas are rather ambitious, but I'm excited to
-see where this project might go next!
-
-That being said, Alex is available as statically compiled binaries for `amd64`
-and `arm64` [on my Gitea](https://git.rustybever.be/Chewing_Bever/alex). If
-you're interested in following the project, Gitea recently added repository
-[RSS feeds](https://git.rustybever.be/Chewing_Bever/alex.rss) ;)
--- a/content/blog/alex/the-village.jpg
+++ b/content/blog/alex/the-village.jpg
--- a/content/blog/lander/index.md
+++ b/content/blog/lander/index.md
@ -1,138 +0,0 @@
---
-title: "Designing my own URL shortener"
-date: 2023-10-14
---
-
-One of the projects I've always found to be a good choice for a side project is
-a URL shortener. The core idea is simple and fairly easily to implement, yet it
-allows for a lot of creativity in how you implement it. Once you're done with
-the core idea, you can start expanding the project as you wish: expiring links,
-password protection, or perhaps a management API. The possibilities are
-endless!
-
-Naturally, this post talks about my own version of a URL shortener:
-[Lander](https://git.rustybever.be/Chewing_Bever/lander). In order to add some
-extra challenge to the project, I've chosen to write it from the ground up in C
-by implementing my own event loop, and building an HTTP server on top to use as
-the base for the URL shortener.
-
-## The event loop
-
-Lander consists of three layers: the event loop, the HTTP loop and finally the
-Lander-specific code. Each of these layers utilizes the layer below it, with
-the event loop being the bottom-most layer. This layer directly interacts with
-the networking stack and ensures bytes are received from and written to the
-client. The book [Build Your Own Redis](https://build-your-own.org/redis/) by
-James Smith was an excellent starting point, and I highly recommend checking it
-out! This book taught me everything I needed to know to start this project.
-
-Now for a slightly more techical dive into the inner workings of the event
-loop. The event loop is the layer that listens on the listening TCP socket for
-incoming connections and directly processes requests. In each iteration of the
-event loop, the following steps are taken:
-
-1. For each of the open connections:
-    1. Perform network I/O
-    2. Execute data processing code, provided by the upper layers
-    3. Close finished connections
-2. Accept a new connection if needed
-
-The event loop runs on a single thread, and constantly goes through this cycle
-to process requests. Here, the "data processing code" is a set of function
-pointers passed to the event loop that get executed at specific times. This is
-how the HTTP loop is able to inject its functionality into the event loop.
-
-In the event loop, a connection can be in one of three states: `request`,
-`response`, or `end`. In `request` mode, the event loop tries to read incoming
-data from the client into the read buffer. This read buffer is then used by the
-data processing code's data handler. In `response` mode, the data processing
-code's data writer is called, which populates the write buffer. This buffer is
-then written to the connection socket. Finally, the `end` state simply tells
-the event loop that the connection should be closed without any further
-processing. A connection can switch between `request` and `response` mode as
-many times as needed, allowing connections to be reused for multiple requests
-from the same client.
-
-The event loop provides all the necessary building blocks needed to build a
-client-server type application. These are then used to implement the next
-layer: the HTTP loop.
-
-## The HTTP loop
-
-Before we can design a specific HTTP-based application, we need a base to build
-on. This base is the HTTP loop. It handles both serializing and deserializing
-of HTTP requests & responses, along with providing commonly used functionality,
-such as bearer authentication and reading & writing files to & from disk. The
-request parser is provided by the excellent
-[picohttpparser](https://github.com/h2o/picohttpparser) library. The parsed
-request is stored in the request's data struct, providing access to this data
-for all necessary functions.
-
-The HTTP loop defines a request handler function which is passed to the event
-loop as the data handler. This function first tries to parse the request,
-before routing it accordingly. For routing, literal string matches or
-RegEx-based routing is available.
-
-Each route consists of one or more steps. Each of these steps is a function
-that tries to advance the processing of the current request. The return value
-of these steps tells the HTTP loop whether the step has finished its task or if
-it's still waiting for I/O. The latter instructs the HTTP loop to skip this
-request for now, delaying its processing until the next cycle of the HTTP loop.
-In each cycle of the HTTP loop (or rather, the event loop), a request will try
-to advance its processing by as much as possible by executing as many steps as
-possible, in order. This means that very small requests can be completely
-processed within a single cycle of the HTTP loop. Common functionality is
-provided as predefined steps. One example is the `http_loop_step_body_to_buf`
-step, which reads the request body into a buffer.
-
-The HTTP loop also provides the data writer functionality, which will stream an
-HTTP response to the write buffer. The contents of the response are tracked in
-the request's data struct, and these data structs are recycled between requests
-using the same connection, preventing unnecessary allocations.
-
-## Lander
-
-Above the HTTP loop layer, we finally reach the code specific to Lander. It
-might not surprise you that this layer is the smallest of the three, as the
-abstractions below allow it to focus on the task at hand: serving and storing
-HTTP redirects (and pastes). The way these are stored however is, in my
-opinion, rather interesting.
-
-For our Algorithms & Datastructures 3 course, we had to design three different
-trie implementations in C: a Patricia trie, a ternary trie and a "custom" trie,
-where we were allowed to experiment with different ideas. For those unfamiliar,
-a trie is a tree-like datastructure used for storing strings. The keys used in
-this tree are the strings themselves, with each character causing the tree to
-branch off. Each string is stored at depth `m`, with `m` being the length of
-the string. This also means that the search depth of a string is not bounded by
-the size of the trie, but rather the size of the string! This allows for
-extremely fast lookup times for short keys, even if we have a large number of
-entries.
-
-My design ended up being a combination of both a Patricia and a ternary trie: a
-ternary trie that supports skips the way a Patricia trie does. I ended up
-taking this final design and modifying it for this project by optimising it (or
-at least try to) for shorter keys. This trie structure is stored completely in
-memory, allowing for very low response times for redirects. Pastes are served
-from disk, but their lookup is also performed using the same in-memory trie.
-
-## What's next?
-
-Hopefully the above explanation provides some insight into the inner workings
-of Lander. For those interested, the source code is of course available
-[here](https://git.rustybever.be/Chewing_Bever/lander). I'm not quite done with
-this project though.
-
-My current vision is to have Lander be my personal URL shortener, pastebin &
-file-sharing service. Considering a pastebin is basically a file-sharing
-service for text files specifically, I'd like to combine these into a single
-concept. The goal is to rework the storage system to support arbitrarily large
-files, and to allow storing generic metadata for each entry. The initial
-usecase for this metadata would be storing the content type for uploaded files,
-allowing this header to be correctly served when retrieving the files. This
-combined with supporting large files turns Lander into a WeTransfer
-alternative! Besides this, password protection and expiration of pastes is on
-my to-do list as well. The data structure currently doesn't support removing
-elements either, so this would need to be added in order to support expiration.
-
-Hopefully a follow-up post announcing these changes will come soon ;)