From 5c7675dfbe0d7ff93272ec9596d0d82f082fa98a Mon Sep 17 00:00:00 2001 From: Chewing_Bever Date: Sat, 8 Jun 2024 21:08:06 +0200 Subject: [PATCH] rieter: add devlog-1 post --- content/dev/rieter/devlog-1.md | 78 ++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) create mode 100644 content/dev/rieter/devlog-1.md diff --git a/content/dev/rieter/devlog-1.md b/content/dev/rieter/devlog-1.md new file mode 100644 index 0000000..1a9f627 --- /dev/null +++ b/content/dev/rieter/devlog-1.md @@ -0,0 +1,78 @@ +--- +title: "Rethinking the Vieter project" +date: 2024-06-08 +--- + +I've been meaning to recreate my Vieter project for a while. The codebase is +full of technical debt, and I've grown dissatisfied with the language it was +originally written in. That's where the Rieter project comes in: a full +reimagining and reimplementation of the core ideas of the project, in Rust. I +am however following a different mindset this time around. + +My plan is to develop the project in two stages. The first stage involves +creating a well-designed general-purpose repository server. This includes +serving and storing packages, as well as providing a REST API and web UI to +interact with the repository packages. In this stage I'll also add mirroring +functionality to allow a Rieter server to automatically maintain a local copy +of another repository. This could be used to easily create another mirror for a +distribution's servers, or perhaps to create a local mirror for faster +downloads. + +Once the first stage is finished, we have a solid foundation on which we can +build the second stage: the build system. This will involve redesigning the +agent-server architecture that's currently used in Vieter, with the goal of +completely replacing Vieter in due time. + +This post is the first in a hopefully plentiful series of devlogs for this +project where I'll document my progress along the way. + +## Current progress + +The implementation of the repository server itself is almost done. A user can +publish, request and remove packages for any number of repositories and +architectures. Repositories are then further grouped into distributions, +allowing a single server to be used for multiple distributions if need be (e.g. +I would for example have `arch` and `endeavouros` as distributions on my +personal server). A package's information is added to the database, and this +data is then exposed via a paginated REST API. + +The only real hurdle left for a first release is concurrency, which brings with +it a couple of problems. With the current implementation, it's possible for +concurrent uploads of packages to corrupt the repository. The generation of the +package archives happens inside the request handler for each upload, meaning +multiple requests basically do duplicate work and can cause CPU usage spikes. +The parsing of packages is also done inside the request handler, which once +again causes the server to spike in CPU usage if multiple packages are uploaded +in parallel. These things combined make concurrent uploads of packages a rather +painful problem to deal with. + +My solution for these problems consists of two parts. First I want to add a +queueing system for new packages. Instead of parsing the packages directly in +the request handler, they would get added to a queue, with the server then +responding with a [`202 +Accepted`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/202). The +actual parsing of the packages would be done asynchronously by a configurable +number of worker threads that parse the packages. + +The second part involves serializing and stalling the generation of the package +archives until needed. Instead of actually generating the package archives for +each uploaded package, we simply notify some central worker thread that the +repository has been altered. This worker would then generate the package +archives, after ensuring the queue is empty and no new packages have arrived in +the last `n` seconds. This pattern accounts for groups of packages being +uploaded at once without needlessly stressing the server. + +By implementing these features, the server should be able to handle a large +number of package uploads without using excessive resources, ensuring Rieter +can scale to proper sizes. + +## First release + +Once this is implemented, the codebase should be ready for a 0.1.0 release! +This version will already be useable as a fully-fletched repository server on +which I can then build the other parts of the first stage. + +For the 1.0 release, I'll be adding a web UI, as this was something that I was +sorely missing from Vieter. Perhaps most exciting of all, automatic mirroring +will also be added which I'm definitely looking forward to! I hope to publish +another post here soon, but until then, thanks for reading.