79 lines
4.1 KiB
Markdown
79 lines
4.1 KiB
Markdown
|
---
|
||
|
title: "Rethinking the Vieter project"
|
||
|
date: 2024-06-08
|
||
|
---
|
||
|
|
||
|
I've been meaning to recreate my Vieter project for a while. The codebase is
|
||
|
full of technical debt, and I've grown dissatisfied with the language it was
|
||
|
originally written in. That's where the Rieter project comes in: a full
|
||
|
reimagining and reimplementation of the core ideas of the project, in Rust. I
|
||
|
am however following a different mindset this time around.
|
||
|
|
||
|
My plan is to develop the project in two stages. The first stage involves
|
||
|
creating a well-designed general-purpose repository server. This includes
|
||
|
serving and storing packages, as well as providing a REST API and web UI to
|
||
|
interact with the repository packages. In this stage I'll also add mirroring
|
||
|
functionality to allow a Rieter server to automatically maintain a local copy
|
||
|
of another repository. This could be used to easily create another mirror for a
|
||
|
distribution's servers, or perhaps to create a local mirror for faster
|
||
|
downloads.
|
||
|
|
||
|
Once the first stage is finished, we have a solid foundation on which we can
|
||
|
build the second stage: the build system. This will involve redesigning the
|
||
|
agent-server architecture that's currently used in Vieter, with the goal of
|
||
|
completely replacing Vieter in due time.
|
||
|
|
||
|
This post is the first in a hopefully plentiful series of devlogs for this
|
||
|
project where I'll document my progress along the way.
|
||
|
|
||
|
## Current progress
|
||
|
|
||
|
The implementation of the repository server itself is almost done. A user can
|
||
|
publish, request and remove packages for any number of repositories and
|
||
|
architectures. Repositories are then further grouped into distributions,
|
||
|
allowing a single server to be used for multiple distributions if need be (e.g.
|
||
|
I would for example have `arch` and `endeavouros` as distributions on my
|
||
|
personal server). A package's information is added to the database, and this
|
||
|
data is then exposed via a paginated REST API.
|
||
|
|
||
|
The only real hurdle left for a first release is concurrency, which brings with
|
||
|
it a couple of problems. With the current implementation, it's possible for
|
||
|
concurrent uploads of packages to corrupt the repository. The generation of the
|
||
|
package archives happens inside the request handler for each upload, meaning
|
||
|
multiple requests basically do duplicate work and can cause CPU usage spikes.
|
||
|
The parsing of packages is also done inside the request handler, which once
|
||
|
again causes the server to spike in CPU usage if multiple packages are uploaded
|
||
|
in parallel. These things combined make concurrent uploads of packages a rather
|
||
|
painful problem to deal with.
|
||
|
|
||
|
My solution for these problems consists of two parts. First I want to add a
|
||
|
queueing system for new packages. Instead of parsing the packages directly in
|
||
|
the request handler, they would get added to a queue, with the server then
|
||
|
responding with a [`202
|
||
|
Accepted`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/202). The
|
||
|
actual parsing of the packages would be done asynchronously by a configurable
|
||
|
number of worker threads that parse the packages.
|
||
|
|
||
|
The second part involves serializing and stalling the generation of the package
|
||
|
archives until needed. Instead of actually generating the package archives for
|
||
|
each uploaded package, we simply notify some central worker thread that the
|
||
|
repository has been altered. This worker would then generate the package
|
||
|
archives, after ensuring the queue is empty and no new packages have arrived in
|
||
|
the last `n` seconds. This pattern accounts for groups of packages being
|
||
|
uploaded at once without needlessly stressing the server.
|
||
|
|
||
|
By implementing these features, the server should be able to handle a large
|
||
|
number of package uploads without using excessive resources, ensuring Rieter
|
||
|
can scale to proper sizes.
|
||
|
|
||
|
## First release
|
||
|
|
||
|
Once this is implemented, the codebase should be ready for a 0.1.0 release!
|
||
|
This version will already be useable as a fully-fletched repository server on
|
||
|
which I can then build the other parts of the first stage.
|
||
|
|
||
|
For the 1.0 release, I'll be adding a web UI, as this was something that I was
|
||
|
sorely missing from Vieter. Perhaps most exciting of all, automatic mirroring
|
||
|
will also be added which I'm definitely looking forward to! I hope to publish
|
||
|
another post here soon, but until then, thanks for reading.
|