Add configurable cron schedule #116

New Issue

Jef Roosens · 2022-04-08T11:50:13Z

Jef Roosens commented

2022-04-08 11:50:13 +00:00

Some people might want their packages to be rebuilt more or less frequently, or at a different time.

It might be useful to implement our own simple version of cron which in essence would check whether it can start the builds every minute or so. This might open up opportunities to allow granular configuration of build timings for specific repos.

Some people might want their packages to be rebuilt more or less frequently, or at a different time. It might be useful to implement our own simple version of cron which in essence would check whether it can start the builds every minute or so. This might open up opportunities to allow granular configuration of build timings for specific repos.

Jef Roosens added this to the 0.3.0 milestone 2022-04-08 11:50:13 +00:00

Jef Roosens added the

enhancement

label 2022-04-08 11:50:13 +00:00

Jef Roosens referenced this issue

2022-04-08 11:50:35 +00:00

Build system roadmap #106

Jef Roosens commented

2022-04-08 12:37:03 +00:00

Poster

Writing our own cron daemon could be done using the following:

On startup, request all repos from the API. For each repo, we either check if it has a custom schedule or use the default one. We calculate the next time the job should be run, & place it in a priority queue, sorted by how close the next execution is.
We look at the top of the queue & sleep until that event occurs.
When we awake, we verify that we have indeed slept for enough time and if so, start the build.
After the build has finished, we repeat from step 2.

At some point, the daemon should also update the queue with the new list of repos, to make sure we keep up to date with the settings.

Writing our own cron daemon could be done using the following: 1. On startup, request all repos from the API. For each repo, we either check if it has a custom schedule or use the default one. We calculate the next time the job should be run, & place it in a priority queue, sorted by how close the next execution is. 2. We look at the top of the queue & sleep until that event occurs. 3. When we awake, we verify that we have indeed slept for enough time and if so, start the build. 4. After the build has finished, we repeat from step 2. At some point, the daemon should also update the queue with the new list of repos, to make sure we keep up to date with the settings.

Jef Roosens added reference cron

2022-04-09 07:51:08 +00:00

Jef Roosens removed reference cron

2022-04-12 10:16:50 +00:00

Jef Roosens commented

2022-04-13 08:44:34 +00:00

Poster

To schedule jobs & periodically update the API, I have the following algorithm in mind:

We add a configuration variable api_update_frequency or something that defines how often the API should be contacted for updates, in minutes.

At startup, we contact the API & request the repositories. From these repositories we create an initial pqueue, sorted by how early their next scheduled task is.
We look at the topmost task & sleep either until the next task is due, or until we have to refresh the API. If the next task should be run now, we start it.
Once we awake, we check whether it's time to update the API. If it is, we create a new pqueue & add all tasks from the original pqueue that should already be started. Then, we calculate all the schedules for the new repos & add them to the pqueue.
Repeat steps 2-4 until something stops the program.

I also had some ideas for implementing concurrent builds. At startup, we could store a fixed-length array of threads that will be empty in the beginning, with the length being defined by a max_concurrent_builds variable. When we wish to schedule a job, we search for the next slot in the array that is either empty or contains a thread that has already finished. If none are present, we sleep for a certain amount of time before trying again.

To schedule jobs & periodically update the API, I have the following algorithm in mind: We add a configuration variable `api_update_frequency` or something that defines how often the API should be contacted for updates, in minutes. 1. At startup, we contact the API & request the repositories. From these repositories we create an initial pqueue, sorted by how early their next scheduled task is. 2. We look at the topmost task & sleep either until the next task is due, or until we have to refresh the API. If the next task should be run now, we start it. 3. Once we awake, we check whether it's time to update the API. If it is, we create a new pqueue & add all tasks from the original pqueue that should already be started. Then, we calculate all the schedules for the new repos & add them to the pqueue. 4. Repeat steps 2-4 until something stops the program. I also had some ideas for implementing concurrent builds. At startup, we could store a fixed-length array of threads that will be empty in the beginning, with the length being defined by a `max_concurrent_builds` variable. When we wish to schedule a job, we search for the next slot in the array that is either empty or contains a thread that has already finished. If none are present, we sleep for a certain amount of time before trying again.

Jef Roosens commented

2022-04-13 09:58:52 +00:00

Poster

Apparently there's no function for only checking whether a thread has finished; you can only wait. Perhaps we could use a shared object instead that each thread then updates whenever it's done with a build.

Jef Roosens commented

2022-04-13 10:12:05 +00:00

Poster

import time
import sync.stdatomic

fn tlog(msg string) {
    println('${time.now().format_ss_micro()} | $msg')    
}

fn worker(shared_u64_ptr &u64) {
    defer { stdatomic.store_u64(shared_u64_ptr, 1) }
    time.sleep( 500 * time.millisecond )    
}

fn main() {
    x := u64(0)
    shared_u64_ptr := &x
    tlog('start')
    go worker(shared_u64_ptr)
    for {
        if stdatomic.load_u64(shared_u64_ptr) == 1 { break }
        tlog('still not finished, sleeping ...')
        time.sleep(100 * time.millisecond)
    }
    tlog('done')
}

this snippet, courtesy of spytheman#4818 in the vlang discord, could solve our problems. We could use more than 2 values because it's a u64, so 0 could mean 'free', 1 means 'running' & 2 means 'done' or something.

```v import time import sync.stdatomic fn tlog(msg string) { println('${time.now().format_ss_micro()} | $msg') } fn worker(shared_u64_ptr &u64) { defer { stdatomic.store_u64(shared_u64_ptr, 1) } time.sleep( 500 * time.millisecond ) } fn main() { x := u64(0) shared_u64_ptr := &x tlog('start') go worker(shared_u64_ptr) for { if stdatomic.load_u64(shared_u64_ptr) == 1 { break } tlog('still not finished, sleeping ...') time.sleep(100 * time.millisecond) } tlog('done') } ``` [this snippet](https://ptb.discord.com/channels/592103645835821068/592294828432424960/963742760618303529), courtesy of spytheman#4818 in the vlang discord, could solve our problems. We could use more than 2 values because it's a u64, so 0 could mean 'free', 1 means 'running' & 2 means 'done' or something.

Jef Roosens added reference cron

2022-04-14 19:58:06 +00:00

Jef Roosens referenced a pull request that will close this issue

2022-04-21 11:38:29 +00:00

implementation of cron daemon #134

Jef Roosens closed this issue

2022-04-30 19:05:06 +00:00

Sign in to join this conversation.