Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: A CI job inventory
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Timothe Litt <litt_at_acm.org>
Date: Mon, 7 Feb 2022 18:41:54 -0500
Agree with the thrust of these comments.
Perhaps rather than add metadata, have a utility that each CI job runs
to update a database at setup and/or exit.
This would also automate updating descriptions, etc as well as entering
new jobs without a separate process.
It could give you actual runtimes, fail counts, etc.
The mechanics might be a bit involved, but not difficult - probably the
utility would have to send its updates to a server since most CI
environments don't provide a persistent story - and you'd want data from
all environments in one place anyhow.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
On 07-Feb-22 18:07, Dan Fandrich via curl-library wrote:
> On Mon, Feb 07, 2022 at 11:10:39PM +0100, Daniel Stenberg via curl-library wrote:
>> In order to get better overview and control of the jobs we run, I'm
>> proposing that we create and maintain a single file that lists all the jobs
>> we run. This "database" of jobs could then be used to run checks against and
>> maybe generate some tables or charts and what not to help us make sure our
>> CI jobs really covers as many build combinations as possible and perhaps it
>> can help us reduce duplications or too-similar builds.
> I suspect we will be able to count the time in hours before such a list
> diverges from the actual CI jobs being run because somebody forgot to update
> the master list properly. Such a list will be pure duplication of information
> already found in the CI configuration files, too. I would rather treat the CI
> files as the sources of truth and derive a dashboard by parsing those instead,
> to show the jobs that are *actually* being run. The down side to that, of
> course, is that you'd need to write code to parse 6 different CI configuration
> file formats, but the significant benefit is that you could always trust the
> dashboard.
>
> Another approach would be to add metadata to the different CI configuration
> files that the dashboard could read from each file in a consistent format, such
> as a specially-formatted comment, structured job title, and/or special
> environment variable definition. That makes parsing easier, but it means that
> people would need to remember to update the metadata when they update or add a
> job. The metadata could still fall out of date for that reason, but it's less
> likely to happen than with a separate, central job registry because the
> metadata will always be found along with the job configuration. It should also
> be relatively easy to at least count the number of jobs defined in each CI
> configuration file and flag those without a special metadata line (catching new
> uncategorised jobs).
>
> Maybe a hybrid approach is the best; read and parse as much job data as
> practical from the job name and "env" section of each CI configuration file
> (which should be pretty simple and stable to retrieve), and supplement that
> with additional data from a structured comment (or magic "env" variable), where
> necessary.
>
> The only way I'd advocate for a new central job description file is if it could
> be used to mechanically generate the CI job files. That would mean there would
> be only one source of truth, but this approach would also be pretty impractical
> due to the complexity of many job configurations and the need to write 6
> different configuration file formats.
>
> Dan
Received on 2022-02-08
Date: Mon, 7 Feb 2022 18:41:54 -0500
Agree with the thrust of these comments.
Perhaps rather than add metadata, have a utility that each CI job runs
to update a database at setup and/or exit.
This would also automate updating descriptions, etc as well as entering
new jobs without a separate process.
It could give you actual runtimes, fail counts, etc.
The mechanics might be a bit involved, but not difficult - probably the
utility would have to send its updates to a server since most CI
environments don't provide a persistent story - and you'd want data from
all environments in one place anyhow.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
On 07-Feb-22 18:07, Dan Fandrich via curl-library wrote:
> On Mon, Feb 07, 2022 at 11:10:39PM +0100, Daniel Stenberg via curl-library wrote:
>> In order to get better overview and control of the jobs we run, I'm
>> proposing that we create and maintain a single file that lists all the jobs
>> we run. This "database" of jobs could then be used to run checks against and
>> maybe generate some tables or charts and what not to help us make sure our
>> CI jobs really covers as many build combinations as possible and perhaps it
>> can help us reduce duplications or too-similar builds.
> I suspect we will be able to count the time in hours before such a list
> diverges from the actual CI jobs being run because somebody forgot to update
> the master list properly. Such a list will be pure duplication of information
> already found in the CI configuration files, too. I would rather treat the CI
> files as the sources of truth and derive a dashboard by parsing those instead,
> to show the jobs that are *actually* being run. The down side to that, of
> course, is that you'd need to write code to parse 6 different CI configuration
> file formats, but the significant benefit is that you could always trust the
> dashboard.
>
> Another approach would be to add metadata to the different CI configuration
> files that the dashboard could read from each file in a consistent format, such
> as a specially-formatted comment, structured job title, and/or special
> environment variable definition. That makes parsing easier, but it means that
> people would need to remember to update the metadata when they update or add a
> job. The metadata could still fall out of date for that reason, but it's less
> likely to happen than with a separate, central job registry because the
> metadata will always be found along with the job configuration. It should also
> be relatively easy to at least count the number of jobs defined in each CI
> configuration file and flag those without a special metadata line (catching new
> uncategorised jobs).
>
> Maybe a hybrid approach is the best; read and parse as much job data as
> practical from the job name and "env" section of each CI configuration file
> (which should be pretty simple and stable to retrieve), and supplement that
> with additional data from a structured comment (or magic "env" variable), where
> necessary.
>
> The only way I'd advocate for a new central job description file is if it could
> be used to mechanically generate the CI job files. That would mean there would
> be only one source of truth, but this approach would also be pretty impractical
> due to the complexity of many job configurations and the need to write 6
> different configuration file formats.
>
> Dan
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
- application/pgp-signature attachment: OpenPGP digital signature