The winning strategy for all CI environments is a build system facsimile that works on your machine, your CI's machine, and your test/uat/production with as few changes between them as your project requirements demand.
I start with a Makefile. The Makefile drives everything. Docker (compose), CI build steps, linting, and more. Sometimes a project outgrows it; other times it does not.
But it starts with one unitary tool for triggering work.
The problem isn't CI/CD; the problem is "programming in configuration". We've somehow normalized a dev loop that involves `git commit -m "try fix"`, waiting 10 minutes, and repeating. Local reproduction of CI environments is still the missing link for most teams.
I tend to disagree with this as it seems like an ad for Nix/Buildkite...
If your CI invocations are anything more than running a script or a target on a build tool (make, etc.) where the real build/test steps exist and can be run locally on a dev workstation, you're making the CI system much more complex than it needs to be.
CI jobs should at most provide an environment and configuration (credentials, endpoints, etc.), as a dev would do locally.
This also makes your code CI agnostic - going between systems is fairly trivial as they contain minimal logic, just command invocations.
Killing engineer teams? Hyperbole thread titles need to be killed. I find github actions to be just fine. I prefer it to bitbucket and gitlab.
Good place to ask: I'm not comfortable with NPM-style `uses: randomAuthor/some-normal-action@1` for actions that should be included by default, like bumping version tags or uploading a file to the releases.
What's the accepted way to copy these into your own repo so you can make sure attackers won't update the script to leak my private repo and steal my `GITHUB_TOKEN`?
Dead on. GitHub Actions is the worst CI tool I’ve ever used (maybe tied with Jenkins) and Buildkite is the best. Buildkite’s dynamic pipelines (the last item in the post) are so amazingly useful you’ll wonder how you ever did without them. You can do super cool things like have your unit test step spawn a test de-flaking step only if a test fails. Or control test parallelism based on the code changes you’re testing.
All of that on top of a rock-solid system for bringing your own runner pools which lets you use totally different machine types and configurations for each type of CI job.
Highly, highly recommend.
Github being less and less reliable nowadays just makes this more true.
In the past week I have seen:
- actions/checkout inexplicably failing, sometimes succeeding on 3rd retry (of the built-in retry logic)
- release ci jobs scheduling _twice_, causing failures, because ofc the release already exists
- jobs just not scheduling. Sometimes for 40m.
I have been using it actively for a few years and putting aside everything the author is saying, just the base reliability is going downhill.
I guess zig was right. Too bad they missed builtkite, Codeberg hasn't been that reliable or fast in my experience.
Ian Duncan, I was imagining you on a stage delivering this as a standup comedy show on Netflix.
My pet peeve with Github Actions was that if I want to do simple things like make a "release", I have to Google for and install packages from internet randos. Yes, it is possible this rando1234 is a founding github employee and it is all safe. But why does something so basic need external JS? packages?
Things I dislike about GHA (on Enterprise Server)
* Workflows are only registered once pushed to main, impossible to test the first runs in a branch.
* MS/GH don't care much about GHES as they do github.com, I think they'd like to see it just die. Massive lack of feature parity.
* Labels: If any of your workflows trigger from a label, they ALL DO. You can't target labels only to certain workflows, they all run and then cancel, polluting your checks.
* Deployments: What is a deployment even doing? There is no management to deploy.
* Statefulness: No native way to store state between runs in the same workflow or PR, you would think you could save some sort of state somewhere but you have to manage it all yourself with manifests or something else.
I can go on
The log viewer thing is what baffles me most.
Back in... I don't know, 2010, we used Jenkins. Yes, that Java thingy. It was kind of terrible (like every CI), but it had a "Warnings Plugin". It parsed the log output with regular expressions and presented new warnings and errors in a nice table. You could click on them and it would jump to the source. You could configure your own regular expressions (yes, then you have two problems, I know, but it still worked).
Then I had to switch to GitLab CI. Everyone was gushing how great GitLab CI was compared to Jenkins. I tried to find out: how do I extract warnings and errors from the log - no chance. To this day, I cannot understand how everyone just settled on "Yeah, we just open thousands of lines of log output and scroll until we see the error". Like an animal. So of course, I did what anyone would do: write a little script that parses the logs and generates an HTML artifact. It's still not as good as the Warnings Plugin from Jenkins, but hey, it's something...
I'm sure, eventually someone/AI will figure this out again and everyone will gush how great that new thing is that actually parses the logs and lets you jump directly to the source...
Don't get me wrong: Jenkins was and probably still is horrible. I don't want to go back. However, it had some pretty good features I still miss to this day.
Pretty sure someone at MS told me that Actions was rewritten by the team who wrote Azure DevOps. So bureaucracy would be a feature.
That aside, GH Actions doesn’t seem any worse than GitLab. I forget why I stopped using CircleCI. Price maybe? I do remember liking the feature where you could enter the console of the CI job and run commands. That was awesome.
I agree though that yaml is not ideal.
Agreed with absolutely all of this. Really well written. Right now at work we're getting along fine with Actions + WarpBuild but if/when things start getting annoying I'm going to switch us over to Buildkite, which I've used before and greatly enjoyed.
I agree with all the points made about GH actions.
I haven't used as many CI systems as the author, but I've used, GH actions, Gitlab CI, CodeBuild, and spent a lot of time with Jenkins.
I've only touched Buildkite briefly 6 years ago, at the time it seemed a little underwhelming.
The CI system I enjoyed the most was TeamCity, sadly I've only used it at one job for about a year, but it felt like something built by a competent team.
I'm curious what people who have used it over a longer time period think of it.
I feel like it should be more popular.
I hope the author will check out RWX -- they say they've checked out most CI systems, but I don't think they've tried us out yet. We have everything they praise Buildkite for, except for managing your own compute (and that's coming, soon!). But we also built our own container execution model with CI specifically in mind. We've seen one too many Buildkite pipelines that have a 10 minute Docker build up front (!) and then have to pull a huge docker container across 40 parallel steps, and the overhead is enormous.
> GitHub Actions is not good. It’s not even fine. It has market share because it’s right there in your repo
Microsoft being microsoft I guess. Making computing progressively less and less delightful because your boss sees their buggy crap is right there so why don't you use it
> But Everyone Uses It!
All of my customers are on bitbucket.
One of them does not even use a CI. We run tests locally and we deploy from a self hosted TeamCity instance. It's a Django app with server side HTML generation so the deploy is copying files to the server and a restart. We implemented a Capistrano alike system in bash and it's been working since before Covid. No problems.
The other one uses bitbucket pipelines to run tests after git pushes on the branches for preproduction and production and to deploy to those systems. They use Capistrano because it's a Rails app (with a Vue frontend.) For some reason the integration tests don't run reliably neither on the CI instances nor on Macs, so we run them only on my Linux laptop. It's been in production since 2021.
A customer I'm not working with anymore did use Travis and another one I don't remember. That also run a build on there because they were using Elixir with Phoenix, so we were creating a release and deploying it. No mere file copying. That was the most unpleasant deploy system of the bunch. A lot of wasted time from a push to a deploy.
In all of those cases logs are inevitably long but they don't crash the browser.
At which point did someone force OP to use GH Actions ?
It's fantastic for simple jobs, I use it for my hobbyist projects because I just need 20 to 30 lines to build and deploy a web build.
Just because a bike isn't good for traveling in freezing weather doesn't mean no one should own a bike.
Pick the right tool for the job.
Plus CI/CD is the boring part. I always imagined GH Actions as a quick and somewhat sloppy solution for hobbyist projects.
Not for anything serious.
Pour one out for the memory of CruiseControl, the OG (?) granddaddy of all CI systems in the form we would recognise them today.
What I find hardest about CI offerings is that each one has a unique DSL that inevitably has edge cases that you may only find out once you’ve tried it.
You might face that many times using Gitlab CI. Random things don’t work the way you think it should and the worst part is you must learn their stupid custom DSL.
Not only that, there’s no way to debug the maze of CI pipelines but I imagine it’s a hard thing to achieve. How would I be able to locally run CI that also interacts with other projects CI like calling downstream pipelines?
This is roughly how I feel about cloudformation. May we please have terraform back? Ansible, even?
Personally I like Drone more than Buildkite. It's as close to a perfect CI system as I've seen; just complex enough to do everything I need, with a design so stripped-down it can't be simpler. I occasionally check on WoodpeckerCI to see if it's reached parity with Drone. Now that AI coding is a thing, hopefully that'll happen soon
I really wonder in which universe people are living. GitHub Actions was a godsend when it was first released and it still continues to be great. It has just the right amount of abstractions. I've used many CIs in the past and I'd definitely prefer GA over any of them.
Nice write up, but wondering now what nix proposes in that space.
I've never used nix or nixos but a quick search led me to nixops, and then realized v4 is entirely being rewritten in rust.
I'm surprised they chose rust for glue code, and not a more dynamic and expressive language that could make things less rigid and easier to amend.
In the clojure world BigConfig [0], which I never used, would be my next stop in the build/integrate/deploy story, regardless of tech stack. It integrates workflow and templating with the full power of a dynamic language to compose various setups, from dot/yaml/tf/etc files to ops control planes (see their blog).
I don't care if this is an advertisement for buildkite masquerading as a blog post or if this is just an honest rant. Either way, I gotta say it speaks a lot of truth.
nods. nods again. Yep, this is exactly why we left GitHub for GitLab two years ago. Not one moment of regret.
Still, I wonder who is still looking manually at CI build logs. You can use an agent to look for you, and immediately let it come up with a fix.
GHA is quite empowering for solo devs. I just dev on my tiny machine and outsource all heavy work to GHA, and basically let Claude rip on the errors, rinse repeat.
> I have mass-tested these systems so that you don’t have to, and I have the scars to show for it, and I am here to tell you: GitHub Actions is not good.
> Every CI system eventually becomes “a bunch of YAML.” I’ve been through the five stages of grief about it and emerged on the other side, diminished but functional.
> I understand the appeal. I have felt it myself, late at night, after the fourth failed workflow run in a row. The desire to burn down the YAML temple and return to the simple honest earth of #!/bin/bash and set -euo pipefail. To cast off the chains of marketplace actions and reusable workflows and just write the damn commands. It feels like liberation. It is not.
Ah yes, misery loves company! There's nothing like a good rant (preferably about a technology you have to use too, although you hate its guts) to brighten up your Friday...
That if anything was a fun read, explains why I’ve always heard that GitHub actions were only good for personal projects
I have not had this experience. It sounds like a bad process rather than being GitHubs fault. I’ve always had GitHub actions double checking the same checks I run locally before pushing.
I just can't stand using a build system tied to the code host. And that is really because I have an aversion to vendor lock-in.
webhooks to an external system was such a better way to do it, and somehow we got away from that, because they don't want us to leave.
webhooks are to podcasts as github actions are to the things that spotify calls podcasts.
I don't have much experience with Guthub Actions, but I'll say this does sound worse than Azure DevOps, which I did not imagine was possible. I've never liked any CI system, but ADO must be one of the lower circles of hell.
I think Github Actions is just a lead for Microsoft customers to use paid Azure DevOps. It is bad intentionally.
We started using Buildkite at $DAYJOB years ago and haven't looked back. Incredibly, GitHub Actions seems to have gotten _worse_ in the interim. Absolutely no regrets from switching.
I matured as an Engineer using various CI tools and discovering hands-on that these tools are so unreliable (pipes often failing inconsistently). I am surprised to find that there are better systems, and I'd like to learn more.
I agree with the gripes, but buildkite is not the answer
If I cannot fully self host an open source project, it is not a contender for my next ci system
I was excited for actions because it was “next to” my source code.
I (tend to) complain about actions because I use them.
Open to someone telling me there is a perfect solution out there. But today my actions fixes were not actions related. Just maintenance.
Is it great? No. Is it usually good enough? Yes. CI shouldn’t be a main quest for most engineers. Just get it rolling early and adjust as needed.
I'll be that guy.
For what boils down to a personal take, light on technicalities, this reads like uncannily impersonal, prolonged attempt at dramatic writing.
If you believe the dates in this blog, it's totally different in tone, style, and wording to a safely distant 2021 post (https://www.iankduncan.com/personal/2021-10-04-garbage-in-ne...).
It made me feel paranoid just in about three paragraphs. I apologize to the author if I'm wrong but we all understand what my gut tells me.
I think this author would benefit from using the Refined GitHub browser extension, which fixes a lot of these problems.
RA the specified array and query polkit prior to k-mod in o-space. Xenosystem upload
#git --clone [URL]
Declarative (a la bazel and garnix) is obviously the way to go, but we're still living in the s̶t̶o̶n̶e̶ YAML age.
> If you’re a small team with a simple app and straightforward tests, it’s probably fine. I’m not going to tell you to rip it out.
> But if you’re running a real production system, if you have a monorepo, if your builds take more than five minutes, if you care about supply chain security, if you want to actually own your CI: look at Buildkite.
Goes in line with exactly what I said in 2020 [0] about GitHub vs Self-hosting. Not a big deal for individuals, but for large businesses it's a problem if you can push that critical change when your CI is down every week.
The internet makes me feel like the only person that doesn't mind Jenkins. Idk it just gets the job done ime.
> You’ve upgraded the engine but you’re still driving the car that catches fire when you turn on the radio.
And fixing the pyro-radio bug will bring other issues, for sure, so they won't because some's workflow will rely on the fact that turning on the radio sets the car on fire: https://xkcd.com/1172/
I think we can honestly remove the word Actions in the headline and still agree.
It used to be fast ish!
Now it's full ugh.
Happy user of GitLab CI here.
I see the appeal of GitHub for sharing open source - the interface is so much cleaner and easier to find all you are looking for (GitLab could improve there).
But for CI/CD GitHub doesn’t even come close to GitLab in the usability department, and that’s before we even talk about pricing and the free tiers. People need to give it a try and see what they are missing.
“Microsoft is where ambitious developer tools go to become enterprise SKUs“
It’s hard to remember, sometimes, that Microsoft was one of the little gadflies that buzzed around annoying the Big Guys.
I hate to say this. I can't even believe I am saying it, but this article feels like it was written in a different universe where LLMs don't exist. I understand they don't magically solve all of these problems, and I'm not suggesting that it's as simple as "make the robot do it for you" either.
However, there are very real things LLMs can do that greatly reduce the pain here. Understanding 800 lines of bash is simply not the boogie man it used to be a few years ago. It completely fits in context. LLMs are excellent at bash. With a bit of critical thinking when it hits a wall, LLM agents are even great at GitHub actions.
The scariest thing about this article is the number of things it's right about. Yet my uncharacteristic response to that is one big shrug, because frankly I'm not afraid of it anymore. This stuff has never been hard, or maybe it has. Maybe it still is for people/companies who have super complex needs. I guess we're not them. LLMs are not solving my most complex problems, but they're killing the pain of glue left and right.
> this is a product made by one of the richest companies on earth.
nit: no, it was made by a group of engineers that loved git and wanted to make a distributed remote git repository. But it was acquired/bought out then subsequently enshittified by the richest/worst company on earth.
Otherwise the rest of this piece vibes with me.
I've used many of the CI systems that the author has here, and I've done a lot of CircleCI and GitHub Actions, and I don't come to quite the same conclusions. One caveat though, I haven't used Buildkite, which the author seems to recommend.
Over the years CI tools have gone from specialist to generalist. Jenkins was originally very good at building Java projects and not much else, Travis had explicit steps for Rails projects, CircleCI was similarly like this back in the day.
This was a dead end. CI is not special. We realised as a community that in fact CI jobs were varied, that encoding knowledge of the web framework or even language into the CI system was a bad idea, and CI systems became _general workflow orchestrators_, with some logging and pass/fail UI slapped on top. This was a good thing!
I orchestrated a move off CircleCI 2 to GitHub Actions, precisely because CircleCI botched the migration from the specialist to generalist model, and we were unable to express a performant and correct CI system in their model at the time. We could express it with GHA.
GHA is not without its faults by any stretch, but... the log browser? So what, just download the file, at least the CI works. The YAML? So it's not-quite-yaml, they weren't the first or last to put additional semantics on a config format, all CI systems have idiosyncrasies. Plugins being Docker images? Maybe heavyweight, but honestly this isn't a bad UX.
What does matter? Owning your compute? Yeah! This is an important one, but you can do that on all the major CI systems, it's not a differentiator. Dynamic pipelines? That's really neat, and a good reason to pick Buildkite.
My takeaway from my experience with these platforms is that Actions is _pretty good_ in the ways that truly matter, and not a problem in most other ways. If I were starting a company I'd probably choose Buildkite, sure, but for my open source projects, Actions is good.