1. Don't use bash, use a scripting language that is more CI friendly. I strongly prefer pwsh.
2. Don't have logic in your workflows. Workflows should be dumb and simple (KISS) and they should call your scripts.
3. Having standalone scripts will allow you to develop/modify and test locally without having to get caught in a loop of hell.
4. Design your entire CI pipeline for easier debugging, put that print state in, echo out the version of whatever. You don't need it _now_, but your future self will thank you when you do it need it.
5. Consider using third party runners that have better debugging capabilities
Its not Github Actions' fault but the horrors people create in it, all under the pretense that automation is simply about wrapping a GitHub Action around something. Learn to create a script in Python or similar and put all logic there so you can execute it locally and can port it to the next CI system when a new CTO arrives.
The way I deal with all these terrible CI platforms (there is no good one, merely lesser evils) is to do my entire CI process in a container and the CI tool just pulls and runs that. You can trivially run this locally when needed.
Of course, the platforms would rather have you not do that since it nullifies their vendor lock-in.
I like Github Actions and it is better than what I used before (Travis) and I think it solves an important problem. For OSS projects it's a super valuable free resource.
For me what worked wonders was adopting Nix. Make sure you have a reproducible dev environment and wrap your commands in `nix-shell --run`, or even better `nix develop --command`, or even better your most of your CI tasks derivations that run with `nix build` or `nix flake check`.
Not only does this make it super easy to work with Github Actions, also with your colleagues or other contributors.
Wrote a comment to someone here but I thought of deciding to make a main comment here as well
Note that I don't really use github actions much but have heard about its architecture
From my understanding, I feel like Github actions should just be a call to some bash or python file. Bash has its issues so I prefer python
I recommend people to take a look at https://paulw.tokyo/standalone-python-script-with-uv/ and please tell me if something like this might be perfect for python scripts in github actions as this script would automatically install uv, get all the dependencies of python and even the runtime I think and then execute the python code all while being very managable usually and it can run locally as well
The only Issue I feel like I might have with this is say why go something with this complex when bash exists or the performance concerns of installing uv but considering its github actions, I feel like the latter is ruled out.
Bash is good as well but bash has some severe limitations. and I feel like Python can be good case for something like this plus its ecosystem is a bit mature and you could even create web servers or have some logs be reported to your custom server or automate just basically everything
To me this script feels like the best of both worlds and something genuinely sane to build upon.
The main problem with actions is the way they advertise its usage "just put workflows together" is a horrible and non-debuggable way to do things. But even in the tech itself, caching is pretty stingy which can slow dev builds for fairly simple projects because every run will repeat some common work unless you have the cache perfectly configured (did u cover npm, docker, etc. with cache keys correctly?)
Looking at these flaws, running workflows from a persistent VM of ur own becomes pretty tempting because you don't need to copy caches around and can easily SSH in.
As a founder of Depot [0], where we offer our own faster and cheaper GitHub Actions runners, I can assure everyone that this feeling is the majority and not the minority.
Sounds strange to say as someone who has a product that is built around making GitHub Actions exponentially faster to close these feedback loops faster.
But I can honestly say it's only really possible because the overall system with GitHub Actions is so poor. We discover new bottlenecks in the runner and control plane weekly. Things that you'd think would be simple are either non-existent, don't work, or have terrible performance.
I'm convinced there are better ways of doing things, and we are actively building ideas in that realm. So if anybody wants to learn more or just have a therapy session about what could be better with GitHub Actions, my email is in my bio.
First time I see jj being mentioned in a post not about jj. Made me very happy.
I actually built the last thing last weekend weirdly enough.
gg watch action
Finds the most recent or currently running action for the branch you have checked out. Among other things.
Would a tool like act help here? (https://github.com/nektos/act) I suppose orchestration that is hiding things from different processor architectures could also very well run differently online than offline, but still.
I recently, for Rust targets reasons, have decided to punt GitHub Actions AND GitHub. Opted for Radicle. Had to figure out my own CI.
Doc'd it here: https://revolveteam.com/blog/goa-radicle-ci/
For those who have experience, how does github actions compare to azure devops pipelines?
Would something like this help with the feed back loop?
At work we have a bunch of GitHub actions for integration testing, building models, publishing, reporting, and whatnot. It was horrible to maintain and look into whenever something went wrong, so I rewrote all the individual parts in Perl and hooked them together with pipes inside GHA, and it works wonders.
Also, GitHub actions itself just breaks sometimes, which is super annoying. A couple of weeks ago, half of all the macOS runner images broke when using GitHub's caching system. Seems like that would have been caught on GitHub's side before that happened, but what do I know!
Is any of this unique to GitHub Actions that does not happen on other cloud CI platforms?
I am surprised that these sort of declarative specs are so popular in certain domains. Essentially you always seem to be putting settings into the ether and hoping they interact with each other in the way you expect.
I prefer an api with documented contracts between its abstractions
So many engineers could put the hours spent debugging GH actions to use developing expertise to run their own CI. But people either don’t believe they can, can’t convince decision makers to let them try, or just want to fix their own problem and move on.
I was convinced GH actions was best practice and it was normal to waste hours on try-and-pray build debugging, until one day GH actions went down and I ran deploys from my laptop and remembered how much better life can be without it..
(Solo dev here - but opensource CI on an EC2 instance can be just as nice)
Guys,
GitHub action is a totally broken piece of s !! I know about that broken loops cause I had to deal with it an incredible number of times.
I very often mention OneDev in my comments, and you know what ? Robin solved this issue 3 years ago : https://docs.onedev.io/tutorials/cicd/diagnose-with-web-term...
You can pause your action, connect through a web terminal, and debug/fix things live until it works. Then, you just patch your action easily.
And that’s just one of the many features that make OneDev superior to pretty much every other CI/CD product out there.
GH Actions isn't great compared to other CI systems, but it's also not particularly worse until you get into the nitty gritty details.
The most important advice is probably to put as much code as possible into locally runnable scripts written in a cross-platform scripting language (e.g. Python or Node.js) to avoid 'commit-push-ci-failure' roundtrips.
Only use the GH Actions YAML for defining the runtime environment and job dependency tree.
If you wanted a better version of GitHub Actions/CI (the orchestrator, the job definition interface, or the programming of scripts to execute inside those jobs), it would presumably need to be more opinionated and have more constraints?
Who here has been thinking about this problem? Have you come up with any interesting ideas? What's the state of the art in this space?
GHA was designed in ~2018. What would it look like if you designed it today, with all we know now?
Of all the valid complaints about Github Actions or CI in general, this seems to be an odd one. No details about what was tried or not tried, but hard to see a `-run: go install cuelang.org/go/cmd/cue@latest` step not working?
I am not having fun with GitHub Actions right now! Why does everything have to be so hard?
I like being able to run self-hosted runners, that is a very cool part of GitHub Actions/Workflow.
I appreciate all the other advice about limit my yamls to: 1) checkout, 2) call a script to do the entire task. I am already half-way there, just need to knuckle-down and do the work.
I was dismayed that parallel tasks aren't really a thing in the yaml, I wanted to fanout a bunch of parallel tasks and I found I couldn't do it. Now that I'm going to consolidate my build process into a single script I own, I can do the fanout myself.
Honestly part of the reason why I left my last job was because I had to extensively work with GitHub workflows and actions. Debugging was absolutely hell, especially with long running tasks (~1hr) that would fail with next to no debug or traceability. I offered many times to overhaul the system and make it easier to maintain, but we had no time budget. I now work for 30% less but am substantially happier in life without github. It sounds crazy, but I’m not exaggerating.
> For the love of all that is holy, don’t let GitHub Actions
> manage your logic. Keep your scripts under your own damn
> control and just make the Actions call them!
I mean your problem was not `build.rs` here and Makefiles did not solve it, was your logic not already in `build.rs` which was called by Cargo via GitHub Actions?The problem was the environment setup? You couldn't get CUE on Linux ARM and I am assuming when you moved to Makefiles you removed the need for CUE or something? So really the solution was something like Nix or Mise to install the tooling, so you have the same tooling/version locally & on CI?
I've gotten to a point where my workflow YAML files are mostly `mise` tool calls (because it handles versioning of all tooling and has cache support) and webhooks, and still it is a pain. Also their concurrency and matrix strategies are just not working well, and sometimes you end up having to use a REST API endpoint to force cancel a job because their normal cancel functionality simply does not take.
There was a time I wanted our GH actions to be more capable, but now I just want them to do as little as possible. I've got a Cloudflare worker receiving the GitHub webhooks firehose, storing metadata about each push and each run so I don't have to pass variables between workflows (which somehow is a horrible experience), and any long-running task that should run in parallel (like evaluations) happens on a Hetzner machine instead.
I'm very open to hear of nice alternatives that integrate well with GitHub, but are more fun to configure.
> i.e. a way that I could testbed different runs without polluting history of both Git and Action runs.
How about writing a separate repo and testing it separately
Keywords: reusable workflow/actions
I've always found things like AWS codebuild or even just a self hosted bare bones jenkins server far easier to work with. What is the advantage that github actions provide that people put up with it? The feedback seems almost universally negative.
As much as I hate GitHub Actions, I prefer it over Jenkins and others, because it is right there. I don't need to go and maintain my own servers, or even talk to a 3rd party to host the services.
I think the root problem here is scope creep on using build tools, configuration management, and ci tools: don't make them do things they are not designed for / weak at. The issue isn't necessarily github ci, which is fine if you keep it simple. But things just getting to the stage where they are not simple anymore.
I've seen this over and over again over the years in projects where things like ant, maven, gradle, puppet, ansible, etc. Invariably somebody tries to do something really complicated/clever in a convoluted way. They'll add plugins, more plugins, all sorts of complex configuration, etc. Your script complexity explodes. And then it doesn't work and you spend hours/days trying to make it do the right thing and fighting a tool that just wasn't designed to do what you are doing well. The problem is using the wrong tool for the job. All these tools have the tendency to evolve into everything tools. And they just aren't good at everything. Just because it has some feature doesn't mean it's a good idea to use it.
The author actually calls this out. Just write a bash script and run that instead. If that gets too complicated pick something else more appropriate to the job. Python, whatever you like. The point here is to pick something that's easy for you to run, test, and debug locally. Obviously people have different preferences here. If you are shoe horning complex fork/join behavior, conditional logic, etc. into a Yaml / CI build, maybe simplify your build. Yaml just isn't suitable as a general purpose programming language.
Externalizing the complex stuff to some script also has the benefit of that stuff still working if you ever decide to switch CI provider. Maybe you want to move to Gitlab. Or somebody decides Jenkins wasn't so bad after all. The same scripts will probably be easy to adapt to those.
One of my current github actions basically starts a vm, runs a build.sh script, stops the vm. I don't need a custom runner for that. I get to pick the vm type. I can do whatever I need to in my build.sh. I have one project with an actual matrix build. But that's about the most complex thing I do with github actions. A lot of my build steps are just inline shell commands.
And obligatory AI comment here, if you are doing all this manually, having scripts that run locally also means you can put claude code/codex/whatever to work fixing them for you. I've been working on some ansible scripts with codex today. Works great. It produces better ansible scripts than me. These tools work better if they can test/run what they are working on.
fyi i maintain a repo that accidentally tracks github actions cron reliability (https://www.swyx.io/github-scraping) - just runs a small script every hour.
i just checked and in 2025 there was at least 2 outages a month every month https://x.com/swyx/status/2011463717683118449?s=20 . not quite 3 nines.
In general, I've never really experienced the issues mentioned, but I also use Gitea with Actions rather than GitHub. I also avoid using any complex logic within an Action.
For the script getting run, there's one other thing. I build my containers locally, test the scripts thoroughly, and those scripts and container are what are then used in the build and deploy via Action. As the entire environment is the same, I haven't encountered many issues at all.
> For the love of all that is holy, don’t let GitHub Actions
> manage your logic. Keep your scripts under your own damn
> control and just make the Actions call them!
The pain is real. I think everyone that's ever used GitHub actions has come to this conclusion. An ideal action has 2 steps: (1) check out the code, (2) invoke a sane script that you can test locally.Honestly, I wonder if a better workflow definition would just have a single input: a single command to run. Remove the temptation to actually put logic in the actions workflow.
Standard msft absurdity. 8 years later there is still no local gh action runner to test your script before you commit, push, and churn through logs, and without some 3rd party hack, no way to ssh in and debug. It doesn't matter how simple the build command you write is, because the workflow itself is totally foreign technology to most, and no one wants to be a gh action dev.
Like most of the glaring nonsense that costs people time when using msft, this is financially beneficial to msft in that each failed run counts against paid minutes. It's a racket from disgusting sleaze scum who literally hold meetings dedicated to increasing user pain because otherwise the bottom line will slip fractionally and no one in redmond has a single clue how to make money without ripping off the userbase.
For the last decade I've been doing my CI/CD as simple .NET console apps that run wherever. I don't see why we switch to these wildly different technologies when the tools we are already using can do the job.
Being able to run your entire "pipeline" locally with breakpoints is much more productive than whatever the hell goes on in GH Actions these days.
A lot of the pain of GitHub Actions gets much better using tools like action-tmate: https://github.com/mxschmitt/action-tmate
As soon as I need more than two tries to get some workflow working, I set up a tmate session and debug things using a proper remote shell. It doesn't solve all the pain points, but it makes things a lot better.
Issue is, op is trying to use matrix strategy when with cross-compiling they could avoid it. I have done it for https://github.com/anttiharju/compare-changes (which has nontrivial CI pipelines but they could be a lot simpler for op's needs)
Main issue is Rust. Writing catchy headlines about hating something may feel good, but a lot of people could avoid these pains if
- zig cc gets support for new linker flag that Rust requires https://codeberg.org/ziglang/zig/pulls/30628 - rust-lang/libc gets to 1.0 which removes iconv issues for macos https://github.com/rust-lang/libc/issues/3248
You should try RWX! You can trigger runs from the CLI for way faster feedback loops. The automatic caching is surprisingly good too. https://www.rwx.com/
The love for Github Actions dissipated fast, it wasn't that long ago we had to read about how amazing Github Actions where. What changed?
I know GitHub Actions won the war, but I think Bitbucket Pipelines are much nicer to work with. They just seem simpler and less fragile.
But almost every company uses GitHub, and changing to Bitbucket isn't usually viable.
Who doesn’t? I use it with Mise to have a very simple locally tested way or running tasks.
I avoid actions for these exact reasons unless I can run the exact same build on another host.
And that’s where there’s a Mac Studio that sits sadly in the corner, waiting for a new check in so it has something to do.
> A word of explanation. I’m building tmplr for 4 platforms:
> Linux ARM
> macOS ARM
> Linux x86_64
> macOS x86_64
Oh, here we go again. Java was invented to solve that, "Write once run everywhere" :] I.e. `int` means `i32` on all platforms, no `usize`.Of course avoid Actions if you expect tasks to complete promptly or ever. Have we forgotten safe_sleep.sh? I don't think it was unique.
This is very validating as someone just trying out Actions and getting frustrated with the less than ideal UX.
I have spent the majority of my professional career waiting for ci to complete. It is hell.
Slap your problem in an agentic loop, this becomes the following single step:
1. Here's your goal "...", fix it, jj squash and git push, run gh pr checks --watch --fail-fast pr-link
This feels like a rage bait.
As a PM trying to understand what was happening, it took me a minute. Turning this into an analogy so that it might benefit other non-too-techies:
Original plan: You have an oven in another building that automatically bakes your cake. But the oven needs a mixer, and every time you bake, it has to wait 2-3 minutes for someone to bring the mixer from storage.
Problem: For one specific oven (Linux ARM), the mixer never arrives. So your cake fails. You keep trying different ways to get the mixer delivered. Each attempt: 2-3 minutes wait.
What you finally do: Stop waiting for the mixer to be delivered. Just mix the batter at home where you already have a mixer. Send the pre-mixed batter to the other building. Now the oven just bakes it - no waiting for the mixer.
Translation: Stop trying to generate files in GitHub Actions (where it takes 2-3 minutes each time). Generate them locally on your computer where you already have the tools. Upload the finished files. GitHub Actions just uses them.
Sometimes "pre-mix the batter at home" beats "wait for the mixer every single time."
This happens when there’s too much logic in the build server.
I think this post accurately isolates the single main issue with GitHub Actions, i.e. the lack of a tight feedback loop. Pushing and waiting for completion on what's often a very simple failure mode is frustrating.
Others have pointed out that there are architectural steps you can take to minimize this pain, like keeping all CI operations isolated within scripts that can be run locally (and treating GitHub Actions features purely as progressive enhancements, e.g. only using `GITHUB_STEP_SUMMARY` if actually present).
Another thing that works pretty well to address the feedback loop pain is `workflow_dispatch` + `gh workflow run`: you still need to go through a push cycle, but `gh workflow run` lets you stay in development flow until you actually need to go look at the logs.
(One frustrating limitation with that is that `gh workflow run` doesn't actually spit out the URL of the workflow run it triggers. GitHub claims this is because it's an async dispatch, but I don't see how there can possibly be no context for GitHub to provide here, given that they clearly obtain it later in the web UI.)