I already talked about it above. Main problems with passing dependencies in function argument list...

kgeist • 01/21/2025 • 4 replies • view on HN

I already talked about it above.

Main problems with passing dependencies in function argument lists:

1) it pollutes the code and makes refactoring harder (a small change in one place must be propagated to all call sites in the dependency tree which recursively accept user ID/tenant ID and similar info)

2) it violates various architectural principles, for example, from the point of view of our business logic, there's no such thing as "tenant ID", it's an implementation detail to more efficiently store data, and if we just rely on function argument lists, then we'd have to litter actual business logic with various infrastructure-specific references to tenant IDs and the like so that the underlying DB layer could figure out what to do.

Sure, it can be solved with constructor-based dependency injection (i.e. request-specific service instances are generated for each request, and we store user ID/tenant ID & friends as object fields of such request-scoped instances), and that's what we had before switching to contexts, but it resulted in excessive allocations and unnecessary memory pressure for our highload services. In complex enterprise code, those dependency trees can be quite large -- and we ended up allocating huge dependency trees for each request. With contexts, we now have a single application-scoped service dependency tree, and request-specific stuff just comes inside contexts.

Both problems can be solved by trying to group and reuse data cleverly, and eventually you'll get back to square one with an implementation which looks similar to ctx.Context but which is not reusable/composable.

>Including logger.

We don't store loggers in ctx, they aren't request-specific, so we just use constructor-based DI.

Replies

TeMPOraL • 01/21/2025

I believe this problem isn't solvable under our current paradigm of programming, which I call "working directly on plaintext, single-source-of-truth codebase".

Tenant ID, cancellations, loggers, error handling are all examples of cross-cutting concerns. Depending on what any given function does, and what you (the programmer) are interested in at a given moment, any of them could be critical information or pure noise. Ideally, you should not be seeing the things you don't care about, but our current paradigm forces us to spell out all of them, at all times, hurting readability and increasing complexity.

On the readability/"clean code", our most advanced languages are operating on a Pareto frontier. We have whole math fields being employed in service of packaging up common cross-cutting concerns, as to minimize the noise they generate. This is where all the magic monads come from, this is why you have to pay attention to infectious colors of your functions, etc. Different languages make slightly different trade-offs here, to make some concerns more readable, but since it's a Pareto frontier, it always makes some other aspects of code less comprehensible.

In my not so humble opinion, we won't progress beyond this point until we give up on the paradigm itself. We need to accept that, at any given moment, a programmer may need a different perspective on the code, and we need to build tools to allow writing code from those perspectives. What we now call source code should be relegated to the role of intermediary/object code - a single source of truth for the bowels of the compiler, but otherwise something we never touch directly.

Ultimately, the problem of "context" is a problem of perspective, and should be solved by tooling. That is, when reading or modifying code, I should be able to ignore any and all context I don't care about. One moment, I might care about the happy path, so I should be able to view and edit code with all error propagation removed; at another moment, I might care about how all the data travels through the module, in which case I want to see the same code with every single goddamn thing spelled out explicitly, in the fashion GP is arguing to be the default. Etc.

Plaintext is fine. Single source of truth is fine. A single all-encompassing view of everything in a source file is fine. But they're not fine all together, all the time.

➕ show 1 reply

youerbt • 01/21/2025

> it violates various architectural principles, for example, from the point of view of our business logic, there's no such thing as "tenant ID"

I'm not sure I understand how hiding this changes anything. Could you just not pass "tenant ID" to doBusinessLogic function and pass it to saveToDatabase function?

➕ show 2 replies

zbentley • 01/24/2025

I've worked on (and variously built and ripped out) systems like that, and I end up in the "more trouble than it's worth" camp here. Context-ish things do have considerable benefits, but the costs are also major.

If context isn't uniform and minimal, and people can add/remove fields for their own purposes, the context becomes a really sneaky point of coupling.

Adapting context-ful code from a request-response world to (for example) a parallel-batch-job world or continuous stream consumer world runs into friction: a given organization's idioms around context usually started out in one of those worlds, and don't translate well to others. If I'm a worker thread in a batch job working on a batch of "move records between tenant A and tenant B" work, but the business logic methods I'm calling to retrieve and store records are sensitive to a context field that assumes it'll be set in a web request (and that each web request will be made for exactly one tenant), what do I do? If your business is always going to be 99% request/response code, then sure, hack around the parts that aren't. But if your business does any continuous data pipeline wrangling, you rapidly end up with either a split codebase (request-response contextful vs "things that are only meant to be called from non-request-response code") or really thorny debugging around context issues in non-request-response code.

If you choose to deal with context thread-locally (or coroutine locally, or something that claims to be both but is in reality neither--looking at you, "contextlib"), that sneaky context mutation by the concurrency system multiplies the difficulties in reasoning about context behavior.

> it violates various architectural principles, for example, from the point of view of our business logic, there's no such thing as "tenant ID"

I think a lot of people lose sight of how incredibly useful explicit dependency management is because it's classed as "tight coupling" and "bad architecture" when it's nothing of the sort. I blame 2010s Java and dependency inversion/injection brainrot.

Business logic is rarely pure; most "business" code functions as transforming glue between I/O. The behavior of the business logic is fundamentally linked to _where_ (and often _how_ as well--e.g. is it in a database transaction?) it interacts with datastores and external services. "Read/write business code as if it didn't have side effects" is not a good approach if code is _primarily occupied with causing side effects_--and, in commercial software engineering, most of it is!

From that perspective, explicitly passing I/O system handles, settings, or whatnot everywhere can be a very good thing: when reading complex business logic, the presence (or absence) of those dependencies in a function call tells you what parts of the system will (or can) conduct I/O. That provides at-a-glance information into where the system can fail, where it can lag, what services or mocks need to be running to test a given piece of code, and at a high level what data flows it models (e.g. if a big business logic function receives an HTTP client factory for "s3.amazonaws.com/..." and a database handle, it's a safe bet that the code in question broadly moves data between S3 and the database).

While repetitive, doing this massively raises the chance of catching certain mistakes early. For example, say you're working on a complex businessy codebase and you see a long for-loop around a function call like "process_record(record, database_tenant_id, use_read_replica=True, timeout=5)"? That's a strong hint that there's an N+1 query/IO risk in that code, and the requirement that I/O system dependencies be passed around explicitly encodes that hint _semantically_.

That kind of visibility is vastly superior to "pure" and uncluttered business logic that relies on context/lexicals to plumb IO around. Is the pure code less noisy and easier to interpret? Sure, but the results of that interpretation are so much less valuable as to be actively misleading.

Put another way: business logic is concerned with things like tenant IDs and database connections; obscuring those dependencies is harmful. Separation of concerns means that good business code is code that avoids mutating, or making decisions based on, the dependencies it receives--not that it doesn't receive them/use them/pass them around.

giancarlostoro • 01/21/2025

I have a feeling, if Context disappears, you'll just see "Context" becoming a common struct that is passed around. In Python, unlike in C# and Java, the first param for a Class Method is usually the class instance itself, it is usually called "self" so I could see this becoming the norm in Go.

➕ show 1 reply

alt Hacker News

Replies