logoalt Hacker News

BoiledCabbagetoday at 3:10 AM8 repliesview on HN

The mock discussion still misses the real solution, which is to refactor the code so that you have a function that simple reads the file and returns json that is essentially a wrapper around open and doesn't need to be tested.

Then have your main function take in that json as a parameter (or class wrapping that json).

Then your code becomes the ideal code. Stateless and with no interaction with the outside world. Then it's trivial to test just like and other function that is simple inputs translated outputs (ie pure).

Every time you see the need for a mock, you're first thought should be "how can I take the 90% or 95% of this function that is pure and pull it out, and separate the impure portion (side effects and/or stateful) that now has almost no logic or complexity left in it and push it to the boundary of my codebase?"

Then the complex pure part you test the heck out of, and the stateful/side effectful impure part becomes barely a wrapper over system APIs.


Replies

maelntoday at 10:30 AM

Funnily enough, I am preparing a simple presentation at work to speak about exactly that. The idea of separating "logic" from I/O and side effects is an old one and can be found in many architectures (like hexagonal architecture). There is plenty of benefit doing this, but testing is a big one.

It should be obvious, but this is not something that seem to be thought in school or in most workplaces, and when it is, it's often through the lens of functional programming, which most just treat as a curiosity and not a practical thing to use at work. So I started to teach this simple design principle to all my junior dev because this is something that is actually quite easy to implement, does not need a complete shift of architecture/big refactor when working on existing code, and is actually practical and useful.

amaranttoday at 9:20 PM

I've had great success swapping in a in-memory database via dependency injection and just running 100% of the application, end to end.

in the ideal case my tests start by writing some randomised data using the external API, I then update it(if applicable) using the external API, and finally read it, also using the external API, and compare the actual result with what I expected.

I use randomised data to avoid collisions with other tests, which might cause flakiness and/or prevent running the tests concurrently. I avoid having seed data in the database if at all possible.

It's the only approach I've found that can survive a major refactor of the codebase. Anything short of breaking the external API, which is typically a no-no anyway, shouldn't break these tests.

Doing a refactor and being able to rely on the test suite for finding bugs and inconsistencies is amazing. Of course they won't find 100% of all bugs,but this way at least you know that a failing test means there's a problem in your production code.

jeroenhdtoday at 12:19 PM

The risk to that approach is that you end up writing code that cannot deal with the real world problems of I/O, such as timeouts, failed reads, jitter, and other weird behaviour.

Separating I/O from logic makes a lot of sense and makes tests much easier to write and code much easier to reason about, but you'll still need to implement some sort of mocking interface if you want to catch I/O problems.

show 1 reply
majormajortoday at 4:53 AM

> Then the complex pure part you test the heck out of, and the stateful/side effectful impure part becomes barely a wrapper over system APIs.

In practice the issues I see with this are that the "side effect" part is usually either: extensive enough to still justify mocking around testing it, and also intertwined enough with your logic to be hard to remove all the "pure" logic. I rarely see 90-95% of functions being pure logic vs side effects.

E.g. for the first, you could have an action that requires several sequenced side effects and then your "wrapper over APIs" still needs validation of calling the right APIs in the right oder with the right params, for various scenarios. Enter mocks or fakes. (And sometimes people will get clever and say use pubsub or events for this, but... you're usually just making the full-system-level testing there harder, as well as introducing less determinism around your consistency.)

For the second, something like "do steps I and J. If the API you call in step J fails, unwind the change in I." Now you've got some logic back in there. And it's not uncommon for the branching to get more complex. Were you building everything in the system from first principles, you could try to architect something where I and J can be combined or consolidated in a way to work around this; when I and J are third party dependencies, that gets harder.

show 1 reply
retrodaredeviltoday at 4:23 AM

I agree with you, however convincing an entire team of devs to explicitly separate the interface of impure parts of code is very difficult.

If you introduce a mocking library to the test portion of the codebase, most developers will start to use it as a way to shortcut any refactoring they don't want to do. I think articles like this that try to explain how to better use mocks in tests are useful, although I wish they weren't necessary.

cyanydeeztoday at 9:16 PM

now you have N+1 tests!

wry_discontenttoday at 4:13 PM

This sounds great until one of your other functions calls that function.

You're just describing dependency injection, but if you say that, people won't want to listen cause doing that all the time sucks.

01HNNWZ0MV43FFtoday at 5:45 AM

"sans-I/O" is one term for that style. I like it a lot but it's not a free lunch.

show 2 replies