I tend to mentally divide code into roughly two types: "computational" and "plumbing".
Computational code handles your business logic. This is usually in the minority in a typical codebase. What it does is quite well defined and usually benefits a lot from unit tests ("is this doing what we intended"). Happily, it changes less often than plumbing code, so unit tests tend to stay valuable and need little modification.
Plumbing code is everything else, and mainly involves moving information from place to place. This includes database access, moving data between components, conveying information from the end user, and so on. Unit tests here are next to useless because a) you'd have to mock everything out b) this type of code seems to change frequently and c) it has a less clearly defined behaviour.
What you really want to test with plumbing code is "does it work", which is handled by integration and system tests.
I've seen this concept called by many names including CQS[0] and Functional Core, Imperative Shell[1]. I'm just leaving this comment here for those that are interested in reading more.
Functional / imperative doesn't exactly map on to these two concepts. "Computational" is often imperative and integration code isnt always imperative (a lot of react code would fit in this box, for instance).
This is always a fun corner of programming terminology to me.
A ton of dense, mathy code like hash computation, de/serialization, sin/cos computation, etc. is usually best implemented in a memory efficient C-style way but lends itself to be used in a very functional way; inputs and outputs without any retained state or side effects.
I think that subtlety is hard to articulate and gets lost.
I agree, but if you watch the linked screen cast you can see that its just Gary Bernhardt's take on Hexagonal/Clean/Onion Architecture.
The idea that your business logic should be isolated from external dependencies (in his case, by making the code (pure) functional). That makes it easy to unit test the business logic, and your integration tests should be minimal (basically testing a single path to make sure everything is talking to each other).
Cheap integration tests is usually an oxymoron (when cheap refers to the tightness of the developer feedback loop).
Gary was coming from the land of Ruby-on-rails where a full set of integration tests could take hours. In that environment, structuring your code to enable easy testing of complex logic makes a lot of sense.
Likewise in a large enterprise environment, where integration testing across a (usually messy) set of interconnected dependencies is a pipe dream.
It's true that over-architecting is something to be wary of, but as usual, there's no one-size-fits-all answer.
One of the benefits of writing tests is that it makes it painfully obvious which parts of your codebase are poorly architected. Difficulty in writing tests is a code smell.
That's because unit tests couple tightly to your code. If you're trying to couple something additional to your already tightly coupled code it's gonna be painful.
It's a really expensive way of discovering that you wrote shit code.
"Computational" code that isn't by the vernacular definition of "functional"--and you can write functional code regardless of programming language--is something of a red flag to me.
Operate only on your inputs. Return all of your outputs. No side effects.
Abstract code solves made-up problems, while concrete code solves real ones. Normally the best way to solve a real problem is by rewriting it as a series of made-up problems, and solving those made-up ones instead.
The made-up problems don't need to be pure computational. Instead, if you restrict them to pure ones, you'll lose a lot of powerful ones. They also don't need to fit functional programming well, but there is no loss of generality on imposing that restriction.
Also, the more abstract you make that code, the less they'll need changing and the better unit tests will fit. At the extreme, once debugged they'll never change. Instead, if your needs change too much, your concrete programs will simply stop using them and use some completely different ones.
I think the problem with this line of thinking is that, most often, the difficulty lies exactly in rewriting the real problem into the made-up problems. So, to have useful tests, you need to check if your made-up problem solutions actually solve the real problem, which is difficult to express in the first place.
For example, let's say you want to get some users from your DB in response to an HTTP call. We rewrite this problem in terms of crafting some SQL query, taking some data from the HTTP request to create that query. We can of course easily test that the code creates the query we designed that the query contains the right information from the HTTP request etc. But, if we don't actually run the query on the actual DB with the actual users, we don't really know if our query does the right thing, even if we know our code creates the query we intended. And, if the DB changes tomorrow, our very abstract code that parametrizes a particular SQL query will still need to change, so our existing unit tests will be thrown away as well.
This is the kind of plumbing code the OP was talking about, and I don't think you can reduce the problem in any way to fix this (especially if the DB is an external entity).
There's nothing abstract about that query. You can easily confirm it by looking how you described it exclusively by business terms.
Instead, it's the most concrete component on your comment, and it's not prone to unit testing in any way.
> I tend to mentally divide code into roughly two types: "computational" and "plumbing".
I agree with this and would go even further. Divide your code into "stateless" functional code and "stateful" objects code.
Original OO was encapsulating things like device drivers that did I/O--it didn't represent data.
If you don't interleave your stateless business logic with your stateful persistence, it's easy to mock "objects" that do the plumbing, and all the meat of the program is unit tests.
Fwiw, the DI model (Guice, Spring, etc.) in modern Java/Scala shops closely hews to this, even if people don't mentally categorize it as such.
> Divide your code into "stateless" functional code and "stateful" objects code.
IINM you are basically referring to the difference between static and instance methods in languages like C++ and Java.
Putting code that neither reads nor writes the object state and instance methods is a common mistake made in both those languages.
That said, both stateful and stateless code are good candidates for unit testing, especially when the code under test is a state machine, rather then just a data encapsulation mechanism.
Nah brah. (I'm gonna say "brah" because I'm feeling especially salty)
Your _program_ should have the flow of a function. At the architectural level, who-the-ef cares about about static vs instance methods in Java (I say as a person with 23 years of Java experience.) It has nothing to do with languages. You can do this in any language you want.
You want to have your inputs go through a process where you have (1) INPUT state transfer, (2) some computation F(INPUT), (3) some output and state transfer, or RESULT = F(INPUT).
If you do not have (1) or (3)--I hate to break it to you--but all your program does is burn CPU. If you don't have (2), your program does nothing at all.
The key thing with scalable systems is they manage complexity well. If you're at the level where you're worried about "static or instance methods", you're not dealing with how data changes in large systems at all. Those words are at the level of state within a language.
> At the architectural level, who-the-ef cares about about state vs instances methods in Java
Who-the-ef should care is anyone who has to implement or maintain the code. After all, the debate at hand is what is worth unit testing, which very much concerns the programming language and the actual implementation. Don't know about you, but I both architect the system and write the code.
> If you do not have (1) or (3)--I hate to break it to you--but all your program does is burn CPU.
I haven't written production code that doesn't have (1) or (3) in my 25 years of programming, so not sure who you are talking to here.
> If you're at the level where you're worried about "static or instance methods", you're not dealing with how data changes in large systems at all. Those words are at the level of state within a language.
You have to tend to this stuff at both the generic data processing and language level. Using a given language's constructs for differentiating between stateful and stateless code is an important part of making the code document itself.
This kind of categorising I've always found to be orthoganal to what should really be the measure of "does it deserve a [unit] test?" . I believe the correct way to assess how much (if any) automated testing at whatever level is decided only by how valuable that thing, and the inverse of the impact of that thing going wrong, is.
If you are writing on one shot script to transmute data from one format to another for say an upgrade, I don't care if you have unit tests if I am confident it has been manually tested to satisfaction. No repeatability, no regression requirement. There could and likely is value in TDD so tests might still be a thing if that is how you work. No objection there.
If you are developing the plumbing code that will ensure my system adheres to financial regulations and, if it were to break, land me in jail for negligence, you can be damn sure I'm demanding a test that will be run everytime that system is built/deployed.
I wrote unit tests >10 years ago for formatting a string for postal codes that I know are still run to this day on every commit because if they get it wrong there is legal recourse for the company that owns that system.
It's also super quick to fix and failing at build is quicker and cheaper than failing in prod, even without the recourse. That test took me all of 1 minute to write. Bargain.
> I wrote unit tests >10 years ago for formatting a string for postal codes that I know are still run to this day on every commit because if they get it wrong there is legal recourse for the company that owns that system.
If it's critical for your business I'd categorize that as business logic, not plumbing code, well deserving of unit test coverage.
> I believe the correct way to assess how much (if any) automated testing at whatever level is decided only by how valuable that thing, and the inverse of the impact of that thing going wrong, is.
Unit tests and automated tests are two completely different concept.
Yes I agree as well. Our company uses Spring to write banking software and there is rarely a case that involve purely logic that can be separated from its dependencies. I used to try isolating code into separate methods that took no dependencies but it just made the code harder to read. Now we just test invoking the grpc endpoints and include the db (with rollback) and it works quite well.
The problem is that doBusinessLogic(a) is often entirely about transforming a into whatever the current DB accepts. Sure, you can write a test to check that b.Field_old == a["field"] , but this buys you very little. The real question is whether you should have mapped a["field"] or a["oldFields"]["Field"] to b.Field_old, and your unit test is not going to tell you that, you need an integration test to actually verify that you made the right transformations and you're getting the correct responses.
By all means, if the transformation is non-trivial, and it is captured entirely in the logic of this method, not in the shape of the API and the DB, then you should unit test it (e.g. say you are enforcing some business rules, or computing some fields based on othee fields). But if you're just passing data around, this type of testing is a waste of time (you don't have reasons to change the code if the API or DB don't change, so the tests will never fail), and brittle (changes in the API or in the DB will require changing both the code and the tests, so the tests failing doesnt help you find any errors you didn't know about).
> The real question is whether you should have mapped a["field"] or a["oldFields"]["Field"] to b.Field_old, and your unit test is not going to tell you that, you need an integration test to actually verify that you made the right transformations and you're getting the correct responses.
So I would argue you don't actually have business logic then. Your service is anemic, and you have a data transformation you need to do. I definitely think that you should do an integration test for that.
Moving JSON -> Postgres or whatever is something that you absolutely still can test with the output of the DML statement by your DB library. It may be a silly test, but that's because if there's no business logic, it's a silly program _shrug_.
While it's bad form to reply to your own post, I might add this is just what a function is in the large, but you're viewing your program this way.
a <- readFromApi ( Input x )
b <- doBusinessLogic(a) ( f(x) )
c <- writeToPersistence(b) ( Output y = f(x) )
You can also imagine that there are more than one lookup from the db or service calls as I/O in different parts of the pipeline (g(f(x) etc.), but it's always possible to have state pulled in explicitly and pushed down explicitly into business logic as an argument. It tends to make programs have flatter call stacks as well.
The amount of effort spent finding errors before you ship it has to be related to to cost of fixing errors including the consequences of the errors if they're found after you ship.
If errors in your system result in death, and if changes must go through an expensive and time consuming process to be approved, and then an expensive and time consuming process to be applied, you should spend a lot of time ensuring your design is sound, and your implementation matches your design. A good place for formal methods.
If you're writing server side code, and deploy takes 5 minutes, you can be a cowboy for most things that won't leave a persistant mess or convince customers to leave.
If you're writing client side code that needs to go through a pre-publication review, neither cowboy or formal methods is a good choice.
Yes! I do something similar which is sometimes referred to as functional core imperative shell. My goal is to put as much code as possible in the computational/functional part. This part is easy to test since it's pure. The remaining plumbing/imperative part has much less code, less dependencies, and less logic, which as you say doesn't need unit tests anymore. It needs less dependency injection as well, which is a huge bonus.
You should still be careful that your pure logic is actually doing something by itself, rather than just massaging data from one external format to another external format.
A lot of code can be in this are where it is absolutely unit testable, but the unit tests are almost entirely useless, as the code only ever changes because the input or output types change, so the tests also need to change.
I think of this in terms of code that is 'authoritative' for its logic or not.
For example, a sorting method is authoritative - it is the ultimate definition of what sorting means. Also, a piece of code that validates some business rule defined in a document is the authority for that business rule.
But a piece of code that takes input from the user and passes it to some other piece of code is not authoritative for this transformation. The functionality of this kind of code is not defined by some spec, but by 'whatever the other piece of code wants to receive', which may be arbitrarily hard to define.
Depending on the complexity of the transformation, there may still be reasons to test parts of this code, at least to ensure that a new field here doesn't affect the way we transform that other field there, but often only small pieces of it are actually worth testing.
This has been a problem for me with my quarantine project. It's little more than a CRUD app: get some data, download it, display it on the screen. There's practically no business logic to it; the entire project is wiring up various XaaS. By the time I mocked everything that needed to be mocked, I'd have put more effort into mocking than the project itself.
I test the parts that are actually mine as best I can, but most of my debugging consists of driving it by hand.
This is sort of what I've done with some success in developing games. Games in general are grossly under-tested, but there are a few good reason for that. Lots of systems can be effectively tested by just playing, and often it's tough to tease out as small of units for useful isolated testing as you would in other types of programs.
What I've been doing is writing as many parts of the game as libraries as is possible, and then implementing the minimal possible usage of that library as a semi-automated test. For instance, our collision system is implemented as a library, and you can load up a "game" that has the simplest possible renderer, no sound, basic inputs, etc. and has a small world you can run around in that's filled with edge cases. This was vastly easier than trying to write automated tests for 3d collision code, and you get the benefit of testing the system in isolation, if not automatically. For other libraries like networking, the tests are much more automated, but they poke the library as a unit, rather than testing all the little bits and pieces individually.
I really wish I had come up with this, it really neatly captures my experiences and how sometimes unit tests were really useful (Developing a (Benefit) Claims Engine which essentially did a bunch of complex calculations and then spit data out) whereas other times, unit tests just feel like a massive chore with mocks and similar stuff that add little to no value and certainly should've been at a higher level (integration or system tests) but the powers that be wanted coverage.
> I tend to mentally divide code into roughly two types: "computational" and "plumbing".
I think of the "computational" type more as a "deterministic data transformation" type. That applies to transformations of any data whether text, images, or the state of a machine.
I think of plumbing as the movement of data without any transformation, or if a transformation occurs, it occurs at and abstracted layer that must be unit tested itself independently.
My thoughts exactly. Unit tests are a huge help in computational-heavy portions of a project and are easy to write. The other areas of a project don't benefit as much and the tests are harder to write and keep maintained.
business logic does not mean it has to be for a business. It's more like calling the pointy part of a spear the "business end". It's the part that does the job.
Business Logic is a euphemism. It doesn't mean literally business logic, it means the 'core functionality of your code.' When you design software, you typically model some real world process or system in the abstract. Business logic is the core problem of your model. You can also call it model code, or core functionality. It all means the same thing - it's the important part of your app.
Using the old Asteroids arcade game [1] as an example: The business logic is how many lives the player has, what happens when you shoot asteroids (they break up, or disintegrate if they're small), what happens when you reach the edge of the map (you wrap around the other side), what kind of control scheme there is (there's momentum in asteroids, you don't stop on a dime) etc.
"domain logic" might be a better euphemism. Consider a library that encrypts text with AES-256. You might want unit tests that verify the IV, cypher block, plaintext and encrytped text (result) of that function. The method, "encrypt" might be your "business logic" that ought to be unit tested.
"Business logic" is just another name of "logic". E.g. something like "if X is even, then print 'fizz' otherwise print 'fuzz'" is considered business logic.
I can't believe I'm wasting my time on another testing debate.
Speaking as a formerly young and arrogant programmer (now I'm simply an arrogant programmer), there's a certain progression I went through upon joining the workforce that I think is common among young, arrogant programmers:
1. Tests waste time. I know how to write code that works. Why would I compromise the design of my program for tests? Here, let me explain to you all the reasons why testing is stupid.
2. Get burned by not having tests. I've built a really complex system that breaks every time I try to update it. I can't bring on help because anyone who doesn't know this code intimately is 10x more likely to break it. I limp to the end of this project and practically burn out.
3. Go overboard on testing. It's the best thing since sliced bread. I'm never going to get burned again. My code works all the time now. TDD has changed my life. Here, let me explain to you all the reasons why you need to test religiously.
4. Programming is pedantic and no fun anymore. Simple toy projects and prototypes take forever now because I spend half of my time writing tests. Maybe I'll go into management?
5. You know what? There are some times when testing is good and some times where testing is more effort than it's worth. There's no hard-set rule for all projects and situations. I'll test where and when it makes the most sense and set expectations appropriately so I don't get burned like I did in the past.
One of the dark arts of being an experienced developer is knowing how to calculate the business ROI of tests. There are a lot of subtle reasons why they may or may not be useful, including:
- Is the language you're using dynamic? Large refactors in Ruby are much harder than in Java, since the compiler can't catch dumb mistakes
- What is the likelihood that you're going to get bad/invalid inputs to your functions? Does the data come from an internal source? The outside world?
- What is the core business logic that your customers find the most value in / constantly execute? Error tolerances across a large project are not uniform, and you should focus the highest quality testing on the most critical parts of your application
- Test coverage != good testing. I can write 100% test coverage that doesn't really test anything other than physically executing the lines of code. Focus on testing for errors that may occur in the real world, edge cases, things that might break when another system is refactored, etc.
I now tend to focus on a black box logic coverage approach to tests, rather than a white box "have I covered every line of code" approach. I focus on things like format specifications, or component contract definitions/behaviour.
For lexer and parser tests, I tend to focus on the EBNF grammar. Do I have lexer test coverage for each symbol in a given EBNF, accepting duplicate token coverage across different EBNF symbol tests? Do I have parser tests for each valid path through the symbol? For error handling/recovery, do I have a test for a token in a symbol being missing (one per missing symbol)?
For equation/algorithm testing, do I have a test case for each value domain. For numbers: zero, negative number, positive number, min, max, values that yield the min/max representable output (and one above/below this to overflow).
I tend to organize tests in a hierarchy, so the tests higher up only focus on the relevant details, while the ones lower down focus on the variations they can have. For example, for a lexer I will test the different cases for a given token (e.g. '1e8' and '1E8' for a double token), then for the parser I only need to test a single double token format/variant as I know that the lexer handles the different variants correctly. Then, I can do a similar thing in the processing stages, ignoring the error handling/recovery cases that yield the same parse tree as the valid cases.
I think you missed an important one, which is: how much do bugs even matter?
A bug can be critical (literally life-threatening) or unnoticeable. And this includes the response to the bug and what it takes. When I write code for myself I tend to put a lot of checks and crash states rather than tests because if I'm running it and something unexpected happens, I can easily fix it up and run it again. That doesn't work as well for automated systems.
You should understand when those tests are low effort: Look for other frameworks that help you to develop those tests easier or frameworks that remove that requirement for you. I.e. Lambok for generation of getters/setters. You only have to unit test code that you wrote.
High test coverage comes from a history of writting tests there. Sadly people include feature and functional tests in the coverage.
There's an easier answer and that is - as an experienced programmer - don't write any tests for your 'toy project' - at least not at the start.
The missing bit in the discussion is 1) churn, and 2) a devs ability to write fairly clean code.
Early stage and 'toy' projects may change a lot, in fundamental ways. There maybe total re-writes as you decide to change out technologies.
During this phase, it's pointless to try to 'harden' anything because you're not sure what it's entirely supposed to do, other than at a high level.
Trying Amazon Dynamo DB, only to find a couple weeks in that it's not what you need ... means it probably wouldn't make sense to run it through the gamut of tests.
Only once you've really settled on an approach, and you start to see the bits of code that look like they're not going to get tossed, does it make sense to start running tests.
Of course the caveat is that you'll need to have enough coding experience to move through the material quickly, in that, no single bit of code is a challenge, it's just 'getting it on the screen' takes some labour. The experience of 'having done it already many times' means you know it's 'roughly going to work'.
I usually try to 'get something working' before I think too hard about testing, otherwise you 3x the amount of work you have to do, most of which may be thrown out or refactored.
Maybe another way of saying it, is if a dev can code to '80% accuracy' - well, that's all you need at the start. You just want the 'main pieces to work together'. Once it starts to take shape, you've got to get much higher than that, testing is the way to do that.
This is the approach I take as well, and also think about it in terms of “setting things in stone”.
When you’re starting out a project and “discovering” the structure of it, it makes very little sense to lock things in place, especially when manual testing is inexpensive.
Once you have more confidence in your structure as it grows you can start hardening it, reducing the amount of manual testing you do along the way.
People that have hard and fast rules around testing don’t appreciate the lifecycle of a project. Different times call for different approaches, and there are always trade offs. This is the art of software.
I agree with all your points. Have you looked at any strongly typed functional language from ML like Ocaml, F#, Rust, or say similar like Haskell?
If you do make a slight tweak somewhere, the compiler will tell you there’s something broken in obscure place X that you would find out at runtime say with Ruby or Python.
THATS the winning formula. I’ve written so many tests for Python ensuring a function’s arguments are validated rather than the core logic/process of it.
Not so fast. For some problems it's great, for other ones it's not.
Have you tried writing numeric or machine leaning core in Haskell? You'll notice that the type system just doesn't help you enforce correctness. Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.
> Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.
Rust's got a very Haskell-like type system, but it's a systems programming language. People are literally writing kernels in it. I think this is a pure-functional-is-a-bad-way-to-do-real-time-I/O thing, not a typing thing.
Hum... Pure functional is a bad way to do real time I/O, but my point was about types.
If you try to verify the kind of state machines that low level I/O normally use with Haskell-like types, you will gain a huge amount of complexity and probably end with more bugs than without.
Low-level I/O doesn't seem to have that much complexity, unless you're trying to handle all of the engineers' levels of abstraction at once.
Let's say you're writing a /dev/console driver for an RS-232 connection. Trying to represent "ring indicator", "parity failure", "invalid UTF-8 sequence", "keyboard interrupt", "hup" and "buffer full" at the same level in the type system will fail abysmally, but that's not a sensible way of doing it.
I could definitely implement this while leveraging the power of Rust's type system – Haskell would be a stretch, but only because it's side-effect free and I/O is pretty much all side-effects.
Really give it a go! It is beyond worldly. If you think Typescript is great, then ocaml/f# will make it look inferior.
If you're doing React + Typescript give Reasonml which is a syntax sugar on top of Ocaml that compiles using bucklescript a go. Ocaml has the fastest compiler out there.
How’s the tooling for that? Haskell has the “best” compiler and garbage tooling that should be built on top of the ol’ rolls Royce engine it’s rocking on.
Meanwhile the plugins and IDE integrations for Reason/Ocaml and F# are ready to go from the start and work pretty well.
Just a data point, with my current team, everyone jumped right in and wrote code and wrote tests from the start. The tests were integration tests that depended on the test database. Worked great at first, but then tests started failing sporadically as it grew. Turning off parallelism helped a bit, but not entirely. Stories starting taking longer too, where features entailed broad changes - it felt like every story was leading to merge conflicts and interdependency, where one person didn't want to implement their fix until someone else finished something that would change the code they were going to work on.
So then I came along and said, "hey, why don't we have any unit testing?" and it turns out because it was pretty impossible to write unit tests with our code. So I refactored some code and gave a presentation on writing testable code - how the point of unit testing isn't just to have lots of unit tests, how it's more that it encourages writing testable code, and that the point of having testable code means that your codebase is then easier to change quickly.
I even showed a simple demonstration based off of four boolean parameters and some simple business logic, showing that if it were one function, you'd have to write 16 tests to test it exhaustively, but if you refactored and used mocking, you'd only have to write 12. That surprised people. Through that we reinforced some simple guidelines of how we'd like to separate our code, focusing on pure functions when possible, making layers mockable. We don't even have a need for a complicated dependency injection framework as long as we reduce the # of dependencies per layer.
Since that time we've separated our test suite into integration tests and unit tests, with instructions to rewrite integration tests to unit tests if possible. (Some integration tests are worthwhile, but most were just because unit tests were hard at that time.) We turned parallelism back on for the unit test suite. The unit tests aren't flaky, and now people are running the unit test suite in an infinite loop in their IDE. Over that time our codebase has gotten better structured, we have less interdependence and merge conflicts, morale has improved, velocity has gone up.
Anyway, according to this article it sounds like we've done basically the opposite of what we should have done.
Sorry, nothing that's so good for the general public, but the general gist is that the goal for a test is something that is simultaneously small, fast, and reliable.
And that by following those three principles, it kind of drives you to writing testable code. Because if you don't, you might have tests that are only small (simple integration tests), or only fast and reliable (testing unfactored code with lots of mocking) - and that the only way to do all three is by refactoring to write testable code that has good layer separation and therefore minimal mocking requirements.
There was stuff in there about how mutable state and concurrency leads to non-determinism and therefore unreliable tests, which is part of what justifies pushing towards pure functions that can be easily unit tested without mocking.
Only half your time? You're doing testing wrong if it doesn't take 80% of the time ;-)
I have a love hate relationship with testing. Working for myself as a company of one, some of the benefits testing bring just don't apply. I have a suite of programs built in the style of your point (1). The programs were quick to market and hacked out whilst savings ran out not knowing if I would make a single sale.
Sales came, customer requests came, new features were wanted, sales were promised "if the program could just do xyz". More things was hacked on. The promise of "I will go back and do this properly and tidy up this god unholy mess of code" slowly slipped away that I stopped lying to myself I would do it.
Yes there was a phase of fix one problem add another, but I have most of that in my head now and has been a long time since that happened.
Not a single test. Developing the programs was "fun" and exciting. Getting requests for features in the morning and having the build ready by lunch kept customers happy.
Now I am redoing the apps as a web app for "reasons". This time am doing it properly, testing from the start. I know exactly what the program should do and how to do it, unlike the first time when I really had no idea. But still, I Come to a point and realise the design is wrong and I hadn't taking something into consideration. Changing the code isn't so bad, changing the tests, O.M.G.
I am so fed up of the project, I do all I can to avoid it, it is 2 years late, I wish I never started it. The codebase has excellent testing, mocks, no little hacks, engineering wise am proud of it. The tests have found little edge cases that would have been found out by customers so avoided that. But there is no fun in it. No excitement. Is just a constant drudging slog.
Am trying to avoid dismissing testing all together, as I really want to see the benefit of it in a production substantially code base. If I ever get there. At the moment, the code base is the best tested unused software ever written IMO
Well, then stop! Delete all the tests right now and do it however you want to do it.
The thing about testing that never really gets talked about it is, what's the penalty for regressions? What's the consequences if you ship a bug so bad the whole system stops working?
Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!
Your customers certainly don't care if you ship bugs. If it was something important enough where they REALLY cared, they wouldn't be using a company of one person.
So, go for it. Dismiss tests until you get to a point where you fear deploying because of the consequences. Then add the bare minimum of e2e tests you need to get rid of that fear, and keep shipping.
There is another cost, if you try and fix a bug and break something else. If your codebase becomes so brittle that you feel like you can't do anything without breaking something else, that makes it unbearable to keep going with that project.
Having said all that, I find that it's better to avoid doing some unit tests when building your own project. It can be better to do the high level tests (some integration, focused on system) to make sure the major functionality works. In many cases, for an app that's not too complicated, you can just have a rough manual test plan. Then move to automated tests later on if the app gets popular, or the manual testing becomes too cumbersome.
It's still good to have a few unit tests for some tricky functions that do complicated things so you aren't spending hours debugging a simple typo.
Sure. My point wasn't really whether to write unit tests or not. It's more, do what works for you / your team to enable you to ship consistently. For the OP, spending all of their time writing tests clearly isn't working for them if they haven't shipped at all.
> Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!
Human lives, customer faith in product, GDPR violations, HPPA violations, data, time/resources in space missions
> But you? You're a team of one! You rollback that bad deploy and basically no one cares!
I somehow doubt that comparing this 'team of one project' to the Mars Climate Orbiter leads to any useful conclusions. It's a nice bit of hyperbole though!
Rollbacks can create data loss. Also, rollbacks are not always a viable option.
Anyways..this was to address the issue of a bug. I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.
> Rollbacks can create data loss. Also, rollbacks are not always a viable option.
I've delivered a number of products (in the early days of my career) to clients where data loss happened and while not fun, it also didn't significantly harm the product or piss off said client. I saw my responsibility primarily to do the best I could and clearly communicate potential risks to the client.
> I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.
That I do agree with, but 'due diligence' is a very vague concept. I guess honest communication about the consequence of various choices is perhaps the core aspect?
And of course 'engineering due diligence', in my opinion, includes making choices that might lead to an inferior result from a 'purely' engineering perspective.
> not putting your engineering due diligence into delivering a product to the customer.
Yes. This is exactly what this person should do. Stop worrying about arbitrary rules and just deliver the damn product already. A hacky, shitty, unfinished product in your customer's hands that can be iterated on beats one that never got shipped at all every day of the week.
LOL. I guess I was being a bit conservative with that estimate!
I've worked for myself as well and know what you mean. In my situation, I was able to save myself from testing by telling my customers "this is a prototype so expect some issues".
My observation around codebases that weren't written with/for unit tests is that they always end up being a monolith that you have to run all of in order to run any of. Having decent code coverage means that it's at least possible to run just that one function that fails on the second Tuesday of the month when that one customer in Albania logs in.
Your points are fine, but I do not see how they apply to the blog post.
Overall, the blog post says, unit tests take a long time to write compared to the value they bring - instead (or also) focus on more valuable automated integration tests / e2e tests because it is much easier than it was 10-20 years ago.
My point is that OP is in step 1 of 5. It's not to say there aren't any good thoughts there, but the overall diatribe comes from a place of inexperience so take their advice with a grain of salt.
I don't think OP is step 1. OP is not arguing against testing, although the title could lead one into thinking that. OP is arguing for better, more reasonable testing.
OP appears to be arguing what you call step 5 of 5. They're not even saying you should never unit test, only that it should be avoided where it doesn't make sense, and that this happens more often than step-3 people like to think. Furthermore, the main direction of the article is that it's arguing for integration testing as a viable replacement for unit testing in a lot of situations, which doesn't relate to your overall point at all.
Step 5 touches on what I like to call "engineering judgment".
One of the things that distinguishes great engineers is that they make good judgment calls about how to apply technology or which direction to proceed. They understand pragmatism and balance. They understand not to get infatuated with new technologies but not to close their minds to them either. They understand not to dogmatically apply rules and best practices but not to undervalue them either. They understand the context of their decisions, for example sometimes code quality is more important and other times getting it built and shipped is more important.
As in life, good and bad decisions can be the key determiner of where you end up. You can employ a department full of skilled coders and make a few wrong decisions and your project could still end up a failure.
Some people never develop good engineering judgment. They always see questions as black and white, or they can't let go of chasing silver bullet solutions, etc.
Anyway, it's one thing to understand how to do unit tests. It's another thing to understand why you'd use them, what you can and can't get out of them, what the costs are, and take into account all that to make good decisions about how/where to use them.
This. These days I write unit tests only for functions whose mechanism is not immediately clear. The tests serve as documentation, specification of corner cases, and assurance for me that the mechanism does what it was intended to do.
I keep tests together with the code, because of their documentation/specification value.
I do not write tests for functions which are compositions of library functions. I do not test pre/post-conditions (these are something different).
And I definitely do not try to have "100% test coverage".
Personally, I fast tracked through 2-4 out of sheer laziness but that's definitely my progression in regards to testing and pretty much everything related to code quality. It includes comments, abstraction, purity, etc...
More generally:
- Initially, you are victim of the Dunning–Kruger effect, standing proudly on top of Mount Stupid. You think you can do better than the pros by not wasting time on "useless stuff".
- Obviously, that's a fail. You realize the pros may have a good reason for working the way they do. So you start reading books (or whatever your favorite learning material is), and blindly follow what's written. It fixes your problems and replace them with other problems.
- After another round of failure, you start to understand the reasoning behind the things written in the books. Now, you know to apply them when they are relevant, and become a pro yourself.
One thing I do religiously all the time is putting asserts everywhere. It's the only thing you can go crazy on. The rest is indeed always a balancing act.
Unit tests are not a goal, they are a tool. Striving for 100% test coverage is nonsense, not testing your software at all levels is bad. Middle ground and moderation are where it's at, not a black vs white choice. Just like every other tool you should understand it's strengths and weaknesses and you should apply it properly, not dogmatically or it will bite you.
I read this complementary one the other day, and one thing that is readily apparent to me is that a lot of people have a lot of different opinions about testing (move to system tests! more unit tests! regression tests!), but many are not asking the zillion dollar question:
What are you testing for?
This is critical because it basically gives you immediately what you should and should not test, and how. While mindless, dogmatic, metric oriented testing is a waste, testing with higher intent and purpose is extremely useful.
An example: test that something working on current vX also works on vA to vW, and when vZ is out, have the answer readily. Or that a biz feature fulfills the requirements. Or that someone not as well versed on intricate details of your piece of ownership will be confident in that piece still working after a simple fix when you’re on vacation. It can be one, some, but probably not all.
With that in mind, what to test, what doesn’t make sense to test, and what to test against becomes more clear: should I mock this? or should I run it against some staging environment? Should I perform (yikes? not!) manual testing?
The answers are highly dependent on the piece of code being tested.
Tests are here to help you answer a question, if you aren’t sure what the question is then your tests will miss the point.
I feel like a lot of unit testing is just another form of bikeshedding. It's easily understood, everyone can talk about it, and you can spend a lot of time on it with no clear goal but feel like you're getting something done.
100% coverage of what exactly? Tests that go through all your lines of code without testing any of the logic, is useless. If you want to be thorough, you need to do mutation testing, which is a system that tests the quality your unit tests by mutating your logic (changing a > for >=, a + for -, etc) and then expects at least one test to fail. If no test fails, that piece of logic wasn't tested.
Without that, it's entirely possible your high code coverage doesn't actually test anything meaningful. Also, this sort of logic is exactly the kind of stuff you want to unit test. All the standard plumbing boilerplate code is not something that needs to be unit tested. The logic does.
and it's not just that, for example branch coverage only says that you branched but not how, a complex branch might need more than one test to be fully, or decently, tested.
Right, essentially you have to be aware that you are testing the state space the code might end up in, which is very different from just hitting every line of code or every branch.
On that note it is a great tool during development to get a piece working without connecting it to the broader application. Test driven development gives a nice debugging context that is easier to work with. The code coverage and regression part comes as a nice bonus feature.
I’d also like to add that if you contribute code to an open source project it is extremely beneficial to have iron-clad unit tests. Since there is so many devs it would be easy for someone to accidentally break something you fixed already.
The benefit of 100% test coverage is that there are no more discussions on what to test and what not. When in doubt, test it. In larger groups of developers there are otherwise ongoing discussions if A needs to be tested or not. I have seen culture wars around this, from people who don't want to test and are in the eyes of others always testing not enough and vice versa. Especially with a diverse development force with different ages, seniority and cultural background.
It's often easier to just aim for 100% test coverage instead (with excluding some categories of files).
EDIT: I would not and did not start with 100% unit testing. But if there are ongoing culture wars and discussions didn't lead to a workable compromise, 100% test coverage worked for me and after some days test coverage was a non issue.
> just aim for 100% test coverage instead (with excluding some categories of files).
That's where the 'gaming' comes in.
The tests start just going through lines without hitting a single expect statement.
The ignore files start becoming battlegrounds in the PRs because people just exclude half the damn project.
We just have a simple rule... if you wrote code, you have to write coverage for it. If it breaks and your test doesn't catch the breakage, the bug fix goes back to you. Some people will ask "but what about what I'm working on now", you'll have to communicate that you feel your previous work was far more important.
> the bug fix goes back to you. Some people will ask "but what about what I'm working on now", you'll have to communicate that you feel your previous work was far more important.
this feels punitive, especially in the eyes of management. unless you're in a safety critical area where fully testing every code path is a hard requirement, people will eventually write bugs.
i'd rather work somewhere that recognizes defects occur and has a fast iterative process to push out new changes rather than one based on shame for having written a bug.
Even with 100% coverage, that doesn't mean you've found every bug that can possibly be found with unit tests. Your tests could always cover more inputs, more situations, etc.
Rather, you won't find bugs that you choose not to think of because you've let the "100%" number lull you into complacency, even though you know it's 100% of lines/branches, not 100% of inputs.
My problem in 40y of programming is still making bugs and those I make come from not thinking about edge cases or from wrong assumptions and not from being lulled into writing tests to meet a 100% number.
But personalities differ and if being lulled into security by writing towards a 100% number is a problem for you I would be careful, I totally agree here.
I agree that fuzz testing+linting can help you. However from my persective lower level tests help you to build trust about the software you're releasing.
If you are someone who games metrics for his benefit or hire people who are gaming metrics for their benefit, I assume yes, this metric is very easy to game as are most metrics. Metric systems are not cheater proof.
I like thinking about the trade off between a simpler rule that is mostly right vs. decisions require judgement and consensus. I think the simpler rule is usually the better side of the trade-off.
But in this case I think the cure might be worse than the disease. Tests for plumbing code often end up being brittle tests of methods getting called on mocks in the right order. People will notice that these require a lot of toil to keep them running as code changes while providing very little benefit in avoiding mistakes. People will rankle at being told that they must write these tests, which they can see are a waste of time.
I've done it both ways. I'm much happier with my work when I'm not trying to write tests that are tedious and don't seem to provide any value, in order to hit an arbitrary coverage metric. I suspect my teammates feel the same way, so on teams where I have input into the decision on this, I do not advocate for 100% coverage. It does make it harder to have the discussion of which tests should and shouldn't be written, but I think it's worth that cost.
Writing good testing code is harder than writing business code. Especially junior developers struggle with this, most often because many companies write not enough tests to learn writing good tests.
And if you're in an environment, where this is a non-issue I think thats great. Don't fix something that doesn't need to be fixed.
Yup. The debate over unit testing, being political, is a far bigger impediment to progress than the actual tests, which are a technical hurdle. It's essentially the same reason for linters and auto-formatters.
It has a side benefit that it forces devs to write testable code, which inclines them to reasonbly factored code.
"larger group of developers" is the phrase that caught my attention. Humans don't scale well. This is where microservices do become attractive. This service is owned by a small team, and that team makes these types of judgements. It may be very different from how other service is owned and maintained, and that's ok.
From my limited experience you get cross team discussions about unit testing, especially if one microservice has too many bugs in the eyes of other teams giving development a bad reputation or making working with a microservice hard. Especially if it breaks with releases and other teams get paged.
Ongoing culture wars are a recipe for demotivated staff.
Perhaps I am wrong, and I would not start with a 'diktat' for obvious reasons.
As a manager, you didn't have discussions about the level of necessary code coverage? Would be interested on how you managed unit testing without 'dictat'. How it would fit into integration testing and explorative testing. What level did developers in your deparment usually find "adequat" ? If you considered it too low, how did you raise test coverage as a manager without defining a coverage level?
Exactly, the 80-20 rule also applies to unit tests. I don't have 100% coverage on big projects, but everything I write that's meant to go in production is in TDD anyway, so there's always enough tests to prevent a junior from breaking my stuff, and I save a lot of time because thanks to TDD I don't need to manually test much, and most of the times not at all, I just wait for user feedback: it's always much easier to have code working in real conditions, if it already works in test conditions, not the other way around.
there is real world research that actually shows that the 20/80 works in unit tests, the last 20% hardly catches any issues or contribute very little to quality
And like other tools their importance is part of the entire tool-set. In many shops tight schedules, management by Product managers or people who are too removed from code cause you compromise every other principle of responsible sane coding. When this happens, unit tests are your only shield from doom. If everybody knows and allowed to write sane, good code with reasonable time to build it, the unit tests are nice to have but not a must
Actually it is pretty common to have 100% coverage with some extra redundancy too (where some things get accidentally covered multiple times). Striving for 100% is indeed nonsense, but having 100% coverage is usually accidental in clean code that you want to work, and merely a by-product of TDD.
I strive for working code. Sometimes I miss something in the TDD cycle and don’t have 100% and it is that which usually comes back to bite you.
I have never found 100% test coverage has bitten me, dogmatic or otherwise.
That heavily depends on the size of your codebase and perhaps also the language you are writing in. Writing in C++, for example, I often have switch statements in the form of:
switch(type) {
case X:
...
case Y:
...
default:
throw InternalException("Unsupported type!");
}
Now if all goes well the default case will never be covered. At some point I thought "why have this code if it's not supposed to run; let's rewrite this so we can get 100% code coverage!", and I ended up with the following code:
Now we can get 100% code coverage... except the code is much worse. Instead of an easy-to-track down exception we now trigger either an assertion (debug) or weird undefined behaviour (release) when the "not supposed to happen" inevitably does happen because of e.g. new types being added that were not handled before.
Is worse code worth getting 100% code coverage? In my eyes, absolutely not. I think good code + testing should be able to reach at least 90% typically, likely 95%, but 100% is often not possible without artificially forcing it and messing up your code and/or making it much harder to change your code later on.
This behavior occurs in internal functions and is not triggerable by the user. The only way to trigger this behavior would be to create unit tests that test small internal functions by feeding them specifically invalid input. This is possible, but I would argue this falls under "dogmatically trying to reach 100% code coverage". Testing small internal functions adds very little value and is detrimental to a changing codebase. After adding these tests every single change you make to internals will result in you needing to hunt down all these tiny tests, which adds a big barrier to changes for basically no pay-off (besides a shiny "100%" badge on Github, of course).
As always, I think the answer here is more along the lines of "it depends." It's not that uncommon of a task to make an existing function more performant, and a well thought out test suite makes that leaps and bounds easier even for small, internal functions.
It's arguable that this is a programming bug an not really recoverable, so throwing doesn't make much sense.
You can be defensive to various degrees about assertions:
1. You can just use assert() to fail in Debug and do nothing in Release.
2. You can be more defensive and define your always_assert() to fail in Release as well.
3. You can double down on the UB with hints to the compiler and provide assume(), which explicitly compiles to UB when it's triggered in Release (using __builtin_unreachable() for example).
About the organization of the if statement: I agree that the former is better, I would use assert(false) though.
Indeed it is a programming bug - but programming bugs happen. In my experience writing programs as if bugs will not happen is typically a bad idea :)
Throwing an exception here is basically free (just another switch case) and gives the user a semi-descriptive error message. When they then report that error message I can immediately find out what went wrong. Contrasting with a report about a segfault (with maybe a stacktrace), the former is significantly easier to debug and reason about.
assert_always would provide a similar report, of course. However, as we are writing a library, crashing is much worse than throwing an internal error. At worst an internal error means our library is no longer usable, whereas a crash means the host program goes down with it.
Better yet, omit that default case, so that in the future when you do add a new value to the enum, the compiler will warn you and force you to add a new case.
But I agree with your general thesis that it's just not worth getting to 100% coverage.
If it didn't catch any bugs, either during initial development or through later changes, then it bit you via wasting your time. I don't think it's fair to say that having tests necessarily makes the code under test any cleaner.
And that's exactly why you need less tests in your project that uses a DB: there's no need to test the DB because it's covered by its own tests already.
Interactions with your DB are often the most fragile piece of your code, because there's an impedance mismatch between the language you're writing and SQL. Some languages/frameworks abstract this more safely than others, though.
Interaction, in general, is where many, if not most, errors lie. Unit testing verifies that things work in isolation. But if you code up your unit tests for two components that interact, but with different assumptions, then the unit tests alone won't do you any good: X generates a map like `%{name => string()}`, Y assumes X generates a map like `%{username => string()}`. Now, hopefully that won't happen if you're writing both parts at the same time, but things like this can happen. Now your unit tests pass because they're based on the assumptions of their respective unit of code, but put it together and boom!
Exactly, though I believe there's still a thin line between testing the interaction, and testing the db itself. Just like, the difference between testing some code, and testing the language itself.
Mocking networked services is something very much worth doing because you can then check you’ve set timeouts correctly, handling incomplete or junk responses gracefully etc. Those are the kind of hidden problems that can bite you on production deployments.
"mocking networked services" is exactly what you would do when testing clients.
And it doesn't have to be a static mock. It's not too hard to inject a fuzzer in your mock service response, although that's probably left to a separate testing routine, and not part of your unit test setup. But if you have no mock for your network service, you can't fuzz it either.
Perhaps the problem isn't mocking dependencies, but trying to hide the fact that GetSolarTimesAsync needs two pieces of data to work: a date and a location.
But the original signature is just this:
public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date)
That introduces a lot of complexity:
* The SolarCalculator needs to be able to work out its own location, so it needs a LocationProvider
* SolarCalculator needs to be IDiposable since it owns a LocationProvider
* The SolarCalculator will need more methods if it ever needs to calculate the times in a different location
* If fetching the location is slow, but the application needs to calculate times for multiple dates (eg to build up a table of times), then the SolarCalculator will need an method that takes in an array of dates to be efficient
But all that could be solved by making the function take all of the arguments it needs to return its value:
public SolarTimes GetSolarTimes(DateTimeOffset date, Location location)
No location provider needed, no IDiposable, just one efficient stand-alone method.
Unit testing this is now just:
var calculator = new SolarCalculator();
var actual = calculator.GetSolarTimes(new Date(...), new Location(...));
var expected = new SolarTimes(...);
actual.Should().BeEquivalentTo(expected);
...so, perhaps the issue isn't that unit testing is a bad idea, but that code which is hard to use in a unit test might also be hard to use in a wider application? And perhaps the fix is to make the code easier to use?
Completely agree. The author is trying to blame tests for code complexity. GetSolarTimes() is a simple function evaluating an equation - treat it as such.
If your code is broken down clearly into logic and plumbing, unit testing the logic becomes super easy. It allows you to construct software using blocks you have absolute confidence in. Unit testing plumbing is harder, and that's when integration testing shines.
I agree with you and that‘s most times also my observation: Unit tests tend to show how decoupled and re-usable your code is. Is a function gets hard to test with a unit test, this points usually towards an architectural issue.
if i got a dollar every time i started testing a piece of code, realized that it was waaaay to complex, refactored the code so the tests were easier to write/understand... i would have a lot of dollars.
100% of the time, it was the right idea and the code became a lot better.
The author's tests are overly complex. Instead of gleaning the actual value of this insight, which is that you're not cleanly separating your inputs and your outputs, the author concludes that unit tests are a waste of time.
Nope. Unit tests are a tool, but writing proper unit tests and understanding the value they give you is an art and a science. It requires experience and deliberate design.
It seems that most devs (me included) learn at school to write pure functions, which is great. Then they come to the industry and all of the sudden the "parseXml" function takes a ftp port as a parameter... ("be in my case the xml was on a ftp server!")
Why there is no CS course that explains this kind of stuff?
I thought about this too when I read the article
but then I thought that it might not always be possible to rewrite the code like that in other cases.
Maybe the example used in this article was just not the best example.
var actual = calculator.GetSolarTimes(new Date(...), new Location(...));
is JUST SIMPLER and better than
public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date)
with an internal location provider as a dependency because it's easier to test.
but i think that ignores the reason why DI containers were invented in the first place and assumes that the solar calculator is just a simple entrypoint-type application, rather than being a component in real application. You might have 20 layers of THING, somewhere inside which, this solar time calculator lives and is used... and you still have to get Location from SOMEWHERE to pass it into the calculator.
so what happens when Whatever uses the location provider to get the location and pass it along needs to be tested? and through how many layers of stack do you need to pass Location before you realize that every test of every intermediate layer needs to know about location, but only for the purpose of passing it along?
I think it's a more nuanced case than you're making it seem. Beyond some level of complexity in an application, it becomes simpler to co-locate dependencies where they're actually used.
I disagree with the notion that making your code testable in isolation serves no other purpose than to write unit tests. It very specifically forces you to think about how and why each piece of code is coupled with other code, and generally requires you to make this coupling as loose as possible, to make testing in isolation possible. Loosely coupled code is also easier to reason about and easier to refactor. So testing doesn't just provide you with the value of tests, it also nudges you toward a saner architecture.
I strongly disagree for the reasons listed in the article; it induces the construction and testing of abstractions which exist solely to enable testing, and do not enable simpler reasoning.
Refactoring is even worse. Refactoring after you've split something up into multiple parts and tested their interfaces in isolation is far more work. Any refactoring worth a damn changes the boundaries of abstractions. I frequently find myself throwing away all the unit tests after a significant refactoring; only integration tests outside the blast radius of the refactoring survive.
> the construction and testing of abstractions which exist solely to enable testing
One of his examples from the article is injecting, IOC-style, the HttpClient instance into his LocationProvider class. He insists that this is a waste of time, and that the automated tests (if you have any at all), should be calling out to the remote service anyway. I can't disagree more! Hopefully you're configuring the automated tests to interact with a test/dev instance of the service and not the production instance (!). But what invariably happens is that the tests fail because the dev instance happened to be down when they ran. And they take a long time to run anyway, so everybody stops running them since they don't tell you anything useful anyway. This is even worse when the remote service is not a web service but a database: now you have to insert some rows before you run the test and then remember to delete them... and hopefully nobody else is running the same test at the same time! To be useful in any way, automated tests must be decoupled from external services, which means mocking, which means some level of IOC.
On the other hand, he also introduces the example of SolarCalculator mocking LocationProvider. I agree that that level of isolation is overkill and will unapologetically write my own SolarCalculator unit test to invoke a "real" LocationProvider with a mocked-out HttpClient, and I'll still call it a unit test. (On the other hand, the refactored designed with the ILocationProvider really is better anyway).
So I think the reason people argue about this is because they can't really agree on what constitutes a unit test. I'd rather step back from what is and isn't a unit test and focus on what I want out of a unit test: I want it to be fast, and I want it to be specific. If it fails, it failed because there's a problem with the code, and it should be very clear exactly what failed where. A bit of indirection to permit this is always worthwhile.
Maybe your unit's are too big.
Unit tests are tricky because it's about coming to a personal and team agreement on what a 'unit' of functionality is.
I find the same issue in throwing away tests when I'm writing small scale integration tests with junit. Usually I'm mocking out the DB and a few web service calls. So those tests become more volatile because their surface is exposed more. But smaller level, function and class level tests can have a really good ROI and they do push you design for testing which makes everything a bit better imo.
It's normally the opposite. Unit tests are too small.
If you unit test all of the objects(Because their all public) then refactor the organisation of those objects then all your tests break. Since you've changed the way objects talk to each other, all your mock assumptions go out the window.
If you define a small public api of just a couple of entry points, which you unit test, you can change the organisation below the public api quite easily without breaking tests.
Where to define those public apis is a matter of skill working out what objects work well together as a cohesive unit.
The notion of a public API is really more fluid in the context of internal codebases as well. It's important to maintain your contract for forwards/backwards compatibility when publishing a library for a world. When you can reliably rewrite every single caller of a piece of code, you don't have that problem.
I usually test whatever subset of code could be tested with less than about a dozen of test cases. If it's larger then test logical parts of it with mocks in the leaves. For small projects it could be usually a single controller with only some mocks on the edge of the system (database, external APIs etc.). Refactoring the code where there is one test suite per class could be a nightmare.
Unit tests test that the units do what they are supposed to do. Functional tests test that parts of the system do what it's supposed to do.
If you change the implementation for a unit, a small piece of code, then the unit test doesn't change; it continues to test that the unit does what it's supposed to do, regardless of the implementation.
If you change what the units are, like in a major refactor, then it makes sense that you would need whole new unit tests. If you have a unit test that makes sure your sort function works and you change the implementation of your sort, your unit test will help. If you change your system so that you no longer need a sort, then that unit test is no longer useful.
I don't see why the fact that a unit test is limited in scope as to what it tests makes it useless.
If a particular test never finds a bug in its lifetime (and isn't used as documentation either), you might as well as not have written it, and the time would be better spent on something else instead--like a new feature or a different test.
Of course, you don't know ahead of time exactly which tests will catch bugs. But given finite time, if one category of test has a higher chance of catching bugs per time spent writing it, you should spend more time writing that kind of test.
Getting back to unit tests: if they frequently need to be rewritten as part of refactoring before they ever catch a bug, the expected value of that kind of test becomes a fraction of what it would be otherwise. It tips the scales in favor of a higher-level test that would catch the same bugs without needing rewrites.
> If a particular test never finds a bug in its lifetime (and isn't used as documentation either), you might as well as not have written it
That's like saying you shouldn't have installed fire alarms because you didn't wind up having a fire. Also, tests can both 1) help you write the code initially and 2) give a sense of security that the code is not failing in certain ways.
> It tips the scales in favor of a higher-level test that would catch the same bugs without needing rewrites.
Writing higher level tests that catch the same bugs as smaller, more focused tests is harder, likely super-linearly harder. In my experience, you get far more value for your time by combining unit, functional, system, and integration tests; rather than sticking to one type because you think it's best.
My comment went on to say that you don't know ahead of time exactly which tests will prove useful. So you can't just skip writing them altogether. They key point is that if you have evidence ahead of time that a whole class of tests will be less useful than another class (because they will need several rewrites to catch a similar set of bugs) that fact should inform where you spend your time.
To go with the fire alarm analogy and exaggerate a little, it would work like this: you could attempt to install and maintain small disposable fire alarms in the refrigerator as well as every closet, drawer, and pillowcase. I'm not sure if these actually exist, but let's say they do. You then have to keep buying new ones since the internal batteries frequently run out. Or, you could deploy that type mainly in higher-value areas where they're particularly useful (near the stove), and otherwise put more time and money in complete room coverage from a few larger fire alarms that feature longer-lasting batteries. Given that you have an alarm for the bedroom as a whole, you absolutely shouldn't waste effort maintaining fire alarms in each pillowcase, and the reason is precisely that they won't ever be useful.
There are side benefits you mentioned to writing unit tests, of course, like helping you write the API initially. There are other ways to get a similar effect, though, and if those provide less benefit during refactoring but you still have to pay the cost of rewriting the tests, that also lowers their expected value.
To avoid misunderstanding, I also advocate a mixture of different types of tests. My comment is that based on the observation that unit tests depending on change-prone internal APIs tend to need more frequent rewrites, that fact should lower their expected value, and therefore affect how the mixture is allocated.
I get what you're saying and it makes sense to me.
> unit tests depending on change-prone internal APIs
This in particular is worth highlighting. I tend to now write unit tests for things that are getting data from one place and passing it another, unless the code is complex enough that I'm worried it might not work or will be hard to maintain. And generally, I try to break out the testable part to a separate function (so it's get data + manipulate (testable) + pass data).
Not if you rewrite/change the tests first, since you know the code currently works and you are safe to refactor the tests. Equally you are safe to change the tests to define the new behaviour, and then follow on with changing the code to make it green.
The point was to change the code structure without changing the tests (possibly to enable a new feature or other change). The challenge being when the tests are at the wrong "level", probably by team policy IME. If you change the tests, how can you be sure the behavior really matches what it was before?
Agreed. I see tests as the double entry accounting of programming, they let you make changes with a lot more confidence that you're only changing what you want and not some unexpected thing.
They're not for catching unknown bugs, they're for safer updates.
You often end up reimplementing the "comsumer" module for your code in order to test. This is problematic because a) extra work and b) that fixture layer probably doesn't behave exactly like the real caller code and c) now you have to keep those two implementations in sync.
There shouldn't be anything particularly complex in test code, limiting the extent of any "reimplementation". Moreover, if the test client is different from real clients and not "in sync" it's a good thing: unit tests that do something differently within the limits of documented acceptable behaviour expose assumptions and bugs.
For example, suppose outputs should be sorted in a certain way, but they are sorted if a client presents sorted inputs and not because they are actually checked and sorted: a test with random inputs can expose the hole.
Put another way, if your code is very difficult or complicated to unit test, you've probably abandoned best practices for the language you're writing in somewhere along the way, in the name of expediency.
Striving to make your code testable is almost always worth it. Someone might ask this guy to add some error handling to his code for example. :)
Then he will find out that by writing code, however simple, that a "works on my machine" I.e. is proven to work in a single happy path context is painful to change. Writing code that runs in multiple contexts (composed as an app or decomposed for testing) is intrinsically more easy to work with and change.
Suppose you have a unit that makes a couple of api calls to set itself up, then performs a computation, then does some sort of storage. Thinking about how you would test this might lead you to an inversion of control approach, and you might isolate side effects. The storage provider might get passed in, both to make it easy to mock, and to reduce coupling and surprise.
Mainly it reduces surprise. If you call an interface with implicit dependencies, you won’t know if why it’s breaking without debugging and making sure it sets up its dependencies properly. If you call a testable interface with explicit dependencies, you can mock out those dependencies to debug parts individually.
Unit tests have a purpose, which is mostly to protect the programmer against future mistakes. Integration and system tests protect the user against current mistakes.
I've been on projects that focused almost exclusively on unit tests and on projects that focused almost exclusively on integration tests. The latter were far better at shipping actually working code, because most of the interesting problems occur at the boundaries between components. Testing each piece with layer after layer of mocks won't address those problems. Yay, module A always produces a correct number in pounds under all conditions. Yay, module B always does the right thing given a number in kilograms. Let's put them together and assume they work! Real life examples are seldom this obvious, but they're not far off. Also note that the prevalence of these integration bugs increases as the code becomes more properly modular and especially as it becomes distributed.
I firmly believe that integration tests with fault injection are better than unit tests with mocks for validating the current code. That doesn't mean one shouldn't write unit tests, but one should limit the time/effort spent refactoring or creating mocks for the sole purpose of supporting them. Otherwise, the time saved by fixing real problems more efficiently - a real benefit, I wouldn't deny - is outweighed by the time lost chasing phantoms.
Unit tests protect you against current mistakes. They're tied to the exact implementation.
"Right now my function X should call Y on it's dependency Z before it calls A on it's dependency B.
I know that my method should do this, because this is how I designed it now.
Let me write a test and expect exactly that."
Integration and unit tests will tell you whether in the future your code will still work when you refactor.
"Okay, we rewrote the whole class containing the function. Does running my thing still end up writing ABC into that output file?"
If unit tests are tied to an exact implementation, they''ll fail on correct behavior and that's definitely wrong. It shouldn't matter whether X calls Z:Y or B:A first, whether it calls them at all, whether it calls them multiple times, whether it calls them differently. All that matters is that it gets the correct answer and/or has the same final effect.
Unit tests should be based on a module's contract, not its implementation. This is in fact exactly what's wrong with most unit tests, that they over-specify what code (and all of its transitive dependencies) must do to pass, while by their nature leaving real problems at module interfaces out of scope.
a) Most code in the wild doesn't have an explicit output and instead is orchestration code.
b) Even if you have an output, it's dependent on more complex input of arbitrary types.
Assume that there's a method that returns an input based on summing the output of a method call of it's abstract dependencies.
To do dogmatically correct unit testing you'd pass those 2 mocked dependencies, and have those methods return the values when the right method is called on them.
Then you'd assert that B was called on A, that D was called on C, and that the method under test returns the sum of those returns.
As soon as you move into passing implementations of those 2 dependencies, to anyone dogmatic you're doing integration testing.
Even if the tester isn't being dogmatic, in a lot of cases these inputs are complex enough that building enough actual inputs that are consistent and realistic to cover all the cases is prohibitively costly, so they opt for mocks.
Now, suddenly you just have more code to maintain when making changes, but you feel good about yourself.
The interface on our object (O) that you are describing is:
O -> int
Your unit test is concerned with narrowing the interface above to:
O -> int // of specific value based on dependencies
If Os only dependencies are A and C, this can be rewritten to:
A -> C -> int // of specific value
Of course if we assume both A and C, themselves, have dependencies we can recursively rewrite the above until we have a very long interface, but instead you have opted to mock (M) them:
M(A) -> M(C) -> int // of specific value
You then take it a step further and mock the method calls on each to return a specific value:
M(A) -> int
M(B) -> int
becomes:
M(A) -> 3
M(B) -> 5
Okay. Now we can rewrite our interface to:
3 -> 5 -> int // of specific number
and our test to:
3 -> 5 -> 8
and make our assertion that the result is indeed the sum of the inputs (not to mention the ridiculous assertions that specific methods were called within the implementation). Yikes... No wonder OOP gets a bad wrap. All that for what amounts to a `sum` function.
The designer of the above monstrosity could learn a lot from the phrase "imperative shell, functional core". It sounds like dogma until you are knee deep in trying to test the middle of a large object graph!
No, unit tests aren't tied to the current implementation, they're tied to the current interface. If your programming interface calls for multiple interdependent objects without central coordination, then yes, you should test that. But I would say that you've already started out with code that is too badly structured to allow for testing the units in isolation: you should be able to unit test A without relying on Z at all.
It's integration testing that validates that all your units still combine (integrate) into a working end product. That's not about testing your implementation nor your internal interfaces, that's about testing your program's inputs and outputs.
All tests protect the programmer against future mistakes. All tests are a protection against regressions.
But yes agreed, integration tests absolutely carry much more value than any unit tests might. Specifically because units tests tend to target things that are essentially implementation details.
The only time I'd say unit tests carry any value is if they're testing some especially important piece of business logic e.g. some critical computation. Otherwise, integration tests rank the highest in the teams I lead.
One interesting thing that is easy to notice about all of the examples of in the article is that they are absolutely infested with objects.
I don't have anything against objects, per se, but I think they tend to make unit testing much more difficult to accomplish. The closer your code resembles pure functions, the easier it is to do dependency injection and unit testing.
> There isn't pure functions the moment you touch any kind of IO.
You can get pretty far with good abstractions and dependency injection. Go's io::Reader and io::Writer interfaces are a great example of this. The resulting functions aren't pure in a technical sense, but they're pretty easy to unit test none the less.
> Plus the same problem arises with modules instead of objects, which traditionally are even harder to customize.
Maybe you could elaborate. I really don't understand what you mean here.
From what I understand, modules just scope names, they don't maintain state. I don't see how they have the same problems as objects.
> Which goes back to the article's point of having to write code that is unit test friendly.
> Now architecture decisions have to integrate interfaces that wouldn't be needed otherwise.
You're not wrong.
But in the context of functions, that doesn't seem to me to be particularly onerous. If the worst I'm forced to do is change the type of my parameters to an interface instead of a concrete type, that seems like a pretty small price to pay for easy testability. Certainly a much smaller price than the examples in the article.
That's how a lot of great C code is written anyway. A C library should abstract out logging, allocation, and IO so that the client code can change them out if need be.
The fact that it makes unit testing easier is just icing on the cake.
Agreed, and this goes back to the initial thread that just because a language is more focused on functions it doesn't make testing automatically better, unless it was written with testing friendliness as part of the requirements.
For C, I've found it's not a test friendliness thing though; the great C libraries were doing this before unit testing made it's way into their codebases. They dependency inject IO, memory allocation, and logging because they have no idea what you as the end user are going to be using for those. So you pass all that in on an env struct when you initialize the library.
You probably want it rigged up to your own logger instead of just blindly writing to stdout. You probably want the library's allocations tagged somehow on the heap so you can track down memory leaks. You probably don't want it doing IO directly, because of how many different way there are to do IO.
It's all more a function of how incredibly varied c envs are, than design for testability. It just happens to be very testable as an aside.
In my experience mock objects can be brittle. A few sprinkled in judiciously can be ok, but once the density gets high enough, it starts to feel like the test becomes decoupled from the actual code it's supposed to test.
Agree. To add to this, many unit tests that you might have to do become obsolete with a strongly typed functional language. At that point you’re basically only integration testing the API boundaries / external interfaces.
Early in my career I saw a large legacy project that was riddled with bugs turned around after a senior developer insisted on having unit tests. No one else believed in the value of unit testing, so he added them on his own in his free time. Occasionally another developer would push up some code that broke the senior developer's tests, and he gradually got the upper hand because he now had proof that his tests were finding real problems.
Everyone started writing unit tests, and the code broke less. Developers became more confident in deploying, and eventually most PRs looked roughly the same: 10-20 line diff on the top, unit tests on the bottom. If there were no tests, the reviewer asked for tests. It became a fun and safe project to work on, rather than something we all feared might break at any moment.
I've since started insisting on having them as well, especially when I'm using dynamically typed languages. A lot of the tests I write in Python for example are already covered in a language like Go just by having the type system.
I programmed the first 10 years of my life in compiled statically typed languages (C, C++, Java, etc), then I needed to start programming in Ruby for production environments and initially I felt "naked"; I felt so insecure when building something and not having it compiling successfully. That's when I really got into Unit Tests, bugs as stupid as "vlue" instead of "value" typos can plague your codebase in languages like javascript, python, ruby, etc; and testing is the only way to find them (other than... in production errors).
Functional Code with no side effect should be unit tested.
Integration Code which glues various components together should have integration tests. If you feel like you need unit tests but have to create too many mocks, you have merged functional and integration code, separate them out.
We initially only had integration tests, because many people think they're better. I get it: itests use the real plumbing, so they're more representative of your runtime. But they're slooooow -- especially the tests that involve the DB (which is most of our itests).
So we started adding unit tests. Utesting code that wasn't written for utests is painful: you often need to choose between refactoring or just patching the hell out of it. The latter is highly undesirable, since it leads to verbose tests, failures when you move a module, and the inability to do blackbox testing.
But utests encourage our new code to be clean and readable. We've found that functional programming is much easier to test than object-oriented, and is easier for engineers to grok. We just sprinkle a little dependency injection and the whole thing works nicely.
Itests have their place, but utests lead to faster feedback and more readable code.
Unit tests are an easy path to fall down, because they're clearly easier to setup, to write for, require less effort to maintain, execute more quickly.
But you don't realise their significant downside until after you attempt a major refactor - you begin to see that unit tests are testing at the layer that changes the most anyway.
Weird that you started using a functional approach, noticed that it’s easier to unit test, and drew the conclusion that unit testing is what led to more readable code. Consider that functional code is the source of readability. Also we don’t typically call it “dependency injection” in the functional world
You're absolutely right: functional code is the source of the readability. But writing unit tests incentivizes engineers to keep things functional.
What's a better term than "dependency injection"? What should I call an argument whose default is always used in production code, but is there to make passing a mock easy? I'm not trying to be snide -- I'm genuinely curious.
I'm a massive fan of unit testing, but I mostly agree with the observations. However, I (mostly) disagree with the conclusion. The problems with unit testing I've seen to come from the following anti-patterns in various combinations.
1) The use of unit tests as the exclusive automated test type. ie; No functional tests, integration, etc.
2) Test doubles for most or every dependency, even purely functional dependencies like math libraries.
3) Not using the appropriate kind of test double for the test at hand. (Dummies vs Fakes, vs Spies, vs Stubs, vs Mocks)
4) The overuse of mocking libraries.
Mocking libraries have their place, but in opinion, are used approximately a hundred, perhaps even a thousand times more often than they should be. I use them to create test doubles in exactly three scenarios:
1) A dependency that does not have an interface, usually a third party library. This usually happens in one place only, and is used for writing the wrapper code test.
2) A dependency that has an incredibly large interface and/or dependency graph where building a set of stubs or spies is simply not worth the effort.
3) I want to test weird edge cases that's not available any other way, such as theoretically unreachable code.
These should not be the majority of your unit tests!
Code is a liability. Unit Tests are code obviously and are no less prone to contain bugs than the code under test. And of course they require maintenance just like any other part of the codebase.
It feels like the industry has blindly pushed for unit testing everything and 80% or more code coverage as the gold standard.
I’ve given up arguing about the cost/benefit of unit tests at work. I feel that the software the teams I’ve worked on over the past couple of decades still produce about as many bugs as before unit testing came along. I’m not building pace makers or aviation software, mostly LOB applications.
Unit tests provide a false sense of security (especially to management.) Yes sometimes they help catch refactoring bugs, but at what cost?
The article emits a key point, when talking about any practice: the context, in which unit testing is performed. The size of the team, the type of company, the technology, and the impact of product defects.
For a startup with a small team and few customers building an MVP? Unit testing is overrated.
For a company with 50 engineers in 10 teams building a product, that moved $500,000/day in revenue? Unit testing could or could not be overrated.
For a company with 1,000 engineers working in the same repo, shipping a product that moves $50M in revenue per day? Unit testing is most likely underrated - and essential.
You cannot ignore how the organization works, and the cost of a defect that a unit test could have caught. I happen to work at the third type of organization, and while unit tests might not be the most efficient type of safety net, it is a very big one. We have other types of testing layers on top of unit: integration and E2E tests as well.
Also, one more fallacy in the article:
"If we look back, it’s clear that high-level testing was tough in 2000, it probably still was in 2009, but it’s 2020 outside and we are, in fact, living in the future. Advancements in technology and software design have made it a much less significant issue than it once was."
This is not true everywhere. High-level / E2E testing on native mobile applications in 2020 is just as bad as it was on the web in 2009.
> Unit testing is most likely underrated - and essential.
You are right, but it still doesn't mean aiming for high coverage. In the big company case you'll want to cover the interfaces and dependencies and less of your team's code.
I know that part of this will fall under "integration" but definitions are sneaky.
I think unit testing definitely has its place. I use it a lot; but I have learned to moderate my reliance on unit testing.
I tend to prefer test harnesses and manual (or automation-assisted) testing. I've sometimes written my own unit-testing frameworks, because the "canned" variants didn't give me what I needed.
The term "unit testing" is quite old. It seems to mean something different, these days, from what it used to mean.
As far as I'm concerned "not testing my software" is out of the question.
For most of my projects, the testing code is vastly greater than the actual product code.
Unit tests are insurance for later refactoring and library upgrades. This let's you avoid premature abstractions as you can easily swap out lines of code and verify you didn't break the app. This is especially important when you aren't the person doing the future refactoring.
Tests are insurance for refactoring which doesn't change the interface that is being tested.
Refactoring usually changes interfaces. Things are factored differently. The clue is in the name.
The higher up the stack your test is, the more insurance it gives you for refactoring. The lower downs the stack it is, the more likely it is to be thrown away or heavily rewritten after refactoring.
No, refactoring shouldn't change public interfaces. The very definition of refactoring is rewriting code without changing interfaces.
> Things are factored differently.
internally
> In computer programming and software design, code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior.
If you're developing a library, then refactoring shouldn't change public interfaces. If you're developing an application and you own all the code paths to the code, then refactoring could change public interfaces, as the external behavior here would be the UI.
If you literally agree with the "refactoring shouldn't change public interfaces" then we need a new word for "code improvement which doesn't change external behavior, which can mean UI", which is the more commonly needed term.
And then perhaps we could agree that "code improvement often changes public interfaces" and how this relates to unit tests.
Exactly this. It's mostly lost on engineers that unit tests are usually testing at the layer that changes the most anyway - though the pain is felt once any real refactoring effort begins.
It's a perennial but: unit tests are more about how you think and rigourously approach a problem than preventing regression. I'd say if you wrote unit tests and then deleted them, your code would be better for it. Indeed, I often don't commit some of the tests I've written in order to write the code! I'll ship some subset of them. It's a notebook.
They serve as a form of living documentation for the code and help increase velocity in a build under the right conditions.
For example, you do need to know a certain function does what you think it does because the rest of the system isn't even in place yet. You might have to approach this from the outside, via integration, but the speed of doing this is quite slow. Versus a unit test.
This is not to mention refactoring!
The piece seems to be more about the value of mocking and how far to go with isolation. Which is a slightly tangential issue. I agree that in particular styles of object orientated programming this becomes absurd.
This article, which is linked provides a more convincing case, on the grounds that unit testing foregrounds the system as software as opposed to the software as a useful and functional thing in the world, meeting user needs.
However, it omits the being about to _react_ to user needs is actually a central party of agility. Unit testing allows maximum reactivity to changing requirements without regressions in the code and makes this code navigable. Changing customer needs means your carefully built functional tests are going to be just as useless and rotted. This has been the case with large test suites of functional tests - say in something like Cucumber - I've seen. Better test at a lower level, which moves slightly less rapidly.
Unit Testing, yes, it can be a waste of time. Depends on what you're doing. Unit Testing fails when you sink more and more time into trying to make a test for something because 'duh, unit test everything'. Fact is some code changes a lot and some code doesn't change much. Some code is also hard to unit test and some isn't. The intersection between code that doesn't change much and code that's hard to unit test should NOT be unit tested, especially if another form of testing works better. There's just no need to sweat over a test that wont run enough to justify the work that went in to writing it.
The issue here is developers don't have a sense for economics. Diminished returns, marginal utility, and opportunity cost should be studied by ALL.
What’s with the recent HN push against unit tests? Yes you need other tests too, but they serve a purpose. You can’t build on bad foundations! And the search space of integration tests is larger so it’s much harder to have good coverage of non-happy paths
I think it's a critical mass of people experiencing the "unit tests pass but the code is broken" problem. When unit tests are used to test glue code you end up testing your mocks and nothing else. Mocks are often done using a framework which encourages one off implementations per unit test. This introduces a maintenance burden where all the mocks have to be kept in sync.
Unit tests are very useful but we somehow landed ourselves in a place where we have lots of line coverage in our unit tests but little confidence that the code actually works.
There are at least 3 different applications at work where they unit tests are green while the code is broken or red when it's not. The cause is almost always the mocks. They either presume too much, making then fragile or they are flat out incorrect. Despite having a lot of test coverage there is little confidence for the developer that their change is correct.
In a sense the writers of the tests were "doing it wrong". The burst of articles on unit tests and their failure modes are a reaction to the prevalence of this in our industry.
Similarly, the unit tests never catch bugs and you're always changing them because of refactors or changing requirements. At some point it's like 'hey this set of tests have been rewritten 3 times and they never caught anything, so ... like does that mean that we just wasted time for the first two times?'
I've seen unit tests as documentation cause problems a few times. Like, maybe if you've got some sort of DSL where it's actually obvious what behavior is expected. However, more often it's 10-50 lines of setup / mock code and then some number of asserts and you're left trying to decide what the point is (ops, it turns out something got misinterpreted and the test is actually nonsense).
Finally, it seems like using the type system to design code where illegal states are not representable is starting to make some headway. Additionally, we're seeing increasingly powerful type systems make it into industry acceptable languages.
Even with typed driven development you still want unit tests in the form of property based tests. Also a nice way to resolve the 10-50 lines issue is to follow the Arrange, Act, Assert pattern where you could look at the second block to see the actions. Move as much of the Arrange section into a setup and it should be a bit clearer :)
> people experiencing the "unit tests pass but the code is broken"
Arguing against unit testing is like arguing against type-safety (and, usually, anti-unit-testing people are anti-type-safety people, too). The presumption always seems to be that if it doesn't solve every problem, it's unnecessarily slowing things down.
In my experience it's the pro-unit-testing people who are more likely to be anti-type safety people. The argument against type safety typically goes something like this:
I already have to write unit tests so why would I bother with types they don't add any real value.
Both camps are wrong. Unit tests are an unambiguous good. Types are also an unambiguous good. Both have some rather common failure modes though and guarding against those failure modes is useful.
Unit tests of purely functional code where you only need to provide an input and validate an output provide tremendous value. The unit test can treat the code as a black box and as a result the unit test is robust and resilient to changes in the black box while verifying that the box still produces the correct answer. Unit tests of code with hidden dependencies that need to be mocked require a lot more care to construct properly. Mock scripting frameworks encourage a number of bad habits. Things like "How many times did this method get called". Or "Always provide the same answer when this method get's called." The result is a hundred reimplementations of the same interface that are at best correct for the current version of the code they are testing and at worst completely incorrect reimplementations of whatever they are mocking. They all need to be kept in sync and maintained over time.
A shared in memory Fake will in general provide more value and be less fragile over time while also ensuring that your tests are actually testing the code and not the particular script you defined for the mock.
> (and, usually, anti-unit-testing people are anti-type-safety people, too)
This is the opposite of my experience. Most of the anti-unit testing people I've talked to are very much pro-type people. I wonder if anyone has done any studies that shows what the actual numbers look like.
> Arguing against unit testing is like arguing against type-safety
I disagree with this. Types (at least when they've been built on top of an actual logic) have the benefit of real costs and benefits. You can show what programs you are unable to write and you can show (mathematically) that certain failures won't happen.
Unit tests on the other hand are much more hand wavy. You can show that some refactors seem easier, but you can't prove it without a lot of data that has to be collected on a project by project basis. I'm not saying that I don't want unit tests if I'm doing a non-trivial refactor, but I am saying that that desire is more of an intuition thing. It's not like I can make any proofs around it like I can do with a type system.
you can have good experiences with unit tests or bad ones depending on the health of your codebase. Hard to test code is a smell that many people interpret as being a problem with the concept of unit testing.
Most of my experience is with React, and the majority of react devs I've seen don't unit test their code, because they don't write enough pure testable functions. As a result the community has leaned heavily on React Testing Library which is AMAZING for integration testing. Instead of checking that your function returns the right value, people will mount the component and then check to see that the right value is displayed in the rendered DOM. This obviously works, but writing the logic as a pure function and unit testing that function with a lot of different inputs gives you much more confidence.
It takes skill to know which parts to test with unit tests.
If you start striving for over 100% unit test coverage, then you'll be testing if the IDE pre-generated setters and getters actually set and get the value. This adds zero value to the codebase and you'll be testing things that, if they fail, will break half the world anyway.
Unit Tests are for algorithms, stuff that does something complex and not immediately obvious. Preferably deterministic, every time X goes in, X+Y comes out type of stuff.
Most tests should be either integration tests, testing the interfaces between different parts of the software or automated tests pretending to be the user, made with Robot Framework or something similar.
More like all languages are not equally easy to test. Just like all languages are not equally easy to write. It's incredibly easy to write mocks/spies in javascript without a single third party library. It's tedious to do that in Java.
There is also a corporate culture of tests that mandates bloated test frameworks, which leads to developers spending more time on writing tests than writing functional code.
That depends on if you are shipping tests to customers. Most companies are shipping code, not tests, so time spent on test is time not spent on code. You only want tests that make code easier to write.
You’re not just shipping code, you’re aiming to ship a correct program right? And tests are a part of that and will be on the list of risks the customer may be concerned about. It’s not just about making code easier, it’s about showing the code works as expected.
And the remaining 50%: making it easy for people to give you useful feedback, such as error messages giving a web link to a user-friendly bug tracker. Ideally the link already fills in the stacktrace and system info.
I always argue for this and get told, “that’s not for the user. It’s adds screen bloat to the UI.” I never understand why people say that. How about we make it easier for us to help people. I mean at the core that is kind of our job.
This! Not just for the systems development side of things, but just as much for normal usage. I tend to do work for fairly small organizations (<500 ppl), but with layer upon layer of bureaucracy and management, not to mention cultural and geographic differences.
The main goal of the systems I work on is to provide technical documentation of complex industrial processes. If things break, it can be pricey and/or dangerous. Having good information is a must.
However, if a user sees that something is off or just plain missing in the sometimes 30+ year old documentation, the easiest way to deal with it is to make a note of it and adapt to it for his or her work. Reporting the problem back in order to get it fixed is....difficult. There probably is a process for it, you probably don't have an account where you can log the time spent on it.
Having a quick and low-threshold way to report problems would be of enormous value in the long run.
I'm a big fan of unit tests and from my perspective, this article misunderstand what unit tests are for. Unit tests verify that the central assumptions you've made about some module's behaviour hold water. Unit tests aren't supposed to "exercise user behaviour" or "verify business logic", it verifies the theoretical behaviour of a carefully isolated module under specific conditions. A well crafted suite of tests makes it easier to reason about your application's behaviour.
This is the correct way, but in my experience less-traveled programmers start striving for that elusive 100% coverage and end up writing unit tests for completely obvious stuff like constructors (which do nothing else than setting values) or setters/getters.
Developers always have to estimate the cost/benefit of how they're spending their time and overly strict coverage targets get in the way of that.
Test coverage targets make very little sense for a statically typed language like C#, a fools errand almost. In dynamically typed languages, it's hard not to almost hit 100% if you're doing some honest TDD. Just for example.
> There is no formal definition of what a unit is or how small it should be, but it’s mostly accepted that it corresponds to an individual function of a module (or method of an object).
And there-in lies the problem. Remove the idea that the unit is a single method/function.
I subscribe to the idea that a unit is a unit of functionality. Nothing to do with the code implementation.
Only mock where you're reaching out outside of your codebase (filesystem, network, operating system (time, for ex.))
You can still do unit tests for individual functions when you need to work on a complicated algorithm, but those functions should have no dependencies or side effects - pass in all the data you need
I've worked at really small startups (<20 engineers) and pretty large companies (>1k engineers).
At small companies we have absolutely been able to get away with 0 unit tests while maintaining agility - being able to do major reactors quickly even when working in dynamic languages, while maintaining a high level of quality, even when operating at reasonably large scale. The key is clear, well written code and strong ownership from senior engineers who have a deep understanding of the code they own.
On the other hand, at large companies, extensive unit testing has been invaluable. Code bases are older, ownership changes hands frequently, new engineers join all the time, old ones move on to other projects, dozens of teams are calling each others code, refactoring is done by people who had no hand in writing the code in the first place, dependencies are higher and harder to track down completely, and it's not realistic to cover all important functionality with integration tests. Engineers often must rely on unit tests to prevent others from breaking their code, and to ensure that they are not breaking someone else's. Yes, unit tests can be highly problematic and costly to maintain, and they do add friction and time to initial development, but in these scenarios the benefits outweigh the costs considerably.
The units here are poorly designed - for example, I'd expect the LocationProvider to be responsible for choosing how to get the location, not the caller - and this makes them hard to test. The solution is fixing the design, not throwing out unit tests entirely.
The example is a straw man. He refers to a doubling in code size, but that's just because he's got hardly any code. In a real class, you still have one method (or two! or three!) on the interface but you've probably got 300 lines in the implementation. The only thing being "duplicated" is the external interface.
Might as well retitle this "Why I don't like the Interface Segregation Principle".
Engineering a good code base is very hard and is unique to every project and the mentality especially in the .net community of "loosely coupled code is good" usually ends up over engineering and adds more complexity than necessary.
I've seen far too many cases where added complexity has killed the project's time budget because things just take so much longer to do. You need to maintain a healthy balance and actually evaluate if writing this particular system within the project in a way where we can "swap it out" later on is actually a use case.
Every project is unique and "unit testing all the things" might not be the solution for your project. Where I work we shifted our mentality when we highlighted this problem to only unit test bugs and write less decoupled code and in bigger services that are more mission critical we do integration testing instead. This works very well for us but as highlighted earlier, every project is unique and you should experiment what works for your particular team because what works for others might not work for you.
This was obvious to me as soon as I wrote my first unit tests. I was all about uncle Bob's philosophy till I realized it doesn't work and discourages creativity and problem solving. Now I remove most old unit tests as they are a liability. They break while the code they supposedly test works perfectly. Maintaining them is a waste of time. I do have some unit tests for specific algorithms written in functional languages. They can be helpful but are no substitute for proper testing that actually catches bugs, whatever way that's achieved. In software, people love fashion. Unit tests were and are to some extent fashion. Like microservices and other ideas that only apply to the top 0.1% of companies, unit testing is generally useless in apps. There is some use when used with specific algorithms where the inputs and outputs are always constant and known. Libraries, programming languages, etc. But for apps, they are mainly a liability.
I believe unit and integration tests, apart from checking whether code does what you think it says, it serves both as a sort of executable documentation and, most importantly, it highlights development intent. If you TDD your code iteratively, reflecting not only the unit in your code, but the intent in your tests, you get a much more healthy testing base.
Code coverage is not really a good metric. Intent coverage is more interesting, but a whole lot more subjective and elusive. All in all, tests should be written not with your own self in mind, but with whoever might come later to maintain your code.
The article isn't about this, but in my experience, a lot of unit testing is driven by loosely typed languages, and tests for things that would be a compile time error in something stricter. Rust in particular has a way of making many unit tests redundant.
Like all things taught and learned by cargo-culting, unit tests have long lost the original intent.
No, a unit is not a function, or a method, or a class, or a file.
A unit is a clearly separated, non-trivial software component with a minimal and stable interface to the rest of the software system.
A unit is a good if it requires little to no mocking to test and a very small number of messages to test. It should also do something non-trivial - think small library size, not class size.
The unit should probably have a README which explains the small interface and the small number of necessary dependencies. The unit tests should largely treat the unit as a black box and be based on its promises in the README.
If module A is very expensive to fix and has a high probability of failing in production, well of course you want to have it rock solid and should be thoroughly tested.
You could build a priority index, with something like :
priority for testing module A = cost of fixing A x probability A fails
The other problem is getting management onboard and demonstrating a ROI justifying the time spent building these tests and using them. I personally failed at that and still am trying to figure out how to get them to understand the benefits. I've progressed, but it's one heck of an uphill battle for me.
* a 'journal' - at the date this test was written, this is how we expect the system to behave
* an ELI5 - if I'm trying to use this method, why am I passing all these complex objects?
Unit tests declare expected behaviour, and should make the developer think about their methods.
For example, why pass complex objects to just to print a string or a count or similar? And why pass those objects? could a method be generalised to take a lambda, or an interface instead? Could the method be pure? And so on.
Unit testing isn't overrated, just a bit misunderstood.
> Unit tests, as evident by the name, revolve around the concept of a “unit”, which denotes a very small isolated part of a larger system. There is no formal definition of what a unit is or how small it should be, but it’s mostly accepted that it corresponds to an individual function of a module (or method of an object).
Heh I feel like this is the crux of the issue. Because there's no standardized definition of what a unit is, people sometimes tend to choose the wrong unit to test.
I personally believe it is usually wrong to test a single function or method in a class. I tend to test the behavior of a whole class at the same time. Testing each individual method is too white-box, and makes your testing code too coupled to the implementation. Basically, don't test internal details (unless the internals are very complicated), just test the externally visible behavior, which is usually presented as a whole class.
I also don't agree with always mocking out dependencies. If your "dependency" is just an instance of a different class, then just use the real deal. Sure, now your definition of a unit now encompasses not just your code, but your dependency's code, but if your dependency's code affects the externally visible behavior of your code, it remains your responsibility if your dependency changes things and breaks your code. That's what abstraction means: you present an interface, your client doesn't need to care about how it's implemented, and what dependency your code needs.
Tests are most useful when they provide fast access to a code path for repeated testing during development, when the code path would otherwise be far removed from the normal course of user interaction with the software. This closes the debugging loop while you're developing new features, when it's applicable.
The second time, less often, when tests are useful is when making large changes with broad implications, and quickly verifying that everything still works and that you haven't overlooked some subsystem.
The least frequently useful application of tests is regression tests. 99 out of 100 regressions only happen once. If you add a regression test, it won't happen again (unless you overlooked something), but it was unlikely to happen again anyway.
Generally I write the first category of tests when I feel that it would be useful to solving the problem I'm working on. Then, when I finish the code, I commit the test, because why not? It's written. It'll never fail again, but hey. This creates a reasonably managable collection of tests which, more often than not, test the more complex (and therefore more fragile) parts of the codebase, and are a decent representative sample of all of the subsystems. This provides sufficient test coverage to support the second case. The third case is so rare that it can be addressed on a case-by-case basis.
The most stable software is software which doesn't change. Keep your scope small and your complexity low, and don't be afraid to mark a finish line. This is more effective than exhaustive automated testing.
> focusing on unit tests is, in most cases, a complete waste of time.
Someone is pretty inexperienced by making that claim. Unit tests help to isolate the expectations and verification to very small levels.
> While these changes may seem as an improvement to some, it’s important to point out that the interfaces we’ve defined serve no practical purpose other than making unit testing possible.
No it simplified the responsibility of the class. It also simplified your tests as well. Now you don't have to have tests that test many different scenarios.
> Note that although ILocationProvider exposes two different methods, from the contract perspective we have no way of knowing which one actually gets called.
Tests don't verify how you use or call other classes/methods. You can do verification if you want via mocks.
> Unit tests have a limited purpose
Yes, as does integration tests, functional tests, system tests. Etc. You shouldn't be trying to do unit level testing via an integration test.
> For example, does it make sense to unit test a method that calculates solar times using a long and complicated mathematical algorithm? Most likely, yes.
It does if you want to verify that the functionality is setting up the request correctly.
> Does it make sense to unit test a method that sends a request to a REST API to get geographical coordinates? Most likely, not.
That's not a unit test. That's an integration test(if you use a mock) or a functional test if you want to end a live end point.
> Unit tests lead to more complicated design
It highlights that the original design was complicated or the methods had side effects (which you should avoid). In his own example he seperated the resources from the functionality.
> Unit tests are expensive
No they're not. Mocks do not belong in a unit test. That's for integration tests. If you have them there you have issues. They're cheap to write and quick to run.
> Unit tests rely on implementation details
It's all about how you write you code, if you're trying to lump everything together, that's what you get.
> Unit tests don’t exercise user behavior
Correct, unit tests don't. Functional/feature or above do.
I stopped reading after this. It feels like the guy is just trying to argue that he doesn't like testing.
My own take on this is that it is hard to crystallise why or what a "unit" is. My cop out is it comes down to experience - once you've done this enough and solved enough problems and worked on enough teams, you'll get an intuition on this as it is more art than science.
I find it to be some mixture of importance and complexity, while always balancing against the single responsibility principle. A simple `average` function might be trivial but if it's important to your business logic you probably want to test that separately, even if it is nested within a "unit".
I find that following some iteration of "functional core imperative shell" helps here as it helps keep your core business logic to being data transformations and transformations on data are easy to test and their concerns are easy to reason about.
This then helps me reason about what is "implementation" and what is a contract/interface which should be tested robustly.
I guess really the art of writing just enough unit tests is to identify the seams and boundaries of your abstractions in your codebase, and potentially accepting that business seams in your codebase may be different from the seams of your actual software domain -- the latter being possibly more granular.
I only use unit tests as a temporary critic when developing brand new abstractions from scratch. With complex implementations, it is incredibly easy to forget one out of hundreds of constraints and lose track of important capabilities as you go.
That said, the moment the abstraction is in a clean state and consistently passing all tests and has been integrated well with existing logic, the unit tests are deprecated as far as I am concerned. I wont ever explicitly delete them, but I recognize that the unit tests are potentially just as flawed as what they are testing (the same developer wrote them after all), and coming back into that abstraction after 6+ months elapses means I'd probably just have to rewrite tests from scratch as a mental exercise and in order to restore my own sanity.
I can recall at least one occasion where I wasted 2 days chasing down a failing unit test only to find out the test itself was flawed - in the worst way, random pass/fail based on a race condition between multiple threads that were part of the testing code. I think that's the biggest danger with unit testing. Other developers assuming the tests you wrote are foolproof and sending them on pointless errands.
Depends on what kind of software you are building. If you are a 5 man team trying to move very fast and deliver new features, I think rigorous unit testing can take enough time that it won't be worth the effort but if you are working on something with a couple dozen other developers where no one person might be able to fit the whole system in their head, I think unit testing is invaluable. A few days ago I saw another HN post that discounted testing as well stating that you already need to know what to test in order to verify the behavior but that is not even the purpose of tests in large projects maintained by large teams. Tests serve two purposes in such projects. They act as internal documentation for contributors that is far easier to keep up to data as it complains loudly when it fails and they detect breaking that would otherwise go completely unnoticed and end up shipping. This is invaluable for large projects and software that is shipped to be installed like libraries and installable binaries. If you are writing a service that you deploy in your own environment to serve some API requests then sure, too much tests will have diminishing returns.
I initially reacted very negatively to this article but decided to entertain the idea anyway and I have to say I'm convinced of the idea that they are overrated.
Now I'm not saying that they are unnecessary and i definitely believe they are needed but i do think they are overrated.
I've worked on many buggy systems that had very good unit test coverage. It was only with sufficient integration testing that were able to prevent constant regressions.
Of course if your integration tests are lacking you'll have problems but this does not tell you anything about the value of unit tests, just that integration tests are obviously valuable.
Unit tests have their place but in my experience there are also a lot of places in most codebases where they don't deliver enough value compared to the effort put into both writing and maintaining them (they can be a huge PITA while refactoring). I agree that in many cases integration tests often are so much more useful in catching regressions.
> The primary goal is not to write tests which are as isolated as possible, but rather to gain confidence that the code works according to its functional requirements.
Very long article that leads to a trivial conclusion that unit testing is not a substitute to higher level testing, and carries the overhead effort.
It seems to me the author is addressing cases where unit test coverage is blindly used as dev project metric. PMs are often not familiar with code internals, but they need some assurances that project is on-track. Thus trying to collect insights from whatever output provided by automated tools.
Unit tests can have 100% coverage, but still testing the wrong thing. A common misuse is testing to the actual code, instead of testing the expected behavior.
Unit tests is a developer's tool, not PMs metric. On the other hand Acceptance tests should be the common ground, which may be the Integration tests or a whole separate suite altogether tailored to user requirements.
When coding a function, one needs to test preconditions and assumptions prior to following the main logic. Unit tests serve the very same purpose by enforcing assumptions on Unit level at the granularity that makes sense. Noone needs to test trivial getters and setters, but one needs to ensure that objects remain in valid/known states. Thus Unit tests untie developer's hands to recraft the "unit" without fear that the pearls of correctness would get lost.
Paradoxycally, Unit testing is a productivity tool, just as diagraming, or ... sketching prior to coding. Some developers can maintain a perfect mental picture of their code without any need for such tools. Some devs see design and implementations right away. If I were to inherit their codebase, I'd rather see their assumptions validated, so I don't need to do the coverage by eyeballing the code.
I only believe that theres only one kind of testing. Black box testing. And that is broken into two types. Integration and unit.
Tests that validate the interface of the big black box and assert that it does the right thing (integration tests). I.e assert the behavior customers depend on.
That black box is made up of many smaller black boxes that talk to each other. So you test the boundaries of each of those smaller black boxes. (Unit tests)
The ROI for testing the big black box is much higher. But when something fails it’s hard to know what exactly caused it and how to fix it. If you have a decent number of tests for the smaller black boxes then you know what box needs to be fixed.
But all good tests are black box tests. I.e they don’t test the internals but the interface of using that box.
A well designed system is made up of small boxes that do one thing very well and work with other boxes. Once you’ve tested them you can forget about them and work on higher abstractions. They will continue doing what they promised.
Thorough unit tests are required for code that is open to unforeseeable new uses. The existing application may not provide enough coverage. Unit tests can exercise all the requirements of a module that are not yet being used now so that in the future, someone can do new things with that module without having to fix it first (thereby risking the existing application).
The argument that it's more productive to test modules through the application only applies to private, dedicated modules that will only ever be invoked in support of the use cases arising in the application.
Basically, it boils down to whether or not the piece of code is an independent product (where "product" could refer to something with "customers" internal to the organization).
In engineering, all building blocks that are separate products to be integrated into other products are rigorously tested on their own, whether they are integrated circuits, or steel cables or whatever.
I like assertions. They're a really good alternative to unit tests because they can be used in a real environment without the need for maintenance-intensive mocking etc. But assertions have a significant performance cost, especially when they involve consistency checks on large datastructures.
So, a pattern that I've found useful is heavyweight asserts that can be enabled / disabled through external means, in conjunction with high-coverage integration testing. This is really easy in some languages (c/c++, for example, using the preprocessor) and can be more fragile in languages like python (hence the plug above -- which isn't the best way, but is a way, to achieve this 'best of both worlds' testing).
The whole point of testing is make sure you aren't breaking something when you add a feature/refactor/delete old code. Its purpose is to speed up development. Excessive unit testing just creates a brittle test suite, and adds more work without much benefit. It slows you down.
As a Rails dude focused on startups, iterating rapidly and what the user sees is what you care about. Therefore, I focus on integration tests that run the whole stack. That lets me mess with the implementation code without re-writing the test suite. At the same time, it provides regression protection and a good place to start troubleshooting. Plus, Rails already has tests for the "plumbing".
Tests should serve the developer, and speed up the iterative process, not add work to the project b/c of a dogmatic adherence to TDD or unit test all the things design.
Seeing Java examples after this does not surprise me. Testing is hard in Java and it forces you to change your design. Maybe better put - testing stateful objects is hard
I do a lot of dev in Ruby and testing there is super easy and powerful. Say what you will about monkey patching, open classes, and reflection, but it does make it very easy to write great testing libraries. I'd argue that testing libraries in Ruby lead to better design (e.g. using more methods, making things single purpose, thinking about the interface first can lead to better naming etc)
That said, I don't test as much as I used to. Unit testing is really good for testing algorithms (e.g. transform this complex JSON; rebalance load). Outside of that, some light end-to-end testing will catch most other things, and you can use staging and gradual rollouts to derisk bigger changes
Software is not all the same. Some people are writing Auto-Pilot software, others are writing some useful tool for a few people to speed up some data cleaning for a few months. There are no rules that you must do this for software when we're all doing different things.
I see unit tests as executable documentation. So it's nice when code you're getting familiar with has good tests. That said, I don't write them myself, except when the code is complicated and difficult to predict, but then I'm more likely to refactor it.
How much wasted effort is expended by testing policies? Consider, for instance, managing development by setting an arbitrary target % coverage for unit tests. As team refactors code, unit tests will break, and the test coverage requirement will compel someone to revisit old tests and refactor them. Refactor more code and the same tests are breaking again! Must refactor the tests -- yet again -- to reflect the latest changes. Test coverage policies are very costly..
On the other hand, unit testing does present some benefits. Consider a function that cannot be easily unit tested. This tells you that it's probably too complex and would benefit by refactoring into more manageable parts. Also, business logic unit tests are beneficial.
More like unit testing in some languages is a freaking chore because it involves complex unit test frameworks.
Unit testing should be as easy as writing a few lines of code, just to ensure that an interface behaves as expected, but it's often not the case because well many languages make it incredibly difficult, for various reasons, or that you end up writing walls of code (thus behavior ironically) with mocks, stubs, fakes and what not, just to test a single method...
"Testing experience" should absolutely be a core concern for any modern language. More generally the notion of "developer experience" should be a core design aspect in any new language and not be delegated to third party tools.
What is the language with the best testing experience?
Most systems in production already have key use cases that stand in for unit and system testing. Find those and use them after every new feature or change to the system.
I support an existing reporting system with several hundred tables and thousands of interdependent stored procedures. This system has a fairly often-used "global report" which touches nearly 90% of the tables. We run this user report after all changes to the database with a few swaps in parameters (history vs current etc). If this report and a few others "pass" (results are the same as previous reports), we know that the changes are safe. When the reports show differences, we confirm that the changes were intended. If not, something broke.
Giggle. How things evolve. "Unit Testing" came out of the Smalltalk work around 1997. I was there.
It was Kent Becks brain child. The first NUnit framework was SUnit.
What was a unit? Kent was never absolute about this, I always felt because as a consultant he wanted to peddle the theory far and wide. But the early examples all had a pretty strong trend towards "a unit is an object." Nowaday "what is an object" is pretty loose; I just finished some Dart tutorials where an object is a way of "organizing our code into smaller reusable pieces" which ironically, I did in Fortran77 with common blocks and well factored files. But in the Smalltalk world, which was "objects all the way down", an object was small amounts of imperative behavior bound to data, where computational results were achieved via an approximation of the way cellular biology solves problems: lots of little glumps of data that achieve a larger result by sending messages to each other.
This process of turning behavior and algorithms into things, or reification, was sometimes easy and sometimes hard. An object for Point, obvious. An object for SortCollationPolicy, less so.
What Unit Tests did was help programmers design good objects. Beck said this in eXtreme Programming eXplained. He said that traditional QA departments would laugh themselves silly at what unit tests did. But that the value was that it drove good design. And that in a collaborative (pair programming) environment, it helped communicate the design intents around objects to fellow developers. I did the Smalltalk koolaid fest for 20 years. I found Unit Testing to be immensely effective. It made my designs more cellular again and again. When my designs were solid, I had less bugs.
As a mechanical engineer, I still see similarities between unit tests and geometric dimensioning and tolerancing, a practice in the mechanical world that also swam against the current of conventional testing practices and left some shaking their head.
In todays world where OO design is more of a "small unit of organization" I'm not surprised that unit testing also seems meh.
These kinds of articles criticising unit testing always seem to work out more to do with the way people go about composing their software and the design. Interface explosion and unnecessary abstractions as presented in the first part is often the nightmare people get into early in the their software development journey. But that's a problem with design. Possibly unit testing lead you down that path, or perhaps that path was shown to you by some advocate of unit testing. In the article even the final code seems too messy for my tastes, also could be simpler and more modular. Which then makes the testing more straight forward and you'd pretty much the same thing in the end.
I've been saying this for years and suffice it to say it has been a career-limiting move for me. If any new developers are reading this be warned that these are considered dangerous ideas. Know your audience before sharing controversial opinions.
Unit testing reveals latent defects, defects that can't be triggered by the system as it currently exists. Each such defect is a problem waiting to pop out in the future, as the system changes and the latent defects become exposed.
I can imagine you can argue in favour of any statement, given your custom, ad-hoc anecdotical use case.
However, in general terms, unit testing (together with a very clear domain design) are the most solid pilars of your software.
I use unit tests for complex units of code and depend on frameworks to do simple things correctly. Like I don't need to write unit tests for inserting things into a database. Unless there are complications around it.
> Does it make sense to unit test a method that sends a request to a REST API to get geographical coordinates? Most likely, not.
Isn't a "unit test" which sends a request to a REST API almost by definition an "integration test", or even a "live server smoke/staging test", since it is testing the integration between separate live systems (the requesting code and the server).
If anything, what should the unit tested if anything is the code that generates the request, which should be separate from the code that sends the request to the server. But only if that code is non-trivial.
One argument made for unit testing which this post doesn't address is that it's easier to understand and debug unit test failures.
To which I say: yes it is, but quality debugging tools (which don't exist in many domains, and aren't used nearly as much as they could be where they do exist) can mitigate this issue. I'm talking about tools like https://rr-project.org and (self-promoting!) https://pernos.co.
I’m a software test engineer and spend quite a bit of time performing code reviews. One thing I always ask for is proof that the code works - this requires a combination of unit, integration, and e2e tests.
Unit tests are great to show that some code works in isolation, and then a few integration tests can cover the functionality. You should at a minimum cover each path through a function with boundary cases, which would be far too much effort to do via integration tests.
"however many find it encumbering and superficial."
Author doesn't want to write unit tests. There are two factions of people on the internet, those that write unit tests and those that don't like to. Nothing has changed in the last 15 years.
Whenever I write many unit tests I feel happier and more confident. If I don't and catch up writing unit tests, I always find bugs in my code. Never haven't I found bugs when increasing test coverage from a low start.
Excellent, thought-provoking piece that comes at a good time for me. I've been looking at Python's unittest for a specific system I've been developing over the course of 12 years at work. It became clear recently that we need more automated tests. However, as I started to design a test suite in unittest, the scope of it became overwhelming. Looking at it as a suite of functional tests makes it more reasonable to write.
What's missing from the title of these articles is "to me" or "in the context of my project". The approach to testing a CRUD/line of business app is tested is different than the approach for testing something like EC2, which is also different than how a library/package is tested. Without context, it's really easy to argue exceptions or counter-examples to any opinion.
It's worth noting that what the author describes as functional testing was, in fact, what unit testing was originally intended to be. "Unit" referred to a unit of functionality, not a unit of code. But that was an ambiguous concept, and it was simpler and easier to just write tests for each unit of code (typically a method), and so that's the approach that has prevailed.
Unit tests are a tool to solve a problem. If you don't have the problem, don't use the tool.
If you can do major refactorings without impact on productiviy, don't have a QA department doing manual tests, junior developers can deploy to production with confidence on their first day and customers are happy about the quality of your product, don't solve a problem that isn't there.
One thing I've learned, is that unit testing seems much much more painful than it has to be when you are using too many classes.
When you are writing code as pure functions (i.e. stateless), it's actually much less painful. In the provided example, I would never write a class to curl a website and parse json.
I used to think that unit tests are a waste of time. But they find tons of bugs in my code so I simply can't ignore the reality by posting some random untestable code and claim that it proves that tests are (most of the time - let's cover our bases folks!) a waste of time.
Unit tests are very important and very cheap. They safe a huge amount of grief later on.
As for everything, the pros and cons should be weighted to reach a practical and effective approach. For example there is rarely a need to test every single function.
> For example there is rarely a need to test every single function.
And that's where the bugs end of being.
People suck at writing code so it needs to be reviewed by peers and thoroughly tested. If you said that in an interview for code you'd written then I'd point to the door.
So you would write test cases for all the methods and constructors for a class like this:
class Foo{
public Foo(initialValue) {
this.value = initialValue
}
public setValue(value) {
this.value = value
}
public getValue() {
return this.value;
}
}
What's the point, what are you testing? That the language's most basic operations still work?
I'm serious. Of course you don't test that language operations work, but you're testing that given method does what it's supposed to do. In this case your method sets value for property on model. It doesn't matter you're doing it via assignment - you could be doing it by any other ways. You want to test that for given model after calling that method your models property will change to given value. This way, if you'll change the implementation of setValue your test will still succeed. If it'll start doing something else, they will fail. And of course, this method can be used in your feature tests, so those will start failing too (but that's beside the point, I guess)
Of course it's also a balancing act - should you immediately write test for this? I try to.
It's always a question of balance and return of investment. It's good to aim for 100% coverage, but it's a diminishing return exercise and it may not be particularly valuable to reach it, so in practice it usually remains an aim.
> If you said that in an interview for code you'd written then I'd point to the door.
That's an extremely arrogant thing to say. Any experienced dev will understand my point, so...
Most programmers are arrogant and that's why they write terrible code full of bugs. I posted elsewhere that if your code isn't tested then it's broken. It was sitting at negative four when I last looked. That comment should be in the hundreds. If folk don't understand that then they need to look into their software development practices.
You aim for maximizing code paths in the minimum test cases. Test cases written correctly can be quickly reused with different input values. Design to test. Prove your code through automated unit testing. Skipping proper test development leads to terrible tech debt when you try to maintain the code. Error paths are generally the place that doesn't get most coverage and those are where you crash a lot of programs. This stuff is elementary.
They are cheap per unit, but if you test a lot of units, it adds up.
There is a set of bugs that only higher-level testing will catch. There is another set of bugs that both higher and unit-level testing will catch. Then there is the set that only unit-level testing will catch.
How important is that last set? If a unit test fails and there is no user interaction to trigger that failure, is it really a bug?
Unit tests can be valuable if they help you during development of a unit, but most units are not that complex and should not require unit tests.
Cost includes the impact on further tests and maintenance.
The cost of testing and debugging increases as you go down the development/release cycle.
If you have unit tests then that helps integration tests (less issues, easier to investigate), system tests, etc. all the way down to dealing with bug reports from the field.
> but most units are not that complex and should not require unit tests.
> If you have unit tests then that helps integration tests (less issues, easier to investigate), system tests, etc. all the way down to dealing with bug reports from the field.
Not in my experience. I don't see how a unit tests helps me with bugs that show up in the field. If it shows up in the field, something should've caught it, which means testing failed.
I believe unit tests have a fundamental flaw: you're trying to solve a code problem by using code. Yes it can work out in the end. It also blows like code blows: waste of time, over-engineering, legacy code, too-big codebases, bugs, etc.
I think Unit Testing is good for when you have big teams and changes that can go across knowledge boundaries... so the tests can keep people in check from making changes that mess up other parts of the codebase inadvertently.
That’s all well and good until you need to upgrade some massive project to a new JDK or your Rails app from one major version to another. Then you will be wishing you had significant coverage.
I think the value of unit test is to test the logic in software abstraction instead of relying on external dependencies, but I believe many unit test is just for the sake of unit test
Most of the problems defined in the article tend to go away with an interpreted language such as JS. In JS unit testing is super cheap and super easy, with very common patterns.
I read most of the article and I am not convinced. I'll stick with the unit tests. In my experience they improve code quality, maintainability and reduce bugs.
wow, I keep reading those "test is overrated" (second article, this week), and I don't see how would I survive, maintaining a 10K LOC rails application, without unit testing my models, services, finders, etc.. I guess Unit Testing is Overrated, Long live to Unit Testing!
unit tests are the equivalent of repl for languages that are not designed. you need unit tests to close the feedback loop. you can't simply write code for days without having any feedback. some code simply does not have an obvious immediate visible feedback on the ui
I'd suggest that someone who thinks that UTs are over-rated are over-rating their ability. I'd also suggest that those people are the exact people that should have extra testing done on their code.
First of all, I also do not condone the idea that developers shouldn't worry about other tests than unit tests. I think the developers should be responsible for producing working code, end to end, and having automation and QA is just a nice bonus, additional verification.
Anyway, regarding unit testing. Lot of the objections against unit testing becomes clearer once we start talking about it in a formal way, best in functional terms.
Let's have a function y=f(x) that we want to test. In unit testing, we generate some examples (x1,y1),.. that we run this function with and compare the output.
If we have two functions, f and g, where z = g(f(x)), even if we unit tested each of them separately, we still can fail because what we didn't test was if the domain of g is indeed a subset of range of f. In fact, that cannot be unit tested, since unit tests only verify logic, not the domain and range of the functions.
That's the first objection to unit tests, there are holes in integration. This is especially insidious because if two different people wrote the two functions, they can each have different assumption on the domain and range, yet they won't detect it by unit testing, because they both wrote the tests that only operate under their respective assumption. (BTW, this also shows that the code coverage metrics are meaningless, unless you can coverage all the code executed down to libraries, because you can always leak the coverage through the datatype and vice versa.)
Second objection to unit tests is how do you generate the test cases, in particular, the output? Well, the unit to be tested has to be reasonably small. This means more work (more mocking) and more holes in the integration assumptions, as above.
Personally, I believe property based testing is superior in all respects and should replace unit testing. Property based testing forces you to write down assumptions that you have written your code with, and it scales better because you decouple generation of test cases from the assumptions themselves.
So formally in property based testing, we would create a generator of input cases for x, and also a property - a function that verifies what the y looks like (possibly even given x). In fact, this approach completely subsumes unit testing, because for each of the test cases (x1,y1),.. as above we can just write a property that checks if the output of f is y1 given the x1, and so on.
However, property based testing is stronger. We could also produce a property that would state that input to g has to be in its domain. Then we could easily detect the above problem with composing f and g, just by verifying the properties for our generated test cases. So it resolves the first objection.
That's the beauty of the properties, they really test the data, not the logic. I believe that what we really need to test as developers is the assumptions that we put on the data structures we work with, rather than logic that the functions do. If you want to verify the logic, read the code you have written again.
The second objection to unit testing is also somewhat resolved, because we don't have to produce complete test output, we only verify some of it's properties. Also, separating generation of inputs from properties let's you naturally reuse outputs from the other parts of the program for testing. You only to get the input generation right, the other things will sort of test themselves. So for instance to test g, we don't need to create an extra input generator, we can just live with the outputs of f. In essence, property based testing just makes tests themselves to compose better.
It always bothers me a lot when writing unit test, how little bang I get for the buck. I need to come up with all these test examples, and they usually only cover a small piece of code. What often happens is that I actually know (in my head) the testing generator and properties, it's just instead of writing them up so that computer would understand them as well, I just write a few examples. It feels totally wrong.
Ideally, I would love to see framework that would let some of the verification properties (from property based testing) to live in the code as additional runtime assertions. I think that would be much more practical approach than having a lot of unit tests, and it would tie nicely with defensive programming.
Well thought-out article with good points and deductions! I have a few counterpoints and I think the example is not the best way to argue against unit testing as there are better ways to implement the same features and write better unit tests without bloating the code.
Looking at the SolarCalculator class, I would go another way first and refactor it like so:
public class SolarCalculator
{
public static SolarTimes GetSolarTimes(Location location, DateTimeOffset date) { /* ... */ }
}
1. Made the method static (make it a free function in other languages)
2. Take an explicit Location parameter
3. Return a SolarTimes object directly, not async, not a Task<solarTimes>, and remove Async from the name
4. Drop the now unnecessary class LocationProvider member
This becomes more easily unit testable without any excess Arranges steps at the beginning.
public class SolarCalculatorTests
{
[Fact]
public Task GetSolarTimes_ForKyiv_ReturnsCorrectSolarTimes()
{
// Arrange
var location = new Location(50.45, 30.52);
var date = new DateTimeOffset(2019, 11, 04, 00, 00, 00, TimeSpan.FromHours(+2));
var expectedSolarTimes = new SolarTimes(
new TimeSpan(06, 55, 00),
new TimeSpan(16, 29, 00)
);
// Act
var solarTimes = solarCalculator.GetSolarTimes(location, date);
// Assert
solarTimes.Should().BeEquivalentTo(expectedSolarTimes);
}
}
The GetSolarTimes function is now purely computational and has no plumbing at all (stealing terms from chimprich). I think the original author would also agree that unit testing SolarTimes GetSolarTimes(Location location, DateTimeOffset date) has none of the problems that unit testing async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date) had.
(Added benefit: the interface is more flexible, it can be reused in more use cases without modification but that is not the point.)
I find that such a refactoring often solves all of the problem entirely. It might seem like we just swapped the problem under the rug and forced the calling code to add the complexity (and the tests, interfaces, mocks etc.) that we discarded, but in practice this is often not the case.
// Uses original async implicit-location interface
// (assumes an existing solarCalculator instance)
var solarTimes = await solarCalculator.GetSolarTimesAsync(date);
// Uses proposed non-async explicit-location interface
// (assumes an existing locationProvider instance)
var solarTimes = SolarCalculator.GetSolarTimes(await locationProvider.GetLocationAsync(), date);
The reason we can often get away with this in practice is that the complexity increase in the caller is small. We did not add additional state to the caller, we did not push more testing/mocking complexity to the caller. The assumed locationProvider instance in the caller replaces the solarCalculator instance in the caller. If testing/mocking locationProvider is required, testing/mocking solarCalculator ought to have been tested too. We require the caller to test/mock something else, not something new.
If the original async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date) interface is required nonetheless, it can be implemented as a pure "plumbing" function. As such I would agree that unit testing it would provide less value than integration testing. A simple pattern that can be applied here instead of an ILocationProvider interface and all the baggage that comes with it is using a Func<Task<Location>> or lambda instead. This allows both testing the instance with custom location providers and unhindered usage of the SolarCalculator class without always needing to inject a dependency.
public class SolarCalculator
{
private readonly Func<Task<Location>> _locationProvider;
// default constructor for normal usage
public SolarCalculator() {
internalReaLLocationProvider = LocationProvider();
_locationProvider = async () => internalReaLLocationProvider.GetLocationAsync();
}
// constructor for custom locations and testing
public SolarCalculator(Func<Task<Location>> locationProvider) {
_locationProvider = locationProvider;
}
// Gets solar times for current location and specified date
public async Task<SolarTimes> GetSolarTimesAsync(DateTimeOffset date) {
return GetSolarTimes(_locationProvider(), date);
}
public static SolarTimes GetSolarTimes(Location location, DateTimeOffset date) { /* ... */ }
}
(Sorry, I haven't implemented the IDisposable pattern for internalReaLLocationProvider and I might have misplaced an async keyword or two because C# is not my most recent language.)
To support the "pyramid-driven" paradigm I argue that the most complicated part of this feature is the solar time calculation and it would well deserve a large test suite containing many test cases like GetSolarTimes_ForKyiv_ReturnsCorrectSolarTimes above (edge cases, diverse locations, etc.). Conversely, the higher-level functions don't need this level of testing. Since they contain no complex logic, I usually assume that if they work for one input, they will work for any other. Testing the async, automatic-location version with the simple Kyiv-based input is enough, there is no need to test it with midnight sun and all the same edge cases as the base function.
The point I'm making is that unit testing is not as overrated as the original example suggests. The code can be modularized better (not by making everything an interface), with a well-unit-testable "computational" part and a part which is mostly "plumbing". I agree with those saying that the second part benefits more from integration testing than unit testing, and I can agree with keeping them "as highly integrated as possible, while keeping their speed and complexity reasonable". But I insist that unit testing the 1st part and writing it in a way that it is unit testable is important.
I totally agree with your refactoring. The code you wrote is simpler, less surprising, reusable, easily testable, and so on.
Unfortunately I think that the article only shows that the OP made poor design choices, (which he probably wouldn't have if he had used TDD, ironically).
Even if the article is well written, the code shown in the three first blocks kind of invalidate the whole argumentation :/
I think what a lot of people tend to miss with this discussion is that TDD is Test-Driven development. It doesn't have to be unit tests. The point is that you think first about what a valid specification is and you write a test for it.
There's an anecdote from Djikstra that I'll paraphrase:
Djikstra was working on a problem where two programs running on a shared memory computer were not allowed to enter their critical sections at the same time. He tasked his graduate students with finding the algorithm that would guarantee this. A student would provide a program. Djkistra would review it and find that it contained errors. The student would take the feedback and produce a longer, more complicated program.
Tired and unable to find more time to review the increasingly complicated programs, Djikstra tasked his students to submit with their program a proof of its correctness. Djikstra then need only verify that he understood the proof and that the program implemented it faithfully.
Shifting the burden of proof from the reviewer to the author made Djikstra's work much easier and the programs more robust.
I'm not suggesting we need to start writing proofs. We're practical industry programmers who aren't working with such high-assurance software most of the time. However unit tests, weak as they are, are at least a form of proof. Proof by example. For trivial code where a few examples would suffice to convince you of its correctness I would say they are quite useful.
Overrated though? I don't think so. They're also useful as a design tool. The OP gives an example of testing business logic that makes a bunch of HTTP API calls. The author claims unit tests are of little value here.
Well how would you test that?
Me, I would defunctionalize the calls the execute the HTTP requests. I'd provide an interpreter that the user can run their program in. The production code could use an algebra that makes the HTTP requests. The test code could use an algebra that stores requests made and returns canned responses.
It might seem like "extra code," but it gives us the ability to decouple our logic from how its executed. Not only can we test this without having to mock our HTTP library (which is, itself quite well tested) but this decoupling opens new avenues for managing our program. We can imagine affixing to our algebras some logging actions. We could write a development version of our interpreter algebra that logs out everything and a production version which masks secrets and sensitive information from the logs.
I've met programmers who can "just write the code," and they manage well for themselves. However I've also worked with such programmers who don't. The former are quite rare. And working with either group is difficult to say the least. If I am reviewing a piece of code I need to check whether your thinking is sound: did you consider the essential properties, edge cases, and did you spend time proving that you've thought about them? I don't want to read 600 lines of code and try to understand it... it's too much. But a proof, even an incomplete hand-waving one, I can understand.
That being said... sorry for the wall of text. Unit tests aren't the be-all-and-end-all of testing. They're a beginning. And they can be a useful tool when you're starting out. Look to property based tests. Think about the tests before you write the code. And run the tests frequently and often.
This is comically bad. Way too long and myopic but the points aren’t even useful points.
> “ unit tests are only useful to verify pure business logic inside of a given function.”
Yep, that’s one of the most important things you need to verify. You should also verify integration test success, and then the unit test allows you to immediately observe where is the failure: is it pure business logic failure when all external factors were mocked? Or is it integration failure? Or is your dependency flaky & untrustworthy?
Without unit tests you can’t (a) develop business logic in isolation from the external resources it will integrate with or (b) easily isolate what is a business logic error vs what is an integration error.
It’s just so dumb to use language like saying unit tests “are only useful” for this. That’s a hugely valuable thing to be useful for!
> “ no practical purpose other than making unit testing possible.“
This is incredibly bad circular reasoning. You must first already agree that unit tests confer no value, only then is this considered a point by the author. But if unit tests do add value (and they really do) then refactoring to facilitate unit tests also adds value!
The point about testing a hidden implementation is mixed. On one hand the example used here is just a bad example. On the other hand, testing a hidden implementation can be a very good thing because it assists the act of development in the first place. The test helps the person writing the business logic to write, by factoring into a test that's proof of correctness. Maybe it’s debatable that it should be removed like scaffolding when they are done if it’s a hidden implementation, but that’s really a local decision for a team to make. Sometimes it’s good leave those tests of hidden implementations because they add extra protection for changes that can have unintended consequences.
* test your core, make sure your core is strong
* don't test your http api
* don't mock
* don't test writing and reading from the database
* don't complicate your code to make it testable
* ... unless you deem fit
Not the OP but I count integration testing as testing integrations /between/ systems - wherever your code depends on something outside your codebase. You can mock this for unit testing.
Don't mock things that are already in your codebase. Use the actual object.
Common argument against this I've heard is "But then when something goes wrong it is harder to figure out where the problem is" - I have never actually experienced this myself, but I have, very often, experienced being reluctant to do any refactoring because I'd have to rip up all the unit tests because they are testing only implementation details
You definitely have some strong opinions about testing. I'd suggest being more specific so your guidance is more clear. "people who test what they think is appropriate" is really subjective and makes it seem like there's two camps: people who "get" the author's perspective, and those who don't. I've tried the "do" and "don't do" approach, but that's not enough information usually. Teach by examples. Lots of examples. Calling things BS isn't going to change someone's mind. I'd also re-enforce that these practices are for your codebase. Every project is different and has different challenges. There's a reasonable counter-example for each item in your TLDR, and "unless you deem fit" doesn't feel like it gives developers leeway given how strongly you state your opinions.
If you don't do 2 you may as well not test at all. You may do everything else and still fail your end APIs. Your users don't care about your core or mocks. They want end APIs to work
The rest just seem arbitrary. Can you make a change and be confident it works ? The last point sums up everything. "Use your best judgement"
No, it is not overrated. The author is just using a not-so-handy definition of 'unit'. When a 'unit' is a single method/class/function 'unit' testing is not in general a very wise thing to do. In most cases it is better to unit test a set of related classes/methods/functions as a whole because the communication between these things also needs to be tested. In that case it is quite impossible to overrate the importance of unit testing. I also don't think the definition the author uses is the original definition of 'unit'......
Computational code handles your business logic. This is usually in the minority in a typical codebase. What it does is quite well defined and usually benefits a lot from unit tests ("is this doing what we intended"). Happily, it changes less often than plumbing code, so unit tests tend to stay valuable and need little modification.
Plumbing code is everything else, and mainly involves moving information from place to place. This includes database access, moving data between components, conveying information from the end user, and so on. Unit tests here are next to useless because a) you'd have to mock everything out b) this type of code seems to change frequently and c) it has a less clearly defined behaviour.
What you really want to test with plumbing code is "does it work", which is handled by integration and system tests.