Tuesday, May 28, 2013

Testing on the Toilet: Don’t Overuse Mocks

By Andrew Trenk

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

When writing tests for your code, it can seem easy to ignore your code's dependencies by mocking them out.

public void testCreditCardIsCharged() {
  paymentProcessor = new PaymentProcessor(mockCreditCardServer);
  when(mockCreditCardServer.isServerAvailable()).thenReturn(true);
  when(mockCreditCardServer.beginTransaction()).thenReturn(mockTransactionManager);
  when(mockTransactionManager.getTransaction()).thenReturn(transaction);
  when(mockCreditCardServer.pay(transaction, creditCard, 500)).thenReturn(mockPayment);
  when(mockPayment.isOverMaxBalance()).thenReturn(false);
  paymentProcessor.processPayment(creditCard, Money.dollars(500));
  verify(mockCreditCardServer).pay(transaction, creditCard, 500);
}

However, not using mocks can sometimes result in tests that are simpler and more useful.

public void testCreditCardIsCharged() {
  paymentProcessor = new PaymentProcessor(creditCardServer);
  paymentProcessor.processPayment(creditCard, Money.dollars(500));
  assertEquals(500, creditCardServer.getMostRecentCharge(creditCard));
}

Overusing mocks can cause several problems:

- Tests can be harder to understand. Instead of just a straightforward usage of your code (e.g. pass in some values to the method under test and check the return result), you need to include extra code to tell the mocks how to behave. Having this extra code detracts from the actual intent of what you’re trying to test, and very often this code is hard to understand if you're not familiar with the implementation of the production code.

- Tests can be harder to maintain. When you tell a mock how to behave, you're leaking implementation details of your code into your test. When implementation details in your production code change, you'll need to update your tests to reflect these changes. Tests should typically know little about the code's implementation, and should focus on testing the code's public interface.

- Tests can provide less assurance that your code is working properly. When you tell a mock how to behave, the only assurance you get with your tests is that your code will work if your mocks behave exactly like your real implementations. This can be very hard to guarantee, and the problem gets worse as your code changes over time, as the behavior of the real implementations is likely to get out of sync with your mocks.

Some signs that you're overusing mocks are if you're mocking out more than one or two classes, or if one of your mocks specifies how more than one or two methods should behave. If you're trying to read a test that uses mocks and find yourself mentally stepping through the code being tested in order to understand the test, then you're probably overusing mocks.

Sometimes you can't use a real dependency in a test (e.g. if it's too slow or talks over the network), but there may better options than using mocks, such as a hermetic local server (e.g. a credit card server that you start up on your machine specifically for the test) or a fake implementation (e.g. an in-memory credit card server).

For more information about using hermetic servers, see http://googletesting.blogspot.com/2012/10/hermetic-servers.html. Stay tuned for a future Testing on the Toilet episode about using fake implementations.

18 comments:

  1. "Sometimes you can't use a real dependency in a test (e.g. if it's too slow or talks over the network), but there may better options than using mocks"

    Often the easiest solution is to refactor interfaces. In this case, the PaymentProcessor probably shouldn't be searching so deep into the CreditCardServer's dependencies. One possibility is to create a TransactionCreditCardServer that wraps communication with the server in transactions. Then the software under test can ask the TransactionCreditCardServer for the payment directly, making the mocks much easier to set up.

    ReplyDelete
    Replies
    1. Refactoring might help reduce the problem, but if you use mocks to test TransactionCreditCardServer then you still have the same problem. Plus you'll probably be adding complexity if you add lots of new classes just to make your mocks happy.

      Also, in many cases you won't be able to refactor the mocks away. In the above example there's only one dependency, but what if your code takes five dependencies? The mocking problem gets much worse, and it probably won't be easy to refactor all those away behind a simple interface.

      Delete
  2. Much of the ugliness in this example is due to the mock object being a "partial mock", see http://monkeyisland.pl/2009/01/13/subclass-and-override-vs-partial-mocking-vs-refactoring/ for a discussion when to do partial mocks vs subclass-and-override.

    ReplyDelete
  3. Brilliant and concise article.

    The problem with mocks is they duplicate the logic of the code itself - instantly breaking DRY. Having to use mocks is a code smell and normally indicates the need to refactor.

    ReplyDelete
    Replies
    1. Mocks work better if you don't put logic into them and just use them to get your code into a certain state (e.g. if you have special handing in your code when your dependency has an empty list, you can use a mock to tell the dependency to return an empty list so your test can execute that code).

      Mocks also work well for testing interactions
      (see http://googletesting.blogspot.com/2013/03/testing-on-toilet-testing-state-vs.html).

      Delete
    2. I don't read any general "problem with mocks" into the article above at all. I have seen tests like the bad example above; those are simply attempts to write integration tests in a unit test style. However mocking is a very useful technique for truly testing single units in isolation (for instance how one of the 5 subsystems represented in the above test interacts with a single other subsystem).

      I would say the opposite. The need to use mocks is not a code smell. The *inability to use mocks properly to write readable concise tests* is a code smell. It means your code is too tightly coupled, and the contracts between the individual components are opaque and poorly understood, and only testable by throwing the whole kit-n-caboodle together, and in the event of a failure, hoping you have good enough logging to figure out exactly what went wrong.

      Delete
  4. Also it's called unit testing for a reason, testing dependencies is a nono. Although I agree mocking shouldn't be overdone, having test spill outside the unit is not good either. It has just as much potential as mocks to introduce unwanted side-effects.

    ReplyDelete
    Replies
    1. You can't test every class in isolation. So you can do things like refactor your code so that they don't even know about the depenency (e.g. if you have business logic and database logic mixed together, you can move the business logic into a separate class that has no knowledge of the database, which makes it easier to test).

      But at some point you're going to test your code that uses a dependency, and you can't always use mocks just so you can ignore the dependency.

      Delete
    2. So that's what end to end tests are for. I agree, unit tests, first and foremost, must compile in ~500ms, tops. Otherwise you can't hook them into your save functions of your editor without noticing the lag, and you lose that tight feedback loop that makes unit tests so critical.

      Let end to end tests run for 20 minutes. You're only running them when you push to production, which is probably only three times a day for your team, anyway.

      Delete
  5. Mocks are like hard drugs... the more you use, the more separated from reality everything becomes.

    ReplyDelete
  6. The whole point of unit testing is that you are attempting to test a unit of functionality. A unit is usually understood as a class. However, sometimes the purpose of that class is to interact with files, databases, networks, other classes, etc., and dragging those into your test muddies the water of what a unit is. Mocking the interaction with the rest of the system is a valid use of mocking I believe.

    There is scenario where mocking cannot be avoided; testing the operation of the program as a result of human interaction. I recently mocked a dialog box service such that I could mock that the user clicked 'yes' or 'no' as a result of being shown a dialog. If you intend to do automated testing of human interaction, mocking is your only way.

    ReplyDelete
    Replies
    1. A better definition of a "unit" would be a single class or a group of related classes. Otherwise if you do something like refactor some utility code out of a class into a helper class, that helper class becomes a separate unit, meaning what was previously an implementation detail now becomes something you would have to mock out, and this could easily lead to tests that are hard to maintain.

      But mocking can still get complex sometimes even if you're trying to mock something that's a separate unit, and it really only works well if you have a simple interface that's being mocked out (e.g. if the code under test calls a method that returns a list and your code then does something interesting with that list, you can tell the mock to return an empty list, a list with one element, etc). But if there are more complex interactions that require several mocks to talk to each other or require the mocks to return other mocks, it's a much better idea to use a fake, or possibly even the real thing if it's not too heavyweight.

      Delete
  7. Doesn't the code being tested above involve a pretty clear Law Of Demeter violation? Code that reaches through 4 levels of dependencies (server -> transactionManager -> transaction -> payment) is going to be difficult to test by any means. If you're truly trying to test the integration between all those systems, then we're talking about an integration test, and mocking is obviously not appropriate there. If you're unit testing, the code should either be refactored not to reach through so many layers, or the component interactions should be tested piecemeal as true units.

    ReplyDelete
    Replies
    1. There's really no such thing as a "violation" of the Law of Demeter since it's more of a best practice than a law, and it doesn't necessarily make sense to follow it in every case. And in some situations you might not have a choice if the API has already been designed and you're stuck with that design.

      Using mocks when what you really need is an integration test is already mentioned in the post. This is probably one of the most common cases of overuse of mocks, but similar problems can still happen in unit tests if you indiscriminately mock out all dependencies (e.g. mocking out a bunch of helper classes that are closely related to the class under test).

      Delete
    2. Heh, can I lobby that we call it something other than a "law" if it can't be violated? (Joking) Anyway I was making reference to previous Google Testing Blog posts that advocate writing code with the LoD in mind:
      http://googletesting.blogspot.com/2008/07/how-to-write-3v1l-untestable-code.html
      http://googletesting.blogspot.com/2008/07/breaking-law-of-demeter-is-like-looking.html

      It seems like if you control the code in question, you could rewrite it with the above posts in mind to make it more easily testable. If you don't control the APIs as you suggest, I think I'd write a Facade that I could control whose implementation I left only tested as part of a full system test, and was simple enough that I was okay with that trade off, then the code on top of it talked to the Facade API that I could control (and possibly write a fake of, etc.)

      I do agree that overmocking can get ugly fast, especially in the cast of closely related dependencies like you mention. I guess the reason I initially bristled at your post is that I am somewhat of a test practices evangelist at my company, and a significant portion of the culture here is resistant to the use of *any kind of testing double whatsoever under any circumstance*, that I wanted to clarify it so someone wouldn't use this post (erroneously, I hope) in defense of a never-use-doubles-ever philosophy.

      Delete
  8. I suggest the book Growing Object Oriented Software Guided by Tests (GOOS) to see a good use of mocks and TDD.

    There are two styles of TDD: state based (the book Test Driven Development by Example from Kent Beck is a good example) and interactive based (the book GOOS is a good example). As the names suggests state based TDD use much more the state of the objects to construct its tests and interactive TDD use much more the interactions (messages) between objects. Both have pros and cons. Have used the two practices, today I tend to use more interactive TDD, but depending on the problem, I also use state based. And I know of very good programmers in both sides.


    For more information two good groups (IMO) are:

    TDD group
    http://groups.yahoo.com/neo/groups/testdrivendevelopment/info

    Growing Object-Oriented Software (the GOOS book group)
    https://groups.google.com/forum/#!forum/growing-object-oriented-software


    ReplyDelete
  9. The immediate problem I see with this test is that it actually makes a $5 charge through the payment processor. Do you really want to have to cancel a charge every time you run your test suite? And what if there's some fee involved? I think those considerations would be enough to make me want to run a test suite as rarely as possible, which is exactly what you don't want from your test suite.

    ReplyDelete
    Replies
    1. You definitely don't want your tests to talk to a real credit card server. The last paragraph of the post mentions what to do here: either start up a local version of the credit card server for the test (although this might not be feasible if the server is too big and complex), or use a fake in-memory version of the credit card server.

      That's the problem with mocks: they make it easy to write tests for your code, but it's also easy to use them improperly, which can make the tests useless and will likely slow down development too.

      Delete

The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.