Why exactly don't we stub that database?

Ok, when it came to unit tests I’ve been in the heavy-mocking-and-stubbing camp a few years ago. But then I’ve then seen our team (including myself) crash into the same issue over and over again: the fact that with mocks/stubs one does not test the real thing. One tests mocks and stubs which can easily get out of sync with the “real thing”, i.e. ActiveRecord models in our case.

So I’ve completely turned away from my previous position. Our tests became somewhat slower, hitting the database all the time, but they also were more robust, easier to read and easier to change. We could be pretty sure that our tests test the real application and things that were broken would actually be caught by the test suite.

Now while I’ve been watching Mike Perham’s talk about Scalable Ruby Processing with EventMachine something rang a bell for me. At some point Mike mentions that in order to write a non-blocking Postgres adapter he only needed to change a single method: execute on the Postgres adapter. For some reason the question popped into my mind if that couldn’t apply to mocks/stubs for unit tests as well.

Actually when we say that we “mock out the database” that’s not even close to what we’re really doing. In fact we are using an arbitrary object that pretends to behave the same way as the “real” one. That’s something completely different from really mocking the database if you think about it – and to me it seems this difference is the same difference that I’ve been experiencing when mocks became out of sync with the actual application: some logic in the model had changed but an according change to the mocks was missing. Because my “mocks” didn’t actually mock the database but the whole model.

So what if we’d use a record-replay approach instead that wraps around the adapter’s execute method and records responses from the database when our tests run for the first time. On subsequent requests it would not hit the database any more but actually mock it by replaying the results from the first run.

[Update:] VCR is a test helper library that does something similar for HTTP. The README says: “Record your test suite’s HTTP interactions and replay them during future test runs for fast, deterministic, accurate tests.”

Let’s assume we have this totally made up test. When we run it for the first time the following happens:

User.create!(:name => 'David')     # => pass the CREATE query to the db and record that it's in state abcd (hash from the query)
david = User.find_by_name('David') # => pass the FIND query to the db and record the result for state abcd
david.name = 'Josh'
david.save                         # => pass the UPDATE query to the db and record that it's in state 1234 (hash from the query)
david.reload                       # => pass the FIND query to the db and record the result for state 1234
assert_equal 'Josh', david.name    # => GREEN

Running the same test again would allow the execute mock to kick in, preventing the test from hitting the database:

User.create!(:name => 'David')     # => take a note that the db is in state abdc (hash from the CREATE query)
david = User.find_by_name('David') # => return the result previously recorded for this FIND query executed on state abdc
david.name = 'Josh'
david.save                         # => take a note that the db is in state 1234 (hash from the UPDATE query)
david.reload                       # => return the result previously recorded for this FIND query executed on state 1234
assert_equal 'Josh', david.name    # => GREEN

Obviously when we now make a change to either the application or test code that changes the interaction with the database then our recorded responses would become stale. In that case a query would get executed that has not been recorded before so we can stop the test suite from executing and ask the developer to re-record the results from the actual database first. We could even consider stopping the test suite, wiping out any results and re-run the test suite in recording mode automatically.

This way we would actually stub the database, not the model. Our tests would still go through all the low level ActiveRecord code that turns database result sets into model instances, which will be slower than using fake models but also will be much more robust and make our tests more useful.

The other question obviously is, how much run time could we spare this way? Obviously we’d need some kind of temporary place (maybe Redis?) to store those db states between test runs. So in the end we’d use one database to stub another one. Hu?

Let me know what you think.