A few thoughts on BDDon't

Note: apparently I wrote this blog post, but forgot to publish it back in december. I will publish it nonetheless together with the python warnings post, and keep the publish/create date the original from last year.

Kevin Dishman wrote an article titled BDDon't, in which he raises a few objections against issues common with the tools and common practices of them. As recently at the Softwerkskammer Nuremberg user group we've just talked about Gherkin style testing after my polytesting talk, I'm now doubly prompted to blog some thoughts.


  • I've only used behave, so it's entirely possible it has different behavior from the tool implementation(s) Kevin used
  • I don't run the behave tests on the selenium level
  • I have never worked for an airline, though flew enough times already. But to be consistent with Kevin's examples, I will use this domain to illustrate points
  • I don't exclusively use only BDD tests

I agree with Kevin's problem statements

  • Regular-expression mapping can become complex and error prone and should be avoided
  • Global state and step interdependencies are also error prone and should be avoided
  • If the business is really interested in participating in test specification, this would be a great opportunity for them to pair with someone familiar and comfortable with the test suites the team has created.

Despite that, I still like Gherkin tools

It forces separation of test, test glue, and app code

Writing readable tests are hard. While on a team with a great culture and strong discipline this isn't a challenge, on teams that are not yet experts in testing I find this constraint really helpful.

Gherkin docs can be useful beyond dev

As stated above, I've found that motivated stakeholders, BAs, etc. will be fine reading Java, Ruby, Python test code. However, other people can find value from an existing - and filterable/navigable (!) - Gherkin documents, e.g.: tech/customer support people could appreciate living documents. A new support person would need the same info as a new developer making changes in that area! Or imagine if your processes need to be audited - persuading an auditor to read natural text might be easier than getting them to navigate codebases.

Don't have to test things on the full stack level

Even if Gherkin is used for acceptance testing, some algorithms/calculations can be tested on the unit/component level, never even touching the UI to provide the confidence the developers (business) need(s). As time goes on, it's even possible to transition tests up and down the test pyramid (thanks to Stefan Clepce for bringing behave stages to my attention)!

Reduce the pain for Gherkin

Regular-expression mapping - Parameter variance

Different abstraction levels

The problem of complex test data isn't unique to BDD tools - integrated tests written in any tool will face the same problem. And the solutions are similar - in the tests where I care about the concrete values, I specify all the test values explicitly, and in tests where I don't care about those details and operate at a higher level abstraction, I create sample (random) data for that - a'la pytest fixtures

E.g.: instead of

Given I have booked 2 flights

I might say

Given I booked a valid flight
And I booked another valid flight

both these steps can match to the same step implementation - whether via regex, multiple methods calling the same actual implementation, or via some other way. Here I quite like behave's annotations, i.e.: I can say

@given('I booked a valid flight')
@given('I booked another valid flight')
def booked_a_valid_flight(context):

Or I might even test things in isolation, not even sharing step implementations - I might test that the OLTP model gets translated to the correct reporting model, and for the report I only test things on the report data model (see step tables below).

Relying on the application's own defaults

I have never booked a flight for an unaccompanied minor. Until reading Kevin's article, it hasn't even occurred to me that is a use case to consider, yet I've booked quite a number of flights.

I'm sure there is an option to specify that. However, that option by default is not check in the application for the checkout - the customer explicitly has to request it to be able to specify such a scenario.

Thus for such a scenario, I likely would include an explicit step for booking with an unaccompanied minor.

Using step tables (beware, this is defaults in disguise!)

Behave supports step table data

Given I have booked a flight
   | type                     | return |
   | with unaccompanied minor | true   |

technically, the above would need a header, but in the step implementation, the header could be treated as the first row, if that aids readability.

This can cause more complex test code, but if each step corresponds to an actual action in the application (a'la Page Object), then there should be no extra logic in the step handler - each line here corresponds to a form input (or HTTP POST) entry.

Global state

In behave, each step implementation method receives a Context object to store its state (changes) in,

This object is a place to store information related to the tests you're running. You may add arbitrary attributes to it of whatever value you need.

During the running of your tests the object will have additional layers of namespace added and removed automatically. There is a "root" namespace and additional namespaces for features and scenarios.

This is unlike it is implemented e.g.: in cucumber-jvm, and maybe that's why I haven't run into this issue (though external global state, such as a database can still be shared).

What do you think? I would love if you would leave a comment - drop me an email at hello@zsoldosp.eu, tell me on Twitter!

Posted on in bdd, python, testing, behave, cucumber, gherkin, software by

Share this post if you liked it - on Digg, Facebook, Google+, reddit, or Twitter

Your email address