Archive | scrum RSS feed for this section

Gherkin for Business Analysts

30 Jun


This article aims to provide an overview & guide to Gherkin. It should help BAs that have been asked to write scenarios …. work in BDD teams … write feature files … or create acceptance tests using the Given/When/Then format.

I’ll provide examples of the Gherkin syntax, why Gherkin is used & how it fits into BDD.

What is this Gherkin you speak of?

Gherkin is a language used to write acceptance tests. BA’s use Gherkin to specify how they want the system to behave in certain scenarios.

My personal definition of Gherkin is: “A business readable language used to express the system’s behaviour. The language can be understood by an automation tool called Cucumber.”

It’s a simple language. There are 10 key words (e.g. Given, When, Then). Because it’s a simple language, it’s understandable by the business. As well as being understandable by the business, Gherkin can be understood by an automation tool called Cucumber. That means Cucumber can interpret Gherkin and use it to drive automated tests. This links BA requirements to automated tests.

Below is an example of an acceptance test written in Gherkin. A BA may write the acceptance test independently, or as a team effort. Each scenario will test one example of the system’s behaviour:

The system’s behaviour needs to match the acceptance tests/scenarios. A feature may have many scenarios that need to pass. For example with a login component: in one scenario (incorrect password 3 times) a user should be locked out. In another scenario (incorrect password 2 times) a user should see a warning error message etc.

The 10 key words of Gherkin are:

  • Given
  • When
  • Then
  • And
  • But
  • Scenario
  • Feature
  • Background
  • Scenario Outline
  • Examples

We’ll go through each key word soon.

Why use Gherkin?

There are two key reasons to use Gherkin:

  1. Gherkin allows Business Analysts to document acceptance tests in a language developers, QA & the business can understand (i.e. the language of Gherkin). By having a common language to describe acceptance tests, it encourages collaboration and a common understanding of the tests being run.

2. Gherkin also links acceptance tests (GIVEN/WHEN/THEN) directly to automated tests. This is because Cucumber can understand Gherkin. Essentially it means if a BA changes an acceptance test – the developer’s underlying Cucumber test should fail and a red light should start flashing!! Therefore we can be confident that the system matches the BA’s specification. It’s an executable specification. It links requirements, tests and code together. It means the requirements are a living document that need to be kept up to date – otherwise automated tests will fail! Similarly, if the documentation changes and the code doesn’t change – a test will fail which is also good 🙂

As part of BDD, teams want to write many automated tests to improve their confidence in the product/releases. Teams want these tests to be understandable + valuable. Gherkin acceptance tests help with that!! Gherkin adds power to the acceptance tests being written by a BA – because they are directly executed as Cucumber automated tests.

Basic Syntax

Let’s go through the 10 key words.

  • Given, When, Then, Scenario

Above is a simple acceptance test. It uses 4 of the 10 key Gherkin words.

Given. This puts the system in a known state. It’s a set of key pre-conditions for a scenario (e.g. user has logged in, user has money in their account etc)

When. This is the key action a user will take. It’s the action that leads to an outcome

Then. This is the observable outcome. It’s what happens after the user makes that action

Scenario. This is used to describe the scenario & give it a title. The reason we do this is because a feature or user story will likely have multiple scenarios. Giving each scenario a title means people can understand what is being tested without having to read all the Given/When/Thens. A scenario title should be succinct. Some people follow “the Friends format” e.g. The one where …. the user has insufficient funds.

  • And

This is used when a scenario is more complicated. It can be used in association with Given, When, or Then. Best practice is to avoid having lots of Ands. Having lots of Ands can indicate that a scenario contains unnecessary information – or that a scenario is infact multiple scenarios.

Some people avoid using And in association with When … because this implies a scenario infact needs to be broken down into multiple scenarios. Typically you only want one when (i.e. action) per scenario.

  • But

Can be used in association with Then. It’s used to say something shouldn’t happen as an outcome. I’ve literally never used this one!!

  • Feature

Feature is used to give a title for the feature/piece of functionality. A feature contains lots of scenarios. For example “Sign in” might be a feature … or “push alerts” …. it’s the title of a piece of functionality.

The same way that scenarios have titles, feature have titles.

A feature file is a file that contains Acceptance Criteria (bullet points describing the rules / high level behaviour) & Scenarios (these are written in Gherkin; they test instances of the rules). Essentially it can be used to contain all the detail for a feature It’s usually stored on GitHub & its basically a text file with an extension of .feature.

  • Background

This sets the context for all scenarios below it. If you find that scenarios have common Given/Ands, Background can be used to eliminate the repetition.

Background is run before each of your scenarios. Scenarios can still have Given/When/Thens.

  • Scenario Outline / Examples

These are used together. They are used to combine a set of similar scenarios. Essentially you create a table and enter in values … rather than writing a scenario for each set of values. It can mean you have one scenario rather than 10 similar scenarios & makes the feature file much more readable.

  • Other stuff

Tags can be used to group acceptance tests. These aren’t part of Gherkin syntax, but are good practice. For example you can use @manual to identify manual acceptance tests. Or @javascript-disabled, @signed-in-users @edge-case @jira-103. A scenario can contain multiple tags – and you can create your own tags.

Steps are the name of anything below the scenario title. It’s the steps that a test will run through for a scenario (e.g. your Given / When / Then)

How it fits into BDD

As part of BDD the developer will write a test before the code. That means the test will initially fail, because the developer hasn’t write the code yet. It ensures each piece of functionality has automated test coverage

Tests should be behavioural in BDD. They should be a high level tests describing user functionality (i.e. not a unit test). Gherkin ensures behavioural tests are written.

Below is a typical BDD process:

The BA would write a feature file (includes bullet point ACs and Gherkin scenarios). This would be 3 Amigo’d with a developer/QA.

The developer would write step definitions for a scenario:

This would cause the test to fail because there is no code yet to pass the functionality. The developer writes code that means the system behaves as specified. The test passes.

Now if anyone changes the Gherkin scenario – it should result in the test failing.

Automated BDD tests reduce manual testing; this means we can have greater confidence when performing regular releases. By using Gherkin,those automated tests can be understandable by everyone. And the tests are hooked into the BA requirements.


Hopefully you can see the benefits of using Gherkin. I’ve tried to explain the what Gherkin is, why it’s used & the key syntax.

Hopefully the article provided a useful overview of Gherkin.

You should be able to take this little Gherkin quiz !!

Answers are below:

1 C

2 G

3 B

4 D

5 E

6 I

7 J

8 F

9 A

10 H

Emojination cards

1 Feb

Tired of planning poker? Use some Emojination!! Cards free to download below:



PDF cards

Powerpoint cards

Let me know your thoughts!!


Brian the BA learns about the INVEST criteria

31 Jan



Brian the BA – explains specification by example

29 Jan


User Story Smells

24 Jan


“User story smells” is a term used by Mike Cohn in User Stories Applied. It describes anti-patterns that happen when writing user stories. Mike Cohn provided a number of story smells.

With 9 years Business Analysis experience, I decided to write my top 10 story smells. They are based on my observations. I’ve even created a game for people to try.



Smell 1 – Everything in a Sprint should be written as a user story

This seems to happen with less experienced agile teams. They use the story format for everything in a Sprint (e.g. As a developer … I want … So that).

Why is it bad? User stories are written from the perspective of end users. They ensure what you build is anchored on a user need. Technical tasks can be sub-tasks of user stories (preferred option), or just tasks that need to be done to keep the lights on (e.g. renew a cert).

User stories are one type of item in the product backlog. Other types of item include: bugs, tasks, epics and spikes. Item can be in a Sprint without being user stories. Don’t spend time thinking how a technical sub-task can fit into the user story format. hammer-nail


Smell 2 – Stories should be sliced by technology layer, because that’s how our development team will approach them

Teams can have different groups of developers (e.g. front end and backend developers). There can be pressure to slice stories accordingly, because each story will be done by a different development team. Another reason is that breaking it down by technology layer removes a dependency on other developer teams. This is an artefact of how the development team is split.

The problem with this approach is that technology slides do not produce a valuable deliverable for the end user. The front end slice must plug into the backend to add value. Vertical slices of functionality are preferred to horizontal technology slices. Vertical slices are much more likely to be potentially shippable.



Smell 3 – Stories don’t need acceptance criteria

This is a strange one – I’ve seen it before. The idea is that the BA/product team should not solutionise. They should present the user need/story to the developer and not come with a list of acceptance criteria/constraints.

The problem is – you need a clear outcome for a story. And there are often clear requirements from the business, or constraints to be considered. Just putting AS I … I WANT … SO THAT and leaving out the acceptance criteria means you won’t know when a ticket is done. It’s not specific enough.

Collaborative specifications, or collaborative specification reviews (e.g. 3 Amigos) work around this. Stories have to have acceptance criteria in order to be testable and closeable.



Smell 4 – The product owner is a user

One of the most common smells. The product owner is a proxy for the user, but 9 times out of 10 they’re not the end user of the service.

The user in a user story has to be an end user of the system. They can be personas/types of user (e.g. admin, front line staff, loyal user etc). Product Owners, BAs, members of the dev team are not the end users.

Writing AS A product owner I WANT something SO THAT value isn’t a user story.



Smell 5 – Acceptance criteria must specify how features look & behave

Some developers like lots of detail. And that’s OK … but generally speaking acceptance criteria specify behaviour (i.e. what the system does in certain scenarios). They don’t need to specify how it looks.

There can be times when describing how a feature looks is useful – or even necessary. Generally attaching a visual or link to a component library is sufficient.

A picture is worth a thousand words.



Smell 6 – System-wide NFRs should be written as NFRs

NFRs are tricky. There are obviously NFRs that affect the end user e.g. system availability. They can be convincingly been written in the user story format.

One problem with writing system-wide NFRs as user stories (e.g. availability, system backups) is that they cut across the entire system. It’s difficult to test these NFRs until the entire system is built. I prefer to have system wide NFRs either as “definition of done criteria” which get tested against each ticket, or as items for regression testing at the end of a release.

Story -specific NFRs might be written as ACs against a ticket (e.g. audit log for a reduction decision).



Smell 7 – Specifying what the user wants is enough!

I’ve seen several people excluding the 3rd line of a user story. It’s the reason why the user wants something – the 3rd line helps us to understand why we’re doing the work.

The 3rd line of the user story (So that … ) can be driven from user research, or observations, or data analytics etc. Either way we need to understand the why before we start to solve the problem. At the minimum a story needs to include “So that”. This helps with prioritisation.



Smell 8 – User stories should be incredibly detailed

User stories should specify the appropriate level of information. There’s a tendency from BAs, and sometimes from the development team, to try to put all the information they have into a ticket.

Having a ticket that is too detailed adds little value. It makes it likely that people will scan over the ticket and miss the most important information. An incredibly detailed ticket is not necessarily better than a less detailed ticket – it’s about having the appropriate level of information.

As a story is worked on it might be that more detail emerges. But a story should contain enough information for the team to develop and test it.



Smell 9 – User stories can depend on other stories in the Sprint

Ideally user stories should meet the INVEST criteria. That means each story should be independent.

Unless it’s agreed at Sprint planning & made visible on the ticket – all user stories should be independent. There may be cases where two dependent stories are brought into the same Sprint – however the goal should be that stories do not depend on other stories.



Smell 10 – Stories should be very small

This is more for teams that are using Gherkin & TDD, however some teams aim to have very small user stories. Almost at the level of a handful of scenarios.

One advantage of smaller user stories is that we can track progress in a Sprint to a more granular level. But a note of caution – small user stories are essentially a grouping of scenarios. They can make the Sprint board less manageable and in themselves deliver very little value to a user. For very small stories it is difficult to make them to be independent and valuable.


The game

Here’s a link to a game we created. It lists the 10 smells + 10 example bad user stories. See if you can match them:

It’s a great team exercise – with either a product or a BA team. It helps reiterate some of the key points above. And makes examples tangible.

Any smells I’ve missed? Enjoy!

Applying Build, Measure Learn to Sprints Demos

25 Jul


Like most Scrum teams, we held “Sprint Review Meeting” every two weeks. We would gather as a team to demo what was recently built & receive feedback. Although it was a great opportunity to showcase recent work, we identified a number of problems with “Sprint Review Meetings” for our mature product:

  1. Stakeholder attendance was poor. Stakeholders saw the Sprint Review Meetings as a technical show & tell. The demos often didn’t work fully & business value wasn’t necessarily communicated.
  2. Because developers demoed the work, it put disproportionate pressure on the development team. We presented recent work & we often had problems with test environments/connections/mock data etc.
  3. More generally – the development team wanted regular updates from the product team. Our retros identified a need for the product team to provide regular updates about recent features; did a recently released feature meet our hypothesis? What did we learn? Will we iterate? How did it impact our quarterly OKRs?
  4. Sprint Review Meetings felt like a conveyor belt. We would demonstrate work, get feedback about quality, and then watch it leave the factory. But we wanted to learn how customers actually used the new product. We wanted external as well as internal feedback.


Build, Measure, Learn (BMLs) sessions

To address the above issues, we replaced Sprint Review Meetings with “Build, Measure, Learn” sessions. As advocates of the Build, Measure, Learn approach – we were keen to review recently released features with the team. We launched features every 2 weeks – so the natural cadence was to report on features at the end of the following Sprint.

We created “Build, Measure, Learn” sessions. The basic format is simple:


Every 2 weeks. At the end of the Sprint. Replaces the Sprint Review Meeting. 


Team (Product, Devs, UX) & Stakeholders. 


1 hour.


The session is divided into two sections:

  1. Build = demo from the development team about what was built during the Sprint. It’s a chance to get feedback from the Product Owner/Stakeholders.
  2. Measure/Learn = product reporting back on stats/usage/insights of recently launched features. Typically on features & changes launched 2 & 4 weeks ago. This provides an external feedback loop.

The Measure/Learn section became as valuable as the demo section. It also provided practical breathing space for setting up/fixing demo’s – if we had problems we would start off with the Measure/Learn section 😉


Build section

As with the Sprint Review meeting – this section was the development team demoing what was built during the Sprint.

This was an opportunity for product/stakeholders to provide feedback and ask any questions. Changes were noted by the BA and put on the product backlog.

It was also an opportunity to praise the team & celebrate success.


Measure/Learn section

In the Measure/Learn section the BA or Product Owner would cover the following areas:

  1. General product performance: how we are performing against quarterly goals/OKRs
  2. For each recently released feature:
    • Present the testable hypothesis
    • Present the actuals. Key trends/unexpected findings/verbatim feedback from the audience about the feature
    • Present key learnings/actions: Build a v2/pivot/stop at v1/kill the feature?
  3. Wider insights (optional):
    • Present recent audience research/lab testing
    • Present upcoming work that UX are exploring & get feedback on it



We found that BML sessions were a great replacement to Sprint Review Meetings. They ensured we kept the measurement & learning part of the lifecycle front and center in the team. The Measure/Learn section also ensured we reported back on business value regularly.

Main benefits:

  1. Learnings/insights about recently released features were shared with the team – this kept us focused on our original hypotheses and business value. It enabled us to discuss the learnings based on external audience feedback.
  2. Encouraged a shared sense of ownership about the end of Sprint session and the performance of features
  3. Increased stakeholder attendance & stakeholder engagement as there was a focus on audience feedback and KPIs
  4. We were still able to demo the newly developed features & get Product Owner/Stakeholder feedback

How Might We … brainstorm ideas

13 Jul

Screen Shot 2016-07-13 at 18.46.35


“How Might We …” is a group brainstorming technique we have used for 6>months to solve creative challenges. It originated with Basadur at Procter & Gamble in the 1970s, and is used by IDEO/Facebook/Google/fans of Design Thinking.

“How Might We …” is a collaborative technique to generate lots of solutions to a challenge. Our team modified the technique slightly to ensure that we also prioritise those solutions. More on that below …

In essence “How Might We …” frames problems as opportunity statements in order to brainstorm solutions. For example:

  • How Might We promote our new service to the audience?
  • How Might We improve our membership offering?
  • How Might We completely re-imagine the personalisation experience?
  • How Might We find a new way to accomplish our download target?
  • How Might We get users excited & ready for the Rio Olympics?

How Might We works well with a range of problem statements. Ideally the question shouldn’t be too narrow or broad.



How Might We sessions involve a mixture of participants: product (Product Owner/BA), technical (Developers/Tech Lead/QA) and stakeholders. The duration is 1 – 1.5 hours.

The format is:

  1. Scene setup (background/constraints/goals)
  2. Introduce the question (How Might We …)
  3. Diverge (generate as many solutions as possible)
  4. Converge (prioritise the solutions)


1. Scene Setup

Scene setup is about introducing the background, constraints, goals & groundrules of the How Might We session.

For example we held a session about: “How Might We get app users excited & ready for the Rio Olympics?” We invited 10 participants across product, technical and stakeholder teams. For 5 minutes we setup the scene. As part of scene setup:

  • Background: Rio 2016 is the biggest sporting event. We expect record downloads & app traffic. There will be high expectations. There will be hundreds of events & hours of live coverage.
  • Constraints: We want to deliver the best possible experience without building a Rio specific app.
  • Session goal: Generate ideas for new features & to promote current features.
  • Commitment: We will take the best ideas forward to explore further.


2. Introduce the question

The How Might We question is presented to participants and put on a wall/physical board

The question shouldn’t be too restrictive; wording is incredibly important. Check the wording with others before the session. We circulate the question to participants ahead of the session – this allows them to generate some solutions before the meeting.

Framing the question in context/time will help. It makes the problem more tangible. For example:

“It’s 3 days before the Olympics. How Might We get users excited & ready for the Rio Olympics?”


3. Diverge

Use a technique like crazy 8’s to generate ideas. Give people 5-10 minutes to think of many solutions to the question.

These solutions are typically written on post-it notes. At the end of 10 minutes we ask each participant to stand up and present their post-it notes ideas to the group. Participants explain their ideas; common ideas are grouped together. For example:

Post it note ideas

With 10 users you can generate 50 – 80 ideas. Once ideas are grouped together you can have 20 – 30 unique ideas.


4. Converge

We ask people to pick their favourite idea. It can be there own idea, or another person’s post-it note idea.

For 10-15 minutes they explore that idea in more detail. Participants can add notes/draw user flows/write a description about the idea.

At the end of 10 minutes, each participant is asked to present back their idea to the group. For example:

Idea example

Once each participant has presented their idea (10 people = 10 ideas), participants are invited to dot vote. Each participant has 3 votes to select their favourite 3 ideas.

Typically this is where a HMW ends ….

BUT we would often find ourselves in a position where the top voted idea was the most difficult to implement. The top ideas were often elaborate & had a cool factor – but were very complicated to build/offered limited business value. For example: “We could build VR into the app. It would offer all sports in immersive 3D and recommend videos based on the user’s Facebook likes”.

AND we found that stakeholders weren’t comfortable having an equal say (3 dot votes) to QA/developers in terms of the product proposition.

SO we implemented a further step to converge on more realistic options. We took the top voted ideas + any ideas that stakeholders were particularly keen on from the How Might We session. We allowed UX to explore these ideas in more detail. An example of a more refined idea is an Olympics branded menu:

Screen Shot 2016-07-13 at 17.56.56

We took these ideas into the prioritisation session.



With the more refined ideas we held a prioritization session with the key stakeholders (product owner, tech lead, primary stakeholders).

As a group we would rank these ideas in terms of business value and technical complexity (1-5). The business value was driven by a KPI or agreed mission. The technical complexity was an estimate of effort.

Complexity 5 = hard

Complexity 1 = easy

Impact 5 = high impact

Impact 1 = low impact

We would end up with a relative ranking of the top ideas. For example:

Cost Value example

The top left quadrant is tempting (high impact, low effort). The bottom right quadrant is not tempting (low impact, high effort).

We used the relative weightings & dot voting to select the best idea. We would go on to shape & build the best idea.