Archive | business analyst RSS feed for this section

Brian the BA learns about the INVEST criteria

31 Jan



Brian the BA – explains specification by example

29 Jan


User Story Smells

24 Jan


“User story smells” is a term used by Mike Cohn in User Stories Applied. It describes anti-patterns that happen when writing user stories. Mike Cohn provided a number of story smells.

With 9 years Business Analysis experience, I decided to write my top 10 story smells. They are based on my observations. I’ve even created a game for people to try.



Smell 1 – Everything in a Sprint should be written as a user story

This seems to happen with less experienced agile teams. They use the story format for everything in a Sprint (e.g. As a developer … I want … So that).

Why is it bad? User stories are written from the perspective of end users. They ensure what you build is anchored on a user need. Technical tasks can be sub-tasks of user stories (preferred option), or just tasks that need to be done to keep the lights on (e.g. renew a cert).

User stories are one type of item in the product backlog. Other types of item include: bugs, tasks, epics and spikes. Item can be in a Sprint without being user stories. Don’t spend time thinking how a technical sub-task can fit into the user story format. hammer-nail


Smell 2 – Stories should be sliced by technology layer, because that’s how our development team will approach them

Teams can have different groups of developers (e.g. front end and backend developers). There can be pressure to slice stories accordingly, because each story will be done by a different development team. Another reason is that breaking it down by technology layer removes a dependency on other developer teams. This is an artefact of how the development team is split.

The problem with this approach is that technology slides do not produce a valuable deliverable for the end user. The front end slice must plug into the backend to add value. Vertical slices of functionality are preferred to horizontal technology slices. Vertical slices are much more likely to be potentially shippable.



Smell 3 – Stories don’t need acceptance criteria

This is a strange one – I’ve seen it before. The idea is that the BA/product team should not solutionise. They should present the user need/story to the developer and not come with a list of acceptance criteria/constraints.

The problem is – you need a clear outcome for a story. And there are often clear requirements from the business, or constraints to be considered. Just putting AS I … I WANT … SO THAT and leaving out the acceptance criteria means you won’t know when a ticket is done. It’s not specific enough.

Collaborative specifications, or collaborative specification reviews (e.g. 3 Amigos) work around this. Stories have to have acceptance criteria in order to be testable and closeable.



Smell 4 – The product owner is a user

One of the most common smells. The product owner is a proxy for the user, but 9 times out of 10 they’re not the end user of the service.

The user in a user story has to be an end user of the system. They can be personas/types of user (e.g. admin, front line staff, loyal user etc). Product Owners, BAs, members of the dev team are not the end users.

Writing AS A product owner I WANT something SO THAT value isn’t a user story.



Smell 5 – Acceptance criteria must specify how features look & behave

Some developers like lots of detail. And that’s OK … but generally speaking acceptance criteria specify behaviour (i.e. what the system does in certain scenarios). They don’t need to specify how it looks.

There can be times when describing how a feature looks is useful – or even necessary. Generally attaching a visual or link to a component library is sufficient.

A picture is worth a thousand words.



Smell 6 – System-wide NFRs should be written as NFRs

NFRs are tricky. There are obviously NFRs that affect the end user e.g. system availability. They can be convincingly been written in the user story format.

One problem with writing system-wide NFRs as user stories (e.g. availability, system backups) is that they cut across the entire system. It’s difficult to test these NFRs until the entire system is built. I prefer to have system wide NFRs either as “definition of done criteria” which get tested against each ticket, or as items for regression testing at the end of a release.

Story -specific NFRs might be written as ACs against a ticket (e.g. audit log for a reduction decision).



Smell 7 – Specifying what the user wants is enough!

I’ve seen several people excluding the 3rd line of a user story. It’s the reason why the user wants something – the 3rd line helps us to understand why we’re doing the work.

The 3rd line of the user story (So that … ) can be driven from user research, or observations, or data analytics etc. Either way we need to understand the why before we start to solve the problem. At the minimum a story needs to include “So that”. This helps with prioritisation.



Smell 8 – User stories should be incredibly detailed

User stories should specify the appropriate level of information. There’s a tendency from BAs, and sometimes from the development team, to try to put all the information they have into a ticket.

Having a ticket that is too detailed adds little value. It makes it likely that people will scan over the ticket and miss the most important information. An incredibly detailed ticket is not necessarily better than a less detailed ticket – it’s about having the appropriate level of information.

As a story is worked on it might be that more detail emerges. But a story should contain enough information for the team to develop and test it.



Smell 9 – User stories can depend on other stories in the Sprint

Ideally user stories should meet the INVEST criteria. That means each story should be independent.

Unless it’s agreed at Sprint planning & made visible on the ticket – all user stories should be independent. There may be cases where two dependent stories are brought into the same Sprint – however the goal should be that stories do not depend on other stories.



Smell 10 – Stories should be very small

This is more for teams that are using Gherkin & TDD, however some teams aim to have very small user stories. Almost at the level of a handful of scenarios.

One advantage of smaller user stories is that we can track progress in a Sprint to a more granular level. But a note of caution – small user stories are essentially a grouping of scenarios. They can make the Sprint board less manageable and in themselves deliver very little value to a user. For very small stories it is difficult to make them to be independent and valuable.


The game

Here’s a link to a game we created. It lists the 10 smells + 10 example bad user stories. See if you can match them:

It’s a great team exercise – with either a product or a BA team. It helps reiterate some of the key points above. And makes examples tangible.

Any smells I’ve missed? Enjoy!


Bit of BA humour from me …

24 Jan



Try this your next Agile meetup. Let me know your feedback!!

24 Jan


New cartoon published

26 Nov

My latest cartoon was published on Modern Analyst. Very happy with it:

10 lessons from A/B testing

24 Aug


We implemented A/B testing into our product 6 months ago. During that time we conducted a variety of A/B tests to generate insights about our user’s behaviour. We learnt a lot about our specific product. More generally, we learnt about how to run valuable A/B tests.

Below is a Buzzfeed-esque TOP 10 LESSONS I learnt RUNNING A/B TESTS. It’s tips & tricks – plus things to avoid doing. It’s written from a product/BA perspective.




Lesson 1: A/B vs MVT testing

Lesson 1

A/B and MVT testing are very similar. Infact the terms are sometimes used interchangeably.

A/B and MVT tests both serve up different experiences to the audience and measure which experience performs the best. They are both run with the same 3rd party tools (e.g. Optimizely, Maxymiser) and have a similar experiment lifecycles.

The key difference between A/B and MVT tests is how many elements they vary to the audience.

A/B test

This is where you change one element of a page (e.g. the colour of a button). You might compare a blue button (challenger) against a red button (control) and examine what effect the button’s colour has on user behaviour. For example:

Button Colour | Variant Name |

| Blue                  | Challenger       |

| Red                   | Control            |

Pros: Simple to build, faster results, easier to interpret results

Cons: limited to one element of a user experience (e.g. button colour)

As a note – A/B tests aren’t limited to 2 variants. You could show a blue button, red button, purple button etc; as long as you change only one element of an experience (button colour) it’s an A/B test.

MVT test

This is where you change a combination of elements. You might compare changing the button colour and its text label. You would test all combinations of those changes and see what effect it has on user behaviour. For example:

| Button Colour | Text Copy | Variant Name |

| Blue                  | Click here  | Challenger 1   |

| Blue                  | Click          | Challenger 2   |

| Red                  | Click here   | Challenger 3   |

| Red                  | Click           | Control           |

Pros: Greater insights, identifies the optimal user experience, more control

Cons: Longer to get results, more complex, requires more traffic

Which one to pick?

This depends on what you want to test & your testable hypothesis. In the early stages of running experiments you might start with A/B tests and then move onto MVT tests. This is because A/B tests are simpler to create & interpret. MVT tests are slightly more complex but provide greater product insight.

As an example: we ran an MVT experiment where we changed the promotional copy on a page and a CTA label. We thought both elements would impact the click-through rate. The result was that the winning promotional copy was emotive copy. The best CTA was “Get started“. However the optimal variant was descriptive copy with “Get started“.Why? Perhaps because the tone between the two elements was more aligned. If we had run this as 2 A/B tests then we wouldn’t have identified the optimal combination.


Lesson 2: Have a clear hypothesis


An experiment is designed to test a hypothesis. The purpose of an experiment is to make a change and analyse the effect. Tests need to have a clear reason and a measurable outcome.

When creating an A/B test its crucial to create a clear hypothesisWhat is the problem you’re trying to solve? What are the success metrics? Why do you think this change will have an effect?

We use a variation of the Thoughtworks format to write testable hypotheses:

We predict that <change>

Will significantly impact <KPI/user behaviour>

We will know this to be true when <measurable outcome>

By having clearly defined hypotheses we can:

  1. Compare the merits of different hypotheses and select the most valuable one first. For example if hypothesis 1 predicts a 5% uplift in a KPI and hypothesis 2 predicts a 50% uplift in the same KPI, then we would test hypothesis 2 first.
  2. Agree the success metric upfront before starting development. For example if changing the mobile navigation is the test, what are the success metrics: more users clicking on the menu button, more items in the menu being clicked, increased usage and retention of brand new users? Having clear success metrics/goals is key when trying to identify the winning variant later on.
  3. Ensure the test is focussed on solving a user problem or improving a KPI that matters to the product. We don’t want to run tests simply because we can – they need to solve problems and offer benefits. The above format aligns each test with business KPIs/user problems.
  4. Make it incredibly easy for anyone to generate a hypothesis. The Thoughtworks format means that anyone in our team can generate a hypothesis. Some of the best ideas we’ve had are from “non-creatives” such as QA.

Note – we often put a “background” section with research in the testable hypothesis (e.g. how many people currently use a feature, industry average, user feedback etc).


Lesson 3: Forecast sample size

Lesson 3

When designing an A/B experiment it’s crucial to calculate the sample size. You will need to forecast the sample size required to detect the MDE (Minimal Detectable Effect). This forecast will inform:

  1. Whether you can run the experiment (do you have enough users?)
  2. The maximum number of variants you can create
  3. What proportion of the audience will need to be in the experiment
  4. Potentially the experiment duration (e.g. it will take 2 weeks to get that many users)

There’s several tools online to help you forecast e.g. Without upfront forecasting you run the risk of creating an experiment that will never reach an outcome.

For example: imagine your product has 100k weekly users. You plug in the numbers and forecast that each variant requires 22k users to detect a 0.05 statistical effect size. That means you should build no more than 4 variants, otherwise you won’t detect a significant result. At least 44% of users need to be in the experiment (22% see a variant, 22% see a control). If the change is radical, based on these numbers you may only want to create one variant; this is because you don’t want to show the experiment/significant UX changes to a large proportion of the audience.


Lesson 4: More variants the better

Optimizely ran an analysis of their customers successful A/B tests. What they found was interesting. The more variants run in an experiment (up to a limit), the more likely you are to find an effect. Why?

One reason is that if you ask UX to create 2 variants they may create two similar visuals. If you ask them to create 8 there might be greater differences between them. It’s likely with 2 variants you’re playing it safe. The Optimizely results suggests running about 5 variants in a test:

Lesson 4


Lesson 5: Implement a health metric

The purpose of a health metric is to ensure that an experiment doesn’t maximise one KPI (the experiment’s primary goal) at the detriment of other KPIs. Popular health metrics include: average weekly visits, content consumption, session duration etc. Essentially health metrics are key business KPIs you don’t want to see go downduring an experiment. If the health metric fails, then you pull the experiment early, or do not release the winning variant.

For example: imagine you have 3 variants of a sign-in prompt. One variant of the prompt is non-dismissible. If your primary goal is to maximise sign-ins then this variant will win. However the variant could be so annoying that it reduces overall user engagement with the product. Your health metric ensures you don’t maximise sign-ins at the detriment of core product KPIs (e.g. average weekly sessions).

In our case – the BA worked with stakeholders/the product owner to identify & track the health metrics. The health metrics will vary depending on the product.



Lesson 6: Get management buy in

Based on experience, I recommend getting management buy-in early on. A/B testing is a significant culture change. It challenges the idea that a Product Owner/UX/Managers know what the best user experience is. It replaces gut decisions with data based decisions. Essentially A/B testing can transition a team from a HIPPO culture (HIghest Paid Persons Opinion) to a data driven culture.

Lesson 6

To get management buy in for A/B testing there’s a variety of tactics:

  1. Ensure the 1st A/B test you run offers real business value. Don’t run a minor/arbitrary change as your 1st test. Try to solve an important problem or turn the dial on a key business KPI. Even better if the result might challenge existing beliefs.
  2. Reiterate the benefits of A/B testing. These include:
    • Increasing collaboration by empowering the team to generate their own hypotheses, which can be delivered as “small bets”
    • Increasing openness by encouraging a data-driven culture to decision making, rather than a HIPPO culture
    • Increasing innovation by learning more about user behaviour and adapting the product
    • Increasing innovation because delivering changes to a sub-set of the live audience means you can experiment more and take more risks
    • Challenging assumptions and decisions to create a more valuable product. Gut feelings can be wrong
    • Small bets are better than big bets. They are less risky & can have significant user benefits
    • Empowering the team to improve the quality of solutions
  3. Create experiments in collaboration with the entire team so that it’s not seen as a threat to the PO/UX
  4. Create a fun testing environment. Get people to place bets on the winner.


Lesson 7: Assumptions can be wrong

lesson 7

We’ve had several examples of where our assumptions about user behaviour were wrong.

Our 1st A/B test was a prompt. We thought it would increase usage of a new service. We were so confident about it as an in-app notification that we were going to make it a re-usable component. We actually had 3 more prompts on the roadmap.

What did we find out with an A/B test? The prompt significantly reduced general usage of the app. It was a dramatic drop in usage. The results challenged our assumptions and changed our roadmap.

By having a control group that we could compare against & by serving the experiment to a sub-set of the audience we were able to challenge our assumptions early & with a relatively small subset of users.
We never put the prompt live. Test your assumptions.


Lesson 8:  Broadly it’s a 6 step process

This is a slight simplification – below is the typical lifecycle of an experiment.

lesson 8

STEP 1 – Business goals

Identify the business goals (KPIs) and significant user problems for your product.

STEP 2 – Generate hypotheses

Generate testable hypotheses to solve these goals/problems. Prioritise the most valuable tests.

STEP 3 – Create the test

  • Work with UX & developers to create n number of variants
  • Forecast the number of users required for the MDE
  • Decide on traffic allocation (e.g. 50% see A, 50% see B)
  • Identify target conditions (e.g. only signed in users, only 10% of users)
  • Implement conversion goals (one primary and optional secondary goals)
  • Implement the health check
  • Set the statistical significance level

STEP 4 – Run the experiment

  • Run the experiment for at least 1 business cycle
  • Actively monitor it
  • Potentially ramp up number of users

STEP 5 – Analyse results

  • Review the performance of variants
  • Analyse the health check
  • Identify winner

STEP 6 – Promote the winner

  • Promote the winner to 100% of the audience
  • Learn the lessons
  • Archive the experiment


Lesson 9: Make testing part of the process

lesson 9

When we started A/B tested we committed to run 3 tests in the first quarter. It was a realistic target. It meant we were either developing a test, or analysing the results of a test (tests typically ran for 2 weeks). The more tests we ran, the easier they were to create.

Getting into a regular cycle is important in the early stages. For any feature or change you should ask “Could we A/B test that?”

I have seen several teams “implement A/B testing” and only run 1-2 tests. The key to getting value from A/B testing is to make it part of the product development lifecycle.


Lesson 10: There’s a community out there …

There’s a huge number of resources out there:

lesson 10


I learnt a huge amount from Olivier Tatard, Sibbs Singh, Sam Brown, Toby Urff and the folks at Optimizely. Big thanks also to the rest of the app team, we all went on the journey together.

If you made it down this far then you get 10 bonus points.