Part 1 and Part 2 of this series provided how-tos and usefulness tips for creating acceptance tests for Web apps. This final post reflects on some of the broader topics for our acceptance tests.
Aims and drivers of our tests
In my experience and that of my colleagues, there are drivers and aims for acceptance tests. They should act as ‘safety rails’, by analogy similar to crash barriers at the sides of roads, that keep us from straying too far from the right direction. Our tests need to ensure development doesn’t break essential functionality. The tests must also provide early warning, preferably minutes after relevant changes have been made to the code.
My advice for developing acceptance tests for Web applications: start simple, keep them simple, and find ways to build and establish trust in your automation code. One of the maxims I use when assessing the value of a test is to think of ways to fool my test into giving erroneous results. Then I decide whether the test is good enough or whether we need to add safeguards to the test code to make it harder to fool. I’m pragmatic and realise that all my tests are imperfect; I prefer to make tests ‘good enough’ to be useful where essential preconditions are embedded into the test. Preconditions should include checking for things that invalidate assumptions for that test (for example, the logged-in account is assumed to have administrative rights) and checking for the appropriate system state (for example, to confirm the user is starting from the correct homepage and has several items in the shopping basket).
The value of the tests, and their ability to act as safety rails, is directly related to how often failing tests are a "false positive." Too many false positives, and a team loses trust in their acceptance tests entirely.
Acceptance tests aren’t a ‘silver bullet.’ They don’t solve all our problems or provide complete confidence in the system being tested (real life usage generates plenty of humbling experiences). They should be backed up by comprehensive automated unit tests and tests for quality attributes such as performance and security. Typically, unit tests should comprise 70% of our functional tests, integration tests 20%, and acceptance tests the remaining 10%.
We need to be able to justify the benefits of the automated tests and understand both the return on investment (ROI) and Opportunity Cost – the time we spend on creating the automated tests is not available to do other things, so we need to ask whether we could spend our time better. Here, the intent is to consider the effects and costs rather than provide detailed calculations; I typically spend a few minutes thinking about these factors as I’m deciding whether to create or modify an automated test. As code spends the vast majority of time in maintenance mode, living on for a long time after active work has ceased, I recommend assessing most costs and benefits over the life of the software. However, opportunity cost must be considered within the period I’m actively working on the project, as that’s all the time I have available.
Unlike testing of traditional web sites, where the contents tend not to change once they have been loaded, tests for web applications need to cope with highly dynamic contents that may change several times a second, sometimes in hard-to-predict ways, caused by factors outside our control.
As web applications are highly dynamic, the tests need to detect relevant changes, wait until the desired behaviour has occurred, and interrogate the application state before the system state changes again. There is a window of opportunity for each test where the system is in an appropriate state to query. The changes can be triggered by many sources, including input, such as a test script clicking a button; clock based, such as a calendar reminder is displayed for 1 minute; and server initiated changes, such as when a new chat message is received.
The tests can simply poll the application, trying to detect relevant changes or timeouts. If the test only looks for expected behaviour, it might spend a long time waiting in the event of problems. We can improve the speed and reliability of the tests by checking for problems, such as error messages.
Browser-based UI tests are relatively heavy-weight, particularly if each test has to start from a clean state, such as the login screen. Individual tests can take seconds to execute. While this is much faster than a human could execute a test, it’s much slower than a unit test (which takes milliseconds). There is a trade-off between optimizing tests by reducing the preliminary steps (such as bypassing the need to log in by using an authentication cookie) and maintaining the independence of the tests – the system or the browser may be affected by earlier tests. Fast tests make for happier developers, unless the test results prove to be erroneous.
As with other software, automated tests need ongoing nurturing to retain their utility, especially when the application code is changed. If each test contains information on how to obtain information, such as an xpath expression to get the count of unread email, then a change to the UI can affect many tests and require each of those tests to be changed and retested. By applying good software design practices, we can encapsulate the ‘how’ from the rest of our tests. That way, if the application changes, we only need to change how we get the email count in one piece of code, instead of having to change it in every piece of code.
Practical tests
Lots of bugs are discovered by means other than automated testing – they might be reported by users, for example. Once these bugs are fixed, the fixes must be tested. The tests must establish whether the problem has been fixed and, where practical, show that the root cause has been addressed. Since we want to make sure the bug doesn’t resurface unnoticed in future releases, having automated tests for the bug seems sensible. Create the acceptance tests first, and make sure they expose the problem; then fix the bug and run the tests again to ensure the fix works. Antony Marcano is one of the pioneers of acceptance tests for bugs.
Although this article focuses on acceptance tests, I’d like to encourage you to consider creating smaller tests when practical. Smaller tests are more focused, run significantly faster, and are more likely to be run sooner and more often. We sweep through our acceptance tests from time to time and replace as many as we can with small or medium tests. The remaining acceptance tests are more likely to be maintained because we know they’re essential, and the overall execution time is reduced – keeping everyone happy!
Further information
A useful tutorial on xpathhttp://www.zvon.org/xxl/XPathTutorial/General/examples.html
Google Test Automation Conference (GTAC) 2008: The value of small testshttp://www.youtube.com/watch?v=MpG2i_6nkUg
GTAC 2008: Taming the Beast - How to Test an AJAX Applicationhttp://www.youtube.com/watch?v=5jjrTBFZWgk
Part 1 of this article contains an additional long list of excellent resources.
Part 1 of this series provided practical how-tos to create acceptance tests. Read on to learn how to make your tests more useful.
Once we have some automated acceptance tests, they must be run, without delay, as often as appropriate to answer the concerns of the team. We may want to run a subset of the tests after each change of the code. The process of running tests can be automated and integrated with a source control system that continuously builds the code and runs various automated tests. If the acceptance tests are sufficiently fast and can run unattended, they should be included in the tests run by the continuous build system. One challenge for our acceptance tests at Google is to enable the web browser to run without appearing on screen; our machines don’t typically have a physical or logical GUI. Utilities such as vnc and xvfb can host the web browser and enable us to run our acceptance tests. A useful guide on test automation is the book Pragmatic Project Automation by Mike Clark.
Fluid writing of test automation code smooths over obstacles, coping with the interface twixt web application and your code. When the application has been designed with testing in mind, hooks exist; keyboard shortcuts are proffered; and debug data is available for the asking. Hooks include IDs on key elements such as the search field, enabling tests to identify the correct element quickly, unambiguously, and correctly even as the layout of the UI changes.
Tests that repeat exactly the same steps using the same parameters tread a well-worn path through the application and may side-step some nearby bugs which we could find by changing a couple of parameters in the tests. Ways to change the tests include using external data sources and using random values for number of repetitions, sleeps, number of items to order, etc. You need to be able to distinguish between tests that fail because they are flaky and those that report valid failures in the software being tested, so make sure the tests record the parameters they used in sufficient detail to enable the test to be re-run consistently and predictably.
Browsers differ between one provider and another and between versions. Your application may be trouble-free in one browser, yet entirely unusable in another. Make sure your automated tests execute in each of the major browsers used by your users; our list typically includes Internet Explorer, Firefox, Safari, Opera, and Chrome. Tools such as Selenium RC (http://seleniumhq.org/) and WebDriver (http://code.google.com/p/webdriver/) support most of the browsers, and if your tests are designed to run in parallel, you may be able to take advantage of parallel test execution frameworks such as Selenium Grid.
Many web applications are now used on mobile phones such as the iPhone or G1. While there are some early versions of WebDriver for these devices, you may find emulating these devices in a desktop browser is sufficient to give you the confidence you need. Firefox’s excellent extensions and profiles make such testing easy to implement. Safari’s development tools can be used to specify the parameters you need, such as which device to emulate. Here’s an example of how to configure Firefox in WebDriver to emulate a version 1.1 iPhone.
private static final String IPHONE_USER_AGENT_V1_1 = "Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420.1 " + "(KHTML; like Gecko) Version/3.0 Mobile/3B48b Safari/419.3"; /** * Returns the WebDriver instance with settings to emulate * an iPhone V1.1 */ public static WebDriver createWebDriverForIPhoneV1_1() { final String emptyString = ""; FirefoxProfile profile = new FirefoxProfile(); // Blank out headers that would otherwise confuse the web server. profile.setPreference("general.appversion.override", ""); profile.setPreference("general.description.override", ""); profile.setPreference("general.platform.override", ""); profile.setPreference("general.vendor.override",""); profile.setPreference("general.vendorsub.override",""); profile.setPreference("general.appname.override", "iPhone"); profile.setPreference( "general.useragent.override", IPHONE_USER_AGENT_V1_1); WebDriver webDriver = new FirefoxDriver(profile); return webDriver; }
The user-agent string can be found online in many cases or captured from a tame web server that records the HTTP headers. I use http://www.pycopia.net/webtools/headers, which even emails the values to me in a format I can easily adapt to use in my test code.
Robust tests can continue to operate correctly even when things change in the application being tested or in the environment. Web applications use HTML, so try to add IDs and CSS classes to relevant elements of the application. Although these additions potentially increase the size of the page, they enable easier and more consistent identification, navigation, and selection of the user interface.
Try to avoid brittle identifiers, such as xpath expressions that rely on positional data. For example, /div[3]/div[1] becomes unreliable as soon as any of the positions change – and problems may be hard to identify unless the change is easy to identify.
Add guard conditions that assert your assumptions are still accurate. Design the tests to fail if any of the assumptions prove false. If possible, make the tests fail at compile time to provide the earliest possible feedback.
Try to only make positive assertions. For example, if you expect an action to cause an item to be added to a list, assert that after the action the list contains the expected value, not that the list has changed size (because other functionality may affect the size). Also, if it's not something your test is concerned about, don't make assertions about it.
Help your tests to help others by being informative. Use a combination of meaningful error messages and more detailed logs to help people to tell whether the tests are working as intended and, if problems occur, to figure out what’s going wrong.
Taking screenshots of the UI when a problem occurs can help to debug the issue and disambiguate between mismatches in our assumptions vs. problems in the application. It’s not an exact science: screenshots are seldom recorded at exactly the same time as the interaction with the application; typically they’re recorded afterwards, and the application may have changed in the interim period, no matter how short that period is.
Debug traces are useful for diagnosing acute problems, and range from simple debug statements like ‘I made it to this point’ to dumps of the entire state of values returned from the application by our automation tool. In comparison, logging is intended for longer-term tracking of behaviour which enables larger-scale thinking, such as enabling a test to be reproduced reliably over time.
Good error messages should say what’s expected and include the actual values being compared. Here are two examples of combinations of tests and assert messages, the second more helpful than the first:
1. Int actualResult = addTwoRandomOddNumbers();
assertTrue("Something wrong with calculation", actualResult % 2 == 0);
2. Int actualResult = addTwoRandomOddNumbers(number1, number2);
assertEquals(String.format("Adding two odd numbers [%d] and [%d] should return an even result. Calculated result = %d", number1, number2, actualResult) actualResult % 2 == 0);
Vint Cerf coined the phrase bit-rot to reflect the decay of usefulness or availability of software and data stored on computers. In science, half-life is a measurement of the decay of radioactivity over time, and is the period taken for the radioactivity to reduce by 50%. Similarly, our tests are likely to suffer from bit-rot and will become less useful over time as the system and its use change.
The only cure for bit-rot is prevention. Encourage the developers to adopt and own the tests.
As our tests get bigger and more complex, let’s add unit tests to help ensure our acceptance tests behave as expected. Mock objects are one practical way to reliably automate the tests, with several good and free frameworks available for common programming languages. I suggest you create unit tests for more involved support functions and for ‘driver’ code, rather than for the tests themselves.
If you think creating automated tests for a web application is hard, try using the web site with accessibility software such as a screen reader to learn just how inaccessible some of our web applications are! Screen readers, like automated tests, need ways to interrogate, interact with, and interpret the contents of web applications. In general, increasing the accessibility of a site can improve testability and vice-versa. So while you’re working hard with the team to improve the testability, try to use the site with a screen reader. Here's one example: Fire Vox, a screen reader for Firefox. http://firevox.clcworld.net/about.html
The third and final post of this series will reflect on the aims and challenges of acceptance tests.
Acceptance tests must meet the needs of several groups, including the users and the developers. Long-lived tests must be written in the language of each group, using terms users will recognize and a programming language and style in which the developers are competent.
We create tests by modelling the purpose of a test from the user’s perspective: send a message, order a book, etc. Each test is decomposed into individual actions: to send a message, a user must be logged in, select the compose message icon, specify one or more recipients, type a minimum of either a subject or a message, then select Send. From this list of actions, create a skeleton in the programming language of choice and create a method name that reflects each action. Show these to both the users and programmers and ask them to tell you what they think each step represents. Now is a great time to refine the names and decide which methods are appropriate: before you’ve invested too much time in the work. If you wait until later, your natural protective instincts will make it harder for you to accept good suggestions and make useful changes.
For each method, we need to work out how to implement it in code. How could an automated test select the compose message icon? Do alternative ways exist? An understanding of HTML, CSS, and JavaScript will help you if you plan to use browser automation tools. All the visible elements of a web application are reflected in the Document Object Model (DOM) in HTML, and they can be addressed in various ways: the directions from the root of the document to the element using xpath; unique identifiers; or characteristics possessed by the elements, such as class names, attributes, or link text. Some examples of these addressing options are shown in the Navigation Options illustration below. (Notes: navigation using xpath is much slower than using IDs; and IDs should be unique.)
Some actions can be initiated using JavaScript running in the browser. For devices such as the iPhone, changes in orientation when the phone is rotated are triggered this way (see Handling Orientation Events in the Safari Reference Library).
Typically, automated web application tests use JavaScript, either directly or indirectly, to interact with the web application being tested.
Utilities such as recording tools can help reduce the effort required to discover how to interact with the web application. The open-source test automation tool Selenium (http://seleniumhq.org/) includes a simple IDE record and playback tool that runs in the Firefox browser. Recorded scripts can help bootstrap your automated tests. However, don’t be tempted to consider the recorded scripts as automated tests: they’re unlikely to be useful for long. Instead, plan to design and implement your test code properly, using good software design techniques. Read on to learn how to use the PageObject design pattern to design your test code.
Two of the tools I find most useful are Firebug (http://getfirebug.com/), a Swiss Army knife for the Web Browser, and Wireshark (http://www.wireshark.org/), a network protocol analysis tool with a distinguished pedigree. Firebug is extremely useful when learning how to interact with a web application or debug mysterious problems with your tests when they seem to be misbehaving. I encourage you to persist when learning to use these tools – it took me a while to get used to their foibles, but I wouldn’t be without either of them these days.
Several years of experience across multiple project teams have taught us that the tests are more likely to survive when they’re familiar and close to the developers. Use their programming language, put them in their codebase, use their test automation framework (and even their operating system). We need to reduce the effort of maintaining the tests to a minimum. Get the developers to review the automated tests (whether they write them or you do) and actively involve them when designing and implementing the tests.
Typically, our acceptance tests use the xUnit framework; for example, JUnit for Java projects (see http://www.junit.org/). A good source of inspiration for creating effective tests is Gerard Meszaros’ work (see http://www.xunitpatterns.com).
By using effective test designs, we can make tests easier to implement and maintain. The initial investment is minor compared to the benefits. One of my favourite designs is called Page Objects (see PageObjects on the Google Code site). A PageObject represents part or all of a page in a web application – something a user would interact with. A PageObject provides services to your test automation scripts and encapsulates the nitty-gritty details of how these services are performed. By encapsulating the nitty-gritty stuff, many changes to the web application, such as the reordering or renaming of elements, can be reflected in one place in your tests. A well-designed PageObject separates the ‘what’ from the ‘how’.
// Given I have a valid user account and am at the login page,// When I enter the account details and select the Enter button,// Then I expect the inbox to be displayed with the most recent email selected.
The previous code consists of three programming comments that are easy for users to read. The actual programming code is entered immediately below each comment. Programming concepts such as literate programming are intended to make the code almost as readable as the textual comments.
Isolate things that change from those that don’t. For example, separate user account data from your test code. The separation makes changes easier, faster, and safer to implement, compared to making updates in the code for each test.
Writing automated tests may be easy for some of you. In my case, I started with some simple example tests and tweaked them to suit my needs. I received boosts from working with more experienced practitioners who were able to correct my course and educate me in how to use various tools effectively. I recommend pairing with one of the developers of the software to be tested when you face a new testing requirement. Their intimate knowledge of the code and your understanding of the tests can form a potent combination. For instance, by working with one of the developers on a recent project, we were able to implement bi-directional injection of JSON messages and capture the responses from the server to test a key interaction between the server and client that was causing problems in production.
I encourage you to try out examples, tweak them, experiment, and plunge in to writing your first automated tests. Learn about AJAX – it underpins the web applications. And learn from more experienced practitioners – I’ve added some links at the end of the article to some of the people I respect who write great acceptance tests, including Antony Marcano and Alan Richardson.
Part 2 of this series helps you create more specialized tests (for example, to emulate mobile web browsers) and gives advice on how to increase the utility and effectiveness of your tests.Further InformationIntermediate work products
‘The intermediate work products have only one real purpose in life: ‘‘to help the team make their next move’’.’ ‘An intermediate work product might be measured for ‘‘sufficiency” — was it sufficient to remind, inform or inspire? Any amount of effort or detail beyond sufficiency is extraneous to the purpose of the team and the purpose of the work product.’ Cooperative game manifesto for software development (Alistair Cockburn)Cooperative game manifesto for software development at http://alistair.cockburn.us.JUnit infoJUnit in Action, available from Manning Publications Co. (2nd edition, early access or 1st edition)JUnit Recipes, by J. B. Rainsberger with Scott Stirling, available from Manning Publications Co.Firebug infoIntroduction to Firebug on Estelle Weyl’s blog, "CSS, JavaScript and XHTML Explained"Firebug tutorials in the Firebug Archive at Michael Sync's blogFun with Firebug Tutorial on the Google Code siteWebDriver infowebdriver on the Google Code siteAJAX resources
Bulletproof Ajax—An incredibly good book on how to write good AJAX code. It starts with the basics and builds reliably and clearly from good foundations. The DOM manipulation code is relevant for implementing your acceptance tests in tools such as WebDriver.
Building a web site with Ajax —Again, a book that starts simple and builds a simple application step by step.
Acceptance tests are more A+S than T+G (Antony Marcano, in his blog at testingReflections.com)A+S => Activities + SpecificT+G => Tasks + General
Alan Richardson: any and everything. For example, see:A generalised model for User Acceptance Testing andA little abstraction when testing software with Selenium-RC and Java, both at the Evil Tester blog
It took eighteen months for Julian Harty to overcome the various challenges of testing mobile wireless applications. In turn, he has learned some valuable lessons that he wants to share with you in this week's column.