PAC and ARIA-AT: Helping to Define and Evaluate Screen Reader Support Expectations

April 16, 2023

Introduction

In 2023, getting web accessibility right is still considered as being too difficult by many, particularly when compared with the facilities provided on some native platforms. Ensuring that a user interface is not only technically conformant with standards, but also incorporates best practices to make it inclusive, approachable and welcoming to all audiences is an undeniable technical challenge.

Often, designers and developers alike find that the web platform does not provide out-of-the-box facilities for a design pattern or widget that they've deemed appropriate for their use case. When provisions do exist, they cannot always be styled to meet requirements, may not be fully accessible, and/or may not perform consistently across browsers. The Accessible Rich Internet Applications (WAI-ARIA) suite of web standards is designed so that, when custom components do need to be built, they can be made accessible. But support for ARIA roles, states and properties within user agents is far from guaranteed, frequently leading to stakeholder decisions such as:

over estimation of browser/access technology support for a given pattern;
under estimation of same, leading to the creation of an alternative that may not adequately meet perceived business requirements; or
exhaustive testing of a proposed approach with access technology to determine viability.

All three of these are problematic in some way. Over estimation leads to potentially inaccessible experiences, and shifts the burden onto access technology and/or browser vendors. Under estimation results in compromise, attaching negative connotations to accessibility work. And testing is a specialized skill in itself, requiring significant investment of time and knowledge (or money if a third-party team is engaged to carry it out), as well as the creation of one or more prototypes against an approach which may ultimately be determined to be unworkable.

Many of these problems have been solved in the browser space. There are entire databases of tests targeted at the various web platform features, exercising use cases ranging from common to unusual. Automation is in place to rerun tests at regular intervals or when triggered by a new browser release, and various projects like Can I use... exist to collate and present the data.

Historically, the same could not be said for access technology. PAC are proud to be involved in efforts to effect change in this area, with funding provided by Meta. This blog post describes what we're doing and how, with links to more information and ways to get involved.

ARIA and Assistive Technologies (ARIA-AT) Project

The ARIA-AT project, managed by the corresponding community group, has an ambitious mandate. From the linked home page, it seeks to:

... empower equal inclusion by realizing interoperability for AT users.

In other words, the project seeks to achieve some of the benefits fulfilled by web browser interoperability testing, but for access technologies (AT). Starting with screen readers, expectations for accessibility semantics are proposed, reviewed and agreed upon within the community group, and then tested.

In later phases, feedback is sought directly from access technology vendors, allowing them to have their say on the correctness of those proposals. Along the way, the generated data can facilitate bug reports against access technologies. This translates into direct improvements for disabled users and greater support levels and flexibility for web content authors.

Keep reading for more details on PAC's role within the project, how we go about defining expected screen reader behavior without dictating conformity in software responses, and how we're contributing to workstreams to scale up the amount of testing being performed.

Proposing AT Behavior Expectations

The goal of the ARIA-AT Project is to propose and test all expectations for all accessibility semantics. We're looking for agreement on the expectations needing to be met, and for them to be re-tested every time an AT or browser vendor makes a change that could affect how the semantics are conveyed. But how exactly do we define what those expectations should be in a testable way?

1. Start with an Example

We begin with a working implementation of the semantic(s) being targeted by a given test plan. To date, these have been pulled from the ARIA Authoring Practices Guide (APG). In the future we plan to bring samples in from additional sources, and maybe develop our own for more targeted testing.

At this stage, we also assess whether any modifications to the example are required to match our objectives. For instance, an APG page may include two sets of radio buttons with a thematic connection between them. We only need to test one such grouping, and so one of them can be removed. Likewise, we cut out any page content that won't play a role in our testing, such as design pattern documentation.

2. Assess Current Support Levels

Next, we manually evaluate the example's behavior with all of the AT/browser combinations currently being targeted by the project, and carry out background research as appropriate. This is a critical step; among other things it allows us to:

examine terminology used within existing AT responses;
note down which information is conveyed by different ATs, and any additional or missing pieces that we think our tests should include; and
judge how complex and/or contentious our proposed expectations are likely to be.

3. Establish Testing Tasks

A test plan consists of a collection of tests, with each test embodying a task a user is likely to need to carry out. From the behavior within the example, and the results of our manual evaluation, we derive a list of these tasks and their requirements. For instance, when testing a two-state toggle button, our testing tasks would be:

Navigate to the button from different directions, both by moving the system focus and screen reader cursor, in all supported states.
Toggle the button between its supported state permutations by activating it.
Read the button in all of its supported states.

Once these tasks have been enumerated, we assemble all of the following for each one:

Which screen reader mode(s) the task should be carried out in;
the user's starting point within the example;
the command(s) or command sequence(s) a tester must carry out (which differ across screen readers);
any relevant instructions for the tester;
JavaScript code needed to set up the page state before testing; and
reference links relating to the semantics being targeted by the test (e.g. one or more links to roles, states and/or properties within the WAI-ARIA specification).

4. Assert Expected Behavior

Once a tester has carried out a command or command sequence within a test, they respond to a series of assertions about the access technology's response. For instance, if navigating to a toggle button named "Mute" in the pressed state, we would assert that:

the role of "toggle button" is conveyed;
the name "Mute" is conveyed; and
the state of "pressed" is conveyed.

Despite the quotes in these examples, we're not pursuing absolute uniformity in how all access technologies convey information. For example, a screen reader may present a role of "button" as "btn" in braille, but the full word "button" in speech. Or, one screen reader may choose to convey a checked radio button as being "selected", because that fits with platform convention.

Localization is also a factor. A screen reader configured to output information in Spanish will use different terminology to one configured for a French speaker. The response to some assertions, therefore, is open to tester interpretation, while others must match verbatim like the name of a control defined within the example.

Finally, each assertion is marked as either:

required, for information that we feel must be conveyed; or
optional, for aspects of behavior that we feel are more of a "nice to have".

Tester responses to required assertions are the primary factor in determining support levels. However, testers also provide a record of the exact screen reader output, and can choose to indicate the presence of unexpected or undesirable outcomes such as excess speech verbosity.

See it in Action

To see the outcome of these steps within our existing test plans, consult the test plan preview facility.

Test Administration

Together with the compilation of test plans, PAC is responsible for overall test administration within the ARIA-AT application. This includes, but is not limited to:

Adding test plans to the test queue, assigning testers to them, and resolving conflicts between result sets. We also run through all test plans at least once, with every AT/browser combination being targeted.
Publishing test reports, and shepherding test plans through the various phases within the project's Working Mode.
Liaising with access technology vendors to make them aware of which test plans are ready to review, facilitate the feedback process, and respond to any suggestions or concerns they raise.

Scaling up Testing via Automation

PAC is grateful to all testers, both within and outside of the ARIA-AT Community Group, who have volunteered their time to run through test plans and provide data. Particularly for more complex semantics, this can be a slow and arduous process, with a significant deal of care needing to be taken to ensure that the provided responses are accurate.

Given this context, and the eventual need for test plan runs to be re-executed following new browser and AT releases, human testing is a bottleneck of which the project is only too aware. To help scale testing efforts, an automated approach is being actively explored, based on an AT Driver API standard being created by Bocoup.

For our part, PAC has created a proof-of-concept implementation of this standard for the NVDA screen reader. So far, the implementation includes:

Base client/server architecture and protocol session, command, response and event handling.
Retrieval and modification of screen reader settings.
Capture of NVDA speech output to be sent to remote clients.
Actuation of keypresses and other input gestures.

In developing the first implementation of this standard, we are hoping to prove the potential of the specification, pave the way for other AT vendors to incorporate it into their own products, and see it utilized for semi-and-fully-automated result gathering within the ARIA-AT project. Beyond that, we believe that the potential use cases for this technology are varied and wide-ranging.

Credits

Once again, our thanks go to Meta for providing funding for this work, as well as avenues of ongoing support with vendor outreach, logistics and more. Our thanks also to:

Bocoup, for developing the key technology and infrastructure that the project relies on to succeed;
the W3C for hosting much of that infrastructure; and
all various community group, working group and taskforce members that play a key role in driving this work forward.

Internally at PAC, our involvement is a team effort, and we wish to call attention to all of the incredible people who continue to be instrumental in making it happen:

Isabel (Isa) Del Castillo has been the primary test author on the majority of ARIA-AT test plans. She has single-handedly written over 5,000 screen reader behavior assertions, itself a hugely specialized task, as well as managing the day-to-day administration tasks described earlier in this post.
Job van Achterberg has brought his considerable development expertise to bear on the NVDA implementation of the automation standard, providing key insight into the development of the standard itself along the way. He has also contributed accessibility fixes to the ARIA-AT testing application, and worked on internal tooling that plays an essential role within the test writing process.
Sam Shaw has juggled all of the moving pieces of our multiple work streams with aplomb, helping to keep us all on track and not forget our deadlines. He has also kept admirable meeting minutes of highly technical subjects, maintained our test plan tracking and reporting data, and eased the often-complex task of arranging meetings that suit everyone's calendars.
James Scholes has diligently managed the overall project direction and relationships with other ARIA-AT stakeholders and external access technology vendors, as well as generously sharing is wealth of deep subject-matter expertise, lived experience, and technical acumen.

Additional Reading

For more information on the ARIA-AT project and concepts presented here, please consult the following resources:

The Project Homepage, which itself links to many other key project resources.
The published test reports.
A W3C blog post announcing integration of ARIA-AT data into APG example pages, and two corresponding posts from other vendors involved with the project:
- Bocoup: ‘Can I Use…’, but for ARIA!
- Deque: APG support tables – why they matter
The test plan preview facility.