False positives (and false negatives) are a part of automatic accessibility testing that is often not discussed enough. Recent comparison of top automatic accessibility testing tools (opens in new window) is another proof for this. The study reveals a lot of consequences and I wanted to add my own experiences and reflections about it.
What does false positives mean for monitoring and auditing accessibility?
Well – it means that we need to be careful when choosing the tools and we need to be careful when we interpret their results. I prefer tools that avoid false positives for monitoring, but I also prefer tools that find more potential issues when auditing.
If I used same tool for both monitoring and auditing – it is obvious that ratio of false positives would may make the situation look worse.
It is also obvious that a lot of false positives mean we need to use a lot more time to investigate them, to document them, and ideally we also need to report them to the tool vendors (so that they fix the rules and we get better tools).
A lot of tools don’t really allow manual interventions that would ignore reported issues (when human verified them as not an issue), so we risk having the statistics worsened all over the site. That is also a big issue, especially for monitoring. Even when we know that tool made a mistake – we can’t make it ignore it in the future. Stealing time and resources, sometimes even causing people to fix what was not really broken…
What does false positives mean for accessibility overlays?
It seems that overlays will not just go away. I am still not convinced they do more good than bad, but I also need to think about them from the business perspective, so here are some thoughts (hopefully helping decision makers with the right choice – not using an overlay, but doing accessibility properly).
I am not familiar with the details of accessibility overlays, but I can speculate that they start with automatic accessibility testing and then they try to remediate the findings. This automatically means that lots of false positives found in the detection phase may also mean overlays try to “fix” something that is not broken at all. Another downside of putting the responsibility for accessibility to scripts (and/or artificial intelligence).
Tools are just tools and need human interpretation
I cannot express this enough – it should be obvious that we can’t just trust the tools when tools have such deviations, when they generate so much noise comparing to the real accessibility issues. We need humans with proper training and experience to use the tools and help them be better at the same time.
Tools start with good rules and I wish more resources would be invested into projects like Accessibility Conformance Testing rules (ACT Rules) instead of forking new tools that try to overcome existing tools just for the sake of marketing. False positives should be prevented at all costs, especially when tools are used for regular monitoring where we rely on trend observations instead of checking the details manually (like in audits).
Nevertheless – we need more people and to spread the awareness and knowledge, so that we can improve the rules, the tools and ideally also prevent accessibility issues long before we can use the automatic tools on our products.