Software testing at Facebook and Google in 2018June 26, 2018
If you’re researching best practices for software testing, Google and Facebook are among the biggest trendsetters in QA. So how are the giants ensuring software quality in 2018?
Let’s see if the Google Testing Blog and Facebook Engineering Blog can help us answer this question. This post will piece together the bits of first-hand info from these two resources, as well as comments from Googlers and Facebookers. Without further ado, read on for a high-level overview of how Google tests software in 2018 or jump to the section covering Facebook.
Software testing at Google
One peculiar thing about Google is the massive codebase it owns. InformationIsBeautiful estimates that Google maintains about 2 billion lines of code for its internet services. To draw a comparison, it takes ~50 million lines of code to operate the Large Hadron Collider.
So how can a company keep software testing efficient at a scale of ~2,000,000,000 LOC? According to Google’s Jeff Listfield, the company was running ~4.2 million tests as of April 2017. Out of that number, only 2% (or ~63,000 tests) had flaky runs. Impressive, isn’t it? Here’s how they do it.
Software testing strategy at Google
At Google, ensuring test coverage for code is mostly the job of the developers who authored that code. There are no dedicated testing departments, either. Instead, software testing falls under the jurisdiction of the branch known as Engineering Productivity (Eng Prod).
That doesn’t mean, however, that there are no testers. Quite the contrary, Google has two testing-related roles: Test Engineer and Software Engineer, Tools and Infrastructure.
Test Engineers (TEs)
Test engineers’ main focus is to keep Google’s continuous automation testing infrastructure efficient and effective. The role of TEs encompasses knowledge sharing, fostering best practices, and tools development. The goals set before TEs include:
- Test prioritization (i.e. determining what features need better test coverage).
- Writing test scripts and building testable user journeys that developers can use to test their own code.
- Development of in-depth QA knowledge and expertise in product teams.
- Finding weak spots in the product codebase.
- Finding new ways to break software and identify bugs
- Fostering communication between product stakeholders to ensure successful releases.
Software Engineers, Tools and Infrastructure (SETIs)
SETIs are engineers building, orchestrating, and continually improving tools, frameworks, and packages used in software development and testing. Originally known as Software Engineers in Test, this role expanded beyond product testing. To mark this move, Google rebranded the SET role to Software Engineers, Tools and Infrastructure (SETIs) in 2016.
The key responsibilities of SETIs include:
- Development of tools for automated testing, software development, performance monitoring, etc.
- Collaboration with TEs aimed to streamline communication and foster the adoption of best practices across product teams at Google.
- Collaboration with other specialists on conferences (e.g. Google Test Automation Conference).
Code coverage at Google
In a gigantic company like Google, code coverage varies across project teams. According to Marko Ivanković from Google Zürich, the attitudes towards code coverage also vary across projects. With this in mind, it makes more sense to look at how Googlers achieve a healthy code coverage for their products.
Based on the data from 2014, Google leverages an internal opt-in system that automatically measures code coverage on a daily and per-commit basis:
- For daily measurements, developers use the Google build system. The system supports a range of measurement tools including JaCoCo and for Emma Java, Gcov for C++, and Coverage.py for Python. The system gathers and analyzes data that help developers identify blank spots in code coverage and explore weekly, monthly, and yearly trends.
- For per-commit measurements, there’s an internal tool that diffs code and highlights uncovered areas:
Automation testing tools at Google
Knowing that Google’s testers are really software engineers building test frameworks and tools, what are these frameworks and tools? Here’s a short list of the well-known testing solutions built or influenced by Google:
- Selenium. No, Google didn’t build Selenium, but Jason Huggins (the creator of Selenium) worked at Google in 2007 on Selenium RC. Besides, it was the Google Test Automation Conference 2009 where Google and ThoughtWorks agreed to merge Selenium and WebDriver into Selenium 2.0. What followed is now history.
- Protractor. Originally developed for end-to-end testing of Angular apps, Protractor is one of the most popular automation frameworks. Needless to say, engineers from Google use Protractor for many products and play an important part in the development of this framework.
- Espresso UI is a test framework and a record-playback tool for Android.
- EarlGrey. In addition to UI testing for web application and Android, Google taps into functional UI testing for iOS with EarlGrey. At Google, this framework is integral to the UI testing of Google apps for iOS, including YouTube, Play Music, Google Calendar, Google Translate, etc.
- GoogleTest. The products that use this C++ test framework are the Chrome browser, Chrome OC, and the computer vision library OpenCV.
- Google Test Case Manager is a test management software that Search giant uses internally.
- OSS-Fuzz is Google’s solution for the fuzz testing of open-source software.
- Martian proxy is a library for building programmable HTTP proxies for testing purposes. As pointed out on this project’s GitHub page, Martian proxy isn’t not really a Google product. Rather, it’s just code that Google happens to own.
Manual testing at Google
As the TE role description states, Test Engineers at Google aren’t manual testers. Rather, they act as specialist whose guidance developers use when automating tests. As a result, manual testing is never a go-to method for catching regressions or covering standard functionality.
Still, no company can exclude manual testing from its production process. Google is no exception to this rule. Specifically, manual testing has its place in the following cases:
- Google uses manual tests to identify non-trivial problems with their products (i.e. exploratory testing).
- Manual testing is a “Plan B” method for web testing when HTML lacks proper tagging (e.g. IDs). Developers typically resort to manual testing if adding (or retrofitting) proper tagging isn’t an option, or if the structure of a web app is unstable. Cases like this mostly occur in the early stages of product development.
- Judging by LinkedIn, Google outsources parts of its testing efforts to overseas development shops. In some cases, outsourced jobs involve manual testing.
Software testing at Facebook
Even in 2018, people still talk about “moving fast and breaking things”, the motto that Mark Zuckerberg officially ditched four years ago.
The hacker-celebrating approach gave way to a new mantra — “Move Fast and Build Things”. A few years later, this changed, too, to “Move Fast With Stable Infra”, and later to “Move Fast”. Needless to say, breaking things is no longer on the Facebook to-do list.
Today, the developers at Menlo Park are maintaining a gigantic web platform that exceeds 60 million lines of code. Moreover, Facebook is the third most-used website of 2018, following Google and Youtube. So what sort of an infrastructure and testing strategy allow Facebook to assure the quality of its software?
Testing, delivery, and QA at Facebook
Much like Google, Facebook shows little strategic interest in hiring teams of dedicated automation engineers or manual testers. At Facebook, it’s mostly the developers’ job to assure the quality of their code.
The core software delivery at Facebook builds on a “push from master” approach, meaning there’s no release branch acting as a safety net. This strategy ensures fast continuous delivery, yet it requires rigorous testing. To make this possible, Facebook relies on an array of automation tests, including the AI-based Sapienz and Infer. Peer reviews and linting are also a must for all new code.
Dogfooding is another process that’s integral to software production at Facebook. Developers first push changes to Facebook employees. If there are no regressions at this stage, the changes will ship to 2–3% of production. If there are no issues, the changes ship to 100% of production. There’s also Flytrap for user reports, should bugs slip into production.
Code coverage and automation of the Facebook website
Whether code coverage is sufficient at Facebook is really a matter in question. Only in June 2018, Facebook bugs made it to headlines on two separate occasions. On June 7, the company admitted having accidentally changed sharing settings to “public” for 14 million users. On June 22, about 3% of app beta-testers (i.e. end users) received an automatic email containing sensitive business information from Facebook.
Still, blunders are inevitable given how huge and fast-paced Facebook is. There’s also quite a bit of legacy architecture (and technical debt) that Facebook has inherited from its move-fast-and-break-things years. Here’s how the social media giant deals with all of this:
- Watir and WebDriver handle a large part of the browser automation efforts at Facebook. This covers the regression testing of routine user journeys, including the privacy-related stuff. As mentioned by Facebook’s Steve Grimm in 2011, Facebook had tons of tests focusing on what data should be visible to users.
- For the mostly-PHP-based core, Facebook maintains thousands of unit and integration tests written in PHPUnit framework. There are automation tools that run these tests on schedule and collect test run data. Besides, developers use these tests to catch bugs introduced with new code.
- There are other test frameworks that Facebook maintains for its many services and products. Namely, there’s an internal C++ framework for non-user-facing projects. For the open-source projects, developers mostly utilize open-source testing solutions like JUnit or Jest.
- According to Jackson Gabbard (Engineer at Facebook London), the company puts a heavy focus on automated metrics collection and analysis. The internally-developed framework shows Facebookers what parts of the application their new code will impact.
- In the past few years, Facebook has been stepping up its automation game with AI-based testing tools Sapienz and Infer. There’s also BrowserLab for client-side performance testing and the Mobile Device Lab for running automated tests on real devices. We’ll cover these and other solutions in the next section.
Automation testing tools built by Facebook
The 2018’s state of QA at Facebook is, basically, Facebook testing its software with its own tools. Or, at least, the company is moving in that direction. Here’s a short list of the software testing tools that Facebook has built (or bought) in the past few years:
- Sapienz. First released in 2017, Sapienz is a search-based dynamic code analyzer for Android. The tool applies metaheuristic search techniques to programmatically interact with the UI, create models of the system under tests, and generate test sequences. Whereas most dynamic analyzers have to comb through ~15,000 actions to find a crashing bug, Sapienz only needs 150–200 interactions.
- Infer. A static code analyzer, Infer scans through code without running it. It detects bugs like memory leaks and null points exceptions in iOS and Android. Facebook bought the technology in 2013 and open-sourced it in 2015. Today, Infer is integral to the testing of Facebook, Instagram, WhatsApp, Messenger, as well as Spotify and Uber. Mozilla, Sky, and Marks & Spencer are among other well-known brands using working with this tool.
- BrowserLab. As Facebook (and most of the Internet) moved more logic to the client side, there emerged a need for granular monitoring of browser rendering speeds. This is exactly what BrowserLab does. Launched in 2016, this tool detects performance regressions, even if the delay is as minute as 20 milliseconds.
- Facebook Mobile Device Lab. Located at the at the Prineville data center, the Mobile Device Lab powers the automated testing of mobile and web applications on real mobile devices. In a nutshell, the Lab is a center with mobile device racks and a sophisticated infrastructure for running tests on these devices.
- WebDriver. Just like Google, the Facebook engineering team are among the active contributors to the W3C WebDriver standard.
Manual testing at Facebook
Based on the comment from Jackson Gabbard (linked above), Facebook took huge steps towards automated software testing in 2013. Still, you can’t really automate everything. Things like exploratory testing and ad hoc testing are manual in all companies. Facebook is no exception to this rule.
Takeaways: ROI-driven automation, UI-centric tests
Looking at the examples of Google and Facebook testing their products, what are the actionable insights that smaller companies can implement? On a high level, there two things that Google and Facebook share:
- No dedicated staff for running tests. At both companies, developers own the quality of their own code. This means fewer silos in QA, as well as entrenching of quality culture into the everyday work of engineers. In an even more practical sense, this can also mean better ROI of automated tests. There are no added costs of maintaining a team of engineers whose sole purpose is hand-coding tests. Instead, Google uses the expertise of test engineers for more strategic matters.
- UI testing comes to the fore. Both companies invest heavily in UI testing. This includes both the contributions to the WebDriver standard and the development of UI testing tools like Protractor and Jest. Moreover, Facebook’s AI-drive Sapienz analyzes programs via interaction with the GUI to eliminate the bugs that end-users are the most likely to face.
These two takeaways are the reasons our team is developing Screenster. We believe that automation tools should test software from the end-user perspective, which makes UI testing crucial.
Another crucial point is test automation shouldn’t require a staff of dedicated coders. Routine coding is never a good use for engineering talent, and Google’s approach to testing proves that.
If you share this vision of automated testing, here’s how we implement it in Screenster:
- Intelligent visual testing. An automation tool should look underneath UI screenshots, into the underlying DOM structure of the UI. Screenster does exactly that. Our platform scans the DOM and matches individual elements and content to how they render on a screen.
- Moving beyond record playback. One reason record-playback was popular back in the day is the simplicity it brought to test creation. That said, it sucked in pretty much everything else. To make record-playback work for modern-day UI testing needs, we added automated verification of all on-page elements and eliminated the need to tinker with auto-generated code.
- Self-healing locators. Even at Google, backfilling IDs is a problem that makes people fall back on manual testing. Our solution for this problem is a clever algorithm that generates self-healing locators for each element. Should you move or rename any element on your page, Screenster will still be able to find it during the next test run.
- Seeing beyond the happy path. No team will ever be able to write enough WebDriver tests to cover the complete UI. In fact, most Selenium or Protractor tests will only focus on separate touchpoints in a user journey and ignore everything else. When automating and running tests, Screenster will detect bugs in all elements, even if you’re not targeting them explicitly.
- Forget about thresholding. What most people call automated visual testing is, effectively, screenshot comparison based on pixel matching, with a manually set tolerance for mismatches. Say, is 5% of mismatched pixel good enough for a test to pass? But what if this 5% represents a missing button?
Screenster doesn’t depend on thresholding because it doesn’t match page screenshots. Instead, it runs pixel-perfect comparison for separate elements, eliminating false positives without sacrificing precision. While doing so, the tool will sort out the visual noise caused by anti-aliasing.
- Smooth learning curve. The core functionality of Screenster is accessible to non-technical people. With Screenster, you can start automating UI tests earlier in the development cycle, keeping your process lean and agile.
- Tons of added awesomeness. Screenster automatically determines optimal waits and detects dynamic content, and recognizes dates. It supports multiple screen resolutions, runs on a cloud or a local server, and doesn’t make you install apps or plugins. There’s literally no setup pains, and the tool can fully integrate with Jira and CI solutions like Jenkins, TeamCity, Travis, etc.
- It’s really easy to try it out. Screenster has a free (no strings attached) online demo that you can try right now. Click on the button below and try Screenster for your web application.