Testing in the Real World
When it comes to software and quality, the term “testing” is mentioned very often and in many different contexts. This has led to many people interpreting it many different ways. When you hear “testing the product”, it could mean anything from turning the product on and seeing that it doesn’t catch fire, to rigorous test cycles involving hundreds of people losing sleep to ensure they have covered all the bases.
Historically, the world of software testing has a bad reputation, mostly stemming from many bad examples that demonstrate how testing is done the wrong way.
The “QA Division”
One such example is the invention of the “QA Division” — where many traditional companies set up and recruit large divisions of people focusing only on the QA aspects of their products. QA engineers get the full technical specifications and requirements of the product, and then use that knowledge extensively to make sure it works well under the designed scenarios. Those people are diligent, hard-working professionals with the ability to quickly learn new feature sets and patiently carry out even the most mundane set of tasks repeatedly to make sure no regressions appear between versions.
In time, engineers realized that there was more to their products than just a pile of usage scenarios. They began to realize that there are many intricate inner workings of software, as well as subtle bugs that need to be addressed.
The world of software quality has vastly improved in the past few decades in response to a number of these concerns, and many methodologies for testing these subtleties were gradually introduced such as: unit testing, white-box testing, and even more rigid philosophies like test-driven development (TDD). This strengthened the testing on the R&D division’s side and somewhat narrowed the gap through which bugs can slip.
This continuous improvement process affects the profile needed for filling QA testing positions. As developers improve the tools they have for catching bugs during the development cycle, QA divisions can use their allotted time to test other important matters, like overall user experience and adherence to documentation, which themselves are demanding tasks. In turn, this leads to QA personnel drifting farther away from the low-level technical details and focusing on the product as a whole. In many types of products this is ok, but it leaves dangerous blind spots, like the types of bugs that can happen only when the product is tested as a whole, and especially in production.
The Challenges of Testing Enterprise Storage
At INFINIDAT we have a big challenge on our hands. We need to test a complex product involving both advanced software and state-of-the-art hardware. Our demanding constraints mean that we have non-deterministic states to track, as well as background and deferred processes to chase down to completion. We also have a dazzling amount of storing capability to verify, making sure not even a single bit of a customer’s precious data is harmed.
Tests carried out in the development (or code crunching) stage are limited, since they have to finish relatively quickly (or otherwise developers don’t run them), so they provide relatively basic coverage compared to what is needed to address the above demands. This basically means that the “QA division” approach just won’t cut it here — if we were to implement it, we would face serious consequences.
We realized pretty early on what this means for our testing needs, we need automation. Developing automation capabilities requires a broadened skill set. We need full-fledged software developers to write our testing automation.
It takes software engineers to be able to understand the implications of long-running software and to fully grasp concepts like race conditions and resource leaks, and to figure out what can hinder long lasting software processes undertaking critical workloads.
We couldn’t hire just any software engineer of course. We are constantly looking for people with “mischievous” thinking — people with a tendency to break things, and find more ways to abuse already-built mechanisms. Especially complicated mechanisms.
We also hit another key realization when we started out — we are all human, and as humans we get distracted easily.
In storage, there are a lot of details that are required to even use the storage. Provisioning a host to a storage controller means meddling with switches, networking, SCSI, system-level administration, log management, workload generation and what not. This means that even the most basic test imaginable in storage — take a volume, write on it, read it back and make sure the data is there — takes hundreds of lines of code (and we haven’t even started explaining what ‘make sure the data is there’ means).
At INFINIDAT, we strongly believe that when skilled software engineers with mischievous minds waste their mental energy to carry out mundane tasks, these people lose their edge, and only get to the “important” part when they’re already mentally exhausted.
This is why we have two teams of software engineers working in tandem. In parallel to our automation developers, we have a team of infrastructure developers to complement them. Their task is to build a robust testing framework on top of which tests can be written. It is their responsibility to make sure that a test that spanned hundreds of lines of code in the past, can be written with just four lines:
volume = system.volumes.create()
with provisioning_volume(volume) as p:
This is actual Python code that can be executed as a test on our infrastructure — and even those who don’t actually know Python can understand the gist of what it does.
Some people would look at this code and say, “That’s great! now we can hire undergraduate students to do the testing because it is so easy!” Those are likely to be the same people who still have the “QA Division” concept deeply embedded in their minds.
We look at this differently. Now that we can write shorter code that performs mundane tasks — let’s use this to write even more complex code that tests delicate things like race conditions, parallel operations and a lot of other things. This gives our automation developers the freedom they need to bring out their full potential.
About Rotem Yaari
Rotem is a software engineer and leads the INFRADEV (Infrastructure Development) team at INFINIDAT. This team is tasked with developing the advanced in-house infrastructure used for development and testing of INFINIDAT’s product line. A testing and quality assurance expert, Rotem has been designing and developing the tools and infrastructure required to thoroughly test storage systems for the last ten years.