A Fundamental Reshaping of Digital Health

Humanity is ever resilient, 2020 has been THAT year, adversity striking us globally in a pandemic that no nation was ready for. It has had far-reaching impacts on mental health with an unprecedented…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Do you have the Basics of Functional Tests?

It is the most common test type. As such, almost every coder learned it by example, alas — very few understand it in depth.

The internet is so full of code-snippets, that too few take the effort to understand above the minimum it takes to make them work in a project. This created a very low bar: So much so that even good and talented authors of recommended test tools provide them with examples that demonstrate how their tool is used, but mislead on how to write a good test.

SUT — stands for System-Under-Test, which is a generalization for Unit-Under-Test, Class-Under-Test, Endpoint-Under-Test, Component-Under-Test, Service-Under-Test, Software-Under-Test etc. — all systems. So yeah. SUT.

Functional tests check if a working SUT provides the desired result expected of it.

Consider a mathematical approach: any SUT works in a context of a set of parameters, and yields a result — like a mathematical function — hence the name. That’s all.

The discussion about functional code and distinctions about the returned value and side-effects — is an important and tempting discussion, but this is not our current discussion!

Purely speaking, Functional Tests are not interested in how much time the SUT took to get to the expected result. Or how much resources it occupied to do so, or if it did its job efficiently. Only that the result is correct.
That’s purely speaking.

In practice — most test tools come with inherent features outside the domain of functional testing — e.g. timeout mechanisms: but when you ask if it did it in time — you ask how long it took, and that’s a performance measure.

Everything? Yes. Functional, performance, coverage, load, bench — anything that runs.

That’s why test runners are in their core task managers: Your test-cases tree is your task list. You could run them all (like in CI) — but you can run selective tasks if you must (like in debugging a single test-case).

When a case depends on other cases being run before — it’s called case sliding. The upside — in very heavy systems, while you’re at it — you just might check one more pesky thing…
The downside — it doesn’t give a reliable status. When a case fails — all the cases that come after it probably won’t even run, or will run and give misleading failures.

No, that’s not an assignment. That’s a mathematical statement.
It evaluates to true or false, which correlates to PASS or FAIL.

accurate correct result

To be able to tell if a function worked as expected — we need to know what result we’re expecting.

A common misbelief about the expected result is that it is a single value. IMHO, this is a legacy of the early programming languages which inspired each other, that tried to define functions by the type of their returned value, and thus, require their functions to return a single value. Anything outside this returned value ended up being called a side-effect, which is a cheap cheat and a loop-hole in terms of testing, designed to allow us to ignore things the function does, even though it’s definitely being done by our code — and therefore must be accounted for and tested.

Another common workaround is that a single returned value may be a data-structure. But then, you still need to verify every part of that structure you’re interested in, thus — a data structure is a Set of results, just like the set of the returned value and any other values assigned to global, delivered by reference, or passed as a message by the SUT.

Now that we have acknowledged that a result can consist of several values, we’ll go up to the business level: the expected result often includes a few objectives.

Consider the following example:
A successful registration means:

Causality — describes a relation between cause and effect. Purely speaking, parameters propagate through the SUT and together they cause a result.

Parameters — in their wider sense: can be an input, an inner state, or an outer state.
E.g.:

Obviously, SUT can work with any combination of the three.

Some parameters are totally in our control, like a and b of the add. Some are not in our hands — like the current time, which is constantly moving, or an answer from an external system that is not in our control.

If we cannot set them ourselves — we should be able to sample them. When the sampling itself incurs a change of state — the test code must account for that too.

Powered by AAA

This is a theorem that comes to increase maintainability.

It directs code-organization around the following test-concerns:

A failure here means the test code and/or any of the helpers it uses are broken: The test-code did not even get to interact with the SUT.

A failure here will mostly mean that the SUT is not resilient or is unstable. When the test-code does not communicate well with the SUT — we expect the SUT to communicate it well in the errors it throws.

A failure here means the SUT does not meet its spec.

The AAA is something that is found on the web in many sources.
Let’s add to that a small contribution:

A failure here is a bug in the Test-Code.

This trinity does not deal with divinity, but makes a test suite self-explanatory to increase its maintainability.

Often it’s a part of production-code. But it can be a test-helper or the test itself.

This is the element that interacts with the SUT and measures its effects. This includes any test-helpers, test data-fixtures, and the implementation of the test-cases with their setup/teardown and their assertions.

This part was suppressed from tests for too long. Systems like JUnit and the rest of the (X)Unit family originally let coders snuck the specs into names of test-methods and classes — which was very limiting, and the rest was either truncated and lost or in the better cases snuck into comments and was not apparent in reports. Although we have evolved since, the entire industry still bears the scars of this past.

Often tests are very clear once written, but become gotcha-full head-scratchers even for their author when she meets them in the future.

When a coder comes across a failing test — she needs to take judgment. When the intent of the test is not clear, the coder has to speculate too much in order to judge between the test and the SUT:

The spec helps to judge between the SUT and the Test-Code. Usually the spec agrees with one, indicating that the other should adhere. But I’ve seen cases where the SUT and the Test-Code agree, and the title is wrong (a poor copy-paste).

It is not much about the verbosity as it is about the intent to leave clues for all your future readers:

Please note: the specs appear in all of them, while the code appears only in the last. Make sure your breadcrumbs get to ALL of them.

I confess that when the entire test-case is for a single assertion, even I am tempted to do the entire four A’s in a single it(..) . But I do make sure to leave enough info about what I mean the test to do as part of what context.

This is worth a post of its own, but I will not come clean if I do not mention it.

Often a set of test-cases require the same setup and teardown. Given the recommended test verbosity demonstrated above — a naïve implementation will result in some copy-pastes with poor maintainability.

The Wise packs the repetitive code in test-helpers that assist in these repeating setup / teardown tasks. The Even Wiser — feature a helper API that manages a complete test-context with setup and teardown hooks, providing the test-code with a context ready for the Assert stage.

A test-helper becomes a test-factory when it features a single API that expects arguments that declare not only how to perform any required setup and teardown of the test context, but also what has to be asserted as part of what spec. Calling it generates a complete test-case that appears in the report with readable titles.

Once you get proficient with producing tests, when spec change, it often becomes easier to rewrite new tests instead of refactoring existing ones.

Given that you kept your test-cases isolated (no case sliding), kept your AAA(A) organized, and placed reusable code in appropriate helpers, writing a case should not take much time — it is a matter of declaring values.

It is hard to delete a case when you like its aesthetics. But if it has lost relevance and/or does not add information or coverage — get rid of it!

The goal of a test-suite is to affirm that our software works as expected. However, beyond this desired affirmation, when it is denied — we want to know what went wrong and what must be fixed.
The first level of defense is pointing out what went wrong by organizing our test code into recognizable segments, the goals of which are defined by the AAA(A) theorem.
To crosshatch this, we bring the spec into the test and communicate the goals of each test-case in the suite, so that when they fail we know what the failure points at without the need to read the test. And when they are refactored — we know to preserve these goals and maintain coverage.

Add a comment

Related posts:

Privacy Policy

Terrydsdclayton built the app as a Free app. This SERVICE is provided by Terrydsdclayton at no cost and is intended for use as is. This page is used to inform visitors regarding my policies with the…

Shifting into a career in tech

In a month I will be celebrating my one-year anniversary as a Junior Software Developer. Just typing that sentence is surreal for me!

Government Services Drive Digital Identity Growth

Digital identity is a concept that is constantly evolving and becoming increasingly aligned with allied areas, such as security, privacy, and management of identity-related data. Within the…