Best practices for working with AI validations 
Using AI for validations is a quantum leap in automated testing. [Hiding the following as per feedback from Yaron] This approach allows you to focus on the essence of the validation rather than the complex structure of the screen and the correlation between multiple graphical user interface (GUI) elements that might contain the answer. With AI validations, validations are logical, closer to the business requirements, and can combine almost any aspects appearing on the screen. To learn what you can and cannot do with AI validations [Hiding the following as per feedback from Yaron] and understand the value of the AI assistant as part of your daily operation for creating robust testing scenarios, review the following best practices.
In Scriptless Mobile, AI-based validations are available through AI commands. For Appium testing, AI-based validations are available through AI functions.
On this page:
Basic concepts: Question vs. validation
Make sure you understand the difference between a question and a validation:
-
Question (or query): An open-ended question where the answer can be any value. For example:
-
What’s the account balance?
-
What’s the color or button?
-
What’s the name appearing at the top left?
-
-
Validation: A special type of question with a binary answer that can be PASS or FAIL. When it comes to automated testing, a validation or assertion is the primary way to decide whether something is as expected. For example:
-
Is the account balance equal to $22?
-
Is the color of the login button red?
-
Does the user name appear at the top left?
-
Limitations
The following limitations apply to AI-powered validations:
-
AI validation models are not perfect. Just like humans, they make mistakes. Before assigning a value to an AI-powered command in Scriptless Mobile, always test the validation using Perfecto's AI assistant.
-
AI validation models are not connected to the world or to your backend. If the information does not appear in the context of the device screen and is not public domain knowledge, the validation fails.
Context, meaning, and semantic understanding
Applications are designed and implemented to address human needs, not to address the needs of automated testing. AI validations work in a similar way: They first analyze the semantic and logical meaning of the requested validation and then try to match it with the screen content. [Hiding the following as per feedback from Yaron] But because AI validations are a combination of a Large Language Model (LLM) and a Visual Language Model (VLM), they can also create and calculate complex relationships such as “the sum of all internal funds.”
Traditional functional test automation uses and is aligned with actual elements in the graphical user interface (GUI) on the open device screen, such as buttons, text entry fields, and drop-down menus. A test developer needs to assess the screen structure and create a logical meaning between the different GUI elements and the required logic under test.
In the following example screenshot, the value $15,678.62 is a text entry on the screen. It is up to the tester to connect this value to its meaning, which is “available balance.” Most functional test automation achieves this by directly addressing the object that represents this value, for example, //text[3]
. In some cases, if the developer was kind enough to assist the testing efforts, it is possible to address the test element with better identification, for example //text[@id=”balance”]
.
The testing scenario gets more complex when the association is not as straight-forward as the “available balance,” for example when the value “$92.00” has multiple associations, such as:
-
The 3rd transfer appearing on the screen
-
The 1st transfer from “internal fund transfer from …2425”
-
The highest positive value that is also smaller than $100
-
The value of “any” transaction
The list can go on. Addressing the value "$92.00" is more complex because it is unlikely to have a clear identification. Most likely, it is associated with a child object or the 3rd row and 2nd column of the table shown in the previous screenshot. This answers only the first association above and does not answer the other values.
Unlike computer code, humans approach a device screen semantically: They break what they see down into logical meaning and match the logical, semantic understanding with visual representation. In the above example, if the need is to get the first transfer from “internal fund…,” our brains filter all rows that match this statement and then assess the value logically associated with it based on visual relationships, that is the same line. Our brain does not really care if $92.00 is a child of the 3rd row, a separate table that just looks like the same table, or not even a table but a list of groups instead. AI validations use the same approach.
Perfecto's AI Assistant
Unlike in traditional functional automated testing, the free and open text used for visual validations is convenient but also opens the door for uncertainty, ambiguity, and other aspects that can reduce the robustness of your testing scenario. The following example questions relate to the following screenshot, which shows the menu of a pizza restaurant. They illustrate that asking the right questions is one of the biggest challenges with free and open text validations.
-
“Does the pizza cost $80?” (ambiguous question)
-
"Is there a pizza that costs $80?" (yes)
-
"Do all pizzas cost $80?" (no)
-
"Does the Golden corn pizza cost $80?" (yes)
-
"Does the Capsicum pizza cost $80?" (no)
Perfecto's AI Assistant in Scriptless Mobile serves the purpose of helping you ask the right questions and translating them into valid PASS/FAIL validations that allow you to craft more robust scripts. However, just like humans, when the AI Assistant does not understand what you have in mind, it might respond in a way that does not meet your requirements.
Unlike the AI Visual Checkpoint command, which is a transactional function (1 in, 1 out), the AI Validation Assistant is a discussion-driven tool, aiming to create optimized validations that you can then use in a function call. The following table illustrates a sample conversation between a user and the AI Validation Assistant.
Role | Question (input) | AI response (result or suggestion) | Explanation |
---|---|---|---|
User |
Does the pizza costs $80? |
The user is asking a fair but ambiguous question |
|
AI Assistant |
|
PASS: There are pizzas that cost $80. |
The assistant concluded that the user asked about “any pizza.” |
AI Assistant |
|
Are there pizzas that cost $80 on the menu? |
The assistant suggested a clearer validation based on its understanding. |
User |
And what about the Capsicum one? |
The user added additional information to the discussion (note that this is not a full validation but part of the previous context). |
|
AI Assistant |
|
FAIL: The Capsicum pizza costs $140. |
With the additional info, the assistant created a contextual, semantic sentence and corrected its answer. |
AI Assistant |
|
Is the Capsicum pizza price equal to $80? |
The AI Validation Assistant crafts a suggestion that meets the entire semantic request from the discussion that you can then use in the transactional function call (as the value of the Validation parameter in the AI Visual Checkpoint command). |
In addition, the AI Assistant:
-
Transforms generic questions into a proper validation format. For example, the Assistant translates “How much does the Capsicum pizza cost?” (answer: $140) into “Is the Capsicum pizza price equal to $140” (answer: PASS).
-
Corrects grammar and typos, making sure your test scenario looks great. For example, the Assistant enhances “Are capsicum pizza cost $140?” to “Does the Capsicum pizza cost $140?”.
-
Works in almost any language. This is true for the AI Visual Checkpoint command as well. If English is not your language of choice, you may phrase the AI validation in any other language. Although not mandatory, for best results, consider using the same language as the application under test.
Do's and don'ts
To make the most out of AI-based validations and avoid common pitfalls, keep the following do's and don'ts in mind.
Do's
When phrasing AI validations, do the following:
-
Be specific
-
Use objective and clear definitions
-
Ask about elements that are clearly shown on the device screen
The following table lists examples of good validations along with explanations. The examples are based on the online shoe shop illustrated in the following screenshot.
Validation question | Why this is strong wording |
---|---|
Do the triple black sneakers cost $601? |
Includes a clear reference to the item on the screen. |
Does the count of all favorites equal 1? |
The indication between favorite and non-favorite items on the screen is clear. |
Is Nike the first in the filter list options? |
The screen includes a clear indication of the filter list and where Nike appears in it. |
Does the ad match the current selection? |
The ad on the screen is for a Nike shoe and the current selection is also Nike. |
Don'ts
When phrasing AI validations, avoid the following:
-
Asking subjective questions that include vague words like "nice," "clear," or "understood"
-
Assuming the AI understands unclear icons or uncommon app behavior
-
Asking about edge cases involving obscured or blurry objects on the device screen because they might not be well understood
The following table lists examples of validations to avoid, along with explanations. The examples are based on the online shoe shop illustrated in the previous section.
Validation question | Why this is weak wording |
---|---|
Are there 2 black shoes |
It's not clear whether the second shoe is considered black or white and whether the shoe in the ad should be counted in the validation. |
Is the triple black shoe much more expensive than the others |
It's unclear what "much" means, and the cost of the lower shoes is hidden. |
Are there 5 filter buttons |
The 5th button is not shown in full, so it might or might not be counted. |
Does the Air Jordan shoe cost $196 |
This is ambiguous because it's not clear which Air Jordan shoe the quest refers to. |
Are the triple black cheaper than at Amazon |
The answer is unknown. The AI model likely does not have the knowledge about pricing at Amazon, or its knowledge is not up to date. |
Need help?
If you need help with a complex validation, share it with us—we love a challenge! Reach out to Perfecto Support and include a screenshot with the exact prompt you used and information about what you want to validate or assert.