In recent years organizations have invested significantly in structuring their testing process to ensure continuous releases of high-quality software. But all that streamlining doesn’t apply when artificial intelligence enters the equation. Since the testing process itself is more challenging, organizations are now in a dire need of a different approach to keep up with the rapidly increasing inclusion of AI in the systems that are being created. AI technologies are primarily used to enhance our experience with the systems by improving efficiency and providing solutions for problems that require human intelligence to solve. Despite the high complexity of the AI systems that increase the possibility of errors, we have been able to successfully implement our AI testing strategies to deliver the best software testing services to our clients. So in this AI Testing Tutorial, we’ll be exploring the various ways we can handle AI Testing effectively.
Let’s start this AI Testing Tutorial with a few basics before heading over to the strategies. The fundamental thing to know about machine learning and AI is that you need data, a lot of data. Since data plays a major role in the testing strategy, you would have to divide it into three parts, namely test set, development set, and training set. The next step is to understand how the three data sets work together to train a neural network before testing your AI-based application.
Deep learning systems are developed by feeding several data into a neural network. The data is fed into the neural network in the form of a well-defined input and expected output. After feeding data into the neural network, you wait for the network to give you a set of mathematical formulae that can be used to calculate the expected output for most of the data points that you feed the neural network.
For example, if you were creating an AI-based application to detect deformed cells in the human body. The computer-readable images that are fed into the system make up the input data, while the defined output for each image forms the expected result. That makes up your training set.
Difference between Traditional systems and AI systems
It is always smart to understand any new technology by comparing it with the previous technology. So we can use our experience in testing the traditional systems to easily understand the AI systems. The key to that lies in understanding how AI systems differ from traditional systems. Once we have understood that, we can make small tweaks and adjustments to the already acquired knowledge and start testing AI systems optimally.
Traditional Software Systems
Traditional software is deterministic, i.e., it is pre-programmed to provide a specific output based on a given set of inputs.
The accuracy of the software depends upon the developer’s skill and is deemed successful only if it produces an output in accordance with its design.
All software functions are designed based on loops and if-then concepts to convert the input data to output data.
When any software encounters an error, remediation depends on human intelligence or a coded exit function.
Now, we will see the contrast of the AI systems over the traditional system clearly to structure the testing process with the knowledge gathered from this understanding.
Artificial Intelligence/machine learning is non – deterministic, i.e., the algorithm can behave differently for different runs since the algorithms are continuously learning.
The accuracy of AI learning algorithms depends on the training set and data inputs.
Different input and output combinations are fed to the machine based on which it learns and defines the function.
AI systems have self-healing capabilities whereby they resume operations after handling exceptions/errors.
From the difference between each topic under the two systems we now have a certain understanding with which we can make modifications when it comes to testing an AI-based application. Now let’s focus on the various testing strategies in the next phase of this AI Testing Tutorial.
Testing Strategy for AI Systems:
It is better not to use a generic approach for all use cases, and that is why we have decided to give specific test strategies for specific functionalities. So it doesn’t matter if you are testing standalone cognitive features, AI platforms, AI-powered solutions, or even testing machine learning-based analytical models. We’ve got it all covered for you in this AI Testing Tutorial.
Testing standalone cognitive features
Natural Language Processing:
1. Test for ‘precision’ – Return of the keyboard, i.e., a fraction of relevant instances among the total retrieved instances of NLP.
2. Test for ‘recall’ – A fraction of retrieved instances over the total number of retrieved instances available.
3. Test for true positives, True negatives, False positives, False negatives. Confirm that FPs and FNs are within the defined error/fallout range.
Speech recognition inputs:
1. Conduct basic testing of the speech recognition software to see whether the system recognizes speech inputs.
2. Test for pattern recognition to determine if the system can identify when a unique phrase is repeated several times in a known accent and whether it can identify the same phrase when repeated in a different accent.
3. Test how speech translates to the response. For example, a query of “Find me a place where I can drink coffee” should not generate a response with coffee shops and driving directions. Instead, it should point to a public place or park where one can enjoy coffee.
Optical character recognition:
1. Test the OCR and Optical word recognition basics by using character or word input for the system to recognize.
2. Test supervised learning to see if the system can recognize characters or words from printed, written or cursive scripts.
3. Test deep learning, i.e., check whether the system can recognize the characters or words from skewed, speckled, or binarized documents.
4. Test constrained outputs by introducing a new word in a document that already has a defined lexicon with permitted words.
1. Test the image recognition algorithm through basic forms and features.
2. Test supervised learning by distorting or blurring the image to determine the extent of recognition by the algorithm.
3. Test pattern recognition by replacing cartoons with the real image like showing a real dog instead of a cartoon dog.
4. Test deep learning scenarios to see if the system can find a portion of an object in a large image canvas and complete a specific action.
Testing AI platforms:
Now we will be focusing on the various strategies for algorithm testing, API integration, and so on in this AI Testing Tutorial as they are very important when it comes to testing AI platforms.
1. Check the cumulative accuracy of hits (True positives and True negatives) over misses (False positives and False negatives)
2. Split the input data for learning and algorithm.
3. If the algorithm uses ambiguous datasets in which the output for a single input is not known, then the software should be tested by feeding a set of inputs and checking if the output is related. Such relationships must be soundly established to ensure that the algorithm doesn’t have defects.
4. If you are working with an AI which involves neural networks, you have to check it to see how good it is with the mathematical formulae that you have trained it with and how much it has learned from the training. Your training algorithm will show how good the neural network algorithm is with its result on the training data that you fed it with.
The Development set
However, the training set alone is not enough to evaluate the algorithm. In most cases, the neural network will correctly determine deformed cells in images that it has seen several times. But it may perform differently when fed with fresh images. The algorithm for determining deformed cells will only get one chance to assess every image in real-life usage, and that will determine its level of accuracy and reliability. So the major challenge is knowing how well the algorithm will work when presented with a new set of data that it isn’t trained on.
This new set of data is called the development set. It is the data set that determines how you modify and adjust your neural network model. You adjust the neural network based on how well the network performs on both the training and development sets, this means that it is good enough for day-to-day usage.
But if the data set doesn’t do well with the development set, you need to tweak the neural network model and train it again using the training set. After that, you need to evaluate the new performance of the network using the development set. You could also have several neural networks and select one for your application based on its performance on your development set.
1. Verify the input request and response from each application programming interface (API).
2. Conduct integration testing of API and algorithms to verify the reconciliation of the output.
3. Test the communication between components to verify the input, the response returned, and the response format & correctness as well.
4. Verify request-response pairs.
Data source and conditioning testing:
1. Verify the quality of data from the various systems by checking their data correctness, completeness & appropriateness along with format checks, data lineage checks, and pattern analysis.
2. Test for both positive and negative scenarios.
3. Verify the transformation rules and logic applied to the raw data to get the output in the desired format. The testing methodology/automation framework should function irrespective of the nature of the data, be it tables, flat files, or big data.
4. Verify if the output queries or programs provide the intended data output.
System regression testing:
1. Conduct user interface and regression testing of the systems.
2. Check for system security, i.e., static and dynamic security testing.
3. Conduct end-to-end implementation testing for specific use cases like providing an input, verifying data ingestion & quality, testing the algorithms, verifying communication through the API layer, and reconciling the final output on the data visualization platform with the expected output.
Testing of AI-powered solutions:
In this part of the AI Testing Tutorial, we will be focusing on strategies to use when testing AI-powered solutions.
RPA testing framework:
1. Use open-source automation or functional testing tools such as Selenium, Sikuli, Robot Class, AutoIT, and so on for multiple purposes.
2. Use a combination of pattern, text, voice, image, and optical character recognition testing techniques with functional automation for true end-to-end testing of applications.
3. Use flexible test scripts with the ability to switch between machine language programming (which is required as an input to the robot) and high-level language for functional automation.
Chatbot testing framework:
1. Maintain the configurations of basic and advanced semantically equivalent sentences with formal & informal tones, and complex words.
2. Generate automated scripts in python for execution.
3. Test the chatbot framework using semantically equivalent sentences and create an automated library for this purpose.
4. Automate an end-to-end scenario that involves requesting for the chatbot, getting a response, and finally validating the response action with accepted output.
Testing ML-based analytical models:
Analytical models are built by the organization for the following three main purposes.
Historical data analysis and visualization.
Predicting the future based on past data.
Prescribing course of action from past data.
Three steps of validation strategies are used while testing the analytical model:
1. Split the historical data into test & train datasets.
2. Train and test the model based on generated datasets.
3. Report the accuracy of the model for the various generated scenarios as well.
All types of testing are similar:
It’s natural to feel overwhelmed after seeing such complexity. But as a tester, if one is able to see through the complexity, they will be able to that the foundation of testing is quite similar for both AI-based and traditional systems. So what we mean by this is that the specifics might be different, but the processes are almost identical.
First, you need to determine and set your requirements. Then you need to assess the risk of failure for each test case before running tests and determining if the weighted aggregated results are at a predefined level or above the predefined level. After that, you need to run some exploratory testing to find biased results or bugs as in regular apps. Like we said earlier, you can master AI testing by building on your existing knowledge.
With all that said, we know for a fact that an AI-based system provides a highly functional dynamic output with the same input when it is run again and again since the ML algorithm is a learning algorithm. Also, most of the applications today have some type of Machine Learning functionality to enhance the relationship of the applications with the users. AI inclusion on a much larger scale is inevitable as we humans will stop at nothing until the software we create has human-like functionalities. So it’s necessary for us to adapt to the progress of this AI revolution.
We hope that this AI Testing Tutorial has helped you understand the AI algorithms and their nature that will enable you to tailor your own test strategies and test cases that cater to your needs. Applying out-of-the-box thinking is crucial for testing AI-based applications. As a leading QA company, we always implement the state of the art strategies and technologies to ensure quality irrespective of the software being AI-based or not.