Can Generative AI Write a Good Test Approach?

In today's rapidly evolving technological landscape, the realm of artificial intelligence (AI) continues to push the boundaries of what was once thought possible. One fascinating application of AI is generative AI, a branch that focuses on the creation of original and coherent content.

While generative AI has made impressive strides in various domains such as music, art, and text generation, a compelling question arises: Can generative AI write a good test approach?

Testing plays a pivotal role in software development, ensuring the reliability and quality of applications. In this blog post, we delve into the potential of generative AI to revolutionise the way we approach testing and explore the challenges and opportunities it presents. Join us as we embark on a thought-provoking exploration of the intersection between generative AI and test methodologies.

(Intro written by ChatGPT)

ChatGPT and other generative AI content creators are all the rage across the industry, with many people heralding it as the future of software development. But can it be used to drive a meaningful and useful test approach? Could we use it to give insights into testing and quickly create testing documentation for teams?

I’ve been looking to answer these questions by experimenting with these tools and reviewing the results.

Fig 1. GLADOS – The original testing AI.

What is Generative AI?

For anyone who isn’t aware, Generative AI are tools that learn from existing online content in order to create new content. You ask for something (a picture, a story, a song) and the tool looks for similar content and algorithmically creates something based on what it finds.

This is mostly used across the creative industries, we see things like TV episodes partially written by Generative AI, and we’re now seeing a push in technology organisations for whether these tools can be used for software development.

In testing, there are a number of posts about using Generative AI to support automation efforts or to create scripted tests. It’s also being used to support the creation of better code and content by using tools to review and summarise or reformat things to be more concise.

Let’s write a test approach

Experiment 1: Ask ChatGPT to write a test approach using very basic prompts.

Vid 1. ChatGPT writing a test approach for Experiment 1

I used a very minimal prompt to generate a test approach,

“Write a test approach for a web app using exploratory testing and automation“.

The approach created is very high level, capturing the areas mentioned (automation and exploratory testing) but gives no insights as to how these can be done.

Exploratory Testing: Conduct exploratory testing sessions to uncover defects, usability issues, and unexpected behaviours. Testers will explore the application without predefined test scripts, utilising their expertise and intuition. Focus on areas such as user workflows, boundary conditions, error handling, and compatibility across different devices and browsers.

We can see that there’s a basic level of understanding about exploratory testing, mostly that it’s unscripted manual testing that focuses on business / end to end workflows. There’s no content around uncovering meaningful information related to risks, critical analysis, pushing testing left, user sessions, or testing closer to the code (or requirements).

Automation Framework Design: Select an appropriate automation framework and toolset based on the web application's technology stack and requirements. Design an automation framework that supports modular and maintainable test scripts, data-driven testing, and reporting. Consider using tools like Selenium WebDriver, Cypress, or Puppeteer for browser automation.

The automation section appears to be similarly focused on end to end testing, suggesting a range of tools rather than helping us to drive out what might be useful. There is no discussions over closer to the code automation (unit, integration) nor is there anything about best practices such as matching testing language to the coding language.

Test Execution: Execute both manual exploratory testing and automated test scripts as per the defined test plan. Conduct exploratory testing sessions to identify any new defects or issues that may have been missed during scripted testing. Execute automated tests in different environments, browsers, and devices to ensure compatibility and consistent behaviour.

The test execution section gives us some details about the why of testing, running compatibility and using exploratory testing to compliment automation. Again this is pretty simple and doesn’t give a view of regression for confirming ACs / regression and exploration to find new information related to risks. There’s an assumption here that testing === bug hunting which is a pretty basic view of test.

Overall we get a very basic test approach which, not surprisingly, is not tailored to our needs. The approach seems to have a basic knowledge of both automation and exploratory testing but doesn’t give insights as to why we use them beyond to increase coverage and find more bugs.

Experiment 2: Ask ChatGPT to write a test approach using more tailored prompts.

Vid 2. ChatGPT writing a test approach for Experiment 2

In this experiment I provided more details about the team and details about testing,

“Generate a test approach focused on integration between a web front end and database using REST APIs.

The team will be an agile team with an embedded tester.
Use full stack automation including JEST for unit testing and Cypress for API based tests.
Use exploratory testing to uncover risks beyond what is documented in the User Stories.
The application will have a load of 1000 users an hour.
The application will be deployed to the cloud.
The front end will need to reach AA standards for accessibility.“

Generally the response back from ChatGPT regurgitates back the things that I’ve told it, but in long form. It does add some sections (reporting and environment set up) by itself and relates back to what I’d told it.

Test Environment Setup:
- Set up a test environment that closely resembles the production environment, including the cloud infrastructure and database configuration.
- Configure the necessary test data in the database to support integration testing scenarios.

Although this tells me about the cloud and integration testing that I’d specified, there’s nothing about needing data for load testing or having data to support various negative cases. It also presupposes a single environment as opposed to having environments for different customer set ups or even a development or staging environment.

API Testing:
- Utilise Cypress or another API testing tool to create tests that validate the integration between the web front end and the database via REST APIs.
- Test various API endpoints for correct data retrieval, manipulation, and storage.
- Validate response codes, response payloads, and data consistency.

The API section is pretty good, it gives some more context around what to look for and with some work could be made meaningful. This is still a basic understanding of this testing and doesn’t provide insights into why we’d use this approach, it’s mostly saying “do this because you told me to say to do it”.

Load Testing:
- Simulate a load of 1000 users per hour using load testing tools such as JMeter or Gatling.
- Measure the system's response time, throughput, and resource utilisation under this load.
- Identify any performance bottlenecks or scalability issues and address them accordingly.

Interestingly the load testing section doesn’t suggest reusing the API tests using something like K6. My assumption is that generally it only knows about unit and end to end based load testing so suggested tools around that (I think we could use JEST which I specified for unit testing rather than JMeter here).

Accessibility Testing:
- Employ accessibility testing tools and techniques to ensure the front end adheres to AA standards.
- Verify that the application is navigable using keyboard-only controls.
- Test for proper use of alt tags, proper labelling of form elements, and appropriate colour contrast.

The other non functional area, accessibility, again adds some additional context but this is very limited. It recommends basic visual accessibility tests and nothing about operability, understandability or robustness. This can be dangerous if people take this as a conformation that a limited understanding of a type of testing can be justified as correct.

Exploratory Testing:
- Conduct exploratory testing to uncover risks beyond what is documented in the user stories.
- Perform ad-hoc testing to identify integration issues, edge cases, and potential usability problems.
- Document any defects, unexpected behaviours, or areas of improvement.

This same danger is seen in how exploratory testing is written about; here it’s advocating for ad hoc testing and there’s no details around risk analysis or uncovering information based on planned exploration. The only reason it references risks and going beyond the acceptance criteria is because I told it that in the query.

Experiment 3: Ask ChatGPT to write a specific part of a test approach by giving it more prompts.

Vid 3. ChatGPT writing a test approach section for Experiment 3

In this experiment I provided more details about the team and details about testing,

“Write a roles and responsibilities part of a test approach that details how specific agile team members will contribute to exploratory testing.

The team includes product owner, developers, designer, scrum master and test champion.
The product is a UI front end only.
Exploratory testing will look to uncover information related to risks outside of the acceptance criteria of a ticket.
All team members are expected to contribute to exploratory testing, the test champion will provide coaching on how to do this.
The product owner will have final say on whether issues found through testing will be fixed.“

This was the best response back from ChatGPT, whether this was because I asked for a smaller section or it was an easier thing to write I’m not sure.

Product Owner:
- Understand the goals and objectives of the exploratory testing approach.
- Collaborate with the team to identify potential risks and areas to explore beyond the acceptance criteria of a ticket.
- Review the issues and bugs reported during exploratory testing and make the final decision on whether they should be fixed or addressed.
- Provide guidance on prioritising the identified issues based on their impact on the product's overall value and quality.

This is something we can use! It even adds a prioritisation element that I hadn’t told it to.

Developers:
- Actively participate in exploratory testing sessions to identify potential risks and uncover issues that may not be explicitly specified in the acceptance criteria.
- Collaborate with the test champion and other team members to share knowledge, insights, and ideas for exploring the application from a technical standpoint.
- Take ownership of investigating and debugging any technical issues identified during exploratory testing.
- Collaborate with the test champion to develop test automation scripts or frameworks to support efficient and effective exploratory testing.

I think this is good too, especially the part about insights and ideas for exploring the application from a technical standpoint. It also suggests working automation in to support exploration, which is generally a good idea.

Okay how about the roles of a tester then?

Test Champion:
- Act as the primary advocate and coach for exploratory testing within the team.
- Provide training and guidance to team members on the principles, techniques, and best practices of exploratory testing.
- Collaborate with the product owner, developers, and designer to identify areas of the product that need special attention during exploratory testing.
- Conduct exploratory testing sessions, lead discussions, and provide insights into test coverage and potential risks.
- Assist in the documentation of issues, bugs, and feedback generated during exploratory testing.
- Collaborate with the development team to develop and maintain a comprehensive test automation strategy to support exploratory testing efforts.

Not bad, this is pretty good too. It covers the collaboration and coaching that’s required from a team (understanding that a test champion is a coach).

Overall these roles and responsibilities are a good starting point and could be included into a test approach with minimal rework.

Conclusions and Thoughts

At present Generative AI can write a basic test approach, but there’s a lack of meaning behind it. It’s also coming with the biases of what’s documented online, meaning that if a lot of articles think exploration is ad hoc then that’s what it’ll suggest.

Its basic garbage in, garbage out; if a lot of articles and content online are basic in their understanding then that’s what ChatGPT will spit out. It’s not able to say “oh here’s a modern take on Agile testing” or filter what’s on the web to give you what you really need to know. It can only tell you what the prevailing trends are based on number of articles saying the same thing.

By giving more prompts and telling the tool more about our type of testing we want it can say more about it. But that means we have to basically write a test approach and give it that to reformat.

What’s better is using Generative AI to create a section of an approach, like roles and responsibilities, where there’s more general consensus of how this should look.

I’ll have to do more research into ChatGPT and see if I can get my prompts to write something more useful. What about you, have you had any successes using Generative AI?