September 2025

Tech Talk

Software testing in the age of AI By Michael E. Duffy N ow that anyone can program, thanks to programming- savvy chatbots like ChatGPT

End-to-end tests typically start out very simply, exercising the basic functions of a system. Tests are added to cover problems that are encountered in actual use. As the suite of available tests grows over time, confidence that the software is, in fact, reliable and accurate also increases. For me, that is one of the issues of so-called “vibe coding.” As most people practice it, vibe coding is an unstructured approach to software development. In general, testing a vibe-coded app means seeing if it does mostly what you want it to do, rather than ensuring that it gives correct results in all circumstances,

and Claude, it’s important to remember that software should be accurate, reliable and safe to use. That’s the realm of engineering. My number one method for ensuring accuracy, reliability and safety in software is testing. From experience, I can say that it can be tedious to write a comprehensive set of tests, often resulting in more code than the code being tested. In fact, one development

methodology, Test Driven Development, starts by writing the tests, and then the software. Typically, software engineers talk about the testing pyramid: At the bottom are unit tests, which as the name implies, test individual functional units of code. At the top are end-to-end tests, which test the overall operation of the system. Integration tests, which test how units interact, are in the middle. Let’s say I want to write software which adds two numbers together. It should work correctly whether either of the two numbers is positive, negative or zero. It should give the same result regardless of order. In some languages, like Python and Javascript, it’s possible to provide values which aren’t numbers—this should obviously fail loudly. A competent software engineer will write a set of unit tests which ensure that their software functions as expected for all types of values. Alas, even software engineers are human, and sometimes fail to provide complete test coverage. As it turns out, this is an area where AI coding assistants can really shine. ChatGPT identified 26 distinct tests that it would write in order to test my hypothetical addition function (written in Javascript). It even included some that I would not have thought of, involving the smallest and largest numbers that Javascript can represent. If I had asked it to, ChatGPT would have written the code, a large part of which is simply boilerplate. Of course, I would have to review the code it wrote to make sure that each test was actually testing the thing it purported to check, but that’s a much easier task. Perhaps the biggest value of tests is that it increases your confidence that a change to software hasn’t broken anything. For example, if I decide to improve a piece of code or fix a bug, existing tests let me know immediately if my improvement/fix has introduced a new problem. In fact, when a bug is discovered in existing software, a good software engineer will write a test that demonstrates the bug before fixing the problem. The code can only be termed “fixed” if all the tests, including the one that exercises the problem, run without error.

and failing unambiguously. Obviously, “it works for me” is fine for personal applications or quick demonstrations. I’ve done it myself. But for production-quality software, that sort of uncertainty is simply unacceptable. The good news is that AI can be used effectively to set up a basic framework of tests which establish correct operation. You describe how the software should behave to your chosen AI tool, and AI creates the tests which ensure it behaves that way. You can then test the software, vibe-coded or otherwise, against that framework. When a problem arises, AI can generate a new test case, and AI can provide a suggested fix that passes all tests. Software development is undergoing a huge upheaval due to the advances in AI-assisted programming—I feel it myself. Regardless, testing remains the primary method of delivering software that is accurate and safe to use. Fortunately, AI can assist with testing as well, which bodes well for the quality of future software. … A brief addendum to last month’s column about my wife’s PC malfunction, which at press time, I was unable to resolve. I finally gave up and decided to wipe my wife’s computer clean and reinstall Windows. Before I did that, though, I needed to open up the BIOS setup on her laptop, in order to boot from a recovery USB. I noticed that the same problem occurred (can’t type the same character twice in a row) in the BIOS setup, indicating that it had nothing to do with Windows. So, on a hunch, I just saved the current BIOS setup, making no changes, and rebooted normally. Presto! Problem solved. I still have no idea what really went wrong, but I’m grateful to have avoided a clean install. My wife was pretty happy, too. g

Michael E. Duffy is a senior software engineer and lives in Sonoma County. He has been writing about technology and business for NorthBay biz since 2001.

September 2025

NorthBaybiz 37

Made with FlippingBook interactive PDF creator