Measurement matters. Make your AI count.
Is your AI providing reliable results? Is it missing important data from your data set? Or is it hallucinating — as we’ve seen it can do when it quotes and cites articles, legal cases, or other sources that do not exist?
Validation of results is important with any type of technology – even carpenters measure both before and after they use a saw. Verification (that software is built to specs) and validation (measuring of results) is standard in software development; the latter is more important than ever with AI, black-box technology that can take account of exponentially more features and weight their importance, creating increasingly opaque models. The interwoven elements and processes that affect results – modeling, data, and training, for example – may generate sometimes surprising outcomes that are difficult to trace back to a root cause. Validation is ultimately in the hands of the user. The literature provides a multiplicity of benchmarks and metrics for assessing results, only some of which are truly enlightening.
It helps to remember that AI – regardless of the specific application incorporating the technology – is fundamentally grounded in a search capability. To be effective, AI has to find the information that is responsive to your query and weight it appropriately. Generative AI, going further, then crafts a combination of words, in a particular style, to articulate those search results.
Seen in this light, assessing the output of an AI application is rooted in the tried-and-true ways we’ve been using to assess the effectiveness of search systems for the past fifty years.
At a fundamental level, it typically involves sampling the data population to estimate what the search should find, assessing the results to establish the system’s effectiveness, and then using those results to recalibrate the system. This iterative process is repeated until the system is consistently retrieving the expected results.
If you are relying on the results for your business, this is important. You may indeed find that data enhancement, tailoring of the search, or perhaps a change in the model may be warranted to improve your results. After all, your uses are individual and your goals are unique. Used effectively, AI is a potentially transformative technology, but you need the requisite expertise to know if it is supplying the optimal results.
WOS AI Solutions can help. WOS has that data assessment expertise along with decades of providing skilled technical resources from local communities.
We look forward to connecting with you.
No responses yet