AI Chat playground accuracy metrics
Bob Gastineau
on 19 Nov 2024
Latest activity Reply by Walter Roberson
on 4 Dec 2024
I was curious to startup your new AI Chat playground.
The first screen that popped up made the statement:
"Please keep in mind that AI sometimes writes code and text that seems accurate, but isnt"
Can someone elaborate on what exactly this means with respect to your AI Chat playground integration with the Matlab tools?
Are there any accuracy metrics for this integration?
6 Comments
Ok, no metrics on Matlab Plus AI LLM accuracy?
Let me understand what you are attempting todo here:
- Integrate MATLAB math and scientific tools with another third party AI tool where it is known that the AI LLM result does not generate accurate results.
- There are no published metrics on your AI integration.
- The is no third party qualification on the accuracy and performance of the tool
- You expect the community to debug your tool with no resolution on accuracy in sight?
Is it realistic to expect engineers to accept and use a tool that does not generate accurate results? and your Matlab AI LLM integration apparently generates random garbage that needs to be fact checked.
@Robert Gastineau, the statement describes the fact that large language models are non-deterministic and can generate responses that are inaccurate. For example, MATLAB code in a response may appear correct, in that there are no obvious issues such as malformed statements. However, there can be function names, properties, or arguments that are made up (hallucinations). Having the code editor in the AI Chat Playground allows you to quickly analyze and run the code to verify if it's correct.
Language models are improving quickly and these types of issues are becoming less frequent.