Clear Filters
Clear Filters

bleuEvaluationScore Argument 1 must be a tokenizedDocument scalar.

57 views (last 30 days)
Hi
I am trying to use the bleuEvaluationScore example and it works ok wit the simple.
When I try to use my own text I get an error with "Argument 1 must be a tokenizedDocument scalar."
Any idea as why I get such error ? I use version 2023b under Win11
Brahim

Answers (3)

Sahas
Sahas on 5 Sep 2024 at 14:05
Edited: Sahas on 5 Sep 2024 at 14:50
Hi,
As per my understanding, you are calculating the BLEU Score with the “bleuEvaluationScore” function. The example provided in the MathWorks documentation page runs smoothly but gives an error when you are using custom text.
The error “Argument 1 must be a tokenizedDocument scalar” refers to incorrect data type of the “candidate” input argument while calculating the BLEU Score. Make sure that the data type of “tokenizedDocument” is correct. Refrain from using the "strsplit" function as it results in incompatible data types.
The documentation of “bleuEvaluationScore” function states that, “If candidate is not a tokenizedDocument scalar, then it must be a row vector representing a single document, where each element is a word.”
Refer to the following MathWorks documentation link for more information: https://www.mathworks.com/help/textanalytics/ref/bleuevaluationscore.html
Here is the sample code snippet I used to reproduce the error:
% Example text
referenceText = "The quick brown fox jumps over the lazy dog.";
candidateText = "The fast brown fox leaps over the lazy dog.";
%If you use "strsplit", it will give the same error
%See the datatype in MATLAB's Workspace
% referenceText = strsplit(referenceText)
% candidateText = strsplit(candidateText)
referenceDoc = tokenizedDocument(referenceText)
candidateDoc = tokenizedDocument(candidateText)
% Calculate BLEU score
score = bleuEvaluationScore(candidateDoc, referenceDoc);
% Display the BLEU score
disp(score);
Hope this is beneficial!

Brahim HAMADICHAREF
Brahim HAMADICHAREF on 6 Sep 2024 at 2:39
Instead of simple sentence like "The quick brown fox jumps over the lazy dog."
I use article titles for example from
"Low-temperature water-gas shift reaction over Au/CeO2 catalysts"
I have 84 of them
then I use another articl title for the references
"Remarkable Performance of Ir1/FeOx Single-Atom Catalyst in Water Gas Shift Reaction"
function newTest
clc
clear
close all
str = [
"Low-temperature water-gas shift reaction over Au/CeO2 catalysts" ; ...
"Comparative studies of low-temperature water-gas shift reaction over Pt/CeO2, Au/CeO2, and Au/Fe2O3 catalysts" ; ...
"Low temperature water-gas shift: in situ DRIFTS-reaction study of ceria surface area on the evolution of formates on Pt/CeO2fuel processing catalysts for fuel cell applications" ; ...
"Water gas shift reaction on carbon-supported Pt catalysts promoted by CeO2" ; ...
"Fabrication of Pt/CeO2 nanofibers for use in water-gas shift reaction" ; ...
"Comparative study on nano-sized 1 wt\% Pt/Ce0.8Zr0.2O2 and 1 wt\% Pt/Ce0.2Zr0.8O2 catalysts for a single stage water gas shift reaction" ; ...
"Simultaneous water gas shift and methanation reactions on Ru/Ce0.8Tb0.2O2-x based catalysts" ; ...
];
disp(['str has ' num2str(size(str, 1)) ' entries'])
%
strRef = [ "Remarkable Performance of Ir1/FeOx Single-Atom Catalyst in Water Gas Shift Reaction" ];
documents = tokenizedDocument(str)
disp('class(documents)')
class(documents)
size(strRef)
references = tokenizedDocument(strRef)
disp('class(references)')
class(references)
score = bleuEvaluationScore(documents, references)

Brahim HAMADICHAREF
Brahim HAMADICHAREF on 6 Sep 2024 at 2:40
class(documents) is 'tokenizedDocument'
class(references) is also 'tokenizedDocument'
I do not understand the error !

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!