How do I properly format txt to be used in deep learning text generation.

3 views (last 30 days)
Im currently following the 'Generate Text using Deep Learning' toolbox but using a different piece of text.
I don't understand where this part of the code comes from:
I understand what it does as i can see it in the text, but where does \x2403 come from. The reason i ask is because in my text, everywhere there is an apostrophe, whethere in a word like can't, or where theres are quotes this symbol shows up Ô ...
Later on when i try and train, i get this error:
Error using trainNetwork (line 165)
Invalid training data. Labels must not contain undefined values.
Error in txtgen (line 73)
net = trainNetwork(XTrain,YTrain,layers,options);
Im not sure if this is related but either way the Ô shouldn't be there i dont think...

Answers (1)

Harshit Jain
Harshit Jain on 29 Mar 2019
Values of the form (\x0002) are unicode values for the respective characters. You can read more about unicode characters here

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!