Extract regexp tokens with regexpPattern

With regexp I could extract the tokens of my capture groups via
regexp("abcd3e", "\w+(\d)+\w", "tokens")
ans = 1×1 cell array
{["3"]}
The result is a cell array. With the new regexpPattern and extract functions, the return values usually are string (arrays) which is something I prefer.
Question: Is there an analogon of the above regexp using something like extract("abcd3e", regexpPattern("\w+(\d)+\w"), "tokens")? This syntax obviously does not work in R2023b, but are there standard ways to rewrite these patterns to return my tokens?
Thanks,
Jan
EDIT: this is just a toy example, I do not only want to extract digits which could be done with digitsPattern. Ideally, I'd like to understand how directly translate the regexps.
To show a more realistic example:
str = [
"42652Z_HEX"
"42652X"
"42652Y"
"42652Z"
"42652GYRO-X_HEX"
"42652GYRO-Y_HEX"
"42652GYRO-Z_HEX"
"42351Temp_HEX"
"42652Temp_HEX"
"42652GYRO-X"
"42652GYRO-Y"
"42652GYRO-Z"
"42351Temp"
"42652Temp"
];
res = string(regexp(str, "\d+(?:GYRO-)?([XYZ])?.*", "tokens"))
res = 14×1 string array
"Z" "X" "Y" "Z" "X" "Y" "Z" "" "" "X" "Y" "Z" "" ""
% how to get the same result with matches and regexpPattern?

2 Comments

If you just want to extract numbers between letters -
str = "abcd3e57xyz";
out = extract(str, digitsPattern)
out = 2×1 string array
"3" "57"
Thanks for your answer.
No, I do not only want to extract numbers, it's a toy example. I'd like to translate the regexps which already exist into the new regexpPattern - if possible. The regexp might get more complicated than the shown one. I'll edit my question accordingly.

Sign in to comment.

Answers (1)

I realize that this is not really an answer to your question, but I just wanted to make sure you are aware that one option is to wrap the string function around the regexp:
string(regexp("abcd3e fghi4j", "\w+(\d)+\w", "tokens"))
ans = 1×2 string array
"3" "4"
Also, if you are guaranteed to have only one match, you could do
regexp("abcd3e", "\w+(\d)+\w", "tokens","once")
ans = "3"
but that's somewhat fragile coding, I would say.
I'm not yet sure if there is a more "direct" way with more recent functions.

2 Comments

Your updated question clarifies that my answer is not what you are looking for, but I'll leave it here anyway. :-)
Thank you very much for your answer.
Yes, I updated the question to clarify a bit, sorry.
There were cases in the past where I could not cast to string, I'll need to check why. In fact that's not a terrible solution, but I'm simply wondering how to use the new regexpPattern properly and maybe I'm missing something.

Sign in to comment.

Categories

Products

Release

R2023b

Asked:

on 29 Feb 2024

Commented:

on 29 Feb 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!