Currently, I have an html webpage saved in a text format. Below is an example of the portion of the text I am interested in:
I want to search the text document for every case the "<a href='\some\ " pattern appears and extract the text between the tokens, i.e.
Matlab has regexp, match and tags but I am struggling to pick out the string cleanly. Ideally, I would like to search the document and return a cell array of strings which lists all of the matches. Here is my current code:
urls = regexp(str, 'href=(\S+)(\s*)$', 'tokens', 'lineAnchors');