[~, txt, ~] = xlsread(...
split_fields = regexp(txt(:,1), ',', 'split');
split_fields will now be a cell array containing as many entries as there were text lines in the first column. Each of the entries will be a cell array of character vectors. There will be a different number of character vectors for each original row.
You are going to have difficulty figuring out which of the portions mean on any one row. We can see from your samples that the patterns include:
- name, title, publisher, city
- name, title, edition, publisher
- name, title, city
- name, title, publisher
- name, name, title, publisher
The "name, title, city" line is arguably a "name, title, publisher" line, but there are multiple publishers in that city, with "Oxford University Press" only being the best known of them; we would have to assume that an abbreviated publisher name took precedence over a non-abbreviated city.
You got lucky in your samples that none of the titles happened to include commas: you should assume that in your larger dataset that commas in titles will occur.