I am doing a bibliometric analysis and especially, I have to search article titles on references of the citing papers. Here, you can see my code:
The code works pretty well, however the data that I can export from Scopus is not perfect. Indeed, article names are not consistent, so the perfect match does not always work. Here two examples:
Real article name: 'Biomethane production from different crop systems of cereals in Northern Italy'
Article name in the reference: 'Biomethane production from different crop systems of cereals in Nothern Italy'
Real article name: 'Methodology for the realisation of accelerated structural tests on tractors'
Article name in the reference: 'Methodology for the realization of accelerated structural tests on tractors'
As you can see, the two titles differ of a tiny character. Due to the fact that I have more than 20000 papers and fixing it by hand can be time-consuming, is there any way to programmatically search for very similar strings? As you can see, the strings might change also in length.