Answer to: Are there better algorithms for computing list similarity that this one I wrote, or corrections?
Score: 0
Your example name looks to be Spanish. I know nothing about that.
In English names
first names are often reduced to nicknames, so if you are searching for Richard Brown, you would need to search your document for Dick Brown.
multiple names ( e.g. "Jose Alfredo Plaza Navarro ) are often reduced to two. The reduction is not random. The last name is usually kept. Some first names are more often kept than others ( e.g. Richard Alphonse Brown and Alphonse Richard Brown will BOTH become Richard Brown.
Last name is most often kept, preceded by a title and an abbreviation. Richard Brown will often be seen as Mr R Brown. You will also need to include Mr. R. Brown and similar - suggest you strip out any '.' before doing any comparisons.
I expect Spanish names will show different but equivalent modifications.
"I want the windows with the most common elements with respect to the target name to score higher "
I think this is a mistake. You should not penalize a missing middle name so much.
What are you intending to do about false positives?
View Question ↗
Question
Parent Entity
Score: 2 • Views: 54
Site: stackoverflow
SaaS Metrics