Do fuzzy matching (something like fuzzywuzzy maybe) to see if the the words line up (allowing for wrong words). You'll need to work out how to use scoring to work out how well aligned the two lists are. - Source: Reddit / 2 months ago
Convert the original lines to full furigana and do a fuzzy match. (For reference, the original line is 貴方がこれまでに得てきた力、存分に発揮してくださいね。) You can do a regional search using the initial scene data (E60) first, and if the confidence is low, go for a slower full search. - Source: Reddit / 5 months ago
It's now known as "thefuzz", see https://github.com/seatgeek/fuzzywuzzy. - Source: Reddit / 11 months ago
You can have a look at this library to use fuzzy search instead of looking for plaintext muck: https://github.com/seatgeek/fuzzywuzzy. - Source: Reddit / over 1 year ago
To deal with comparing the string, I found FuzzyWuzzy ratio function that is returning a score of how much the strings are similar from 0-100. - Source: Reddit / over 1 year ago
I used fuzzywuzzy [1], a python-based fuzzy string matching calculator that is based on Levenshtein's edit-distance for a parking enforcement product I built. The product used an iOS client to capture license plates. The app would capture a single plate many times, de-duplicate using an edit-distance threshold matching plates up to a time period lookback. Then among those plates, send the one with the highest... - Source: Hacker News / almost 2 years ago
Probably something as simple as loading it into the memory then calculating Damerau–Levenshtein distance for each pair and setting some threshold below which records are considered duplicates. If you want some more fancy similarities calculator, Python is probably better equipped in terms of ready-made libraries. - Source: Reddit / almost 2 years ago
FuzzyWuzzy has an easy-to-use implementation: https://github.com/seatgeek/fuzzywuzzy. - Source: Reddit / almost 2 years ago
After the above process, I was left to deal with minor differences in author names, like initials, spacing, typos, etc. There were too many entries to adjust manually, so I used fuzzywuzzy to determine if two entries were close enough. This trick would fail with authors, say R.J. Barker and R.J. Parker but I didn't find such entries during a sample manual sanity check from these groupings. - Source: Reddit / almost 2 years ago
Iterate separately over the two files, get the single DNA in a variable and the 180 in a list (make sure they are stored as string). You can use fuzzywuzzy to get the most similar match. - Source: Reddit / about 2 years ago
If you really need an numbers in procent then FuzzyWuzzy could help you with your problem. This lib uses also the leivenstein distance on its core but gives the convinient output of matching in procent. - Source: Reddit / about 2 years ago
Do you know an article comparing FuzzyWuzzy to other products?
Suggest a link to a post with product alternatives.