Calculate the edit distance between two strings and see exactly which characters were inserted, deleted, or substituted to transform one into the other.
Each cell shows the minimum edits to transform the first i characters of A into the first j characters of B. The bottom-right cell is the final distance.
Levenshtein distance — also called edit distance — is the minimum number of single-character operations needed to transform one string into another. It was proposed by Soviet mathematician Vladimir Levenshtein in 1965 and remains one of the most widely used string similarity metrics in computer science.
The three allowed operations are:
cat → cats: insert s)cats → cat: delete s)cat → bat: substitute c → b)kitten → sitting| Step | Operation | Result |
|---|---|---|
| 1 | Substitute k → s |
sitten |
| 2 | Substitute e → i |
sittin |
| 3 | Insert g at end |
sitting |
Edit distance = 3. Similarity = \(1 - \frac{3}{7} \approx 57%\)
The standard algorithm fills an \((n+1) \times (m+1)\) matrix where \(n\) and \(m\) are the lengths of the two strings. The recurrence is:
\[dp[i][j] = \begin{cases} \max(i,j) & \text{if } \min(i,j) = 0 \\ dp[i-1][j-1] & \text{if } s[i] = t[j] \\ 1 + \min(dp[i-1][j],\, dp[i][j-1],\, dp[i-1][j-1]) & \text{otherwise} \end{cases}\]
The three terms in the min represent deletion, insertion, and substitution respectively. Enable "Show DP matrix" above to see the full matrix for your inputs.
Time complexity is \(O(nm)\) and space complexity is \(O(nm)\), though space can be reduced to \(O(\min(n,m))\) with a rolling array.
Spell checking — When a user types a word not found in a dictionary, a spell checker finds words within a small edit distance (typically 1–2) and ranks them as suggestions. teh has distance 1 from the, making it the top suggestion.
DNA sequence alignment — Bioinformatics tools use edit distance variants to identify mutations between genetic sequences. A substitution corresponds to a point mutation; an insertion or deletion is an indel.
Fuzzy string search — Search engines and autocomplete systems use edit distance to return results even when queries contain typos. Searching for javascrpit still surfaces JavaScript results because the distance is only 2.
Plagiarism detection — Comparing document sections using edit distance or derived metrics helps identify paraphrased or slightly modified copied text.
Record linkage / deduplication — Matching records across databases where names may be spelled differently (Jon Smith vs John Smyth) uses edit distance to identify likely duplicates.
Version control diffs — Tools like diff use edit-distance-based algorithms to compute the minimum set of line insertions and deletions that transform one file into another.
What's the difference between Levenshtein distance and Hamming distance?
Hamming distance only counts substitutions and only works on strings of equal length. Levenshtein distance handles strings of any length and counts insertions, deletions, and substitutions. Levenshtein is strictly more general.
What does a distance of 0 mean?
The strings are identical.
How is similarity percentage calculated here?
Similarity is \(\left(1 - \frac{d}{\max(|A|, |B|)}\right) \times 100%\) where \(d\) is the edit distance. This normalizes the distance by the length of the longer string, giving a value between 0% (completely different) and 100% (identical).
Is Levenshtein distance case-sensitive?
Yes — this calculator treats uppercase and lowercase as distinct characters, so Cat and cat have a distance of 1. Convert both strings to the same case before comparing if you want case-insensitive results.
What are the limitations of edit distance?
Edit distance treats all characters equally, so it doesn't account for phonetic similarity, semantic meaning, or common keyboard-adjacency errors. For those use cases, algorithms like Jaro-Winkler (better for short names), Soundex (phonetic), or cosine similarity (semantic) may be more appropriate.
What is Damerau-Levenshtein distance?
A variant that adds a fourth operation: transposition of two adjacent characters. This means ab → ba costs 1 instead of 2. It's useful for catching common typing errors like accidentally swapping two letters.
This website may contain affiliate links. If you click on an affiliate link and make a purchase, we may receive a small commission at no additional cost to you.