Problem

Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.

You have the following three operations permitted on a word:

  • Insert a character
  • Delete a character
  • Replace a character

https://leetcode.com/problems/edit-distance/

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation:
horse -> rorse (replace ‘h’ with ‘r’)
rorse -> rose (remove ‘r’)
rose -> ros (remove ‘e’)

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation:
intention -> inention (remove ‘t’)
inention -> enention (replace ‘i’ with ‘e’)
enention -> exention (replace ‘n’ with ‘x’)
exention -> exection (replace ‘n’ with ‘c’)
exection -> execution (insert ‘u’)

Constraints:

  • 0 <= word1.length, word2.length <= 500
  • word1 and word2 consist of lowercase English letters.

Test Cases

1
2
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
solution_test.py
1
2
3
4
5
6
7
8
9
10
11
12
13
import pytest

from solution import Solution
from solution2 import Solution as Solution2


@pytest.mark.parametrize('word1, word2, expected', [
("horse", "ros", 3),
("intention", "execution", 5),
])
@pytest.mark.parametrize('sol', [Solution(), Solution2()])
def test_solution(sol, word1, word2, expected):
assert sol.minDistance(word1, word2) == expected

Thoughts

动态规划的经典问题。

定义 dp(i, j) 表示子字符串 word1[:i]word2[:j] 的编辑距离。

如果 word1ᵢ = word2ⱼ,那么这一对字符不用编辑,距离与 word1[:i-1]word2[:j-1] 的距离相等。

否则就可以尝试不同的编辑操作:

  • word1[:i]word2[:j-1] 基础上,给 word1 在 i 之后插入 word2ⱼ
  • 删除 word1ᵢ,变成 word1[:i-1]word2[:j]
  • word1[:i-1]word2[:j-1] 基础上,把 word1ᵢ 改成 word2ⱼ

所以状态转移方程为:

dp(i,j)={dp(i1,j1)if word1i=word2j1+min{dp(i,j1)(insert)dp(i1,j)(delete)dp(i1,j1)(replace)otherwisedp(i,j)=\begin{cases} dp(i-1,j-1) & \text{if }word1_i=word2_j \\ 1+\min\begin{cases} dp(i,j-1) & \text{(insert)} \\ dp(i-1,j) & \text{(delete)} \\ dp(i-1,j-1) & \text{(replace)} \end{cases} & \text{otherwise} \end{cases}

初始值 dp(i, 0) = idp(0, j) = j。最终 word1 和 word2 的编辑距离为 dp(m, n)(其中 m、n 分别为 word1、word2 的长度)。

时间复杂度 O(m * n),空间复杂度 O(m * n)

不过因为最多只需要临近的三个值,就可以只保存 dp 的一行,空间复杂度降为 O(min{m, n})

Code

O(min{m, n}) Space

solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
if len(word1) < len(word2): word1, word2 = word2, word1
n = len(word2)
dp = [j for j in range(n + 1)]
for i, c1 in enumerate(word1, 1):
p_dp, dp[0] = dp[0], i
for j, c2 in enumerate(word2, 1):
if c1 == c2:
p_dp, dp[j] = dp[j], p_dp
else:
p_dp, dp[j] = dp[j], 1 + min(dp[j-1], dp[j], p_dp)

return dp[n]

O(m * n) Space

solution2.py
1
2
3
4
5
6
7
8
9
10
11
12
13
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
m = len(word1)
n = len(word2)
dp = [[i + j for j in range(n + 1)] for i in range(m + 1)]
for i, c1 in enumerate(word1, 1):
for j, c2 in enumerate(word2, 1):
if c1 == c2:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = 1 + min(dp[i][j-1], dp[i-1][j], dp[i-1][j-1])

return dp[m][n]