Problem

You are given a list of strings of the same length words and a string target.

Your task is to form target using the given words under the following rules:

  • target should be formed from left to right.
  • To form the iᵗʰ character (0-indexed) of target, you can choose the kᵗʰ character of the jᵗʰ string in words if target[i] = words[j][k].
  • Once you use the kᵗʰ character of the jᵗʰ string of words, you can no longer use the xᵗʰ character of any string in words where x <= k. In other words, all characters to the left of or at index k become unusuable for every string.
  • Repeat the process until you form the string target.

Notice that you can use multiple characters from the same string in words provided the conditions above are met.

Return the number of ways to form target from words. Since the answer may be too large, return it modulo 10⁹ + 7.

https://leetcode.com/problems/number-of-ways-to-form-a-target-string-given-a-dictionary/

Example 1:

Input: words = ["acca","bbbb","caca"], target = "aba"
Output: 6
Explanation: There are 6 ways to form target.
“aba” -> index 0 (“acca”), index 1 (“bbbb”), index 3 (“caca”)
“aba” -> index 0 (“acca”), index 2 (“bbbb”), index 3 (“caca”)
“aba” -> index 0 (“acca”), index 1 (“bbbb”), index 3 (“acca”)
“aba” -> index 0 (“acca”), index 2 (“bbbb”), index 3 (“acca”)
“aba” -> index 1 (“caca”), index 2 (“bbbb”), index 3 (“acca”)
“aba” -> index 1 (“caca”), index 2 (“bbbb”), index 3 (“caca”)

Example 2:

Input: words = ["abba","baab"], target = "bab"
Output: 4
Explanation: There are 4 ways to form target.
“bab” -> index 0 (“baab”), index 1 (“baab”), index 2 (“abba”)
“bab” -> index 0 (“baab”), index 1 (“baab”), index 3 (“baab”)
“bab” -> index 0 (“baab”), index 2 (“baab”), index 3 (“baab”)
“bab” -> index 1 (“abba”), index 2 (“baab”), index 3 (“baab”)

Constraints:

  • 1 <= words.length <= 1000
  • 1 <= words[i].length <= 1000
  • All strings in words have the same length.
  • 1 <= target.length <= 1000
  • words[i] and target contain only lowercase English letters.

Test Cases

1
2
class Solution:
def numWays(self, words: List[str], target: str) -> int:
solution_test.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import pytest

from solution import Solution


@pytest.mark.parametrize('words, target, expected', [
(["acca","bbbb","caca"], "aba", 6),
(["abba","baab"], "bab", 4),

(
["cbabddddbc","addbaacbbd","cccbacdccd","cdcaccacac","dddbacabbd","bdbdadbccb","ddadbacddd","bbccdddadd","dcabaccbbd","ddddcddadc","bdcaaaabdd","adacdcdcdd","cbaaadbdbb","bccbabcbab","accbdccadd","dcccaaddbc","cccccacabd","acacdbcbbc","dbbdbaccca","bdbddbddda","daabadbacb","baccdbaada","ccbabaabcb","dcaabccbbb","bcadddaacc","acddbbdccb","adbddbadab","dbbcdcbcdd","ddbabbadbb","bccbcbbbab","dabbbdbbcb","dacdabadbb","addcbbabab","bcbbccadda","abbcacadac","ccdadcaada","bcacdbccdb"],
"bcbbcccc",
677452090
),
(["abba","baab"], "babbbb", 0),
])
@pytest.mark.parametrize('sol', [Solution()])
def test_solution(sol, words, target, expected):
assert sol.numWays(words, target) == expected

Thoughts

设 words 中每个 word 的长度为 n,target 长度为 m。显然若 m > n 则怎么都拼不出 target,返回 0。

首先统计 words 中每个字符位置 j(0 ≤ j < n),不同字符 c("a" ≤ c ≤ "z")的可选次数,记为 occ(j, c)。其值为 words 中,jᵗʰ 字符是 c 的 word 个数。比如 Example 1 中 words = ["acca","bbbb","caca"] 可以得到如下的 occ 表:

c 0 1 2 3
“a” 1 1 0 2
“b” 1 1 1 1
“c” 1 1 2 0

对于 target 中的每个位置 i(0 ≤ i < m),最多可以从 word 的 n' = n - m + 1 个字符位置(即 [i, i + n - m + 1])中选择,因为要给 i 前后的其他位置留下选择的空间。对于 i(0 ≤ i < m)和 j(i ≤ j ≤ i + n - m + 1),可选的 word 数量为 occ(j, target[i])

为了方便,可以把每个 i 的可选区间左对齐,得到一个 m 行 n' 列的 grid(如下图)。用 j' 表示可选区间的第 j' 个位置,其对应于第 (i + j')ᵗʰ 字符,格子内的数值为 occ(i+j', target[i])。。

对每一行做出选择,但每一行选的 j' 都不能小于上一行所选的 j'。这就相当于从上图右边 grid 的左上角,只能向右或向下移动,最终走到右下角,看有多少种走法。跟 62. Unique Paths 是一样的,只不过每个格子都带上了「系数」。

看其中任意一条路径对应多少种选择。为了方便,在 grid 上方加一个空行,并将起点定为空行的第一个格子,这样一共需要向下走 m 次。每次向下走到第 i 行就表示为 target 的第 i 个字符确定了字符的来源(如此时位于 j' 列,则 target 的第 i 个字符是从其第 j' 个可选位置的某个 word 来的)。而向右走只是为了能够在右边某一列向下走做的准备,不代表实际的选择。这样一条路径对应的选择数量,就是每次向下走时经过的格子的数字之积。

如下图的蓝色路径,三次向下走到达的格子坐标分别为 (0, 0)(1, 0)(2, 1),对应于 occ(0+0, "a") = 1occ(1+0, "b") = 1occ(2+1, "a") = 2,共 1 * 1 * 2 = 2 个选择。

把所有路径的选择数累加起来,就是最终的结果。

定义 dp(i, j') 表示到 (i, j') 格子的所有路径的选择数总和。要走到 (i, j'),可以从上边 (i-1, j') 或者左边 (i, j'-1) 过来。从上边过来的时候,需要乘以 (i, j') 格子的数字(occ(i+j, target[i]));从左边过来时乘以 1 即可。所以:

dp(i,j)=dp(i1,j)×occ(i+j,target[i])+dp(i,j1)dp(i,j')=dp(i-1,j')\times occ(i+j, target[i])+dp(i,j'-1)

为了简化边界的处理,可以定义 dp(i, -1) = 1(即上方增加的空行),dp(-1, j') = 0。最终结果为 dp(m - 1, n' - 1)

62. Unique Paths 类似,dp 在递推过程中只需要保留最新的一行(或一列)即可。

时间复杂度 O(W * n + m * (n - m)),空间复杂度 O(C * n + (n - m)),其中 W 是 words 的长度,C 是各不相同的字符数(题目中最多 26,可以认为是常数)。

Code

solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from collections import defaultdict


class Solution:
def numWays(self, words: list[str], target: str) -> int:
m = len(target)
n = len(words[0])
if m > n: return 0

occurs: list[dict[str, int]] = [defaultdict(int) for _ in range(n)]
for word in words:
for i, c in enumerate(word):
occurs[i][c] += 1

n -= m - 1
dp = [1] * (n + 1)
dp[0] = 0
MOD = 1_000_000_007
for i in range(m):
for j in range(1, n+1):
freq = occurs[i+j-1][target[i]]
dp[j] = (dp[j-1] + dp[j] * freq) % MOD

return dp[n]