Problem

You are given two string arrays words1 and words2.

A string b is a subset of string a if every letter in b occurs in a including multiplicity.

  • For example, "wrr" is a subset of "warrior" but is not a subset of "world".

A string a from words1 is universal if for every string b in words2, b is a subset of a.

Return an array of all the universal strings in words1. You may return the answer in any order.

https://leetcode.com/problems/word-subsets/

Example 1:

Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["e","o"]
Output: ["facebook","google","leetcode"]

Example 2:

Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["l","e"]
Output: ["apple","google","leetcode"]

Constraints:

  • 1 <= words1.length, words2.length <= 10⁴
  • 1 <= words1[i].length, words2[i].length <= 10
  • words1[i] and words2[i] consist only of lowercase English letters.
  • All the strings of words1 are unique.

Test Cases

1
2
class Solution:
def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
solution_test.py
1
2
3
4
5
6
7
8
9
10
11
12
13
import pytest

from solution import Solution
from solution2 import Solution as Solution2


@pytest.mark.parametrize('words1, words2, expected', [
(["amazon","apple","facebook","google","leetcode"], ["e","o"], ["facebook","google","leetcode"]),
(["amazon","apple","facebook","google","leetcode"], ["l","e"], ["apple","google","leetcode"]),
])
@pytest.mark.parametrize('sol', [Solution(), Solution2()])
def test_solution(sol, words1, words2, expected):
assert sorted(sol.wordSubsets(words1, words2)) == sorted(expected)

Thoughts

因为 universal word 能够包含 words2 中的所有单词,可以先汇总 words2 所有单词的信息。

对于 words2 中的每个单词,统计其中每个字符的出现次数。然后把每个单词的字符次数合并,对于任何一个字符,取最大的出现次数(相当于 Python collections.Counterunion 操作(operator |)。

然后遍历 words1 的每一个单词,看是否包含汇总之后的所有字符的次数。

时间复杂度 O(m + n),额外的空间复杂度 O(1)

Code

solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from collections import defaultdict


class Solution:
def wordSubsets(self, words1: list[str], words2: list[str]) -> list[str]:
max2 = lambda a, b: a if a >= b else b
occurs: dict[str, int] = defaultdict(int)
for word in words2:
counter: dict[str, int] = defaultdict(int)
for c in word:
counter[c] += 1

for c, cnt in counter.items():
occurs[c] = max2(occurs[c], cnt)

result: list[str] = []
for word in words1:
counter: dict[str, int] = defaultdict(int)
for c in word:
counter[c] += 1

for c, cnt in occurs.items():
if counter[c] < cnt:
break
else:
result.append(word)

return result

collections.Counter 会慢一些:

solution2.py
1
2
3
4
5
6
7
8
9
10
11
from typing import Counter


class Solution:
def wordSubsets(self, words1: list[str], words2: list[str]) -> list[str]:
occurs: dict[str, int] = Counter()
for word in words2:
counter = Counter(word)
occurs |= counter

return [word for word in words1 if Counter(word) >= occurs]