Problem

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

https://leetcode.com/problems/top-k-frequent-elements/

Example 1:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Example 2:

Input: nums = [1], k = 1
Output: [1]

Constraints:

  • 1 <= nums.length <= 10^5
  • -10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique

Follow up: Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.

Test Cases

1
2
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
solution_test.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import pytest

from solution import Solution


@pytest.mark.parametrize('nums, k, expected', [
([1,1,1,2,2,3], 2, [1,2]),
([1], 1, [1]),
])
class Test:
def test_solution(self, nums, k, expected):
sol = Solution()
result = sol.topKFrequent(nums, k)
assert sorted(result) == sorted(expected)

Thoughts

用哈希表统计每个数字的出现次数。

然后用大小为 k 的最小堆,可以找到刚好第 k 大的词频(以及对应的数字),

numsn 个元素,其中各不相同的数字有 d 个。

时间复杂度 O(n + d log k),空间复杂度 O(d)

小优化点:堆内元素少于 k 个时不用维护堆的结构,第 k 个元素入堆时做一次堆的构建即可。

Code

solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class Solution:
def topKFrequent(self, nums: list[int], k: int) -> list[int]:
counting: dict[int, int] = {} # counting[i]: freq of number i.
for v in nums:
counting[v] = counting.get(v, 0) + 1

min_heap = [None] * k
leaf = k >> 1

def shift_down(pos: int):
while pos < leaf:
left = (pos << 1) + 1
right = left + 1
child = right if right < k and min_heap[right][1] < min_heap[left][1] else left
if min_heap[pos][1] > min_heap[child][1]:
min_heap[pos], min_heap[child] = min_heap[child], min_heap[pos]
pos = child
else:
return

for i, item in enumerate(counting.items()):
if i < k:
min_heap[i] = item
if i == k - 1:
# Build heap.
for j in range(leaf - 1, -1, -1):
shift_down(j)
elif item[1] > min_heap[0][1]:
# Replace heap top with v.
min_heap[0] = item
shift_down(0)

return [v for v, _ in min_heap]