Problem

Given a string s, return the number of different non-empty palindromic subsequences in s. Since the answer may be very large, return it modulo 10⁹ + 7.

A subsequence of a string is obtained by deleting zero or more characters from the string.

A sequence is palindromic if it is equal to the sequence reversed.

Two sequences a₁, a₂, ... and b₁, b₂, ... are different if there is some i for which aᵢ != bᵢ.

https://leetcode.com/problems/count-different-palindromic-subsequences/

Example 1:

Input: s = "bccb"
Output: 6
Explanation: The 6 different non-empty palindromic subsequences are 'b', 'c', 'bb', 'cc', 'bcb', 'bccb'.
Note that 'bcb' is counted only once, even though it occurs twice.

Example 2:

Input: s = "abcdabcdabcdabcdabcdabcdabcdabcddcbadcbadcbadcbadcbadcbadcbadcba"
Output: 104860361
Explanation: There are 3104860382 different non-empty palindromic subsequences, which is 104860361 modulo 10⁹ + 7.

Constraints:

  • 1 <= s.length <= 1000
  • s[i] is either 'a', 'b', 'c', or 'd'.

Test Cases

1
2
class Solution:
def countPalindromicSubsequences(self, s: str) -> int:
solution_test.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import pytest

from solution import Solution


@pytest.mark.parametrize('s, expected', [
("bccb", 6),
("abcdabcdabcdabcdabcdabcdabcdabcddcbadcbadcbadcbadcbadcbadcbadcba", 104860361),

("a", 1),
("ab", 2),
])
@pytest.mark.parametrize('sol', [Solution()])
def test_solution(sol, s, expected):
assert sol.countPalindromicSubsequences(s) == expected

Thoughts

看起来是 647. Palindromic Substrings 的进阶版,从 substring 升级为 subsequence,而且要去重,不过字符种类从所有英文小写字母缩减到只有 'a''b''c''d' 四个。

考虑任意一个子区间 [i, j]0 ≤ i ≤ j < n),定义 dp(i, j) 表示子字符串 s[i:j+1] 内各不相同的回文子序列总数。

关键在于如何避免重复。技巧是确定首(及尾)字母,首字母不同的一定是不同的回文。

不妨以 “a” 为例。先看 s[i:j+1] 内是否有 “a”,没有则跳过。如果有,那么单个字母 “a” 就是回文,计数加一。如果不止一个,那么 “aa” 也是回文,计数再加一。然后找到最左边和最右边的 “a”,设下标分别为 first 和 last(i ≤ first < last ≤ j),给 s[first+1:last] 中所有各不相同回文两头各加一个 “a”,仍然都是各不相同的回文。

可见 s[i:j+1]以 “a” 开头 的回文总数为:

{0for as[i:j+1]1for !as[i:j+1]dp(first+1,last1)+2\begin{cases} 0 & \text{for }`a\text{\textquoteright}\notin s[i:j+1] \\ 1 & \text{for }\exists!`a\text{\textquoteright}\in s[i:j+1] \\ dp(first+1,last-1)+2 \end{cases}

第三个式子的 + 2,一个代表 “a”,一个代表 “aa”。

同理可以得到 s[i:j+1] 中分别以 “b”、“c”、“d” 开头的回文总数,累加即可。

易知这样可以找出来所有各不相同的回文,既不会有重复,也不会遗漏。

自顶向下计算的话,可以用带缓存的递归。另外可以事先扫描 s,对于每个位置 i,记录四个字母在 s[i:] 最左边分别的位置,以及在 s[:i+1] 最右边分别的位置,以便对于指定的区间 [i, j],可以用常数时间得到某个字母的 first 和 last。

整体时间复杂度为 O(C * n²),空间复杂度 O(n²)。其中 C 是 s 中可能出现的各不相同的字符总数,本题中为 4 可以认为是常数。

Code

solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from functools import cache


class Solution:
def countPalindromicSubsequences(self, s: str) -> int:
MOD = 1_000_000_007
CHARS = 'abcd'
n = len(s)

c_lasts = {c: [-1] * n for c in CHARS}
c_firsts = {c: [n] * n for c in CHARS}
c_lasts[s[0]][0] = 0
c_firsts[s[n-1]][n-1] = n-1
for i in range(1, n):
j = n - i - 1
for c in CHARS:
c_lasts[c][i] = c_lasts[c][i-1]
c_firsts[c][j] = c_firsts[c][j+1]

c_lasts[s[i]][i] = i
c_firsts[s[j]][j] = j

@cache
def dp(left: int, right: int) -> int:
if left > right: return 0

cnt = 0
for c in CHARS:
first = c_firsts[c][left]
last = c_lasts[c][right]
if first > right or last < left:
continue

cnt += 1 # The single-char palindromic `c`.
if first < last:
cnt += dp(first + 1, last - 1) + 1 # Plus the empty subsequence.

return cnt % MOD

return dp(0, n - 1)