Problem

Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

Clarification: The input/output format is the same as how LeetCode serializes a binary tree. You do not necessarily need to follow this format, so please be creative and come up with different approaches yourself.

https://leetcode.com/problems/serialize-and-deserialize-binary-tree/

Example 1:

Input: root = [1,2,3,null,null,4,5]
Output: [1,2,3,null,null,4,5]

Example 2:

Input: root = []
Output: []

Constraints:

  • The number of nodes in the tree is in the range [0, 10^4].
  • -1000 <= Node.val <= 1000

Test Cases

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Definition for a binary tree node.
# class TreeNode(object):
# def __init__(self, x):
# self.val = x
# self.left = None
# self.right = None

class Codec:

def serialize(self, root):
"""Encodes a tree to a single string.

:type root: TreeNode
:rtype: str
"""


def deserialize(self, data):
"""Decodes your encoded data to tree.

:type data: str
:rtype: TreeNode
"""


# Your Codec object will be instantiated and called as such:
# ser = Codec()
# deser = Codec()
# ans = deser.deserialize(ser.serialize(root))
solution_test.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import collections

import pytest

from solution import Codec
from utils import TreeNode


def build_tree(values: list[int | None]) -> TreeNode | None:
virtual = TreeNode(None)
queue = collections.deque()
queue.append((virtual, 'left'))
for val in values:
node, branch = queue.popleft()
if val is None:
continue

child = TreeNode(val)
setattr(node, branch, child)
queue.append((child, 'left'))
queue.append((child, 'right'))

return virtual.left


def trees_equal(root1: TreeNode | None, root2: TreeNode | None) -> bool:
if root1 == root2:
return True

if root1.val != root2.val:
return False

return trees_equal(root1.left, root2.left) and trees_equal(root1.right, root2.right)


@pytest.mark.parametrize('values', [
([1,2,3,None,None,4,5]),
([]),

([1,2,3]),
([1,None,2,3]),
([5,4,7,3,None,2,None,-1,None,9]),
])
class Test:
def test_solution(self, values):
ser = Codec()
deser = Codec()
root = build_tree(values)
data = ser.serialize(root)
new_root = deser.deserialize(data)
assert trees_equal(root, new_root)
assert data == ser.serialize(new_root)

Thoughts

序列化的时候按照某个顺序遍历二叉树,依次打印访问到的节点的数据。可以用前序(pre-order,NLR)、中序(in-order,LNR)、后序(post-order,LRN)、层序(level-order)。

需要记录一定的分隔符(如对应位置的空节点),在反序列化的时候才能确定边界。

序列化时候几种遍历顺序都很容易实现,主要看反序列化的时候怎么比较方便。

LeetCode 用的是层序,并把空的子节点用 null 记录占位。层序在序列化和反序列化的时候,用队列辅助循环处理。

用前序遍历(NLR)。如果左右子节点为空,需要记占位符,直接用空字符串表示。节点值之间用逗号分割。序列化和反序列化的时候,都用栈辅助,避免递归。

比如上边 example 1 中的二叉树,会序列化为 "1,2,,,3,4,,,5,"

Code

solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
from utils import TreeNode

# Definition for a binary tree node.
# class TreeNode(object):
# def __init__(self, x):
# self.val = x
# self.left = None
# self.right = None


class Codec:

def serialize(self, root):
"""Encodes a tree to a single string.

:type root: TreeNode
:rtype: str
"""
parts = []
stack = []
while root or stack:
if root:
stack.append(root)
parts.append(str(root.val))
root = root.left
else:
parts.append('')
root = stack.pop().right

return ','.join(parts)


def deserialize(self, data):
"""Decodes your encoded data to tree.

:type data: str
:rtype: TreeNode
"""
nodes = [TreeNode(int(s)) if s else None for s in data.split(',')]
virtual = TreeNode(None)
stack = [virtual]
root = None
for node in nodes:
if root:
stack.append(root)
root.left = root = node
else:
stack.pop().right = root = node

return virtual.right


# Your Codec object will be instantiated and called as such:
# ser = Codec()
# deser = Codec()
# ans = deser.deserialize(ser.serialize(root))

可以看到序列化和反序列化的操作逻辑是完全一致的,唯一的区别就是前者在遍历过程中把节点的值读出来,后者把获取到的值和父子关系写到节点上。

另外反序列化的时候,通过一个指向根节点的虚拟节点,可以简化边界值的处理逻辑。