并查集：Union-Find（１）

Disjoint Sets：

　　我们都知道Sets（集合）是什么，就是一组非重复元素组成的结构。

　　先让我们来看一下Disjoint Sets（非相交集合）：

　　Disjoint Sets的意思是一堆集合们，它们相互之间都没有交集。没有交集是指：各个集合之间没有拥有共同、相同的元素。中文称作「分离集」。

　　Disjoint Sets 的性质相当特殊。信息学家仔细观察其特性后，精心设计出一套优雅美观的资料结构，可以快速的做集合运算。

　　由于每个 Disjoint Sets 指的就是集合们都没有交集，我们就不用考虑交集、差集等等的运算，因为结果很明显。所以只需要考虑 union 、 find 、 split 这三个集合运算：

　　union 就是将两个集合做联集，合并成一个集合。 find 就是找找看一个元素是在哪个集合里面。 split 就是把一个元素从一个集合中分离出来。

集合的拆分：

　　将一个集合拆分成满足以下条件的集合

　　1）S1 ∪ S2 ∪ … ∪ Sk = S

　　2）Si ∩ Sj = ?（i ≠ j）

　　例：S = {1,2,3,4,5,6}；

　　拆分1：{1,2}，{3,4}，{5,6}；

　　拆分2：{1,2,3,4,5}，{6}；

　　……

Binary relations ：

　　S x S is the set of all pairs of elements of S (Cartesian product)；

　　Example: 若S = {a,b,c}，则 S x S = {(a,a),(a,b),(a,c),(b,a),(b,b),(b,c), (c,a),(c,b),(c,c)}；

　　A binary relation R on a set S is any subset of S x S，R(x,y) means (x,y) is “in the relation” .

Three Properties:

　　1) A relation R over set S is reflexive means R(a,a) for all a in S；

　　比如：S = {1, 2, 3}；

　　对于{<1, 1>, <1, 2>, <1, 3>, <2, 2>, <2, 3>, <3, 3>}

　　It is reflexive because <1, 1>, <2, 2>, <3, 3> are in this relation.

　　2) A relation R on a set S is symmetric if and only if for any a and b in S, whenever if R(a, b) , then R(b,a).

　　比如：S = {1, 2, 3}；

　　对于{<1, 1> , <2, 2> <3, 3> } , it is symmetric.

　　3) A binary relation R over set S is transitive means:

　　If R(a,b) and R(b,c) then R(a,c) for all a,b,c in S

　　比如：S = {1, 2, 3}；

　　对于{<1, 2> ,<2, 3> , <1, 3>}, It is transitive.

Equivalence relations：

　　A binary relation R is an equivalence relation if R is reflexive, symmetric, and transitive.

　　Suppose P={S1,S2,…,Sn} is a partition

　　　– Define R(x,y) to mean x and y are in the same Si

　　? R is an equivalence relation

　　Suppose R is an equivalence relation over S

　　　– Consider a set of sets S1,S2,…,Sn where

　　　(1) x and y are in the same Si if and only if R(x,y)

　　　(2) Every x is in some Si

　　? This set of sets is a partition

　　若S = {a,b,c,d,e}

　　? One partition: {a,b,c}, {d}, {e}

　　? The corresponding equivalence relation:

　　 (a,a), (b,b), (c,c), (a,b), (b,a), (a,c), (c,a), (b,c), (c,b), (d,d), (e,e)

并查集（Union-Find：

　　并查集是一种树型的数据结构，用于处理一些不相交集合（Disjoint Sets）的合并及查询问题。常常在使用中以森林来表示。进行快速规整。

　　在一些有N个元素的集合应用问题中，我们通常是在开始时让每个元素构成一个单元素的集合，然后按一定顺序将属于同一组的元素所在的集合合并，其间要反复查找一个元素在哪个集合中。其特点是看似并不复杂，但数据量极大，若用正常的数据结构来描述的话，往往在空间上过大，计算机无法承受；即使在空间上勉强通过，运行的时间复杂度也极高，适合用并查集来描述。

　　并查集在路径、网络、图形连接中得到了广泛应用。

　　并查集操作：

　　1）创建：创建一个集合的初始拆分，一般仅包含自身，如：

{a}, {b}, {c}, … ，并给每个子集选择一个代表元素。

　　2）查找：查找一个元素并返回其所在子集的代表元素

　　3）合并：合并两个小子集成一个大子集

　　例：

并查集数据结构：

Up-tree结构：

　　从单节点树的森林开始，如下：

　　经过合并后：

　　up代表一个数组,index代表每一节点，up[index]代表相应的parent，一开始up[index]都初始化为0或-1；合并后其值为对应子集的代表元素。

int find(int x) {
while(up[x] != 0) {
        x = up[x];
    }
    return x;
}
void union(int x, int y){
 up[y] = x;
}

　　显然，查找的最坏情况为O(n)，合并的最坏情况为O(1);m次查找和n-1次合并最坏情况为O(m*n)，如何优化？

　　优化策略：

　　1）按大小合并，将小树连接到大树，则查找时间复杂度为O(logn),m次查找和n-1次合并最坏情况为O(m*logn);

　　2）路径压缩：在查找时将路径上的节点直接连接到root

按大小合并：

　　这里，up[root]不再是0或-1，而是存储该子集大小的相反数，当up[index]为一负值，则其为root。

　　查找最坏情况为：O(logn)

　　证明：当按照大小合并时，高度为h的up-tree节点至少为2^h，可通过数学归纳法证明。

　　此外，还可以按照高度合并。

路径压缩：

　　直接来看代码，你就懂了：

int find(i) {
    // find root
    int r = i；
    while(up[r] > 0)
        r = up[r]  

    // compress path
    if (i==r)
    return r;
    int old_parent = up[i];
    while(old_parent != r) {
        up[i] = r;
        i = old_parent;
        old_parent = up[i];
    }
    return r;
}

　　单次查找的最坏情况依然为O(logn)，但是经过多次查找，路径压缩后，m次查找和n-1次合并最坏情况几乎为O(m+n);

时间： 2024-10-11 06:13:56

并查集：Union-Find（１）

并查集：Union-Find（１）的相关文章

POJ 1611 The Suspects 并查集 Union Find

并查集(Union Find)：实现及其优化(c++)

【算法学习笔记】41.并查集 SJTU OJ 1283 Mixture

并查集（union/find）

HDU 1035 Robot Motion Union Find 并查集题解

POJ 2524 Ubiquitous Religions Union Find 并查集

并查集（不相交集）的Union操作

UVALive 6910 Cutting Tree（并查集应用）

poj1611 并查集 (路径不压缩)

并查集的应用