disjoint set

MAKE-SET.x/ creates a new set whose only member (and thus representative)

is x. Since the sets are disjoint, we require that x not already be in some other

set.

UNION.x; y/ unites the dynamic sets that contain x and y, say Sx and Sy, into a

new set that is the union of these two sets. We assume that the two sets are disjoint

prior to the operation. The representative of the resulting set is any member

of Sx [ Sy, although many implementations of UNION specifically choose the

representative of either Sx or Sy as the new representative. Since we require

the sets in the collection to be disjoint, conceptually we destroy sets Sx and Sy,

removing them from the collection S. In practice, we often absorb the elements

of one of the sets into the other set.

FIND-SET.x/ returns a pointer to the representative of the (unique) set containing

linklist implement

In the worst case, the above implementation of the UNION procedure requires an

average of ?.n/ time per call because we may be appending a longer list onto

a shorter list; we must update the pointer to the set object for each member of

the longer list. Suppose instead that each list also includes the length of the list

(which we can easily maintain) and that we always append the shorter list onto the

longer, breaking ties arbitrarily. With this simple weighted-union heuristic, a single

UNION operation can still take _.n/ time if both sets have _.n/ members. As

the following theorem shows, however, a sequence of m MAKE-SET, UNION, and

FIND-SET operations, n of which are MAKE-SET operations, takes O.m C n lg n/

time.

Tree implement

Heuristics to improve the running time

So far, we have not improved on the linked-list implementation. A sequence of

n 1 UNION operations may create a tree that is just a linear chain of n nodes. By

using two heuristics, however, we can achieve a running time that is almost linear

in the total number of operations m.

The first heuristic, union by rank, is similar to the weighted-union heuristic we

used with the linked-list representation. The obvious approach would be to make

the root of the tree with fewer nodes point to the root of the tree with more nodes.

Rather than explicitly keeping track of the size of the subtree rooted at each node,

we shall use an approach that eases the analysis. For each node, we maintain a

rank, which is an upper bound on the height of the node. In union by rank, we

make the root with smaller rank point to the root with larger rank during a UNION

operation.

The second heuristic, path compression, is also quite simple and highly effective.

As shown in Figure 21.5, we use it during FIND-SET operations to make each

node on the find path point directly to the root. Path compression does not change

any ranks(more nodes linked to the root will cause much possibility to find it).

When we use both union by rank and path compression, the worst-case running
time is O.m ?.n//, where ?.n/ is a very slowly growing function, which we define
in Section 21.4. In any conceivable application of a disjoint-set data structure,
?.n/ 4; thus, we can view the running time as linear in m in all practical situations.
Strictly speaking, however, it is superlinear. In Section 21.4, we prove this
upper bound.

 1 package disjoint_sets;
 2 // there have two ways,one is the linkedlist,the other is the tree,use the tree here
 3 public class disjoint_set {
 4     private static class Node{
 5         private Node p;
 6         private int rank;
 7         private String name;
 8         public Node(String na){
 9             p = this; rank = 0;name = na;
10         }
11     }
12     public static void union(Node x,Node y){
13         link(findset(x),findset(y));
14     }
15     public static void link(Node x,Node y){
16         if(x.rank > y.rank){
17             y.p = x;
18         }
19         else if(y.rank > x.rank){
20             x.p = y;
21         }
22         else{
23             y.p = x;
24             x.rank = x.rank + 1;
25         }
26     }
27     public static Node findset(Node x){
28         if(x != x.p){
29             x.p = findset(x.p);  //path compression
30         }
31         return x.p;
32     }
33     public static void print(Node x){
34
35             System.out.println(x.name);
36             if(x != x.p){
37             x = x.p;
38             print(x);
39             }
40             return;
41     }
42     public static void main(String[] args) {
43         Node a = new Node("a");
44         Node b = new Node("b");
45         Node c = new Node("c");
46         Node d = new Node("d");
47         union(a,b);
48         union(b,c);
49         union(a,d);
50         print(d);
51
52
53     }
54
55 }

时间： 2024-08-02 06:37:27

disjoint set的相关文章

The Tree-planting Day and Simple Disjoint Sets

First I have to say: I have poor English. I am too young, too simple, sometimes na?ve. It was tree-planting day two weeks ago. SHENBEN dph taught us a lot about tree-planting and the disjoint sets. It was useful and valuable for a JURUO like me. I ad

[email protected] [352] Data Stream as Disjoint Intervals (Binary Search & TreeSet)

https://leetcode.com/problems/data-stream-as-disjoint-intervals/ Given a data stream input of non-negative integers a1, a2, ..., an, ..., summarize the numbers seen so far as a list of disjoint intervals. For example, suppose the integers from the da

352. Data Stream as Disjoint Interval

Given a data stream input of non-negative integers a1, a2, ..., an, ..., summarize the numbers seen so far as a list of disjoint intervals. For example, suppose the integers from the data stream are 1, 3, 7, 2, 6, ..., then the summary will be: [1, 1

[LeetCode] Data Stream as Disjoint Intervals 分离区间的数据流

Leetcode: Data Stream as Disjoint Intervals

并查集(Disjoint Set)

在一些有N个元素的集合应用问题中,我们通常是在开始时让每个元素构成一个单元素的集合,然后按一定顺序将属于同一组的元素所在的集合合并,其间要反复查找一个元素在哪个集合中.这一类问题其特点是看似并不复杂,但数据量极大,若用正常的数据结构来描述的话,往往在空间上过大,计算机无法承受:即使在空间上勉强通过,运行的时间复杂度也极高,根本就不可能在规定的运行时间(1-3秒)内计算出试题需要的结果,只能用并查集来描述. 本文地址:http://www.cnblogs.com/archimedes/p/disj

352. Data Stream as Disjoint Intervals

问题描述: Given a data stream input of non-negative integers a1, a2, ..., an, ..., summarize the numbers seen so far as a list of disjoint intervals. 解题思路: 这道题是目前最新的题,其实思路很容易找到,难点在于考虑到所有的可能的情形. 首先要确定类必须有一个保存当前结果的集合类List<Interval>,其元素的按Interval的起始值的大小排序,

不相交集（The Disjoint Set ADT）

0)引论不相交集是解决等价问题的一种有效的数据结构,之所以称之为有效是因为,这个数据结构简单(几行代码,一个简单数组就可以搞定),快速(每个操作基本上可以在常数平均时间内搞定). 首先我们要明白什么叫做等价关系,而在这个之前要先有一个关系(relation)的定义 Relation:定义在数据集S上的关系R是指,对于属于数据集S中的每一对元素(a,b),a R b要么是真要么是假.如果a R b为真,就说a related b,即a与b相关. 等价关系也是一种关系(Relation),只不过是

hdu 1232, disjoint set, linked list vs. rooted tree, a minor but substantial optimization for path c

three version are provided. disjoint set, linked list version with weighted-union heuristic, rooted tree version with rank by union and path compression, and a minor but substantial optimization for path compression version FindSet to avoid redundanc