[POJ3294]Life Forms(后缀数组)

传送门

统计大于一半的串中都出现过的子串,有多个按照字典序输出

二分子串长度 k,用 k 将height 数组分组,接下来直接判断就 ok。

有个小细节,平常统计所有串中都出现的最长子串时,把所有子串拼接起来的符号可以是相同的,但是这个题不行。(为什么?好好想想)

——代码

  1 #include <cstdio>
  2 #include <cstring>
  3 #include <iostream>
  4 #define N 101001
  5
  6 int len, n, m, max_num, max_len;
  7 int buc[N], x[N], y[N], sa[N], rank[N], height[N], belong[N], pos[N];
  8 char s[N], a[N];
  9 bool f[101];
 10
 11 inline void build_sa()
 12 {
 13     int i, k, p;
 14     for(i = 0; i < m; i++) buc[i] = 0;
 15     for(i = 0; i < len; i++) buc[x[i] = s[i]]++;
 16     for(i = 1; i < m; i++) buc[i] += buc[i - 1];
 17     for(i = len - 1; i >= 0; i--) sa[--buc[x[i]]] = i;
 18     for(k = 1; k <= len; k <<= 1)
 19     {
 20         p = 0;
 21         for(i = len - 1; i >= len - k; i--) y[p++] = i;
 22         for(i = 0; i < len; i++) if(sa[i] >= k) y[p++] = sa[i] - k;
 23         for(i = 0; i < m; i++) buc[i] = 0;
 24         for(i = 0; i < len; i++) buc[x[y[i]]]++;
 25         for(i = 1; i < m; i++) buc[i] += buc[i - 1];
 26         for(i = len - 1; i >= 0; i--) sa[--buc[x[y[i]]]] = y[i];
 27         std::swap(x, y);
 28         p = 1, x[sa[0]] = 0;
 29         for(i = 1; i < len; i++)
 30             x[sa[i]] = y[sa[i - 1]] == y[sa[i]] && y[sa[i - 1] + k] == y[sa[i] + k] ? p - 1 : p++;
 31         if(p >= len) break;
 32         m = p;
 33     }
 34 }
 35
 36 inline void build_height()
 37 {
 38     int i, j, k = 0;
 39     for(i = 0; i < len; i++) rank[sa[i]] = i;
 40     for(i = 0; i < len; i++)
 41     {
 42         if(!rank[i]) continue;
 43         if(k) k--;
 44         j = sa[rank[i] - 1];
 45         while(s[i + k] == s[j + k] && i + k < len && j + k < len) k++;
 46         height[rank[i]] = k;
 47     }
 48 }
 49
 50 inline bool check(int k)
 51 {
 52     pos[0] = 0;
 53     int i, cnt = 1;
 54     memset(f, 0, sizeof(f));
 55     f[belong[sa[0]]] = 1;
 56     for(i = 1; i < len; i++)
 57         if(height[i] < k)
 58         {
 59             cnt = 1;
 60             memset(f, 0, sizeof(f));
 61             f[belong[sa[i]]] = 1;
 62         }
 63         else if(!f[belong[sa[i]]])
 64         {
 65             cnt += f[belong[sa[i]]] = 1;
 66             if(cnt == (n >> 1) + 1) pos[++pos[0]] = sa[i];
 67         }
 68     return pos[0];
 69 }
 70
 71 inline void solve()
 72 {
 73     int i, j, l = 1, r = len, mid, ans = 0;
 74     while(l <= r)
 75     {
 76         mid = (l + r) >> 1;
 77         if(check(mid)) max_num = pos[0], ans = mid, l = mid + 1;
 78         else r = mid - 1;
 79     }
 80     if(ans)
 81     {
 82         for(i = 1; i <= max_num; putchar(‘\n‘), i++)
 83             for(j = pos[i]; j < pos[i] + ans; j++)
 84                 putchar(s[j]);
 85         putchar(‘\n‘);
 86     }
 87     else
 88     {
 89         putchar(‘?‘);
 90         putchar(‘\n‘);
 91         putchar(‘\n‘);
 92     }
 93 }
 94
 95 int main()
 96 {
 97     int i, k = 0, h;
 98     while(scanf("%d", &n))
 99     {
100         if(!n) break;
101         h = 0;
102         m = 256;
103         len = 0;
104         memset(belong, -1, sizeof(belong));
105         for(i = 0; i < n; i++)
106         {
107             scanf("%s", a);
108             for(k = 0; a[k] ^ ‘\0‘; k++) belong[len] = i, s[len++] = a[k];
109             belong[len] = i;
110             s[len++] = h++;
111         }
112         len--;
113         build_sa();
114         build_height();
115         solve();
116     }
117     return 0;
118 }

时间: 2024-08-17 19:58:38

[POJ3294]Life Forms(后缀数组)的相关文章

POJ3294:Life Forms(后缀数组)

Description You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or

POJ 3294 Life Forms (后缀数组)

题目大意: 求出在m个串中出现过大于m/2次的子串. 思路分析: 如果你只是直接跑一次后缀数组,然后二分答案扫描的话. 那么就试一下下面这个数据. 2 abcdabcdefgh efgh 这个数据应该输出 efgh 问题就在于对于每一个串,都只能参与一次计数,所以在check的时候加一个标记数组是正解. #include <cstdio> #include <iostream> #include <algorithm> #include <cstring>

uva 11107 - Life Forms(后缀数组)

题目链接:uva 11107 - Life Forms 题目大意:给定n个字符串,求一个最长的字符串,为n/2个字符串的子串. 解题思路:后缀数组,处理除后缀数组后,二分长度,每次遍历height数组,当长度不足时就分段,如果存在一段中包含n/2个起点,则为可行长度. #include <cstdio> #include <cstring> #include <set> #include <algorithm> using namespace std; co

Poj 3294 Life Forms (后缀数组 + 二分 + Hash)

题目链接: Poj 3294 Life Forms 题目描述: 有n个文本串,问在一半以上的文本串出现过的最长连续子串? 解题思路: 可以把文本串用没有出现过的不同字符连起来,然后求新文本串的height.然后二分答案串的长度K,根据K把新文本串的后缀串分块,统计每块中的原文本串出现的次数,大于原文本串数目的一半就作为答案记录下来,对于输出字典序,height就是排好序的后缀数组,只要按照顺序输出即可. 1 #include <cstdio> 2 #include <cstring>

POJ 3294 UVA 11107 Life Forms 后缀数组

相同的题目,输出格式有区别. 给定n个字符串,求最长的子串,使得它同时出现在一半以上的串中. 不熟悉后缀数组的童鞋建议先去看一看如何用后缀数组计算两个字符串的最长公共子串 Ural1517 这道题的思路也是基本相同的,都是利用了后缀数组的良好性质. #include <iostream> #include <cstring> #include <cstdio> using namespace std; const int MAX = 100500; const int

poj3294 UVA 11107 Life Forms 后缀数组

http://poj.org/problem?id=3294 Life Forms Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 9931   Accepted: 2739 Description You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such

POJ3294--Life Forms 后缀数组+二分答案 大于k个字符串的最长公共子串

Life Forms Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 10800   Accepted: 2967 Description You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, e

POJ3294---Life Forms(后缀数组,二分+给后缀分组)

Description You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or

POJ 3294 Life Forms(后缀数组求k个串的最长子串)

题目大意:给出n个字符串,让你求出最长的子串,如果有多个按照字典序顺序输出. 解题思路:将n个字符串连起来,中间需要隔开,然后我们二分枚举字符串的长度,求最长的长度,如果多个需要按照字典序保存起来,最后输出答案就可以了.时间复杂度是:O(n*log(n)). Life Forms Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 10275   Accepted: 2822 Description You may have

POJ3294 Life Forms(二分+后缀数组)

给n个字符串,求最长的多于n/2个字符串的公共子串. 依然是二分判定+height分组. 把这n个字符串连接,中间用不同字符隔开,跑后缀数组计算出height: 二分要求的子串长度,判断是否满足:height分组,统计一个组不同的字符串个数是否大于n/2: 最后输出方案,根据二分得出的子串长度的结果,直接再遍历一遍height,因为这儿是有序的后缀所以找到一个就直接输出. 1 #include<cstdio> 2 #include<cstring> 3 #include<cm