POJ 3450 Corporate Identity 求所有字符的最长公共子串

Description

Beside other services, ACM helps companies to clearly state their “corporate identity”, which includes company logo but also other signs, like trademarks. One of such companies is Internet Building Masters (IBM), which has recently asked ACM for a help with
their new identity. IBM do not want to change their existing logos and trademarks completely, because their customers are used to the old ones. Therefore, ACM will only change existing trademarks instead of creating new ones.

After several other proposals, it was decided to take all existing trademarks and find the longest common sequence of letters that is contained in all of them. This sequence will be graphically emphasized to form a new logo. Then, the old trademarks may
still be used while showing the new identity.

Your task is to find such a sequence.

Input

The input contains several tasks. Each task begins with a line containing a positive integer N, the number of trademarks (2 ≤ N ≤ 4000). The number is followed by N lines, each containing one trademark. Trademarks will be composed only from lowercase letters,
the length of each trademark will be at least 1 and at most 200 characters.

After the last trademark, the next task begins. The last task is followed by a line containing zero.

Output

For each task, output a single line containing the longest string contained as a substring in all trademarks. If there are several strings of the same length, print the one that is lexicographically smallest. If there is no such non-empty string, output
the words “IDENTITY LOST” instead.

Sample Input

3
aabbaabb
abbababb
bbbbbabb
2
xyz
abc
0

Sample Output

abb
IDENTITY LOST

Source

CTU Open 2007

题意很长,其实就是求一组字符串的最长公共子字符串。

利用KMP可以在O(N)时间内查找一个字符串在另外一个字符串的最长前缀子字符串的特点加速程序。

#include <stdio.h>
#include <string.h>

const int MAX_N = 4001;
const int MAX_L = 201;
char sameStr[MAX_L], dict[MAX_N][MAX_L];
int nxTbl[MAX_L];
int len[MAX_N];

template<typename T1, typename T2>
inline bool equ(T1 t1, T2 t2) { return (T1)t1 == (T1)t2; }

void getNextTble(int *t, char *s, int len)
{
	t[0] = 0;
	for (int i = 1, j = 0; i < len; )//j 记录最后一个好前缀的下标
	{
		if (s[i] == s[j]) t[i++] = ++j;//++j等于有多少个好前缀
		else if (j > 0) j = t[j-1];//取得下一个对比字符的下标
		else t[i++] = 0;
	}
}

int getLongestPre(char *chs, char *s, int len1, int len2)
{
	int len3 = 0;
	int i = 0, j = 0;
	for (; i < len2 && j < len1; )
	{
		if (equ(s[i], chs[j]))
		{
			i++, j++;
			if (j > len3) len3 = j;
		}
		else if (j > 0) j = nxTbl[j-1];
		else i++;
	}
	return len3;
}

bool smaller(char *s1, char *s2, int L)//lexicographically smaller
{
	for (int i = 0; i < L; i++)
	{
		if (s1[i] < s2[i]) return true;
		else if (s1[i] > s2[i]) return false;
	}
	return false;
}

int main()
{
	int N;
	while (~scanf("%d\n", &N) && N)
	{
		for (int i = 0; i < N; i++)
		{
			gets(dict[i]);
			len[i] = strlen(dict[i]);
		}
		char *p = dict[0];
		char *pRes = NULL;
		int L = 0;
		for (; len[0]; len[0]--, p++)
		{
			getNextTble(nxTbl, p, len[0]);
			int tmp = len[0];
			for (int i = 1; i < N && tmp; i++)
			{
				tmp = getLongestPre(p, dict[i], tmp, len[i]);
			}
			if (tmp > L || equ(tmp, L) && smaller(p, pRes, L))
			{
				L = tmp;
				pRes = p;
			}
		}

		if (L)
		{
			for (int i = 0; i < L; i++) putchar(pRes[i]);
			putchar('\n');
		}
		else puts("IDENTITY LOST");
	}
	return 0;
}
时间: 2024-10-22 10:04:55

POJ 3450 Corporate Identity 求所有字符的最长公共子串的相关文章

POJ 3450 Corporate Identity KMP题解

本题要求求一组字符串的最长公共子串,其实是灵活运用KMP快速求最长前缀. 注意肯爹的题意:要求按照字典顺序输出. 还有要提醒的就是:有人也是用KMP来解这道题,但是很多人都把KMP当成暴力法来用了,没有真正处理好细节,发挥KMP的作用.而通常这些人都大喊什么暴力法可以解决本题,没错,的确暴力法是可以解决本题的,本题的数据不大,但是请不要把KMP挂上去,然后写成暴力法了,那样会误导多少后来人啊. 建议可以主要参考我的getLongestPre这个函数,看看是如何计算最长前缀的. 怎么判断你是否把本

求两个字符串最长公共子串

一.问题描述: 最长公共子串 (LCS-Longest Common Substring) LCS问题就是求两个字符串最长公共子串的问题.比如输入两个字符串"ilovechina"和“chinabest”的最长公共字符串有"china",它们的长度是5. 二.解法 解法就是用一个矩阵来记录两个字符串中所有位置的两个字符之间的匹配情况,若是匹配则为1,否则为0.然后求出对角线最长的1序列,其对应的位置就是最长匹配子串的位置.如下图: i   l   o  v  e  

POJ 3450 Corporate Identity(KMP)

[题目链接] http://poj.org/problem?id=3450 [题目大意] 求k个字符串的最长公共子串,如果有多个答案,则输出字典序最小的. [题解] 我们对第一个串的每一个后缀和其余所有串做kmp,取匹配最小值的最大值就是答案. [代码] #include <cstring> #include <cstdio> #include <algorithm> const int N=4050,M=210; using namespace std; int nx

POJ 3415 Common Substrings (求长度不小于k的公共子串的个数)

Common Substrings Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 10002   Accepted: 3302 Description A substring of a string T is defined as: T(i, k)=TiTi+1...Ti+k-1, 1≤i≤i+k-1≤|T|. Given two strings A, B and one integer K, we define S,

UVA 题目760 DNA Sequencing (后缀数组求两个串最长公共子串,字典序输出)

 DNA Sequencing  A DNA molecule consists of two strands that wrap around each other to resemble a twisted ladder whose sides, made of sugar and phosphate molecules, are connected by rungs of nitrogen-containing chemicals called bases. Each strand is

求两个字符串最长公共子串(动态规划)

code如下: //Longest common sequence, dynamic programming method void FindLCS(char *str1, char *str2) { if(str1 == NULL || str2 == NULL) return; int length1 = strlen(str1)+1; int length2 = strlen(str2)+1; int **csLength,**direction;//two arrays to recor

后缀数组(多个字符串的最长公共子串)—— POJ 3294

对应POJ 题目:点击打开链接 Life Forms Time Limit:6666MS     Memory Limit:0KB     64bit IO Format:%lld & %llu Submit Status Description Problem C: Life Forms You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial tra

POJ 2774 后缀数组:求最长公共子串

思路:其实很简单,就是两个字符串连接起来,中间用个特殊字符隔开,然后用后缀数组求最长公共前缀,然后不同在两个串中,并且最长的就是最长公共子串了. 注意的是:用第一个字符串来判断是不是在同一个字符中,刚开始用了第二个字符的长度来判断WA了2发才发现. #include<iostream> #include<cstdio> #include<cstring> #include<algorithm> #include<map> #include<

poj 1226 hdu 1238 Substrings 求若干字符串正串及反串的最长公共子串 2002亚洲赛天津预选题

题目:http://poj.org/problem?id=1226 http://acm.hdu.edu.cn/showproblem.php?pid=1238 其实用hash+lcp可能也可以,甚至可能写起来更快,不过我没试,我最近在练习后缀数组,所以来练手 后缀数组的典型用法之一----------------后缀数组+lcp+二分 思路:1.首先将所有的字符串每读取一个,就将其反转,作为一组,假设其下标为i到j,那么cnt[i]到cnt[j]都标记为一个数字(这个数字意思是第几个读入的字符