HDOJ 题目4691 Front compression(后缀数组+RMQ最长前缀)

Front compression

Time Limit: 5000/5000 MS (Java/Others)    Memory Limit: 102400/102400 K (Java/Others)

Total Submission(s): 1652    Accepted Submission(s): 604

Problem Description

Front compression is a type of delta encoding compression algorithm whereby common prefixes and their lengths are recorded so that they need not be duplicated. For example:

The size of the input is 43 bytes, while the size of the compressed output is 40. Here, every space and newline is also counted as 1 byte.

Given the input, each line of which is a substring of a long string, what are sizes of it and corresponding compressed output?

Input

There are multiple test cases. Process to the End of File.

The first line of each test case is a long string S made up of lowercase letters, whose length doesn‘t exceed 100,000. The second line contains a integer 1 ≤ N ≤ 100,000, which is the number of lines in the input. Each of the following N lines contains two
integers 0 ≤ A < B ≤ length(S), indicating that that line of the input is substring [A, B) of S.

Output

For each test case, output the sizes of the input and corresponding compressed output.

Sample Input

frcode
2
0 6
0 6
unitedstatesofamerica
3
0 6
0 12
0 21
myxophytamyxopodnabnabbednabbingnabit
6
0 9
9 16
16 19
19 25
25 32
32 37

Sample Output

14 12
42 31
43 40

Author

Zejun Wu (watashi)

Source

2013 Multi-University Training Contest 9

Recommend

zhuyuanchen520   |   We have carefully selected several similar problems for you:  5421 5420 5419 5418 5417

看懂题就会了,,就是把它和它上一个的最长前缀减去,在加上这个数字再加上空格再加上一个回车

ac代码

#include<stdio.h>
#include<string.h>
#include<algorithm>
#include<iostream>
#define min(a,b) (a>b?b:a)
#define max(a,b) (a>b?a:b)
using namespace std;
char str[103030];
int sa[103030],Rank[103030],rank2[103030],height[103030],c[103030],*x,*y,len;
int n;
void cmp(int n,int sz)
{
    int i;
    memset(c,0,sizeof(c));
    for(i=0;i<n;i++)
        c[x[y[i]]]++;
    for(i=1;i<sz;i++)
        c[i]+=c[i-1];
    for(i=n-1;i>=0;i--)
        sa[--c[x[y[i]]]]=y[i];
}
void build_sa(char *s,int n,int sz)
{
    x=Rank,y=rank2;
    int i,j;
    for(i=0;i<n;i++)
        x[i]=s[i],y[i]=i;
    cmp(n,sz);
    int len;
    for(len=1;len<n;len<<=1)
    {
        int yid=0;
        for(i=n-len;i<n;i++)
        {
            y[yid++]=i;
        }
        for(i=0;i<n;i++)
            if(sa[i]>=len)
                y[yid++]=sa[i]-len;
            cmp(n,sz);
        swap(x,y);
        x[sa[0]]=yid=0;
        for(i=1;i<n;i++)
        {
            if(y[sa[i-1]]==y[sa[i]]&&sa[i-1]+len<n&&sa[i]+len<n&&y[sa[i-1]+len]==y[sa[i]+len])
                x[sa[i]]=yid;
            else
                x[sa[i]]=++yid;
        }
        sz=yid+1;
        if(sz>=n)
            break;
    }
    for(i=0;i<n;i++)
        Rank[i]=x[i];
}
void getHeight(char *s,int n)
{
    int k=0;
    for(int i=0;i<n;i++)
    {
        if(Rank[i]==0)
            continue;
        k=max(0,k-1);
        int j=sa[Rank[i]-1];
        while(s[i+k]==s[j+k])
            k++;
        height[Rank[i]]=k;
    }
}
int minv[103010][20],lg[103030];
void init_lg()
{
    int i;
    lg[1]=0;
    for(i=2;i<102020;i++)
    {
        lg[i]=lg[i>>1]+1;
    }
}
void init_RMQ(int n)
{
    int i,j,k;
    for(i=1;i<=n;i++)
    {
        minv[i][0]=height[i];
    }
    for(j=1;j<=lg[n];j++)
    {
        for(k=0;k+(1<<j)-1<=n;k++)
        {
            minv[k][j]=min(minv[k][j-1],minv[k+(1<<(j-1))][j-1]);
        }
    }
}
int lcp(int l,int r)
{
	if(l==r)
	{
		return len-l;
	}
    l=Rank[l];
    r=Rank[r];
    if(l>r)
        swap(l,r);
    l++;
    int k=lg[r-l+1];
    return min(minv[l][k],minv[r-(1<<k)+1][k]);
}
__int64 fun(__int64 x)
{
	__int64 ans=0;
	if(x==0)
		return 1;
	while(x)
	{
		x/=10;
		ans++;
	}
	return ans;
}
int main()
{
	while(scanf("%s",str)!=EOF)
	{
		int t;
		len=strlen(str);
		build_sa(str,len+1,256);
		getHeight(str,len);
		scanf("%d",&t);
		init_lg();
		init_RMQ(len);
		__int64 ans1,ans2;
		ans1=ans2=0;
		int l,r;
		scanf("%d%d",&l,&r);
		ans1+=(r-l)+1;
		t--;
		ans2+=(r-l)+3;
		while(t--)
		{
			int a,b;
			scanf("%d%d",&a,&b);
			ans1+=(b-a)+1;
			__int64 p;
			p=min(lcp(l,a),min((b-a),(r-l)));
			ans2+=(b-a)-p+2+fun(p);
			l=a;
			r=b;
		}
		printf("%I64d %I64d\n",ans1,ans2);
	}
	return 0;
}

版权声明:本文为博主原创文章,未经博主允许不得转载。

时间: 2024-10-13 03:05:37

HDOJ 题目4691 Front compression(后缀数组+RMQ最长前缀)的相关文章

HDOJ 4691 Front compression 后缀数组

后缀数组求两子串间的最大公共前缀. Front compression Time Limit: 5000/5000 MS (Java/Others)    Memory Limit: 102400/102400 K (Java/Others) Total Submission(s): 1382    Accepted Submission(s): 517 Problem Description Front compression is a type of delta encoding compr

【HDOJ】4691 Front compression

后缀数组基础题目,dc3解. 1 /* 4691 */ 2 #include <iostream> 3 #include <sstream> 4 #include <string> 5 #include <map> 6 #include <queue> 7 #include <set> 8 #include <stack> 9 #include <vector> 10 #include <deque>

hdu4691 Front compression(后缀数组)

Front compression Time Limit: 5000/5000 MS (Java/Others) Memory Limit: 102400/102400 K (Java/Others) Total Submission(s): 1339 Accepted Submission(s): 496 Problem Description Front compression is a type of delta encoding compression algorithm whereby

hdu4691---Front compression(后缀数组+RMQ)

Front compression Time Limit: 5000/5000 MS (Java/Others) Memory Limit: 102400/102400 K (Java/Others) Total Submission(s): 1490 Accepted Submission(s): 553 Problem Description Front compression is a type of delta encoding compression algorithm whereby

SPOJ题目687 Repeats(后缀数组+RMQ求重复次数最多的子串的重复次数)

REPEATS - Repeats no tags A string s is called an (k,l)-repeat if s is obtained by concatenating k>=1 times some seed string t with length l>=1. For example, the string s = abaabaabaaba is a (4,3)-repeat with t = aba as its seed string. That is, the

hdu 4691 Front compression (后缀数组)

题目大意: 介绍了一种压缩文本的方式,问压缩前后的文本长度. 思路分析: 后缀数组跑模板然后考虑两次l r之间的lcp. 然后减掉重复的长度. 注意ans2的累加. #include <cstdio> #include <iostream> #include <cstring> #include <algorithm> #include <cmath> #define maxn 200005 using namespace std; typede

BZOJ 题目3172: [Tjoi2013]单词(AC自动机||AC自动机+fail树||后缀数组暴力||后缀数组+RMQ+二分等五种姿势水过)

3172: [Tjoi2013]单词 Time Limit: 10 Sec  Memory Limit: 512 MB Submit: 1890  Solved: 877 [Submit][Status][Discuss] Description 某人读论文,一篇论文是由许多单词组成.但他发现一个单词会在论文中出现很多次,现在想知道每个单词分别在论文中出现多少次. Input 第一个一个整数N,表示有多少个单词,接下来N行每行一个单词.每个单词由小写字母组成,N<=200,单词长度不超过10^6

【uva10829-求形如UVU的串的个数】后缀数组+rmq or 直接for水过

题意:UVU形式的串的个数,V的长度规定,U要一样,位置不同即为不同字串 https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&category=&problem=1770 题解:一开始理解错题意,以为是abcxxxcba(xxx为v),开心地打了后缀数组后发现哎样例不对丫.. UVA的意思是abcxxxabc(xxx为v). 类似poj3693,我们暴

HDU 1403 Longest Common Substring(后缀数组,最长公共子串)

hdu题目 poj题目 参考了 罗穗骞的论文<后缀数组——处理字符串的有力工具> 题意:求两个序列的最长公共子串 思路:后缀数组经典题目之一(模版题) //后缀数组sa:将s的n个后缀从小到大排序后将 排序后的后缀的开头位置 顺次放入sa中,则sa[i]储存的是排第i大的后缀的开头位置.简单的记忆就是“排第几的是谁”. //名次数组rank:rank[i]保存的是suffix(i){后缀}在所有后缀中从小到大排列的名次.则 若 sa[i]=j,则 rank[j]=i.简单的记忆就是“你排第几”