LA 4513 hash表示字符串后缀

https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&category=&problem=2514&mosmsg=Submission+received+with+ID+1623942

Dr. Ellie Arroway has established contact with an extraterrestrial civilization. However, all efforts to decode their messages have failed so far because, as luck would have it, they have stumbled upon a race
of stuttering aliens! Her team has found out that, in every long enough message, the most important words appear repeated a certain number of times as a sequence of consecutive characters, even in the middle of other words. Furthermore, sometimes they use
contractions in an obscure manner. For example, if they need to saybab twice, they might just send the message babab,
which has been abbreviated because the second b of the first word can be reused as the first b of
the second one.

Thus, the message contains possibly overlapping repetitions of the same words over and over again. As a result, Ellie turns to you, S.R. Hadden, for help in identifying the gist of the message.

Given an integer m, and a string s, representing the message, your task is to find the longest substring of sthat
appears at least m times. For example, in the message baaaababababbababbab, the length-5 word babab is contained 3 times,
namely at positions 5, 7 and 12 (where indices start at zero). No substring appearing 3 or more times is longer (see the first example from the sample input).
On the other hand, no substring appears11 times or more (see example 2).

In case there are several solutions, the substring with the rightmost occurrence is preferred (see example3).

Input

The input contains several test cases. Each test case consists of a line with an integer m (m1),
the minimum number of repetitions, followed by a line containing a string s of length between m and 40 000,
inclusive. All characters in s are lowercase characters from ``a‘‘ to ``z‘‘. The last test case is denoted bym =
0 and must not be processed.

Output

Print one line of output for each test case. If there is no solution, output none; otherwise, print two integers in a line, separated by a space.
The first integer denotes the maximum length of a substring appearing at least m times; the second integer gives the rightmost
possible starting position of such a substring.

Sample Input

3
baaaababababbababbab
11
baaaababababbababbab
3
cccccc
0

Sample Output

5 12
none
4 2
/**
UvaLA 4513  hash
(白书P225)
题目大意:
          有一个口吃的外星人,说话的时候包含很多重复的字符串,给出外星人说的一句话,找出至少出现m次的最长字符串,如果存在输出长度和该字符串起始位置的最大值
解题思路:
          二分答案L,然后判断是否有长度为L的字符串出现了至少m次。判断的方法很简单,从左到右计算出所有起始位置的长度为L的字符串的哈希值,一旦哈希值出现了至少m次
          就有解。
值得一提的是二分的时候注意处理循环的满足条件。          

*/
#include <stdio.h>
#include <string.h>
#include <iostream>
#include <algorithm>
using namespace std;
typedef unsigned long long  LL;

const int maxn=40000+10;
const int hashseed=31;
int n,m,pos;

LL Hash[maxn];///Hash[i]表示长度为L以i为起点的字符串的哈希值
LL seed[maxn],has[maxn];///前者为种子的幂,后者为以i为起点的后缀的哈希值
int R[maxn];

int cmp(const int& a,const int& b)
{
    return Hash[a]<Hash[b] || (Hash[a]==Hash[b] && a<b);
}

int ok(int L)
{
    int c=0;
    pos=-1;
    for(int i=0;i<n-L+1;i++)
    {
        R[i]=i;
        Hash[i]=has[i]-has[i+L]*seed[L];
    }
    sort(R,R+n-L+1,cmp);///按照哈希值排序,哈希值若相同序号小的优先
    for(int i=0;i<n-L+1;i++)
    {
        if(i==0||Hash[R[i]]!=Hash[R[i-1]])c=0;
        if(++c>=m)pos=max(pos,R[i]);
    }
    return pos>=0;
}

int main()
{
    char s[maxn];
    while(~scanf("%d%*c",&m))
    {
        if(m==0)break;
        scanf("%s",s);
        n=strlen(s);
        has[n]=0;
        for(int i=n-1;i>=0;i--)
              has[i]=has[i+1]*hashseed+(s[i]-'a');
        seed[0]=1;
        for(int i=1;i<=n;i++)
            seed[i]=seed[i-1]*hashseed;
        if(!ok(1))
        {
            printf("none\n");
        }
        else
        {
            ///二分略带技巧性,我们要枚举的长度是(1~n), 并且最终的结果是求满足条件的最大值,因此r要定到n+1,循环条件要r-l>1
            int l=1,r=n+1;
            while(r-l>1)
            {
                int mid=l+(r-l)/2;
                if(ok(mid))
                    l=mid;
                else
                    r=mid;
            }
            ok(l);
            printf("%d %d\n",l,pos);
        }
    }
    return 0;
}

时间: 2024-12-25 07:03:22

LA 4513 hash表示字符串后缀的相关文章

LA 4513 Stammering Aliens 字符串hash

字符串hash模板, 本题是求,给定字符串s中至少出现m次的最长字符串长度,及此时起始位置的最大值 LA 4513  Stammering Aliens 白书p225 //#pragma warning (disable: 4786) //#pragma comment (linker, "/STACK:16777216") //HEAD #include <cstdio> #include <ctime> #include <cstdlib> #i

字符串后缀自动机:Directed Acyclic Word Graph

trie -- suffix tree -- suffix automa 有这么一些应用场景: 即时响应用户输入的AJAX搜索框时, 显示候选列表. 搜索引擎的关键字个数统计. 后缀树(Suffix Tree): 从根到叶子表示一个后缀. 仅仅从这一个简单的描述,我们可以概念上解决下面的几个问题: P:查找字符串o是否在字符串S中 A:若o在S中,则o必然是S的某个后缀的前缀. 用S构造后缀树,按在trie中搜索字串的方法搜索o即可. P: 指定字符串T在字符串S中的重复次数. A: 如果T在S

AC dreamoj 1011 树状数组+hash维护字符串的前缀和

http://acdream.info/problem?pid=1019 Problem Description Now we have a long long string, and we will have two kinds of operation on it. C i y : change the ith letter to y. Q i j : check whether the substring from ith letter to jth letter is a palindr

acdream1116 Gao the string!(hash二分 or 后缀数组)

问题套了一个斐波那契数,归根结底就是要求对于所有后缀s[i...n-1],所有前缀在其中出现的总次数.我一开始做的时候想了好久,后来看了别人的解法才恍然大悟.对于一个后缀来说 s[i...n-1]来说,所有与它匹配的前缀必然是和 s[i+1...n-1]  s[i+2...n-1] ....s[n-1..n-1]里的前缀匹配的,因而如果我们定义一个num[i]表示的是后缀s[i...n-1]与前缀的总长公共前缀,那么num[i]+num[i+1]+..num[n-1]就是前缀在后缀i里出现的次数

Acwing779 最长公共字符串后缀

题目大意:给定n个字符串,让你找到他们的最长公共字符串后缀是什么,可能为空. 分析:题目数据范围比较小,可以O(n*n)暴力匹配,即可解决这道问题.之所以写这道题的题解还是因为写字符串的题还不够多啊,菜的一批. 代码: #include<bits/stdc++.h> using namespace std; string common(string s,string t) { int most = 0; int len = min(s.length(), t.length()); for (i

Hash记录字符串

Hash记录字符串模板: mod常常取1e9+7,base常常取299,,127等等等....有的题目会卡Hash,因为可能会有两个不同的Hash但却有相通的Hash值...这个时候可以用双Hash来判断.Hash值还是很巧妙的... ll getHash(ll x,ll y){ return (Hash[y]%mod-Hash[x-1]*p[y-x+1]%mod+mod)%mod; } for(ll i=1;i<=n;i++){ Hash[i]=((Hash[i-1]*base)%mod+s[

字符串hash LA 4513 Stammering Aliens

题目传送门 题意:训练之南P225 分析:二分寻找长度,用hash值来比较长度为L的字串是否相等. #include <bits/stdc++.h> using namespace std; typedef unsigned long long ull; const int N = 4e4 + 5; const int x = 123; ull H[N], _hash[N], xp[N]; int rk[N]; char str[N]; int m; void get_hash(char *s

哈希Hash在字符串中的应用_C++

本文含有原创题,涉及版权利益问题,严禁转载,违者追究法律责任 哈希大家都会用撒,字符串显然都会写撒,那么哈希离散化字符串不就懂了?!(XXX的神逻辑,其实原文是:树都晓得吧,数组显然都会开呀,那么恭喜你学会了树状数组!) 例如我们给出 n 个长度为 m 的字符串,然后给你一个长度为 m 的字符串 s ,求 s 是否在之前的 n 个串中出现过 暴力扫 n 个串,然后一位位去对,看是否相等,时间复杂度O(nm),它非常得辣鸡 当然我们可以建一颗 trie 树做,但是建树的复杂度也是O(nm)的,对于

LA 4513

https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=2514. 此题需要求在一个字符串中出现至少k次的最长子串, 如果有多个, 取rightmost的那个 , 还是用后缀数组 , 二分然后分段处理 , 不过也可以用hash解决 , 这里先给出后缀数组的代码 . 1 #include<iostream> 2 #inclu