单词拆解&前缀树&树上DP LA 3942 Remember the Word

https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=505&page=show_problem&problem=4147

3942 - Remember the Word

Neal is very curious about combinatorial problems, and now here comes a problem about words. Knowing that Ray has a photographic memory and this may not trouble him, Neal gives it to Jiejie.

Since Jiejie can‘t remember numbers clearly, he just uses sticks to help himself. Allowing for Jiejie‘s only 20071027 sticks, he can only record the remainders of the numbers divided by total amount of sticks.

The problem is as follows: a word needs to be divided into small pieces in such a way that each piece is from some given set of words. Given a word and the set of words, Jiejie should calculate the number of ways the given word can be divided, using the words
in the set.

Input

The input file contains multiple test cases. For each test case: the first line contains the given word whose length is no more than 300 000.

The second line contains an integer S , 1S4000 .

Each of the following S lines contains one word from the set. Each word will be at most 100 characters long. There will be no two identical words and all letters in the words will be lowercase.

There is a blank line between consecutive test cases.

You should proceed to the end of file.

Output

For each test case, output the number, as described above, from the task description modulo 20071027.

Sample
Input

abcd
4
a
b
cd
ab

Sample
Output

Case 1: 2

题意:给定一些单词,和一个长串,问这个长串拆分成已有单词,能拆分成几种方式

思路:

dp[i]=sum(dp[i+len(x)])

dp[i]表示从字符i开始的字符串即后缀(s[i..L])的分解方案数。

x为是(s[i..L]的前缀。

// LA3942 Remember the Word
#include<cstring>
#include<vector>
using namespace std;

const int maxnode = 4000 * 100 + 10;
const int sigma_size = 26;

// 字母表为全体小写字母的Trie
struct Trie {
  int ch[maxnode][sigma_size];
  int val[maxnode];
  int sz; // 结点总数
  void clear() { sz = 1; memset(ch[0], 0, sizeof(ch[0])); } // 初始时只有一个根结点
  int idx(char c) { return c - 'a'; } // 字符c的编号

  // 插入字符串s,附加信息为v。注意v必须非0,因为0代表“本结点不是单词结点”
  void insert(const char *s, int v) {
    int u = 0, n = strlen(s);
    for(int i = 0; i < n; i++) {
      int c = idx(s[i]);
      if(!ch[u][c]) { // 结点不存在
        memset(ch[sz], 0, sizeof(ch[sz]));
        val[sz] = 0;  // 中间结点的附加信息为0
        ch[u][c] = sz++; // 新建结点
      }
      u = ch[u][c]; // 往下走
    }
    val[u] = v; // 字符串的最后一个字符的附加信息为v
  }

  // 找字符串s的长度不超过len的前缀
  void find_prefixes(const char *s, int len, vector<int>& ans) {
    int u = 0;
    for(int i = 0; i < len; i++) {
      if(s[i] == '\0') break;
      int c = idx(s[i]);
      if(!ch[u][c]) break;
      u = ch[u][c];
      if(val[u] != 0) ans.push_back(val[u]); // 找到一个前缀
    }
  }
};

#include<cstdio>
const int maxl = 300000 + 10; // 文本串最大长度
const int maxw = 4000 + 10;   // 单词最大个数
const int maxwl = 100 + 10;   // 每个单词最大长度
const int MOD = 20071027;

int d[maxl], len[maxw], S;
char text[maxl], word[maxwl];
Trie trie;

int main() {
  int kase = 1;
  while(scanf("%s%d", text, &S) == 2) {
    trie.clear();
    for(int i = 1; i <= S; i++) {
      scanf("%s", word);
      len[i] = strlen(word);
      trie.insert(word, i);
    }
    memset(d, 0, sizeof(d));
    int L = strlen(text);
    d[L] = 1;
    for(int i = L-1; i >= 0; i--) {
      vector<int> p;
      trie.find_prefixes(text+i, L-i, p);
      for(int j = 0; j < p.size(); j++)
        d[i] = (d[i] + d[i+len[p[j]]]) % MOD;
    }
    printf("Case %d: %d\n", kase++, d[0]);
  }
  return 0;
}

版权声明:本文为博主原创文章,未经博主允许不得转载。

时间: 2024-10-12 21:39:16

单词拆解&前缀树&树上DP LA 3942 Remember the Word的相关文章

bzoj 2286 [Sdoi2011]消耗战(虚树+树上DP)

2286: [Sdoi2011]消耗战 Time Limit: 20 Sec  Memory Limit: 512 MBSubmit: 1276  Solved: 445[Submit][Status][Discuss] Description 在一场战争中,战场由n个岛屿和n-1个桥梁组成,保证每两个岛屿间有且仅有一条路径可达.现在,我军已经侦查到敌军的总部在编号为1的岛屿,而且他们已经没有足够多的能源维系战斗,我军胜利在望.已知在其他k个岛屿上有丰富能源,为了防止敌军获取能源,我军的任务是炸

LA 3942 Remember the Word (Trie)

Remember the Word 题目:链接 题意:给出一个有S个不同单词组成的字典和一个长字符串.把这个字符串分解成若干个单词的连接(单词可以重复使用),有多少种方法? 思路:令d[i]表示从字符i开始的字符串(后缀s[i..L])的分解数,这d[i] = sum{d(i+len(x)) | 单词x是其前缀}.然后将所有单词建成一个Trie树,就可以将搜索单词的复杂度降低. 代码: #include<map> #include<set> #include<queue>

LA ——3942 - Remember the Word(Trie 入门)

3942 - Remember the Word Regionals 2007 >> Asia - Nanjing Time limit: 3.000 seconds ------------------------------------------------------ 从右往左地推,令dp[i] 表示字符串  S[i....len]的分解方案数,则dp[i]=sum(dp[i+len(x)])  ,我们只要枚举 S[i....len]的前缀,在所给的单词中查找前缀,如果存在,则进行状态

LA 3942 Remember the Word(字典树+DP)

题目链接:https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=1943 题意:一个长字符串和多个短字符串,求短字符串有多少种方式组成长字符串. 状态转移方程: dp[i] = sum(d[i + len(x)])  (x是s[i...L]的前缀) 对于每个i,如果直接暴力寻找s[i...L]的前缀,复杂度为O(nm) (n为短字符

LA 3942 - Remember the Word (字典树 + dp)

https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=1943 题目大意: 给定一个字符串和给定一个单词集合.问从给定单词集合中选取单词,有多少种选取方法刚好拼接成字符串. 例如: abcd 4 a b cd ab 有两种 a-b-cd ab-cd 这两种情况 解题思路: 因为给定的字符串的长度是3*10^5所以暴力就不能解决问题了

LA 3942 Remember the Word 字典树+dp

#include <cstdio> #include <cstring> using namespace std; #define mod 20071027 int dic[401000][28],val[401000]; char str[301000]; int dp[301000]; int s,sz; char T[110]; void insert(char *ch) { int u=0,len=strlen(ch); for(int i=0;i<len;i++)

bzoj 3172 [Tjoi2013]单词(fail树,DP)

3172: [Tjoi2013]单词 Time Limit: 10 Sec  Memory Limit: 512 MBSubmit: 2327  Solved: 1093[Submit][Status][Discuss] Description 某人读论文,一篇论文是由许多单词组成.但他发现一个单词会在论文中出现很多次,现在想知道每个单词分别在论文中出现多少次. Input 第一个一个整数N,表示有多少个单词,接下来N行每行一个单词.每个单词由小写字母组成,N<=200,单词长度不超过10^6

LA 3942 Remember the Word (Trie树)

——刘汝佳的白皮书里面介绍的题目. /* Problem: Status : By WF, */ #include "algorithm" #include "iostream" #include "cstring" #include "cstdio" #include "string" #include "stack" #include "cmath" #inclu

LA 3942 Remember the Word

已知一些单词,选择其中一些单词组成目的字符串,问共有多少种方法.其实初看到这道题,自然而然地可以想到动态规划中经典的硬币问题:例如,问1元,2元,5元,总共有多少种方法能组成20元?这里不过是把硬币换成了单词而已.但是,如果真的只是像硬币问题一样每个单词都轮询一遍,显然太慢了,最多要有300000*4000*100次比对. 假如利用trie数的话,至多只要比对100次,就能找到所有匹配的单词.然后将字符串从左至右DP即可.设d[i]表示从位置i开始的后缀的解,已知d[i]~d[n],那么求d[i