《Cracking the Coding Interview》——第17章：普通题——题目14

2014-04-29 00:20

题目：给定一个长字符串，和一个词典。如果允许你将长串分割成若干个片段，可能会存在某些片段在词典里查不到，有些则查得到。请设计算法进行分词，使得查不到的片段个数最少。

解法：用空间换取时间的动态规划算法，首先用O(n^2)的时间判断每一个片段是否在字典里。这个过程其实可以通过字典树来进行加速，时间上能优化一个阶，不过我没写，偷懒用<unordered_set>代表了字典。之后通过O(n)时间的动态规划，dp[i]表示当前位置的查不到的片段的最少个数。对于懂代码的人，代码说的比文字清楚，所以请看代码。

代码：

 1 // 17.14 Given a dictionary of words, and a long string. You may find a way to cut the string into words, where some of them may or may not be in the dictionary.

 2 // Dynamic programming is a good thing, but trades space in for time.

 3 #include <iostream>

 4 #include <string>

 5 #include <unordered_set>

 6 #include <vector>

 7 using namespace std;

 8

 9 int main()

10 {

11     string data;

12     unordered_set<string> dict;

13     vector<vector<bool> > contains;

14     vector<int> dp;

15     int i, j;

16     string s;

17     int n;

18     int tmp;

19

20     while (cin >> data && data != "") {

21         cin >> n;

22         for (i = 0; i < n; ++i) {

23             cin >> s;

24             dict.insert(s);

25         }

26         n = (int)data.length();

27

28         contains.resize(n);

29         for (i = 0; i < n; ++i) {

30             contains[i].resize(n);

31         }

32         for (i = 0; i < n; ++i) {

33             s = "";

34             for (j = i; j < n; ++j) {

35                 s.push_back(data[j]);

36                 contains[i][j] = (dict.find(s) != dict.end());

37             }

38         }

39

40         dp.resize(n);

41         for (i = 0; i < n; ++i) {

42             dp[i] = contains[0][i] ? 0 : i + 1;

43             for (j = 0; j < i; ++j) {

44                 tmp = dp[j] + (contains[j + 1][i] ? 0 : i - j);

45                 dp[i] = dp[i] < tmp ? dp[i] : tmp;

46             }

47         }

48

49         printf("%d\n", dp[n - 1]);

50

51         for (i = 0; i < n; ++i) {

52             contains[i].clear();

53         }

54         contains.clear();

55         dp.clear();

56         dict.clear();

57     }

58

59     return 0;

60 }

《Cracking the Coding Interview》——第17章：普通题——题目14,布布扣,bubuko.com

时间： 2024-08-03 08:20:37

《Cracking the Coding Interview》——第17章：普通题——题目14的相关文章

《Cracking the Coding Interview》——第17章：普通题——题目13

2014-04-29 00:15 题目:将二叉搜索树展开成一个双向链表,要求这个链表仍是有序的,而且不能另外分配对象,就地完成. 解法:Leetcode上也有,递归解法. 代码: 1 // 17.13 Flatten a binary search tree into a doubly linked list by inorder traversal order. 2 // Use postorder traversal to do the flattening job. 3 #include

《Cracking the Coding Interview》——第17章：普通题——题目12

2014-04-29 00:04 题目:给定一个整数数组,找出所有加起来为指定和的数对. 解法1:可以用哈希表保存数组元素,做到O(n)时间的算法. 代码: 1 // 17.12 Given an array of integers and target value, find all pairs in the array that sum up to the target. 2 // Use hash to achieve O(n) time complexity. Duplicates pa

《Cracking the Coding Interview》——第17章：普通题——题目10

2014-04-28 23:54 题目:XML文件的冗余度很大,主要在于尖括号里的字段名.按照书上给定的方式进行压缩. 解法:这题我居然忘做了,只写了一句话的注解.用python能够相对方便地实现,因为有直接的XML工具可以调用.书上的那种要求应该是符合前序遍历规则. 代码: 1 # 17.10 Parse an XML file, and try to save some space by mapping every item name to an integer index. 2 # Ans

《Cracking the Coding Interview》——第17章：普通题——题目7

2014-04-28 23:28 题目:给定一个数字,用英语把它读出来. 解法:ZOJ上有相反的题目.如果我要用中文读书来呢? 代码: 1 // 17.7 Read an integer in English. 2 #include <map> 3 #include <string> 4 using namespace std; 5 6 map<int, string> m; 7 8 void init() 9 { 10 m[0] = "zero";

《Cracking the Coding Interview》——第17章：普通题——题目9

2014-04-28 23:52 题目:设计算法,找出一本书中某个单词的出现频率. 解法:数就行了. 代码: 1 // 17.9 Given a book, find out the occurrences of any given words in it. 2 // Answer: 3 // 1. process the book as a text file. 4 // 2. find all words, definition of a word must be clearly asser

《Cracking the Coding Interview》——第17章：普通题——题目6

2014-04-28 22:49 题目:给定一个整数数组.如果你将其中一个子数组排序,那么整个数组都变得有序.找出所有这样子数组里最短的一个. 解法:线性时间,常数空间内可以解决,思想类似于动态规划.通过正反扫描两次,可以得出这个区间的两端.只要存在i < j并且a[i] > a[j],那么这个区间[i, j]就必须被排序,为了在线性时间内完成算法,我们可以通过不断比较当前元素与当前最大(最小)元素来更新结果.请看代码. 代码: 1 // 17.6 Given an array, if you

《Cracking the Coding Interview》——第17章：普通题——题目8

2014-04-28 23:35 题目:最大子数组和问题. 解法:O(n)解法. 代码: 1 // 17.8 Find the consecutive subarray with maximum sum in an array. 2 // O(n) online algorithm. 3 #include <cstdio> 4 #include <vector> 5 using namespace std; 6 7 int maximumSum(vector<int>

《Cracking the Coding Interview》——第17章：普通题——题目5

2014-04-28 22:44 题目:猜数字游戏.四个数字,每个都是0~9之间.你每猜一次,我都告诉你,有多少个位置和数字都对(全对),有多少个位置错数字对(半对).比如"6309",你猜"3701",就有1全对,1半对. 解法:依照题意写就可以了. 代码: 1 // 17.5 I am the Master Mind. Guess the number. 2 // When you guessed the right number at the right po

《Cracking the Coding Interview》——第17章：普通题——题目1

2014-04-28 21:45 题目:就地交换两个数,不使用额外的变量. 解法:没说是整数,我姑且先当整数处理吧.就地交换可以用加法.乘法.异或完成,其中乘法和加法都存在溢出问题.三种方法都不能处理交换同一个数的情况,需要条件判断. 代码: 1 // 17.1 Do a swapping in-place. 2 #include <cstdio> 3 using namespace std; 4 5 void swap1(int &x, int &y) 6 { 7 if (x