LeetCode: Word Break II 解题报告 / 憋错料

Word Break II
Given a string s and a dictionary of words dict, add spaces in s to
construct a sentence where each word is a valid dictionary
word.

Return all such possible sentences.

For example, given
s = "catsanddog",
dict = ["cat", "cats", "and", "sand", "dog"].

A solution is ["cats and dog", "cat sand dog"].

解答1 (dfs)：
让我们来继续切切切吧！

本题与上一题Word Break思路类似，但是一个是DP,一个是DFS。
让我们来回顾一下DP与DFS的区别：
DP是Bottom-up 而DFS是TOP-DOWN.

在本题的DFS中，我们这样定义：
用刀在字符串中切一刀。左边是i个字符，右边是len-i个字符。
i: 1- len
如果：
左边是字典里的词，右边是可以wordbreak的，那么把左边的字符串加到右边算出来的List中，生成新的list返回。
1. Base case:
当输入字符串为空的时候，应该给出一个空解。这个很重要，否则这个递归是不能运行的。
2. 递归的时候，i应该从1开始递归，因为我们要把这个问题分解为2个部分，如果你左边给0，那就是死循环。

记忆：
为了加快DFS的速度，我们应该添加记忆，也就是说，算过的字符串不要再重复计算。举例子：
apple n feng
app len feng
如果存在以上2种划分，那么feng这个字符串会被反复计算，在这里至少计算了2次。我们使用一个Hashmap把对应字符串的解记下来，这样就能避免重复的计算。
否则这一道题目会超时。

解答2： dfs2：
参考了http://blog.csdn.net/fightforyourdream/article/details/38530983的
解法，我们仍然使用主页君用了好多次的递归模板。但是在LeetCode中超时，在进入DFS时加了一个『判断是不是wordBreak』的判断，终于过了。这是一种DFS+剪枝的解法

解答3： dfs3：

感谢http://fisherlei.blogspot.com/2013/11/leetcode-wordbreak-ii-solution.html的解释，我们可以加一个boolean的数组，b[i]表示从i到len的的字串可不可以进行word break. 如果我们在当前根本没有找到任何的word，也就表明这一串是不能word break的，记一个false在数组里。这样下次进入dfs这里的时候，直接就返回一个false.通过这个剪枝我们也可以减少复杂度。

  1 package Algorithms.dp;
  2
  3
  4 import java.util.ArrayList;
  5 import java.util.HashMap;
  6 import java.util.HashSet;
  7 import java.util.List;
  8 import java.util.Set;
  9
 10 public class WordBreak2 {
 11     public static void main(String[] strs) {
 12         String s = "aaaaaaaaaaaaaaaaaaaaaaa";
 13         Set<String> dict = new HashSet<String>();
 14         dict.add("bin");
 15         dict.add("apple");
 16         dict.add("app");
 17         dict.add("le");
 18         dict.add("aaaaaa");
 19         dict.add("aaaaa");
 20         dict.add("aaaa");
 21         dict.add("aaa");
 22         dict.add("aa");
 23         dict.add("a");
 24         dict.add("aaaaaaa");
 25         dict.add("aaaaaaaa");
 26         dict.add("aaaaaaaaa");
 27
 28         System.out.println("Test");
 29
 30         Algorithms.permutation.Stopwatch timer1 = new Algorithms.permutation.Stopwatch();
 31
 32         // 递归模板，加剪枝
 33         List<String> list = wordBreak(s, dict);
 34
 35         System.out
 36         .println("Computing time with dfs and cut branch used as Queue/Deque: "
 37                 + timer1.elapsedTime() + " millisec.");
 38
 39         Algorithms.permutation.Stopwatch timer2 = new Algorithms.permutation.Stopwatch();
 40
 41         // HASH保存记忆
 42         wordBreak1(s, dict);
 43
 44         System.out
 45         .println("Computing time with ArrayDeque used as Queue/Deque: "
 46                 + timer2.elapsedTime() + " millisec.");
 47
 48         Algorithms.permutation.Stopwatch timer3 = new Algorithms.permutation.Stopwatch();
 49
 50         // DFS+ 剪枝 3: 设置Flag 变量
 51         //http://fisherlei.blogspot.com/2013/11/leetcode-wordbreak-ii-solution.html
 52         wordBreak3(s, dict);
 53
 54         System.out
 55         .println("Computing time with ArrayDeque used as Queue/Deque: "
 56                 + timer3.elapsedTime() + " millisec.");
 57
 58         //System.out.println(list.toString());
 59     }
 60
 61     // 我们用DFS来解决这个问题吧
 62     public static List<String> wordBreak1(String s, Set<String> dict) {
 63         HashMap<String, List<String>> map = new HashMap<String, List<String>>();
 64         if (s == null || s.length() == 0 || dict == null) {
 65             return null;
 66         }
 67
 68         return dfs(s, dict, map);
 69     }
 70
 71     // 解法1：我们用DFS来解决这个问题吧
 72     public static List<String> dfs(String s, Set<String> dict, HashMap<String, List<String>> map) {
 73         if (map.containsKey(s)) {
 74             return map.get(s);
 75         }
 76
 77         List<String> list = new ArrayList<String>();
 78         int len = s.length();
 79
 80         if (len == 0) {
 81             list.add("");
 82         } else {
 83             // i 表示左边字符串的长度
 84             for (int i = 1; i <= len; i++) {
 85                 String sub = s.substring(0, i);
 86
 87                 // 左边的子串可以为空，或是在字典内
 88                 if (!dict.contains(sub)) {
 89                     continue;
 90                 }
 91
 92                 // 字符串划分为2边，计算右边的word break.
 93                 List<String> listRight = dfs(s.substring(i, len), dict, map);
 94
 95                 // 右边不能break的时候，我们跳过.
 96                 if (listRight.size() == 0) {
 97                     continue;
 98                 }
 99
100                 // 把左字符串加到右字符串中，形成新的解.
101                 for (String r: listRight) {
102                     StringBuilder sb = new StringBuilder();
103                     sb.append(sub);
104                     if (i != 0 && i != len) {
105                         // 如果左边为空，或是右边为空，不需要贴空格
106                         sb.append(" ");
107                     }
108                     sb.append(r);
109                     list.add(sb.toString());
110                 }
111             }
112         }
113
114         map.put(s, list);
115         return list;
116     }
117
118     /*
119     // 解法2：我们用普通的递归模板来试一下。
120     */
121
122     // 我们用DFS来解决这个问题吧
123     public static List<String> wordBreak(String s, Set<String> dict) {
124         if (s == null || s.length() == 0 || dict == null) {
125             return null;
126         }
127
128         List<String> ret = new ArrayList<String>();
129
130         // 记录切割过程中生成的字母
131         List<String> path = new ArrayList<String>();
132
133         dfs2(s, dict, path, ret, 0);
134
135         return ret;
136     }
137
138     // 我们用DFS模板来解决这个问题吧
139     public static void dfs2(String s, Set<String> dict,
140             List<String> path, List<String> ret, int index) {
141         int len = s.length();
142         if (index == len) {
143             // 结束了。index到了末尾
144             StringBuilder sb = new StringBuilder();
145             for (String str: path) {
146                 sb.append(str);
147                 sb.append(" ");
148             }
149             // remove the last " "
150             sb.deleteCharAt(sb.length() - 1);
151             ret.add(sb.toString());
152             return;
153         }
154
155         // 如果不加上这一行会超时。就是说不能break的时候，可以直接返回
156         // 但这也许只是一个treak, 其实这种方法还是不大好。
157         if (!iswordBreak(s.substring(index), dict)) {
158             return;
159         }
160
161         for (int i =  index; i < len; i++) {
162             // 注意这些索引的取值。左字符串的长度从0到len
163             String left = s.substring(index, i + 1);
164             if (!dict.contains(left)) {
165                 // 如果左字符串不在字典中，不需要继续递归
166                 continue;
167             }
168
169             path.add(left);
170             dfs2(s, dict, path, ret, i + 1);
171             path.remove(path.size() - 1);
172         }
173     }
174
175     public static boolean iswordBreak(String s, Set<String> dict) {
176         if (s == null) {
177             return false;
178         }
179
180         int len = s.length();
181         if (len == 0) {
182             return true;
183         }
184
185         boolean[] D = new boolean[len + 1];
186
187         // initiate the DP. 注意，这里设置为true是不得已，因为当我们划分字串为左边为0，右边为n的时候，
188         // 而右边的n是一个字典string,那么左边必然要设置为true，才能使结果为true。所以空字符串我们需要
189         // 认为true
190         D[0] = true;
191
192         // D[i] 表示i长度的字符串能否被word break.
193         for (int i = 1; i <= len; i++) {
194             // 把子串划分为2部分，分别讨论, j 表示左边的字符串的长度
195             // 成立的条件是：左边可以break, 而右边是一个字典单词
196             D[i] = false;
197             for (int j = 0; j < i; j++) {
198                 if (D[j] && dict.contains(s.substring(j, i))) {
199                     // 只要找到任意一个符合条件，我们就可以BREAK; 表示我们检查的
200                     // 这一个子串符合题意
201                     D[i] = true;
202                     break;
203                 }
204             }
205         }
206
207         return D[len];
208     }
209
210

 1     /*
 2     // 解法3：重新剪枝。
 3     */
 4     // 我们用DFS来解决这个问题吧
 5     public static List<String> wordBreak(String s, Set<String> dict) {
 6         if (s == null || s.length() == 0 || dict == null) {
 7             return null;
 8         }
 9
10         List<String> ret = new ArrayList<String>();
11
12         // 记录切割过程中生成的字母
13         List<String> path = new ArrayList<String>();
14
15         int len = s.length();
16
17         // 注意：一定要分配 Len+1 否则会爆哦.
18         boolean canBreak[] = new boolean[len + 1];
19         for (int i = 0; i < len + 1; i++) {
20             canBreak[i] = true;
21         }
22
23         dfs3(s, dict, path, ret, 0, canBreak);
24
25         return ret;
26     }
27
28     // 我们用DFS模板来解决这个问题吧
29     public static void dfs3(String s, Set<String> dict,
30             List<String> path, List<String> ret, int index,
31             boolean canBreak[]) {
32         int len = s.length();
33         if (index == len) {
34             // 结束了。index到了末尾
35             StringBuilder sb = new StringBuilder();
36             for (String str: path) {
37                 sb.append(str);
38                 sb.append(" ");
39             }
40             // remove the last " "
41             sb.deleteCharAt(sb.length() - 1);
42             ret.add(sb.toString());
43             return;
44         }
45
46         // if can‘t break, we exit directly.
47         if (!canBreak[index]) {
48             return;
49         }
50
51         for (int i =  index; i < len; i++) {
52             // 注意这些索引的取值。左字符串的长度从0到len
53             String left = s.substring(index, i + 1);
54             if (!dict.contains(left) || !canBreak[i + 1]) {
55                 // 如果左字符串不在字典中，不需要继续递归
56                 continue;
57             }
58
59             // if can‘t find any solution, return false, other set it
60             // to be true;
61             path.add(left);
62
63             int beforeChange = ret.size();
64             dfs3(s, dict, path, ret, i + 1, canBreak);
65             // 注意这些剪枝的代码. 关键在于此以减少复杂度
66             if (ret.size() == beforeChange) {
67                 canBreak[i + 1] = false;
68             }
69             path.remove(path.size() - 1);
70         }
71     }

比较与测试：

这里贴一下各种解法的时间：

Test

Computing time with DFS1: 8300.0 millisec.
Computing time with DFS2: 5720.0 millisec.
Computing time with DFS3: 5468.0 millisec.

可见，三个方法里最好的还是第三个，建议面试时可以采用第三个。另外还有一个方法就是在计算所有的结果之前，先用DP把所有的字串可不可以word break计算一次，这样一样是可以减少计算量的。

GitHub代码链接

时间： 2024-10-13 10:43:52

LeetCode: Word Break II 解题报告

LeetCode: Word Break II 解题报告的相关文章

【LeetCode】Word Break II 解题报告

[LeetCode] Word Break II 解题思路

[LeetCode]Word Break，解题报告

[leetcode]Word Break II @ Python

LeetCode: Word Break II [140]

【LeetCode】Word Search II 解题报告

LeetCode: Unique Paths II 解题报告

【LeetCode】Subsets II 解题报告

[Leetcode] word break ii拆分词语