Repeated DNA Sequences 解答

Question

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].

Solution -- Bit Manipulation

Original idea is to use a set to store each substring. Time complexity is O(n) and space cost is O(n). But for details of space cost, a char is 2 bytes, so we need 20 bytes to store a substring and therefore (20n) space.

If we represent DNA substring by integer, the space is cut down to (4n).

 1 public List<String> findRepeatedDnaSequences(String s) {
 2     List<String> result = new ArrayList<String>();
 3
 4     int len = s.length();
 5     if (len < 10) {
 6         return result;
 7     }
 8
 9     Map<Character, Integer> map = new HashMap<Character, Integer>();
10     map.put(‘A‘, 0);
11     map.put(‘C‘, 1);
12     map.put(‘G‘, 2);
13     map.put(‘T‘, 3);
14
15     Set<Integer> temp = new HashSet<Integer>();
16     Set<Integer> added = new HashSet<Integer>();
17
18     int hash = 0;
19     for (int i = 0; i < len; i++) {
20         if (i < 9) {
21             //each ACGT fit 2 bits, so left shift 2
22             hash = (hash << 2) + map.get(s.charAt(i));
23         } else {
24             hash = (hash << 2) + map.get(s.charAt(i));
25             //make length of hash to be 20
26             hash = hash &  (1 << 20) - 1;
27
28             if (temp.contains(hash) && !added.contains(hash)) {
29                 result.add(s.substring(i - 9, i + 1));
30                 added.add(hash); //track added
31             } else {
32                 temp.add(hash);
33             }
34         }
35
36     }
37
38     return result;
39 }

时间： 2024-12-24 10:39:24

Repeated DNA Sequences 解答

Question

Solution -- Bit Manipulation

Repeated DNA Sequences 解答的相关文章

[LeetCode]Repeated DNA Sequences

Repeated DNA Sequences

【LeetCode】187. Repeated DNA Sequences

leetcode 204/187/205 Count Primes/Repeated DNA Sequences/Isomorphic Strings

[LeetCode] 187. Repeated DNA Sequences 解题思路

[leedcode 187] Repeated DNA Sequences

187. Repeated DNA Sequences

LeetCode() Repeated DNA Sequences 看的非常的过瘾！

Leetcode OJ : Repeated DNA Sequences hash