LeetCode-Repeated DNA

关于位操作符。如<<, value << num ,其中,num指定要位移值value移动的位数,每左移一个位,高阶位都被移出(直接丢掉),并用0填充右边。。

道理明明很简单啊,老娘我高中就会了好不。。可是不知怎么用啊,coding水平太渣,做题太少,从来木有用过。Σ( ° △ °|||)︴

如,在LeetCode中遇到一个Repeat DNA的题目,刚开始很自然的想到用subString()的方法,无奈每个人的DNA序列成千上万,导致时间、空间需求太大,不符合要求。于是,百度之,大神门说要用到Hash映射的方法。涨姿势啊!!

众所周知,在计算机中,符号01串移位的时间总会要比字符串比较的时间复杂度低得多,而在DNA序列中,只会出现A,C,G,T四个符号,因此可以考虑将字符串对应成01串,这样只需“00”,"01","10","11"即可表示以上四个字符,10个字符的比较用一个32位的整数即可表示!!这样比较10位的String变成了比较一个int型整数啊!!神奇

但是,怎么把不同的字符串表示对应成整数呢?一些大神门描述说用一种叫做“字典”的方式,其实在我看来,无非就是Hash加移位麽,具体描述如下:

首先,将四个字符‘A-0,C-1,G-2,T-3’放入一张HashMap中,这样,计算机中的表示即2位的01串。

1 HashMap<Character, Integer> co = new HashMap<Character, Integer>();
2 co.put(‘A‘, 0);
3 co.put(‘B‘, 1);
4 co.put(‘C‘, 2);
5 c0.put(‘D‘, 3);

第二,那么10位字符串怎么对应成一个32位的整数呢?开始时百思不得其解,其实也很简单,用一个for循环即可。(PS:要自己亲自演示一遍,否则还是云里雾里),这里,首先把前十个对应成整数:

1 Integer key = 0;
2 for(int i=0; i<10; i++){
3  key = (key << 2) + co.get(s.charAt(i));
4 }//计算完for循环,得到的数值是341,正好对应00^000101010101,即前十个字符AAAAACCCCC!

第三,明白了这个原理,后面的步骤就很简单了。再使用一张HashMap,记录目前只出现一次的连续10字符串,若当前字符串出现在map中,则是repeat,放入最终结果表list中。
第四,还有一个问题,往后移动一个char时,计算方法如同上面的for循环中的计算公式。

好啦啦,大功告,,回家碎觉觉^_^

时间: 2024-11-08 21:33:42

LeetCode-Repeated DNA的相关文章

[LeetCode]Repeated DNA Sequences

题目:Repeated DNA Sequences 给定包含A.C.G.T四个字符的字符串找出其中十个字符的重复子串. 思路: 首先,string中只有ACGT四个字符,因此可以将string看成是1,3,7,20这三个数字的组合串: 并且可以发现{ACGT}%5={1,3,2,0};于是可以用两个位就能表示上面的四个字符: 同时,一个子序列有10个字符,一共需要20bit,即int型数据类型就能表示一个子序列: 这样可以使用计数排序的思想来统计重复子序列: 这个思路时间复杂度只有O(n),但是

[LeetCode]Repeated DNA Sequences,解题报告

目录 目录 前言 题目 Native思路 二进制思路 AC 前言 最近在LeetCode上能一次AC的概率越来越低了,我这里也是把每次不能一次AC的题目记录下来,把解题思路分享给大家. 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to

LeetCode() Repeated DNA Sequences 看的非常的过瘾!

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq

Leetcode: Repeated DNA Sequence

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq

[LeetCode] Repeated DNA Sequences hash map

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq

[LeetCode]Repeated DNA Sequences Total

题意:题目意思很简单就是有一个由 A C G T 组成的字符串,要求找出字符窜中出现次数不止1次的字串 思路1: 遍历字符串,用hashmap存储字串,判断即可 代码1: public List<String> findRepeatedDnaSequences(String s) { List<String> rs = new LinkedList<String>(); Map<String, Integer> map = new HashMap<St

leetcode 204/187/205 Count Primes/Repeated DNA Sequences/Isomorphic Strings

一:leetcode 204 Count Primes 题目: Description: Count the number of prime numbers less than a non-negative number, n 分析:此题的算法源码可以参看这里,http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes 代码: class Solution { public: int countPrimes(int n) { // 求小于一个数n的素数个

【LeetCode】187. Repeated DNA Sequences

Repeated DNA Sequences All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all

Repeated DNA Sequences

package cn.edu.xidian.sselab.hashtable; import java.util.ArrayList;import java.util.HashSet;import java.util.List;import java.util.Set; /** *  * @author zhiyong wang * title: Repeated DNA Sequences * content: *  All DNA is composed of a series of nuc

Leetcode:Repeated DNA Sequences详细题解

题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long