Repeated DNA Sequences

package cn.edu.xidian.sselab.hashtable;

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

/**
 *
 * @author zhiyong wang
 * title: Repeated DNA Sequences
 * content:
 *  All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T,
 *  for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
 *  Write a function to find all the 10-letter-long sequences (substrings)
 *  that occur more than once in a DNA molecule.
 *
 * For example,
 *
 * Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",
 * Return
 * ["AAAAACCCCC", "CCCCCAAAAA"].
 *
 */
public class RepeatedDNASequences {

//自己很不细心,错了好几个地方(1)自己没有把思路想好,就开始动笔写代码
    //(2)上来第一步没有考虑特殊情况
    //(3)每次取值范围是(i,i+10),这两个值都没有注意,直接写的(0,10)
    //(4)只考虑了set是否包含,没有考虑set的写入情况
    //(5)没有考虑如果存在多个重复,是否要多次插入情况
    public List<String> findRepeatedDNASequences(String s){
        int length = s.length();
        List list = new ArrayList();
        if(s == null || length < 10) return list;
        Set set = new HashSet();
        for(int i=0;i<length-9;i++){
            String temp = s.substring(i, i+10);
            if(set.contains(temp) && !list.contains(temp)){
                list.add(temp);
            }else{
                set.add(temp);
            }
            
        }
        return list;
    }
    public static void main(String[] args) {
        RepeatedDNASequences r = new RepeatedDNASequences();
        r.findRepeatedDNASequences("CAAAAAAAAAC");
    }
}

时间: 2024-09-30 15:04:18

Repeated DNA Sequences的相关文章

[LeetCode]Repeated DNA Sequences

题目:Repeated DNA Sequences 给定包含A.C.G.T四个字符的字符串找出其中十个字符的重复子串. 思路: 首先,string中只有ACGT四个字符,因此可以将string看成是1,3,7,20这三个数字的组合串: 并且可以发现{ACGT}%5={1,3,2,0};于是可以用两个位就能表示上面的四个字符: 同时,一个子序列有10个字符,一共需要20bit,即int型数据类型就能表示一个子序列: 这样可以使用计数排序的思想来统计重复子序列: 这个思路时间复杂度只有O(n),但是

【LeetCode】187. Repeated DNA Sequences

Repeated DNA Sequences All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all

leetcode 204/187/205 Count Primes/Repeated DNA Sequences/Isomorphic Strings

一:leetcode 204 Count Primes 题目: Description: Count the number of prime numbers less than a non-negative number, n 分析:此题的算法源码可以参看这里,http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes 代码: class Solution { public: int countPrimes(int n) { // 求小于一个数n的素数个

[LeetCode] 187. Repeated DNA Sequences 解题思路

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq

[leedcode 187] Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq

187. Repeated DNA Sequences

题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long

LeetCode() Repeated DNA Sequences 看的非常的过瘾!

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq

Leetcode OJ : Repeated DNA Sequences hash

Total Accepted: 3790 Total Submissions: 21072 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write

LeetCode187——Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long seq