【leetcode】Repeated DNA Sequences（middle）★

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

思路：

开始用hash表存储所有出现过一次的字符串，结果空间超了。有用最简单的循环，时间又超了。做不出来，看答案。

大神的方法，思路是用一个整数来表示一个10字符长的字符串，相当于给字符串编码了。每个字母用一个 2位的二进制数表示依次把每位对应的数字左移，后面或上新的表示数字。

//大神的方法 思路是用一个整数来表示一个10字符长的字符串 相当于给字符串编码了
    vector<string> findRepeatedDnaSequences3(string s) {
        unordered_set<int> words;
        vector<string> ans;
        char* map = new char[26];
        map[‘A‘ - ‘A‘] = 0;  //A C G T 分别用二进制数 00 01 10 11表示
        map[‘C‘ - ‘A‘] = 1;
        map[‘G‘ - ‘A‘] = 2;
        map[‘T‘ - ‘A‘] = 3;

        for(int i = 0; i + 9 < s.length(); i++) //遍历所有起始位置
        {
            int v = 0;
            for(int j = i; j < i + 10; j++)
            {
                //对于一个字符串，每一个字母对应一个两位的二进制数 每次把数字左移两位 留出新的空位来放新字母对应的数
                v <<= 2;
                v |= map[s[j] - ‘A‘];
            }
            //如果数字已经出现过，并且还没有被放入答案中，压入答案
            if(words.find(v) != words.end() && find(ans.begin(), ans.end(), s.substr(i, 10)) == ans.end())
            {
                ans.push_back(s.substr(i, 10));
            }
            else
            {
                words.insert(v);
            }
        }

        return ans;
    }

我的两个通不过的方法

//hash表 内存超了
    vector<string> findRepeatedDnaSequences(string s) {
        vector<string> ans;
        unordered_set<string> hash;

        if(s.length() < 10) return ans;

        for(int i = 0; s.length() - i - 1 >= 10; i++)
        {
            string sub = s.substr(i, 10);
            if(find(ans.begin(), ans.end(), sub) != ans.end())
            {
                continue;
            }
            if(hash.count(sub) == 0)
            {
                hash.insert(sub);
            }
            else
            {
                hash.erase(sub);
                ans.push_back(sub);
            }
        }
        return ans;

    }

    //简单的查找法 时间超了
    vector<string> findRepeatedDnaSequences2(string s) {
        vector<string> ans;
        if(s.length() < 10) return ans;

        for(int i = 0; s.length() - i - 1 >= 10; i++)
        {
            string sub = s.substr(i, 10);
            if(find(ans.begin(), ans.end(), sub) != ans.end())
            {
                continue;
            }
            else if(s.find(sub, i + 1) != s.npos)
            {
                ans.push_back(sub);
            }
        }

        return ans;
    }

    //大神的方法 思路是用一个整数来表示一个10字符长的字符串 相当于给字符串编码了
    vector<string> findRepeatedDnaSequences3(string s) {
        unordered_set<int> words;
        vector<string> ans;
        char* map = new char[26];
        map[‘A‘ - ‘A‘] = 0;  //A C G T 分别用二进制数 00 01 10 11表示
        map[‘C‘ - ‘A‘] = 1;
        map[‘G‘ - ‘A‘] = 2;
        map[‘T‘ - ‘A‘] = 3;

        for(int i = 0; i + 9 < s.length(); i++) //遍历所有起始位置
        {
            int v = 0;
            for(int j = i; j < i + 10; j++)
            {
                //对于一个字符串，每一个字母对应一个两位的二进制数 每次把数字左移两位 留出新的空位来放新字母对应的数
                v <<= 2;
                v |= map[s[j] - ‘A‘];
            }
            //如果数字已经出现过，并且还没有被放入答案中，压入答案
            if(words.find(v) != words.end() && find(ans.begin(), ans.end(), s.substr(i, 10)) == ans.end())
            {
                ans.push_back(s.substr(i, 10));
            }
            else
            {
                words.insert(v);
            }
        }

        return ans;
    }

时间： 2024-12-09 01:37:24

【leetcode】Repeated DNA Sequences（middle）★的相关文章

【LeetCode】Repeated DNA Sequences 解题报告

[题目] All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-lon

【Leetcode】Repeated DNA Sequences

题目链接:https://leetcode.com/problems/repeated-dna-sequences/ 题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences with

【leetcode】Compare Version Numbers（middle）

Compare two version numbers version1 and version2.If version1 > version2 return 1, if version1 < version2 return -1, otherwise return 0. You may assume that the version strings are non-empty and contain only digits and the . character.The . characte

【leetcode】Divide Two Integers （middle）☆

Divide two integers without using multiplication, division and mod operator. If it is overflow, return MAX_INT. 思路: 尼玛,各种通不过,开始用纯减法,超时了. 然后用递归,溢出了. 再然后终于开窍了,用循环,把被除数每次加倍去找答案,结果一遇到 -2147483648 就各种不行, 主要是这个数一求绝对值就溢出了. 再然后,受不了了,看答案. 发现,大家都用long long来解决溢

【leetcode】Insertion Sort List （middle）

Sort a linked list using insertion sort. 思路: 用插入排序对链表排序.插入排序是指每次在一个排好序的链表中插入一个新的值. 注意:把排好序的部分和未排序的部分完全分开,指针不要有交叉. 即不会通过->next 重叠 class Solution { public: ListNode *insertionSortList(ListNode *head) { if(head == NULL) return NULL; ListNode * ans = hea

【leetcode】Combination Sum III（middle）

Find all possible combinations of k numbers that add up to a number n, given that only numbers from 1 to 9 can be used and each combination should be a unique set of numbers. Ensure that numbers within the set are sorted in ascending order. Example 1

【leetcode】Set Matrix Zeroes（middle）

Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do it in place. 思路:不能用额外空间,就用矩阵的第一行和第一列来标记这一行或这一列是否需要置0. 用两个bool量记录第一行和第一列是否需要置0 大神的代码和我的代码都是这个思路,但是我在画0的时候是行列分开处理的,大神的代码是一起处理的 void setZeroes(vector<vector<int> > &

【leetcode】Balanced Binary Tree（middle）

Given a binary tree, determine if it is height-balanced. For this problem, a height-balanced binary tree is defined as a binary tree in which the depth of the two subtrees of every node never differ by more than 1. 思路: 我居然在这道题上卡了一个小时.关键是对于平衡的定义,我开始理解

【LeetCode】数组--合并区间（56）

写在前面老粉丝可能知道现阶段的LeetCode刷题将按照某一个特定的专题进行,之前的[贪心算法]已经结束,虽然只有三个题却包含了简单,中等,困难这三个维度,今天介绍的是第二个专题[数组] 数组(Array)是一种线性表数据结构.它用一组连续的内存空间,来存储一组具有相同类型的数据.在每一种编程语言中,基本都会有数组这种数据类型.不过,它不仅仅是一种编程语言中的数据类型,还是一种最基础的数据结构. 贪心算法回顾: [LeetCode]贪心算法--买卖股票的最佳时机II(122) [LeetC