All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
实现:
class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
vector<string> vs;
char hash[1048575] = {0};
if (s.size() < 11) return vs;
int len = s.size();
int flag = 0;
for (int i = 0; i < 9; i++) {
flag = flag << 2 | (s[i] - ‘A‘ + 1) % 5;
}
for (int i = 9; i < s.size(); i++) {
if (hash[(flag= flag << 2 | (s[i] - ‘A‘ + 1) % 5)&0xfffff]++ == 1) {
vs.push_back(s.substr(i-9, 10));
}
}
return vs;
}
};
版权声明:本文为博主原创文章,未经博主允许不得转载。
时间: 2024-10-19 04:30:46