题意:题目意思很简单就是有一个由 A C G T 组成的字符串,要求找出字符窜中出现次数不止1次的字串
思路1: 遍历字符串,用hashmap存储字串,判断即可
代码1:
public List<String> findRepeatedDnaSequences(String s) { List<String> rs = new LinkedList<String>(); Map<String, Integer> map = new HashMap<String, Integer>(); for(int i =0; i <= s.length() - 10; ++i){ String substr = s.substring(i, i + 10); if(map.containsKey(substr)){ map.put(substr,map.get(substr) + 1); }else { map.put(substr, 1); } } for(Map.Entry<String, Integer> en : map.entrySet()){ if(en.getValue() > 1){ rs.add(en.getKey()); } } return rs; }
思路2:由于字符串只有A,G,C,T 然后就可以用2位bit就可以了,那么10位字符串就可以用int得低20位表示,这样可以节省存储空间
代码2:
/** * 将字符转换成二进制编码 * @param s * @return */ public List<String> findRepeatedDnaSequences(String s){ List<String> rs = new LinkedList<String>(); if(s.length() < 10) return rs; Map<Character, Integer> c2i = new HashMap<Character, Integer>(); c2i.put('A', 1); c2i.put('C', 2); c2i.put('G', 3); c2i.put('T', 4); Map<Integer, Integer> res = new HashMap<Integer, Integer>(); Set<Integer> added = new HashSet<Integer>(); int target = 0; for(int i = 0; i < s.length(); i ++){ if(i < 9){ target = (target << 2) + c2i.get(s.charAt(i)); }else { target = (target << 2) + c2i.get(s.charAt(i)); target = target & ((1 << 20) -1);//保证是二十位 if(res.containsKey(target) && !added.contains(target)){ rs.add(s.substring(i - 9, i + 1)); added.add(target); }else { res.put(target, 1); } } } return rs; }
时间: 2024-11-07 06:18:16