Repeated DNA Sequences The DNA sequence is composed of a series of nucleotides abbreviated as 'A'
, 'C'
, 'G'
, and 'T'
.
- For example,
"ACGAATTCCG"
is a DNA sequence.
When studying DNA, it is useful to identify repeated sequences within the DNA.
Given a string s
that represents a DNA sequence, return all the 10
-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.
Example 1:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC","CCCCCAAAAA"]
Example 2:
Input: s = "AAAAAAAAAAAAA" Output: ["AAAAAAAAAA"]
Constraints:
1 <= s.length <= 105
s[i]
is either'A'
,'C'
,'G'
, or'T'
.
Repeated DNA Sequences Solutions
✅Time: O(10n) = O(n)
✅Space: O(10n) = O(n)
C++
class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_set<string> ans;
unordered_set<string_view> seen;
const string_view sv(s);
for (int i = 0; i + 10 <= s.length(); ++i) {
if (seen.count(sv.substr(i, 10)))
ans.insert(s.substr(i, 10));
seen.insert(sv.substr(i, 10));
}
return {begin(ans), end(ans)};
}
};
Java
class Solution {
public List<String> findRepeatedDnaSequences(String s) {
Set<String> ans = new HashSet<>();
Set<String> seen = new HashSet<>();
for (int i = 0; i + 10 <= s.length(); ++i) {
final String seq = s.substring(i, i + 10);
if (seen.contains(seq))
ans.add(seq);
seen.add(seq);
}
return new ArrayList<>(ans);
}
}
Python
class Solution:
def findRepeatedDnaSequences(self, s: str) -> List[str]:
ans = set()
seen = set()
for i in range(len(s) - 9):
seq = s[i:i + 10]
if seq in seen:
ans.add(seq)
seen.add(seq)
return list(ans)
Watch Tutorial
Checkout more Solutions here