Leetcode

187. Repeated DNA Sequences

Leeter 2021. 10. 23. 02:03

Description:

The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'.

  • For example, "ACGAATTCCG" is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

 

'''
[1] using two set)

idea::
1. slide window (length==10)
2-1. if window string appears for the first time, add it to first_set
2-2. else(window string in first_set) add it answer set

code::
(below)

-T/C: O(n), n==len(s) # for sliding O(n), for finding or adding string in hashset O(1)
-S/C: O(n) # for two set. O(n) + O(n)
'''

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        first_set=set()
        answer_set=set()
        
        for i in range(0,len(s)-10+1):
            string_window=s[i:i+10]
            
            if string_window in first_set:
                answer_set.add(string_window)
            else:
                first_set.add(string_window)
                
        return list(answer_set)