Python script to get all the distinct substrings from a string

Recently I came across an interview question:

Print out all the substrings of a given string.?

Now, the catch in the question is for strings like “ababc”. If you are blindly adding the substrings to a storage you have, you will end up with copies of “ab”, which is not what we want. Now, what is the best method to avoid duplicates..??
Ans: hash tables.. 😉

Thus, using the power of Python (i.e. the sets available in python) I have written the following script which takes the string as an input and returns the set of the substrings.

def substr(string):
    while True:
        for i in range(len(string)-j+1):
        if j==len(string):
    return a

Also, to print the set sequentially, we can easily convert the returned set to an ordered list. For example, here I have sorted the list based on the length of the substrings.

def prnt(seta):
    print l
    ## return l # if you want to return this object too

The normal execution on the commandline looks like this:

>>> x
 set(['a', 'ayus', 'h', 'yus', 'ayush', 'us', 'ush', 'ay', 's', 'sh', 'u', 'yush', 'y', 'yu', 'ayu'])
>>> prnt(x)
 ['a', 'h', 's', 'u', 'y', 'us', 'ay', 'sh', 'yu', 'yus', 'ush', 'ayu', 'ayus', 'yush', 'ayush']

time O(n**2) ## worst but this method provides us with the substrings also
memory O(2**n) ## worst when all characters in the string are different

Explore posts in the same categories: Python

Tags: , , ,

You can comment below, or link to this permanent URL from your own site.

3 Comments on “Python script to get all the distinct substrings from a string”

  1. wehnsdaefflae Says:

    you know you can also do this with set compehension. for example like that:

    test_set = {‘a’, ‘ayus’, ‘h’, ‘yus’, ‘ayush’, ‘us’, ‘ush’, ‘ay’, ‘s’, ‘sh’, ‘u’, ‘yush’, ‘y’, ‘yu’, ‘ayu’}
    new_set = {x[i:j] for x in test_set for i in range(len(x)) for j in range(len(x)+1) if len(x[i:j]) > 0}
    print new_set

    # returns set([‘a’, ‘ayus’, ‘h’, ‘yus’, ‘ayush’, ‘us’, ‘ush’, ‘y’, ‘s’, ‘sh’, ‘u’, ‘yush’, ‘ay’, ‘yu’, ‘ayu’])

  2. A much cleaner and probably more efficient solution must I suggest:

    “`def get_all_substrings(text):
    length = len(text)
    return {text[i:j+1] for i in range(length) for j in range(i, length)}“`

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: