Posted tagged ‘python’

Threading in Python

April 15, 2012

Recently in my effort to learn something new in Python, I thought of having a small introduction to threading in python.

The following modules are related to python that come in default installation in python:

From Python Docs:

Note
The thread module has been renamed to _thread in Python 3.0. The 2to3 tool will automatically adapt imports when converting your sources to 3.0; however, you should consider using the high-level threading module instead.

Thus, it can be assumed that when developing scripts that may use threading, always use the threading module rather than thread.

To save myself some time, it would be better if you can read the basic concepts of threads from wiki itself.

Now, comes the first program.

#!/usr/bin/env python

import time
import thread

def myfunction(string, sleeptime, max_count, *args):

    counter = 0
    ## To manage I/O
    time.sleep(0.2)
    while counter < max_count:
        print "{}. {}".format(counter, string)
        counter += 1
        time.sleep(sleeptime)
        #sleep for a specified amount of time.

if __name__=="__main__":

    print "thread Started : {}".format(thread.start_new_thread(myfunction,("Thread No:1", 2, 10)))
##    thread.exit_thread
##
    ## this can be omitted
    while 1:
        pass

 

In the above script, a new thread is started using the function myfunction. The arguments to the function are passed to the start_new_thread() using a tuple (do remember to make a tuple from the arguments you want to pass). The start_new_thread() returns the thread identifier of the thread started (which has been printed here).
A very usual thing I noticed in threaded programs is the use of time.sleep(), it helps in synchronizing the input output on the terminal. In actual backend scripts, the sleep function would not prove useful (I may be wrong!)

The last example was just for introduction. To jump up the level, let’s calculate the Fibonacci series from a thread.

Code:

#Fibonacci threader

import time, thread, threading

def fib(n):
    a, b = 0, 1
    while a</pre>
&nbsp;

The above script uses both <a title="Python Docs" href="http://docs.python.org/library/thread.html">thread</a> and <a title="Python Docs" href="http://docs.python.org/library/threading.html">threading</a> module. The thread module is used to create threads and threading module is used to get information on the current running threads in the process.

Here the function fib(n) is actually a <a title="Python Docs" href="http://wiki.python.org/moin/Generators">generator</a> and returning a iterator (returning a <a title="Python Docs" href="http://docs.python.org/release/2.5.2/ref/yield.html">generator iterator</a>). Thus we are able to iterate over the Fibonacci numbers using these generators. After the required number of Fibonacci numbers have been generated, <a title="Python Docs" href="http://docs.python.org/library/thread.html#thread.exit">thread.exit_thread() </a>is called which exits the running thread silently.

After creating the thread the script prints the information of the running threads. (Execute to see)

In the end, I would be showing you the code for the (quite famous) <a title="Wikipedia" href="http://en.wikipedia.org/wiki/Producer-consumer_problem">Consumer-Producer problem</a> which would include the code for using locks.

Code:


#!/usr/bin/env python

import time
import thread

## Implementing consumer-producer problem using threads and locks

product = []

def producer(lock, produce_time, lim, *args):

    pr_val = 0

    while True:

        print "Producing.."
        time.sleep(produce_time)
        print "Produced {}".format(pr_val)

        lock.acquire_lock()
        print "P: Lock ACK"
        product.append(pr_val)
        print "Added product {}".format(pr_val)
        lock.release_lock()
        print "P: Release ACK"

        pr_val += 1

        if pr_val > lim:
            break

def consumer(lock, consume_time, waiting_time, lim, *args):

    con_val = 0
    got_product = False

    while True:

        lock.acquire_lock()
        print "C: Lock ACK"

        try:
            con_val = product.pop()
            print "Retrieved value {}".format(con_val)
            got_product = True
        except IndexError:
            print "No produce!"
            got_product = False

        lock.release_lock()
        print "C: Release ACK"

        if got_product:
            print "Consuming.. {}".format(con_val)
            time.sleep(consume_time)
        else:
            print "Waiting for produce"
            time.sleep(waiting_time)

        if con_val == lim:
            break

if __name__=="__main__":

    lock=thread.allocate_lock()
    max_produce = 3
    thread.start_new_thread(producer,(lock, 1, max_produce))
    thread.start_new_thread(consumer,(lock, 2, 1, max_produce))

    # Required for commandline output
    while 1:
        pass

 

The above code creates a lock to be used by the consumer and producer to acquire the produce-line. Then we define the units to be produced. When the threads are started, the consumer and producer are also told the consume-time and produce-time as arguments (these values are implemented in the program using the time.sleep() function).

The consumer thread starts with acquiring the lock on the produce-line and then taking the product from it. If there is no produce yet, it prints an error message. Else, the produce is picked and the lock released. The consumer then consumes the produce and go backs to the start of the loop.

The producer thread starts by producing an item. Then it acquires the lock on the produce-line and adds the produce to it. Then it releases the lock and starts reproducing.

What I have not covered: The threads can also be created and defined using classes. I could not cover that in this post. A good resource of it can be from IBM and devshed.

Github repository for it : https://github.com/ayushgoel/PythonScripts/tree/master/learning_threading

Also, python (CPython actually) is known to be not very good at threading because of GIL (Global Interpreter Lock) on all the data. Google for more information. 😛

Python script to bring all files from subfolders to main folder

January 3, 2012

A usual plight with me is bringing all photos I transfer from my phone to one folder. My phone transfer creates a subfolder for each date it has a pic for.

This was till I discovered the module shutil. (No, I knew about it for around _4 months, but I was too lazy to actually write the code 😛 )

This code I have written is meant to be cross-platform. In case of any discrepancies, please do tell.

import shutil
import os

# copy all the files in the subfolders to main folder

# The current working directory
dest_dir = os.getcwd()
# The generator that walks over the folder tree
walker = os.walk(dest_dir)

# the first walk would be the same main directory
# which if processed, is
# redundant
# and raises shutil.Error
# as the file already exists

rem_dirs = walker.next()[1]

for data in walker:
for files in data[2]:
try:
shutil.move(data[0] + os.sep + files, dest_dir)
except shutil.Error:
# still to be on the safe side
continue

# clearing the directories
# from whom we have just removed the files
for dirs in rem_dirs:
shutil.rmtree(dest_dir + os.sep + dirs)


 

Since the code is all documented, I would skip the explanation here.

Please comment, share or like the post. (Good for my writing spirit 🙂 )

Some tricks of Python

December 31, 2011

Zen of Python

First of all, see the Zen of Python 😉

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

List Comprehensions

These are one of the most smart things to happen in python. They build up a list for you pretty simply and in ONE line.

>>> x
[]
>>> for i in range(20):
x.append(10 * i) ## append elements to list b

>>> print x ## use the list
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190]

Now this can be done (pretty easily) by:


>>> print [10 * i for i in range(20)]
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190]

A better example would be


>>> [i**2 for i in xrange(10) if i%2 == 0]
[0, 4, 16, 36, 64]
## this gives a list containing squares of all even numbers in range of [0, 10)
## means including 0 and not including 10

>>> [ord(i) for i in raw_input()]
1wed342k
[49, 119, 101, 100, 51, 52, 50, 107]

Now, a more Memory efficient solution to above exists with generators:


>>> for j in (i**2 for i in xrange(10) if i%2 == 0):
print j

0
4
16
36
64

Generator Expression

The generator expression do not calculate all the values at the same time (like it was happening in list comprehensions). It calculates values as and when required.
A generator expression is very similar to list comprehensions (just a change in brackets [] -> () )

In python 2.7

dict and set comprehensions

are also provided:


>>> {i: chr(i) for i in range(48, 58)}  ## a dict formed
{48: '0', 49: '1', 50: '2', 51: '3', 52: '4', 53: '5', 54: '6', 55: '7', 56: '8', 57: '9'}
>>> {chr(i) for i in range(48, 58)}     ## a set formed
set(['1', '0', '3', '2', '5', '4', '7', '6', '9', '8'])

P.S. the above code snippets have been pasted from IDLE python shell. If similar statements are used in scripts the results are not printed.

zip function:


>>> print zip.__doc__
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
Return a list of tuples, where each tuple contains the i-th element
 from each of the argument sequences. The returned list is truncated
 in length to the length of the shortest argument sequence.

The above function can be used to transpose a matrix.!


>>> a = [[1, 2], [4, 3], [5, 6]]
>>> zip(*a)
[(1, 4, 5), (2, 3, 6)]

The (*a) means unpack the list/tuple named a i.e. this operator is valid on both tuples and lists.
It can be used only in function call arguments.
This * is also called the

Splat Operator

😀 .
Dictionaries respond differently and have got an extra operator (**)


>>> a = {'a': 1, 'b': 2, 'c': 3}
>>> def splat(*args):
print args

>>> splat(a)
({'a': 1, 'c': 3, 'b': 2},)
>>> splat(*a)
('a', 'c', 'b')
>>> splat(**a)
Traceback (most recent call last):
File "<pyshell#75>", line 1, in <module>
splat(**a)
TypeError: splat() got an unexpected keyword argument 'a'

>>> def splat(a, b, c):
print a, b, c

>>> splat(a)
Traceback (most recent call last):
File "<pyshell#79>", line 1, in <module>
splat(a)
TypeError: splat() takes exactly 3 arguments (1 given)

>>> splat(*a)
a c b
>>> splat(**a)
1 2 3

Ok, this is something I was mistaking from long.
When you have a list named say my_list and you need its index when looping it, you usually do


>>> my_list = [1, 4, 5, 2, 3, 6]
>>> for i in range(len(my_list)):
print i, my_list[i]

0 1
1 4
2 5
3 2
4 3
5 6

Now, this is something they say “unpythonic” (though i like it 😛 ,  even if its dirty, at least it works 😉 ).

Enumerate

But there is a better way to it:


>>> for index, val in enumerate(my_list):
print index, val

0 1
1 4
2 5
3 2
4 3
5 6

(Ok, now I also agree, it was “unpythonic” 😀 )

Reversing

Now, remember one thing, every slicable type (list and strings to be precise) can be simply reversed by


>>> a
[1, 4, 5, 2, 3, 6]
>>> a[::-1]
[6, 3, 2, 5, 4, 1]
>>> b = "this will be reversed"
>>> b[::-1]
'desrever eb lliw siht'

Though reversed(sequence) is better 😛


>>> import timeit
>>> a = timeit.Timer('a[::-1]', 'a = [2, 4, 1, 6]')
>>> a.timeit()
0.7627712499545058
>>> a = timeit.Timer('reversed(a)', 'a = [2, 4, 1, 6]')
>>> a.timeit()
0.5815445977713054

itertools module

And this is for some testing purposes (my favorite module)
These ones are most used by me to generate tests. You can see other functions in the modules.


>>> import itertools as it
>>> print list(it.combinations('asd', 2))         ## all the combinations
[('a', 's'), ('a', 'd'), ('s', 'd')]
>>> print list(it.combinations_with_replacement('asd', 2)) ## combinations + repeated values
[('a', 'a'), ('a', 's'), ('a', 'd'), ('s', 's'), ('s', 'd'), ('d', 'd')]
>>> print list(it.permutations('asd', 2))         ## permutations
[('a', 's'), ('a', 'd'), ('s', 'a'), ('s', 'd'), ('d', 'a'), ('d', 's')]
>>> print list(it.product('asd', 'def'))          ## product of sequences
[('a', 'd'), ('a', 'e'), ('a', 'f'), ('s', 'd'), ('s', 'e'), ('s', 'f'), ('d', 'd'), ('d', 'e'), ('d', 'f')]

P.S. The above functions actually return an iterator (the one with next() function), but tho show the use, I have put out all the values in the form of a list.

 

This was one of my best finds on internet:


>>> import antigravity

Just give it a try 😉

 

And at last, something to laugh.. 😀


>>> from __future__ import braces
SyntaxError: not a chance (<pyshell#106>, line 1)

 

Do remember to comment, like or share the post. 🙂

The new dropbox API

October 22, 2011

Just going through the feeds I found out that Dropbox had given out new APIs. Luckily, they had better support for python this time.

Going along, I have written a script (or an introduction you can say) to these APIs in Python.

I have tried to document the code as much as possible.

do remember to install oauth, setuptools, simplejson as these are not included by default in Python 2.7 installation


## author: Ayush Goel
## Python 2.7 used
## get the Dropbox new API from https://www.dropbox.com/developers/
## mail: ayushgoel111@gmail.com

## do remember to install oauth, setuptools, simplejson
## not included by default in Python 2.7 installation
## http://pypi.python.org/pypi/setuptools
## http://pypi.python.org/pypi/oauth/1.0.1
## http://pypi.python.org/pypi/simplejson/

# Include the Dropbox SDK libraries
from dropbox import client, rest, session

# Get your app key and secret from the Dropbox developer website
APP_KEY =    'INSERT_APP_KEY_HERE'
APP_SECRET =  'INSERT_SECRET_HERE'

# ACCESS_TYPE should be 'dropbox' or 'app_folder' as configured for your app
ACCESS_TYPE = "app_folder" #'INSERT_ACCESS_TYPE_HERE'
sess = session.DropboxSession(APP_KEY, APP_SECRET, ACCESS_TYPE)

request_token = sess.obtain_request_token()

while True:
  url = sess.build_authorize_url(request_token)
  print "url:", url
  import webbrowser
  webbrowser.open_new_tab(url)
  print "You have been redirected to the authorization page."
  print "Please do the authorization within 5 minutes else the URL would expire."
  print "Press ENTER here once you are done. To create the url again, enter any character"
  #print "Please visit this website and press the 'Allow' button, then hit 'Enter' here."
  s=raw_input()
  if s=='':
    break

# This will fail if the user didn't visit the above URL and hit 'Allow'
access_token = sess.obtain_access_token(request_token)

client = client.DropboxClient(sess)
print "linked account:", client.account_info()

## The client object is what is required for the whole
## app buildup by anyone

## anyways, lets have a look at some things of interest of
## all those tokens we just saw

## the url we produced above
print  url
#'https://www.dropbox.com/1/oauth/authorize?oauth_token=n8yyjthf92hv1g5'

print sess.API_CONTENT_HOST
# 'api-content.dropbox.com'
print sess.API_HOST
# 'api.dropbox.com'
print sess.API_VERSION
# 1
print  sess.WEB_HOST
# 'www.dropbox.com'
print  sess.is_linked()
# True
print  sess.locale
# nothing None
print  sess.root
# 'sandbox'
print  sess.signature_method.get_name()
# 'PLAINTEXT'

## Very important to NOTICE: every token we generated
## has two unique identifiers (key, secret)

print  request_token.key
# 'n8yyjthdgff92hvasafd1g5'
print  request_token.secret
# 'qu1dfozfgafeg1hmwrijwum'
print  request_token.verifier
# None
print  access_token.key
# 'd2rdsfjjfgzgd8hwc3j9kiiof'
print  access_token.secret
# 'wfsdfjwglhho0odek2jqby44'

print  access_token.callback_confirmed
# None
print  access_token.get_callback_url()
# None

## file_create_folder() creates a folder in the app folder given to you.
## The folder name is passed as an argument
## a dict is returned giving details of the folder
## errors are raised otherwie (see the documentation)

print client.file_create_folder("folder1")
##        {
##        u'size': u'0 bytes',
##        u'rev': u'104721f34',
##        u'thumb_exists': False,
##        u'bytes': 0,
##        u'modified': u'Fri, 21 Oct 2011 19:48:25 +0000',
##        u'path': u'/folder1',
##        u'is_dir': True,             # memory management
##        u'icon': u'folder',
##        u'root': u'app_folder',      # authotization
##        u'revision': 1
##        }

## client.put_file("C:\Users\Ayush\Desktop\dropify.txt","folder1/")
##
##Traceback (most recent call last):
##  File "<pyshell#31>", line 1, in <module>
##    client.put_file("C:\Users\Ayush\Desktop\dropify.txt","folder1/")
##  File "C:\Users\Ayush\Applications\python_pkg\dropbox-python-sdk-1.2\dropbox-1.2\dropbox\client.py", line 147, in put_file
##    return RESTClient.PUT(url, file_obj, headers)
##  File "C:\Users\Ayush\Applications\python_pkg\dropbox-python-sdk-1.2\dropbox-1.2\dropbox\rest.py", line 142, in PUT
##    return cls.request("PUT", url, body=body, headers=headers, raw_response=raw_response)
##  File "C:\Users\Ayush\Applications\python_pkg\dropbox-python-sdk-1.2\dropbox-1.2\dropbox\rest.py", line 109, in request
##    raise ErrorResponse(r)
##ErrorResponse: [400] {u'path': u"Path 'C:\\Users\\Ayush\\Desktop\\dropify.txt' can't contain \\"}

## correct way of uploading a file
f=open("C:\Users\Ayush\Desktop\dropify.txt")
print client.put_file("folder1/dropify.txt",f)

##        {
##        u'size': u'34 bytes',
##        u'rev': u'204721f34',
##        u'humb_exists': False,
##        u'bytes': 34,
##        u'modified': u'Fri, 21 Oct 2011 19:57:39 +0000',
##        u'path': u'/folder1 (1)',
##        u'is_dir': False,
##        u'icon': u'page_white',
##        u'root': u'app_folder',
##        u'mime_type': u'application/octet-stream',
##        u'revision': 2
##        }

##read a file from dropbox
## we are actually reading the same file we just
## uploaded. It's a text document
a=client.get_file("folder1/dropify.txt")

print a.fileno()
# 588

## get headers of the file we are reading
## a list of tuples is returned
## difference between tuples and lists will be covered later

print a.getheaders()
##        [
##        ('content-length', '34'),
##        ('accept-ranges', 'bytes'),
##        ('server', 'dbws'),
##        ('connection', 'keep-alive'),
##        ('etag', '2n'),
##        ('pragma', 'public'),
##        ('cache-control', 'max-age=0'),
##        ('date', 'Fri, 21 Oct 2011 19:59:28 GMT'),
##        ('content-type', 'text/plain; charset=ascii')
##        ]

print a.read()
## the contents
## please don't do this on your "big files"
## may slow down or clog your app as memory requirements would go very high

## some additional data about the file
print a.reason
# 'OK'
print a.status
# 200
print  a.strict
# 0
print  a.version
# 11
print  a.chunk_left
# 'UNKNOWN'
print  a.chunked
# 0
print  a.begin()
# None
print  a.read()
# ''

## another list of headers
s=a.getheaders()

# I am actually printing the headers beautifully 🙂
for i in s:
print "%15s%s%20s"%(i[0]," : ", i[1])

## content-length :                   34
##  accept-ranges :                bytes
##         server :                 dbws
##     connection :           keep-alive
##           etag :                   2n
##         pragma :               public
##  cache-control :            max-age=0
##           date : Fri, 21 Oct 2011 19:59:28 GMT
##   content-type : text/plain; charset=ascii

print  client.metadata('/')
##{
##        u'hash': u'00d3e63a8e91467dddaf18d04b206e57',
##        u'thumb_exists': False,
##        u'bytes': 0,
##        u'path': u'/',
##        u'is_dir': True,
##        u'icon': u'folder',
##        u'root': u'app_folder', u
##        'contents': [
##                {
##                        u'size': u'0 bytes',
##                        u'rev': u'104721f34',
##                        u'thumb_exists': False,
##                        u'bytes': 0,
##                        u'modified': u'Fri, 21 Oct 2011 19:48:25 +0000',
##                        u'path': u'/folder1',
##                        u'is_dir': True,
##                        u'icon': u'folder',
##                        u'root': u'dropbox',
##                        u'revision': 1
##                },
##                {
##                        u'size': u'34 bytes',
##                        u'rev': u'204721f34',
##                        u'thumb_exists': False,
##                        u'bytes': 34,
##                        u'modified': u'Fri, 21 Oct 2011 19:57:39 +0000',
##                        u'path': u'/folder1',
##                        u'is_dir': False,
##                        u'icon': u'page_white',
##                        u'root': u'dropbox',
##                        u'mime_type': u'application/octet-stream',
##                        u'revision': 2
##                }
##        ],
##        u'size': u'0 bytes'
##}

And yes, I have tested this on my machine, so I am sure it is working..

So, put your coding caps on and go get your own keys from dropbox.

And yes, don’t worry about the .key and .secret , they are scrambled and tempered with.. 😉

Retrieving files from URLs

October 20, 2011

This script was writeen by me way long back. I documented it a little so that it’s easy to understand what it’s trying to do.

## python 3.x compliant
## author: Ayush Goel

import urllib.request as ur

file_url=input('Enter the file URL you want to be downloaded: ')
file_name=input('Enter the path where you want the file to be saved(/enter): ')

if file_name=='':
## if no location provided, we get ourselves a default one

## change the location of download as suited for you
## this one worked on my Win7 machine
file_name='C:\\Users\\Ayush\\Downloads'+file_url.split('/')[-1]

try:
## try to retrieve the file using the URL
ur.urlretrieve(file_url,filename=file_name)

except ur.URLError:
## urls like : "edoc.ub.uni-muenchen.de/7505/1/Fischer_Johannes.pdf"
## headers like http:// https:// are missing
print ("The URL is parsed to be incorrect.. please provide with the complete url, including the protocol name (http,https..)")

except IOError:
## the url given ain't to a file.
## It might be a forwarding URL, we would need the actual file url
print("We are facing issues with the url you provided")

I have included some error issues. If you find any, comment here or PM me.

Some good python interview questions

June 17, 2011

Some questions I had from somewhere. Since I feel somewhat capable of myself, I try to answer them here..

Edit: After all I got that “somewhere” 🙂 The writer is a fellow python enthusiast and has given me the permission to use his questions on my blog (though the answers are to be mine.. :). Thank you Ronak..

Edit: Thanks a lot David Lawrence (Endophage) for your eminent input to modify this post.

1. Name five modules that are included in python by default (many people come searching for this, so I included some more examples of modules which are often used)

datetime           (used to manipulate date and time)
re                         (regular expressions)
urllib, urllib2  (handles many HTTP things)
string                  (a collection of different groups of strings for example all lower_case letters etc)
itertools            (permutations, combinations and other useful iterables)
ctypes                (from python docs: create and manipulate C data types in Python)
email                  (from python docs: A package for parsing, handling, and generating email messages)
__future__      (Record of incompatible language changes. like division operator is different and much better when imported from __future__)
sqlite3               (handles database of SQLite type)
unittest             (from python docs: Python unit testing framework, based on Erich Gamma’s JUnit and Kent Beck’s Smalltalk testing framework)
xml                     (xml support)
logging              (defines logger classes. enables python to log details on severity level basis)
os                        (operating system support)
pickle                (similar to json. can put any data structure to external files)
subprocess    (from docs: This module allows you to spawn processes, connect to their input/output/error pipes, and obtain their return codes)
webbrowser  (from docs: Interfaces for launching and remotely controlling Web browsers.)
traceback       (Extract, format and print Python stack traces)

2. Name a module that is not included in python by default

mechanize
django
gtk

A lot of other can be found at pypi.

3. What is __init__.py used for?

It declares that the given directory is a module package. #Python Docs (From Endophage‘s comment)

4. When is pass used for?

pass does nothing. It is used for completing the code where we need something. For eg:

class abc():
    pass

5. What is a docstring?

docstring is the documentation string for a function. It can be accessed by

function_name.__doc__

it is declared as:

def function_name():
"""your docstring"""

Writing documentation for your progams is a good habit and makes the code more understandable and reusable.

6. What is list comprehension?

Creating a list by doing some operation over data that can be accessed using an iterator. For eg:

>>>[ord(i) for i in string.ascii_uppercase]
     [65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
 >>>

7. What is map?

map executes the function given as the first argument on all the elements of the iterable given as the second argument. If the function given takes in more than 1 arguments, then many iterables are given.  #Follow the link to know more similar functions
For eg:

>>>a='ayush'
>>>map(ord,a)
....  [97, 121, 117, 115, 104]
>>> print map(lambda x, y: x*y**2, [1, 2, 3], [2, 4, 1])
....  [4, 32, 3]
 Help on built-in function map in module __builtin__:

map(...)
map(function, sequence[, sequence, ...]) -> list

Return a list of the results of applying the function to the items of
the argument sequence(s).  If more than one sequence is given, the
function is called with an argument list consisting of the corresponding
item of each sequence, substituting None for missing values when not all
sequences have the same length.  If the function is None, return a list of
the items of the sequence (or a list of tuples if more than one sequence).

#Python Docs

8. What is the difference between a tuple and a list?

A tuple is immutable i.e. can not be changed. It can be operated on only. But a list is mutable. Changes can be done internally to it.

tuple initialization: a = (2,4,5)
list initialization: a = [2,4,5]

The methods/functions provided with each types are also different. Check them out yourself.

9. Using various python modules convert the list a to generate the output ‘one, two, three’

a = ['one', 'two', 'three']
Ans:   ", ".join(a)
>>>help(str.join)
Help on method_descriptor:
 join(...)
 S.join(iterable) -> string
 Return a string which is the concatenation of the strings in the
 iterable.  The separator between elements is S.

10. What would the following code yield?

word = 'abcdefghij'
print word[:3] + word[3:]

Ans: ‘abcdefghij’ will be printed.
This is called string slicing. Since here the indices of the two slices are colliding, the string slices are ‘abc’ and ‘defghij’. The ‘+’ operator on strings concatenates them. Thus, the two slices formed are concatenated to give the answer ‘abcdefghij’.

11. Optimize these statements as a python programmer.

word = 'word'
print word.__len__()

Ans:

word = 'word'
print len(word)

12. Write a program to print all the contents of a file

Ans.

try:
    with open('filename','r') as f:
        print f.read()
except IOError:
    print "no such file exists"

13. What will be the output of the following code

a = 1
a, b = a+1, a+1
print a
print b

Ans.
2
2

The second line is a simultaneous declaration i.e. value of new a is not used when doing b=a+1.

This is why, exchanging numbers is as easy as:

a,b = b,a

😀

14. Given the list below remove the repetition of an element.
All the elements should be unique
words = [‘one’, ‘one’, ‘two’, ‘three’, ‘three’, ‘two’]

Ans:
A bad solution would be to iterate over the list and checking for copies somehow and then remove them!

One of the best solutions I can think of right now:

a = [1,2,2,3]
list(set(a))

set is another type available in python, where copies are not allowed. It also has some good functions available used in set operations ( like union, difference ).

15. Iterate over a list of words and use a dictionary to keep track of the frequency(count) of each word. for example

{‘one’:2, ‘two’:2, ‘three’:2}

Ans:

>>> def dic(words):
  a = {}
  for i in words:
    try:
      a[i] += 1
    except KeyError: ## the famous pythonic way:
      a[i] = 1       ## Halt and catch fire
  return a

>>> a='1,3,2,4,5,3,2,1,4,3,2'.split(',')
>>> a
['1', '3', '2', '4', '5', '3', '2', '1', '4', '3', '2']
>>> dic(a)
{'1': 2, '3': 3, '2': 3, '5': 1, '4': 2}

Without using try-catch block:

>>> def dic(words):
  data = {}
  for i in words:
    data[i] = data.get(i, 0) + 1
  return data

>>> a
['1', '3', '2', '4', '5', '3', '2', '1', '4', '3', '2']
>>> dic(a)
{'1': 2, '3': 3, '2': 3, '5': 1, '4': 2}

PS: Since the collections module (which gives you the defaultdict) is written in python, I would not recommend using it. The normal dict implementation is in C, it should be much faster. You can use timeit module to check for comparing the two.
So, David and I have saved you the work to check it. Check the files on github. Change the data file to test different data.

16. Write the following logic in Python:
If a list of words is empty, then let the user know it’s empty, otherwise let the user know it’s not empty.

Ans.

Can be checked by a single statement (pythonic beauty):

print "The list is empty" if len(a)==0 else "The list is not empty"

>>> a=''
>>> print "'The list is empty'" if len(a)==0 else "'The list is not empty'"
'The list is empty'
>>> a='asd'
>>> print "'The list is empty'" if len(a)==0 else "'The list is not empty'"
'The list is not empty'

17. Demonstrate the use of exception handling in python.

Ans.

try:
  import mechanize as me
except ImportError:
  import urllib as me

## here you have atleast 1 module imported as me.
This is used to check if the users computer has third party libraries that we need. If not, we work with a default library of python. Quite useful in updating softwares.
PS: This is just one of the uses of try-except blocks. You can note a good use of these in API’s.
Also note that if we do not define the error to be matched, the except block would catch any error raised in try block.

18. Print the length of each line in the file ‘file.txt’ not including any whitespaces at the end of the lines.

with open("filename.txt", "r") as f1:
  print len(f1.readline().rstrip())

rstrip() is an inbuilt function which strips the string from the right end of spaces or tabs (whitespace characters).

19. Print the sum of digits of numbers starting from 1 to 100 (inclusive of both)

Ans.

print sum(range(1,101))

range() returns a list to the sum function containing all the numbers from 1 to 100. Please see that the range function does not include the end given (101 here).

print sum(xrange(1, 101))

xrange() returns an iterator rather than a list which is less heavy on the memory.

20. Create a new list that converts the following list of number strings to a list of numbers.

num_strings = [‘1′,’21’,’53’,’84’,’50’,’66’,’7′,’38’,’9′]

Ans.
use a list comprehension

 >>> [int(i) for i in num_strings]
[1, 21, 53, 84, 50, 66, 7, 38, 9]

#num_strings should not contain any non-integer character else ValueError would be raised. A try-catch block can be used to notify the user of this.

Another one suggested by David using maps:

>>> map(int, num_strings)
    [1, 21, 53, 84, 50, 66, 7, 38, 9]

21. Create two new lists one with odd numbers and other with even numbers
num_strings = [1,21,53,84,50,66,7,38,9]

Ans:

>>> odd=[]
>>> even=[]
>>> for i in n:
    even.append(i) if i%2==0 else odd.append(i)

## all odd numbers in list odd
## all even numbers in list even

Though if only one of the lists were requires, using list comprehension we could make:

even = [i for i in num_strings if i%2==0]
odd = [i for i in num_strings if i%2==1]

But using this approach if both lists are required would not be efficient since this would iterate the list two times.!

22. Write a program to sort the following intergers in list

nums = [1,5,2,10,3,45,23,1,4,7,9]

nums.sort() # The lists have an inbuilt function, sort()
sorted(nums) # sorted() is one of the inbuilt functions)

Python uses TimSort for applying this function. Check the link to know more.

23. Write a for loop that prints all elements of a list and their position in the list.
Printing using String formatting

(Thanks endophage for correcting this)

>>> for index, data in enumerate(asd):
....    print "{0} -> {1}".format(index, data)

0 -> 4
1 -> 7
2 -> 3
3 -> 2
4 -> 5
5 -> 9

#OR

>>> asd = [4,7,3,2,5,9]

>>> for i in range(len(asd)):
....    print i+1,'-->',asd[i]

1 --> 4
2 --> 7
3 --> 3
4 --> 2
5 --> 5
6 --> 9

24. The following code is supposed to remove numbers less than 5 from list n, but there is a bug. Fix the bug.

n = [1,2,5,10,3,100,9,24]

for e in n:
  if e<5:
    n.remove(e)
  print n

## after e is removed, the index position gets disturbed. Instead it should be:

a=[]
for e in n:
  if e >= 5:
    a.append(e)
n = a

OR again a list comprehension: 😉

return [i for i in n if i >= 5]

OR use filter

return filter(lambda x: x >= 5, n)

25. What will be the output of the following

def func(x,*y,**z):
....    print z

func(1,2,3)

Ans.

Here the output is :

{}  #Empty Dictionay

x is a normal value, so it takes 1..
y is a list of numbers, so it takes 2,3..
z wants named parameters, so it can not take any value here.
Thus the given answer.

26. Write a program to swap two numbers.

a = 5
b = 9

as i told earlier too, just use:
a,b = b,a

27. What will be the output of the following code

class C(object):
....    def__init__(self):
....        self.x =1

c=C()
print c.x
print c.x
print c.x
print c.x

Ans.

All the outputs will be 1, since the value of the the object’s attribute(x) is never changed.

1
1
1
1

x is now a part of the public members of the class C.
Thus it can be accessed directly..

28. What is wrong with the code

func([1,2,3]) # explicitly passing in a list
func()        # using a default empty list

def func(n = []):
#do something with n

print n

Ans. This would result in a NameError. The variable n is local to function func and can’t be accessesd outside. So, printing it won’t be possible.

Edit: An extra point for interviews given by Shane Green and Peter: “””Another thing is that mutable types should never be used as default parameter values. Default parameter value expressions are only evaluated once, meaning every invocation of that method shares the same default value. If one invocation that ends up using the default value modifies that value–a list, in this case–it will forever be modified for all future invocations. So default parameter values should limited to primitives, strings, and tuples; no lists, dictionaries, or complex object instances.”””
Reference: Default argument values

29. What all options will work?

a.
n = 1
print n++   ## no such operator in python (++)

b.
n = 1
print ++n   ## no such operator in python (++)

c.
n = 1
print n += 1  ## will work

d.
int n = 1
print n = n+1 ##will not work as assignment can not be done in print command like this

e.
n =1
n = n+1      ## will work

30. In Python function parameters are passed by value or by reference?

Ans. By value (check if you want to, I also did the same 😉 It is somewhat more complicated than I have written here (Thanks David for pointing). Explaining all here won’t be possible. Some good links that would really make you understand how things are:

Stackoverflow

Python memory management

Viewing the memory

31.Remove the whitespaces from the string.

s = ‘aaa bbb ccc ddd eee’

Ans.

''.join(s.split())
## join without spaces the string after splitting it

OR

filter(lambda x: x != ‘ ‘, s)

32. What does the below mean?

s = a + ‘[‘ + b + ‘:’ + c + ‘]’

seems like a string is being concatenated. Nothing much can be said without knowing types of variables a, b, c. Also, if all of the a, b, c are not of type string, TypeError would be raised. This is because of the string constants (‘[‘ , ‘]’) used in the statement.

33. Optimize the below code

def append_s(words):
  new_words=[]
  for word in words:
    new_words.append(word + 's')
  return new_words

for word in append_s(['a','b','c']):
  print word

The above code adds a trailing s after each element of the list.

def append_s(words):
return [i+’s’ for i in words] ## another list comprehension 😀

for word in append_s([‘a’,’b’,’c’]):
print word

34. If given the first and last names of bunch of employees how would you store it and what datatype?

Ans. best stored in a list of dictionaries..
dictionary format:  {‘first_name’:’Ayush’,’last_name’:’Goel’}

Since most of the code here gets messed up, I have created a repo on github named Python(blog) which lists all the required code.

Up-vote/share the post if you liked it. Thanks!

Creating the whole “folder tree” via Python

April 7, 2011

Please read the code only if you know something about Python 3.x It has been specifically written in Python 3.0

I created this script long back, when I wanted to add all the music file names I had to a database. It would have been tedious to write them all myself. 😦

But as we know, Python is always to the rescue for such kind of works.

import os

a=os.listdir(os.getcwd())
x=input('Write to file(1) or print(2)?')
if x=='1':
 f=open('directory.txt','w',encoding='utf-8')
 for i in a:
 f.write(i+os.linesep)
 f.close()
elif x=='2':
 for i in a:
 print(i)
else:
 print('wrong input... exiting...')

Thus as you can see, the os module can easily get this whole data in seconds (Please see that this work is really easy for Linux users as they can use the kernel to get the data and then then redirect it to a text file. But I haven’t seen such liberties in Windows 😦 )

The above script is a cool one, but does the most simplest work. But what if we require all the files, even inside the folders too.

import os

def folder(a):
 '''extract folders from a list'''
 b=[]
 for i in a:
 if not i.__contains__('.'):
 b.append(i)
 return b

d={}
'''dictionary to store the folder tree'''

def path(pth):
 '''recursively call this func to
 generate the folder tree'''
 d.__setitem__(pth,os.listdir(pth))
 b=folder(d[pth])
 for i in b:
 try:
 path(pth+"\\"+i)
 except WindowsError:
 continue

def show_dict():
 '''print the dictionary'''
 for i in d.keys():
 print(i)
 for j in d[i]:
 print('  '+j)

def print_dict():
 '''write the dictionary to a file'''
 f=open('directory.txt','w',encoding='utf-8')
 for i in d.keys():
 f.write(i+os.linesep)
 for j in d[i]:
 f.write('  '+j+os.linesep)
 f.close()

if __name__=='__main__':
 path(os.getcwd())
 x=input('Write to file(1) or print(2)?')
 if x=='1':
 print_dict()
 elif x=='2':
 show_dict()
 else:
 print('wrong input... exiting...')

The above program is a really good up gradation of the before script as it creates the whole tree of the files in a folder by recursing on the folders inside it.


%d bloggers like this: