Thursday, October 31, 2013

Exploring Python

Overview

I've been playing with some Python recently and have come to appreciate a lot of its features; some of which, I wanted to share with all of you, as I did with my development team. This is not a beginners tutorial of Python, but merely informal words of my own impressions as I compare Python to some of the other languages I am more familiar with.

Python is a free-to-use, general purpose programming language. It was designed to be some sort of middleware language between the shell and the system. So, initially it was intended for DevOps or System Administrators, so it's no surprise that it is available out of the box in all Linux distributions. However, it has evolved to be much more than that, to a point where entire enterprise-level applications are built on top of it.

Python runs as interpreted byte code (in somewhat similar idea to a Java JVM) and also runs on Windows and Mac. There also exist language ports so that you can mix Python with other programming languages such as Jython (running in the Java JVM) and IronPython (running in the .NET CLR).

Overall, Python is a dynamically-typed language with elegant and concise syntax, a powerful data structure library, garbage collection, and a big community of developers supporting it. It was ranked 8th in the TIOBE index of popular programming languages. It offers both worlds: you can use to do proof-of-concept applications (fast prototyping) or to build entire systems. It also has ample support from PaasS cloud providers such as Heroku and Google App Engine.

How's that for a summary...

Scoping

Python modules are very similar to namespaces in other languages. Like all Python code, they execute top to bottom from the moment they are imported or read in. At that point, the module's functions, classes, and variables are in scope.

One thing that distinguishes Python from other programming languages is that you can conditionally include python modules, meaning you can actually wrap them in if-else statements, this is typical of scripting languages. In addition, you can choose to read in the entire module or just pieces of it, for instance: 


import logging
or 
from logging import *
will read in Python's built-in logging module and all its artifacts. More specifically, I can type:
from logging import Logger

which will just bring the Logger class into the scope of your script. 

Functions also create scope. The function scope starts when the function runs and gets destroyed when the function returns (unless you use generators, more on this soon). Recursive function invocations will create their own namespaces. 

Variables in Python are implicitly global, meaning that if you assign a value to a variable at the beginning of your script, that variable is accessible (global scope) from that moment on. Unless, the variable is assigned a value within a function or class, in that case, they are local to that function or class. There is a caveat to this: Python supports the keyword global. To avoid confusion of accidentally changing a global variable inside of a function, you would prepend the variable assignment with the keyword global. If you do this, you have access to the globally declared variable instead of defining a new local variable inside the function.  Abusing this keyword, however, can make code really hard to follow, so I would do it in edge cases.

Functions

Functions are objects in Python very similar to JavaScript in this respect. This is really nice if you like to code in a functional programming paradigm: functions can be passed to other functions, returned from functions, and assigned to variables.  Like any other language, a function is just a group of code with tight local scope that performs a repeatable task. You can define your own with the keyword def and a label. That label acts as a reference (alias) to the function object. 

Additionally, Python supports the concept of an anonymous function, in this case called a lambda. Lambda don't have a return statement but do return a single expression. As functions, they can be assigned and passed in the exact same way. I would not use lambdas to replace functions, they have completely different purposes. A lambda expression complements a function very well in cases where you need to perform a very specific operation to a set of values, e.g. compute prime numbers, increment, decrement, etc. Typically, you will see lambda functions used in conjunction with Python built in functions: filter, map, reduce.

nums = range(1, 100) 
for i in range(2, 8): 
   nums = filter(lambda x: x == i or x % i, nums)
Another interesting feature of Python is the ability create functions which can accept a variable number of arguments and keyword arguments. You will find some of this in languages like PHP and Javascript.  In Python, you can specify that a function is to accept a variable number of arguments by using the *args function argument. This parameter will bind to a list of arguments of any length. In addition, you can also specify a keyword set of arguments by using the **kwargs parameter. "args" and "kwargs" names are just conventions, it's the "*" and "**" that tell the language to behave this way. A function will look like this:


def myFunc(self, *args, **kwargs):


Decorators

This is one of my favorite language features in Python. The things you would need to do in other programming languages to get something like this is complicated and laborious. As the name suggests, decorators allow you to wrap or "decorate" method invocations. Let me provide an example of a function tracer:

import logging

# configure the logger
FORMAT = 'WARN %(message)s'
logging.basicConfig(format=FORMAT)
logger = logging.getLogger('decorators_2')


count = 0  # global definition outside of function

def trace(myfunc):
 '''
  Function tracer
 '''
 def inner_func(*args, **kwargs):
  global count
  logger.warning('Trace ' + str(count) + ': entering...')
  myfunc(*args, **kwargs)   # invokes original function
  logger.warning('Trace: leaving\n') 
  count += 1   
 return inner_func    

@trace
def some_func_1():
    print('some_func_1')
    
@trace    
def some_func_2():
    print('some_func_2')

def run():
 some_func_1()
 some_func_2() 
 some_func_1()



This snippet of code shows how I can provide logging as the function is entering and before it returns. For things such as debugging or tracing this can be very useful.

You can provide multiple levels of decoration and wrapping. It can actuality get pretty hard to trace and debug. Basically, as soon as a decorated function is called, the Python interpreter will first fire your decorator function and pass an alias to the invoked function. To do something like this in Java, you would probably have to use AspectJ and setup a tracing aspect with all of your point cuts defined. Then you would have to re-compile the code using the AspectJ compiler; definitely, something to think about twice before implementing it. 

Classes and Inheritance

Python is an Object Oriented language. Which means it has support for: polymorphism, inheritance, encapsulation, and abstraction. 

Polymorphism

Since Python is a dynamically typed language, sometimes we take polymorphism for granted. In other words, we don't care so much about type-checking here as we do in other languages. In Python, aside from  doing some reflection work or tests, there is no real need to do check for types in production level code.

You will rely on duck-typing for polymorphic behavior. This mechanism is different from other languages such as Java where polymorphic behavior (called inclusion polymorphism) can only happen with method invocation on classes belonging to the same inheritance path or related by some common super class. Also, you don't have to worry about parametric polymorphism and concepts such as variance, co-variance, and bounded quantification, so understanding this in Python is much easier.


class AlbacotRanger(object): 
    def quack(self):
        print "Quack like an Albacot Ranger Duck!"

class AnconaDuck(object): 
    def quack(self):
        print "Quack line an Ancona Duck!"

    def quackAsADuck(typeOfDuck):
        typeOfDuck.quack()

alba = AlbacotRanger()
ancona =  AnconaDuck()
quackAsADuck(alba)
quackAsADuck(ancona)

In the example above, you can see that classes AlbacotRanger and AnconaDuck do not share a common super class that  defines the method quack( ). In Java, this would not be allowed. In Python, the interpreter will inject the proper type at runtime and it only cares that a method quack( ) exists at that moment. Also remember, that in Python you can actually remove method and variable definitions from an instance at runtime, so checking for this stuff is pointless. Under duck-typing: if it quacks like a duck and acts like a duck, it is a duck.

In addition, if both classes were to belong to a parent class, say Duck, that defines a method quack( ), then you should expect this to work as well in similar manner to Java.

Inheritance

Python supports inheritance just like many other general purpose programming languages do. Its support for multiple inheritance, however, is not just the one where you can extend multiple interfaces but actually extend multiple concrete classes-- pretty insane. This topic can get pretty intense. Multiple inheritance is not a trivial problem to solve, you can think of it as recursive member resolution starting from the child and working its way up the inheritance tree. In other words, a method or member variable is looked under the derived class and if not found recursive up the base classes until reaching the root of all classes, object. For all intent and purpose, think of every method in Python as being virtual.

In order to keep up with the times, after Python 2.2 support for classes changed, in a way that more closely resembles other OO languages out there. These are called "new-style classes." In "old-style," method resolution is done very simply: depth-first, left-to-right class scan; whereas, in new-style method resolution is a bit more tricky. This was done to account for multiple inheritance and support for cooperative calls to super( ). super( ) is only available in new-style.

In new-style classes, method resolution is done using the C3 or Dynamic Method Resolution Order (MRO) algorithm as proposed by Dylan. Being that it is a true multiple inheritance language, Python deals with the inheritance diamond issue by dynamically linearizing the search order so that left-to-right ordering can occur. For extensive details on this, you should read the Dylan paper (resource below).

Behind the scenes, Python stores this dynamic order in a hash called _mro_ which you are not supposed to mess with unless you know what you are doing. This hash can change, to support dynamic reordering of classes. A call to super( ) basically returns a proxy to a parent class instance, and this can be useful to make calls to base methods that have been overridden in derived classes. super ( ) has a second argument to qualify the instance you are referring to, it can be of a class type or an object. To properly design your classes for cooperative calls to super( ), visit the article called "Python's  super( ) considered super!" in the resources section.

Word of caution: old-style classes is something that will eventually be deprecated, so you should never use this in production level code. Stick to "new-style."


Encapsulation

One of the core principles when designing good APIs, is to avoid exposing unnecessary internal state. Unfortunately, the notion of making things "private" like other languages does not exist in Python. But because encapsulation is such a common and recommended practice, Python has limited support for this by using name mangling. 

If you follow the practice of prefixing variable and method names with at least 2 underscores, then Python will textually replace that variable and include the class name. For instance, variable __foo will be replaced with _classname__foo; this will kind "hide" access to that variable and avoid intra class collisions with other identifiers. This happens irrespective of the syntactic position of the identifier, so long as it occurs within the class.

Another part of encapsulation is the ability to make state read-only. Since there is no "final" property concept in Python, you can implement read-only behavior or copy-on-access by using decorators from the abc module. The next section expands on this.


Abstraction

Abstract classes are implemented significantly differently in Python from other programming languages. They are not native, yet supported via the abc module, which stands for Abstract Base Classes.

Classes become abstract when they declare a field called __metaclass__= abc.ABCMeta. By doing this, the Python compiler enhances this class and adds extra metadata and functionality. 

As you would expect ABCs can be subclassed directly (by regular inheritance statement) or you can also use the register( ) function to set unrelated concrete classes as being subclasses of your ABC, effectively making them "virtual subclasses" of your ABC--this is pretty unique to Python. You can do this using the register( ) method in the abc.ABCMeta class. If you perform the issubclass( ) test, it comes out positive. 

The difference with using register( ) as opposed to doing normal inheritance is that the registering class will not factor into the Method Resolution Order (mro) of the registered classes, so calls to super( ) referring to a method in your ABC are not possible.

So, in a way an Abstract Base Class is like a template that enhances the derived class. Typically, we are used to classifying inheritance relationships semantically as IS-A, but in this case that is not necessarily the case, any class can be registered from your ABC --a "virtual" IS-A.

Examples of abstract classes are present in the collections module and the numbers module. 

Let's take a look at a short example:



     from abc import *
     class MyIterable(object):
        def __getitem__(self, index):
           pass
        def __len__(self):
           pass
       def get_iterator(self):
           return iter(self)

     class BaseIterable:
        __metaclass__ = ABCMeta

        @abstractmethod
        def __iter__(self):
           while False:
             yield None

        def get_iterator(self):
           return self.__iter__()

def run():
    BaseIterable.register(MyIterable)
    print issubclass(MyIterable, BaseIterable) and "Is Subclass" or "Not a subclass"

if __name__ == '__main__':
    run()


In the example above, I created a MyIterable class from a BaseIterable (I like to preserve IS-A) abstract base class.  So MyIterator will inherit all of the functionality provided by its meta class and, therefore, the issubclass( ) test passes.

Furthermore, ABCs can also declare abstract methods and abstract properties. The @abc.abstractmethod decorator can be used to annotate methods to act as abstract, meaning they must be overridden by concrete classes. If your ABC declares at least an abstract method or property, it cannot be instantiated directly. Even though abstract methods may contain implementation code, a class derived from an ABC cannot be instantiated unless all abstract methods and properties have been overridden; otherwise, you will get a TypeError. You can always invoke the base class method by calling super( ). Finallythe @abc.abstractmethod decorator will only affect methods for classes that have been derived via regular inheritance; "virtual subclasses" created via register( ) will not be affected by this decorator.

Unlike other programming languages, you can also define abstract properties by using the @abc.abstractproperty decorator. This decorator takes as input functions to define get, set, and delete behavior for a property. As with @abc.abstractmethod, using this decorator requires your class to be derived from ABCMeta. Abstract properties are less common than abstract method, perhaps an example will provide a better explanation. With this you can easily create read-only properties, as such:


import abc
class Base(object):
    __metaclass__ = abc.ABCMeta
   
     def value_getter(self):
        return 'Should never see this'
     def value_setter(self, newvalue):
              return
     value = abc.abstractproperty(value_getter, value_setter)

class Impl(Base):
    
    @abc.abstractproperty
    def value(self):
        return __x
Using the long form allows you to pass in the getter and setter functions. 

Generators

Generators are a powerful tool for creating iterators. By using the yield keyword, you can create (and return) data piece-wise as you are producing results, thereby, generating a new iterator. In this example I am processing a list of names and generating an iterator with names that start with a given letter:

users = ['Luis', 'Camilo', 'Marta', 'Lucia', 'Natasha', 'Dave', 'John', 'Mitch', 'Lalo']

def findNameStartsWith(users, letter):
 for (index,user) in enumerate(users):
  if user[0] == letter:
   yield (index,user)


def run():
 # Find User and process it
 global users
 name = findNameStartsWith(users, 'L')

 for (index,name) in enumerate(name):
  print 'Found: '+ str(name)

Basically a generator is any function that uses yield. This keyword allows you to "return" the value from a function yet preserve the state of the currently executing function at that point in time so that it can continue to be processed on subsequent calls.

Last note

This post basically summarizes my impression of using Python as compared to other more popular programming languages. Even though it was designed to be a middleware language, Python is super powerful and offers many good features that make it a compelling language for the enterprise. It has support for the desktop and the web, where it's more popular. Python on the web can be implemented in multiple ways: you can use an Apache extension like mod_python to run Python within an Apache process; however, the norm nowadays is to use a WSGI compatible web framework. WSGI is a unified programming interface for the web. Its biggest proponent is the Django framework which in my opinion is as brilliant as  Ruby on Rails is for the Ruby community.


Resources

  1. python.org
  2. http://en.wikipedia.org/wiki/Polymorphism_(computer_science)
  3. http://rhettinger.wordpress.com/2011/05/26/super-considered-super/
  4. http://www.youtube.com/watch?v=E_kZDvwofHY
  5. http://www.youtube.com/watch?v=23s9Wc3aWGY
  6. http://tech.blog.aknin.name/2010/04/02/pythons-innards-introduction
  7. žhttps://developers.google.com/appengine/docs/python
  8. žhttp://docs.python.org/2/tutorial/classes.html
  9. http://www.python.org/download/releases/2.3/mro/
  10. http://docs.python.org/2/library/functions.html#super
  11. http://rhettinger.wordpress.com/2011/05/26/super-considered-super/

1 comment:

  1. Im still learning python despite my friends that laugh about this . Thanks for your article !

    ReplyDelete