thoughtwisps One commit at a time

Hello and welcome to thoughtwisps! This is a personal collection of notes and thoughts on software engineering, machine learning and the technology industry and community. For my professional website, please see race-conditions. Thank you for visiting!

Python Functions : def statement

This is an on-going series of notes on writing functions in Python. Most of these posts are based on Mark Lutz’s excellent book Learning Python (5th Edition)

A function is a single group of statements that makes our Python code more reusable and structured! So there is every reason to learn how to write functions effectively and to know the various ways in which functions can be created in Python.

The def statement

The most common way to create functions in Python is using def. The def statement simply creates the function object and assigns it to the name written after def. For example,

def average(a,b):
    """
    Computes the average of a and b and returns it
    """
    
    return (a+b)/2.0
    
print average(3,4)
>>> 3.5

The name average points to the block of code that computes the average of the two numbers. However, we can do all sorts of things to a function’s name. For example, we can

  • store it in a list
python_functions=[]
python_functions.append(average)

#call the function from the list and print the result
print python_functions[0](6,5)
>>> 5.5
  • assign it to another name and use the name to refer to it in the future
average_of_two_numbers=average
print average_of_two_numbers(9,8)
>>> 8.5

def is executable

… which means that any function you write will not exist until the Python interpreter reads the def statement that first creates the function. We can easily test this by writing two functions and calling the second function from the first function.


def print_geometric_series():
    """
    Prints a geometric series in a nice way
    :return None:
    """
    geometric_series=compute_geometric_series(2,2,10)
    for integer in geometric_series:
        print integer

print_geometric_series()

def compute_geometric_series(initial, quotient, n):
    """
    Returns a geometric series of length n
    :param initial:
    :param quotient:
    :param n:
    :return:
    """
    return [initial*quotient**x for x in range(n)]
    
>>>NameError: global name 'compute_geometric_series' is not defined

As we can see from the NameError that Python throws, the function compute_geometric_series does not exist at the time when print_geometric_series calls it. If you are coming from a Java background, this may seem bizarre, because of course in Java, the order in which the methods appear in the file doesnot matter.

Random Bezier Curves with Bokeh

Recently, I have been exploring Bokeh, a Python library for creating beautiful data visualizations in the vein of d3.js. Not only does the library provide a good API for creating anything from histograms to chloropleths, but it also provides support for interactive graphs with real-time data updates, which is especially fascinating for anyone working with streaming data such as stock prices.

One of my current side-projects involves creating a dynamically updating map of the London Underground and for that I need to create a map-like visualization of the relative positions of the stations. My initial experiments with using only circles to indicate the positions were less than satisfying (see the pseudomap below).

London Tube Map

To make the map more realistic, I need to connect the circles together using either lines or curves. Enter Bokeh’s Bezier curve functionality. I am still in the process of figuring out exactly how to create a beautiful tube map using Bezier curves and circles. My primary obstacle lies in understanding the mathematics behind the construction of the Bezier and how this translates to controlling the curvature. For anyone struggling with similar questions, here is a very good introduction to Bezier curves. For now, I have just been experimenting by creating random Bezier curves using Bokeh. Below is an example of Bezier curves with random start and endpoints.

Random Bezier Curve using Bokeh

The entire Python snippet used to create the image is given below.

#import the requires libraries
from bokeh.plotting import *
import numpy.random as np

#set the output HTML file, which will contain the graph
output_file('test-tube-map.html')

#generate random start, end and control points
start_point_x=list(np.random_sample(10))
start_point_y=list(np.random_sample(10))
end_point_x=list(np.random_sample(10))
end_point_y=list(np.random_sample(10))

first_control_point_x=[(a+b)/2 for a in start_point_x for b in end_point_x]
first_control_point_y=[(a+b)/2 for a in start_point_y for b in end_point_y]
second_control_point_x=[2*(a+b)/3 for a in start_point_x for b in start_point_x]
second_control_point_y=[2*(a+b)/3 for a in start_point_y for b in end_point_y]

#create the Bokeh plot object
bokeh_plot=figure(title="Some funky Bezier curves")

#add the Bezier curves to the plot object
bokeh_plot.bezier(start_point_x, start_point_y, end_point_x, end_point_y, first_control_point_x, first_control_point_y,
                  second_control_point_x, second_control_point_y)

Programming braindump : Week 09/02/2015

The Braindump

  • I went to a great event organized by Women in Data and Pyladies London. The evening was sponsored by Man group’s AHL fund and included talks by AHL quant developers and quant analysts. All in all, it was a great evening and I am now quite fascinated about the hedge fund world and in particular, the software development that goes into making pricing models.

  • At work, I am working on improving a source-to-source compiler, which transforms Java code into C# and JavaScript code. Needless to say, I am learning a whole lot about compilers, Abstract Syntax Trees and and about the JVM. There might be a blog post in the near future about compilers - they are truly fascinating!

  • I have been reading Cal Newport’s blog. I first came across Newport’s writings in college and was quite taken with his debunking of the “follow your passion” -thesis for career success. Although I read his advice on being successful at college, various issues in my personal life got in the way of implementing these strategies. I’d like to these strategies another try.

  • I started working on a data simulation project related to the London underground. Hopefully, I will be able to give an update post soon.

  • I finally started using Pycharm and I can say that my Python programming productivity has increased. On the other hand, people say that IDEs make a developer stupid. So I’ll have to keep close tabs on the situation and maybe bust out good old vim once in a while (vim is great, by the way, but for TDD I find an IDE to be easier)

On-going programming reading list

  • Learning Python by Mark Lutz: Still working through part IV: Functions and Generators. Hopefully, will finish it in the coming week

  • Bokeh tutorial (needed for my London Underground programming project)

  • graph_tool tutorial (needed for my London Underground programming project)

An Evening at AHL

London is probably one of the most exciting places to live if you are a software developer. True, the rent is absolutely ridiculous and the Tube gets rather crowded during rush hour, but the amount of activities, conferences and meetups geared toward technologists is absolutely staggering.

I am a member of Pyladies London and Women in Data, two meetup groups who do an excellent job organizing interesting events for women working in software engineering and data analysis. Yesterday, I attended a joint Pyladies and Women in Data meetup sponsored by Man Group’s AHL hedge fund.

At AHL, we heard three great talks, one by Carol Ward, AHL COO, one by Charlie Beeson, a quant developer and one by Giuliana Bordigoni, a quant researcher. Charlie Beeson explained the reasons behind the adoption of Python at AHL. While previously, the quant research and the quant platform part used different technologies (R and Matlab for the the quant research and Java and C++ for the quant platform ), now AHL uses Python to bridge the gap between the two.

In addition to developing in-house Python software, AHL developers also contribute to Numpy, Pandas and SciPy open source projects. The dev environment of choice at AHL is Eclipse IDE with the PyDev plugin. I was quite please to see that AHL had developed the F1 short cut for the PyDev plugin, which allows an RStudio like execution of Python code line by line.

Braindump - week 02/02/2015

The Weekly Programming Braindump

  • Sebastian Rashka has written an interesting post on parallel programming with Python

  • When you are designing a Python method (aka function) that involves file IO, is it “better” (in whatever way a programmer may choose to define better) to pass the file as a filename (string) to the method and let the method take care of opening the file or is better to pass it as a file object and do the whole opening conundrum in the client calling the method/function

  • I started learning about tuple spaces and tuple space programming

  • I learned something about blocking and non-blocking IO in Java, but I can’t say I could explain it to someone else. Must do more research.

  • I started using PyCharm and downloaded Spyder. The weekend should be called “Fun with IDEs”

  • I am planning to write a Spyder plugin for bioinformaticians. That is, as soon as I learn how to use Spyder

  • I started learning about limit order books in finance and started writing my own limit order book simulation based on the orderbook package in R

  • I just completed the data harvesting phase for my London Tube Simulation and I could not be more excited about it!

  • I was inspired by KDB+ and Q (I can’t say I have any experience of either really, I just happened to browse the Wikis for fun) to mangle together a pseudo-useful data structure for my next project.

  • I learned about ctags in Vim (but not, like, how you actually make them useful for your project)

  • I learned something about transcompilers and to be honest, they scare me a bit (Compilers scare me too, but in a good way!)

  • I started reading Ilmari Karonen’s introduction to Redcode again and this time it started making more sense.

  • I have been thinking about the Global Interpreter Lock in Python and how one may try to circumnavigate it and what would the consequences of that be (!!!!)

  • TDD has been starting to stick on me. Esp. after I started using Pycharm. It’s just so addicting to see that green “play” button, click on it and get the satisfying “Tests passed”. Ahhh.

  • I found a paper about computing confidence intervals of the AUC. Previously, Python completely zonked on me, when I tried to implement one of the formulas from the paper, but I think I may give it another try this weekend (armed with numpy and scipy).

  • I finally learned the difference between .bashrc and .bash_profile (yeah, I’m slow…but hey, better late than never!)