Hello and welcome to thoughtwisps! This is a personal collection of notes and thoughts on software engineering, machine
learning and the technology industry and community. For my professional website, please see
race-conditions.
Thank you for visiting!
18 Feb 2015
This is an on-going series of notes on writing functions in Python. Most of these posts are based on Mark Lutz’s excellent book Learning Python (5th Edition)
A function is a single group of statements that makes our Python code
more reusable and structured! So there is every reason to learn how
to write functions effectively and to know the various ways in which functions can be created in Python.
The def
statement
The most common way to create functions in Python is using def
.
The def
statement simply creates the function object and assigns it to
the name written after def
.
For example,
def average(a,b):
"""
Computes the average of a and b and returns it
"""
return (a+b)/2.0
print average(3,4)
>>> 3.5
The name average
points to the block of code that computes the average of the two numbers.
However, we can do all sorts of things to a function’s name.
For example, we can
python_functions=[]
python_functions.append(average)
#call the function from the list and print the result
print python_functions[0](6,5)
>>> 5.5
- assign it to another name and use the name to refer to it in the future
average_of_two_numbers=average
print average_of_two_numbers(9,8)
>>> 8.5
def is executable
… which means that any function you write will not exist until the Python interpreter
reads the def
statement that first creates the function. We can easily test this
by writing two functions and calling the second function from the first function.
def print_geometric_series():
"""
Prints a geometric series in a nice way
:return None:
"""
geometric_series=compute_geometric_series(2,2,10)
for integer in geometric_series:
print integer
print_geometric_series()
def compute_geometric_series(initial, quotient, n):
"""
Returns a geometric series of length n
:param initial:
:param quotient:
:param n:
:return:
"""
return [initial*quotient**x for x in range(n)]
>>>NameError: global name 'compute_geometric_series' is not defined
As we can see from the NameError that Python throws, the function compute_geometric_series
does
not exist at the time when print_geometric_series
calls it. If you are coming from a Java background,
this may seem bizarre, because of course in Java, the order in which the methods appear in the file
doesnot matter.
17 Feb 2015
Recently, I have been exploring Bokeh, a Python library
for creating beautiful data visualizations in the vein of d3.js. Not only does
the library provide a good API for creating anything from histograms to
chloropleths, but it also provides support for interactive graphs with real-time
data updates, which is especially fascinating for anyone working with streaming
data such as stock prices.
One of my current side-projects involves creating a dynamically updating map
of the London Underground and for that I need to create a map-like
visualization of the relative positions of the stations. My initial
experiments with using only circles to indicate the positions were less
than satisfying (see the pseudomap below).

To make the map more realistic, I need to connect the circles together using either lines
or curves. Enter Bokeh’s Bezier curve functionality. I am still in the process
of figuring out exactly how to create a beautiful tube map using Bezier curves and circles.
My primary obstacle lies in understanding the mathematics behind the construction of the Bezier and
how this translates to controlling the curvature. For anyone struggling with similar questions,
here is a very good introduction to Bezier curves.
For now, I have just been experimenting by creating random Bezier curves using Bokeh.
Below is an example of Bezier curves with random start and endpoints.

The entire Python snippet used to create the image is given below.
#import the requires libraries
from bokeh.plotting import *
import numpy.random as np
#set the output HTML file, which will contain the graph
output_file('test-tube-map.html')
#generate random start, end and control points
start_point_x=list(np.random_sample(10))
start_point_y=list(np.random_sample(10))
end_point_x=list(np.random_sample(10))
end_point_y=list(np.random_sample(10))
first_control_point_x=[(a+b)/2 for a in start_point_x for b in end_point_x]
first_control_point_y=[(a+b)/2 for a in start_point_y for b in end_point_y]
second_control_point_x=[2*(a+b)/3 for a in start_point_x for b in start_point_x]
second_control_point_y=[2*(a+b)/3 for a in start_point_y for b in end_point_y]
#create the Bokeh plot object
bokeh_plot=figure(title="Some funky Bezier curves")
#add the Bezier curves to the plot object
bokeh_plot.bezier(start_point_x, start_point_y, end_point_x, end_point_y, first_control_point_x, first_control_point_y,
second_control_point_x, second_control_point_y)
15 Feb 2015
The Braindump
-
I went to a great event organized by Women in Data and Pyladies London.
The evening was sponsored by Man group’s AHL fund and included
talks by AHL quant developers and quant analysts. All in all, it was a great
evening and I am now quite fascinated about the hedge fund world and in
particular, the software development that goes into making pricing models.
-
At work, I am working on improving a source-to-source compiler, which
transforms Java code into C# and JavaScript code. Needless to say, I am
learning a whole lot about compilers, Abstract Syntax Trees and
and about the JVM. There might be a blog post in the near future about
compilers - they are truly fascinating!
-
I have been reading Cal Newport’s blog. I first came across Newport’s
writings in college and was quite taken with his debunking of the
“follow your passion” -thesis for career success. Although I read his advice
on being successful at college, various issues in my personal life got
in the way of implementing these strategies. I’d like to these strategies
another try.
-
I started working on a data simulation project related to the London
underground. Hopefully, I will be able to give an update post soon.
-
I finally started using Pycharm and I can say that my Python programming
productivity has increased. On the other hand, people say that IDEs make
a developer stupid. So I’ll have to keep close tabs on the situation and maybe
bust out good old vim once in a while (vim is great, by the way, but for TDD
I find an IDE to be easier)
On-going programming reading list
-
Learning Python by Mark Lutz: Still working through part IV: Functions and Generators. Hopefully, will
finish it in the coming week
-
Bokeh tutorial (needed for my London Underground programming project)
-
graph_tool tutorial (needed for my London Underground programming project)
12 Feb 2015
London is probably one of the most exciting places to live if you are
a software developer. True, the rent is absolutely ridiculous and the Tube gets
rather crowded during rush hour, but the amount of activities, conferences
and meetups geared toward technologists is absolutely staggering.
I am a member of Pyladies London and Women in Data, two meetup groups who do
an excellent job organizing interesting events for women working in
software engineering and data analysis. Yesterday, I attended a joint Pyladies
and Women in Data meetup sponsored by Man Group’s AHL hedge fund.
At AHL, we heard three great talks, one by Carol Ward, AHL COO, one by Charlie
Beeson, a quant developer and one by Giuliana Bordigoni, a quant researcher.
Charlie Beeson explained the reasons behind the adoption of Python at AHL.
While previously, the quant research and the quant platform part used
different technologies (R and Matlab for the the quant research and Java and C++
for the quant platform ), now AHL uses Python to bridge the gap between the two.
In addition to developing in-house Python software, AHL developers also
contribute to Numpy, Pandas and SciPy open source projects. The
dev environment of choice at AHL is Eclipse IDE with the PyDev plugin.
I was quite please to see that AHL had developed the F1 short cut for
the PyDev plugin, which allows an RStudio like execution of Python
code line by line.
06 Feb 2015
The Weekly Programming Braindump
-
Sebastian Rashka has written an interesting post
on parallel programming with Python
-
When you are designing a Python method (aka function) that involves file IO,
is it “better” (in whatever way a programmer may choose to define better) to
pass the file as a filename (string) to the method and let the method take care
of opening the file or is better to pass it as a file object and do the whole
opening conundrum in the client calling the method/function
-
I started learning about tuple spaces and tuple space programming
-
I learned something about blocking and non-blocking IO in Java, but
I can’t say I could explain it to someone else. Must do more research.
-
I started using PyCharm and downloaded Spyder. The weekend should be
called “Fun with IDEs”
-
I am planning to write a Spyder plugin for bioinformaticians. That is,
as soon as I learn how to use Spyder
-
I started learning about limit order books in finance and started writing
my own limit order book simulation based on the orderbook
package in R
-
I just completed the data harvesting phase for my London Tube Simulation and
I could not be more excited about it!
-
I was inspired by KDB+ and Q (I can’t say I have any experience of either
really, I just happened to browse the Wikis for fun) to mangle together
a pseudo-useful data structure for my next project.
-
I learned about ctags in Vim (but not, like, how you actually
make them useful for your project)
-
I learned something about transcompilers and to be honest, they scare me a bit
(Compilers scare me too, but in a good way!)
-
I started reading Ilmari Karonen’s introduction to Redcode
again and this time it started making more sense.
-
I have been thinking about the Global Interpreter Lock in Python and
how one may try to circumnavigate it and what would the consequences
of that be (!!!!)
-
TDD has been starting to stick on me. Esp. after I started using Pycharm.
It’s just so addicting to see that green “play” button, click on it and get
the satisfying “Tests passed”. Ahhh.
-
I found a paper about computing confidence intervals of the AUC.
Previously, Python completely zonked on me, when I tried to implement one
of the formulas from the paper, but I think I may give it another try
this weekend (armed with numpy and scipy).
-
I finally learned the difference between .bashrc and .bash_profile (yeah,
I’m slow…but hey, better late than never!)