Hello and welcome to thoughtwisps! This is a personal collection of notes and thoughts on software engineering, machine
learning and the technology industry and community. For my professional website, please see
race-conditions.
Thank you for visiting!
25 Jan 2015
The word “engineer” appears in my job title, a title that I have held
since graduation in August, but I cannot, in good conscience, call myself an engineer.
If you’ve been following some of the posts on this blog you have probably
figured out that I work in software. That is what I tell people I meet.
“I work in software”. But I am not, nor will I ever be, a software engineer.
There are probably many ways to define
what makes a software engineer, but here is what I think
makes a good software engineer.
A software engineer is someone who can
- comfortably navigate complicated systems
- can formulate a well-architected solution to a business problem
- can come up with a good testing strategy for code
- come to grips with a fairly large and complicated code base quickly
Honestly evaluating my skills, I am none of the things above.
The first Hello World came too late
I sometimes wonder if things would be easier for me had I picked up
computer programming earlier in life. I wrote one hello world at about 15
using C++, but I quickly gave up, because I didn’t understand how to work
with the compiler. This was back in the stone age before Udacity, Codecademy et co
came along.
My first real “hello world” was in Java. By that time I was already a senior in
college. In software engineering terms, a mega old person to be starting off
in the field.
Most of my colleagues have a Commodore-64 somewhere in their childhood.
By their early teens, they were usually experts in C or another language.
In college and postgraduate, they took courses called “Operating systems”
and “Algorithms and data structures”. They can comfortably converse about UDP
and TCP/IP. Occasionally they turn to me and ask, “Do you know what XOR is?”
I know what XOR is, but I had to Google it. I know something about
algorithms and data structures, but I won’t be able to implement a Bloom filter.
Hell, I’ll probably struggle with implementing merge sort or the like.
I didn’t have a Commodore-64 growing up. I didn’t even have my own computer
until I went to high school.
Know the Code-scape or get lost
My colleagues are experts at hiking the coding landscape. They have years
of experience. Loops and libraries, functions and decorators, variables and pointers,
familiar stuff. But I am the n00b of n00bs. I have barely gotten my feet wet
in the code-scape. There are times when “being lost” is my default
existence on every project I am asked to join.
Don’t get me wrong. I am the good girl scout. I do try. I take a bug ticket
and I start looking at the code base. “I will figure out this bug”, I tell myself.
“I will make it. I will find out what makes this code tick.”
But the map I’m holding is upside-down. I read the code and I hit one
speedbump after another: the object references another object, which
references a method in a library that I cannot even find. Within hours,
I have drifted way off-course. I still don’t know what caused the bug.
Can you come up with a solution? I don’t think I can.
Ultimately, a software engineer is someone who sees a problem and
comes up with a working solution. I see a problem, but I do not know where to
start. I end up floundering for hours. I do some TDD and hit a roadblock.
I do some crash-and-burn coding and hit a roadblock. I can’t call myself an
engineer, because if someone gives me a problem and asks me to come up with
a solution, I am absolutely lost. At best, I can give you sometimes that may work
or might be brute force, but it won’t be elegant.
I can build you a raft made of discarded wood, but a real engineer would build
you a yacht.
So when I say, “I just work with software” that is exactly what it means.
Regardless of what it says on my business card, I am not a software engineer.
24 Jan 2015
Tutorials and Advice on Low Level Programming
(proceed at your own risk)
Last updated on 24th January, 2015.
18 Jan 2015
I was recently interviewing for a position at a financial
software company. The interviewer and I were discussing
some features of built-in Python objects and we got onto the topic
of mutable and immutable objects.
I was asked to explain why are Python tuples immutable.
“Wow, I’ve never though about that,” I said.
Afterwards, I felt like a huge dolt. Here I was waxing code-o-sophical about
mutable lists and immutable strings and I didn’t even know why
a core built-in object is immutable! Cue: embarrassed silence and a lot
of facepalming.
I probably won’t be called for a second interview (due to
the aforementioned “stellar” knowledge of Python’s built-in types), but I still want to know
why Python’s tuples are immutable. There is probably no better way to spend
a cold and wet London Sunday evening than cozying up
with Mark Lutz’s Learning Python and a bottle of cold Coke Zero.
So here goes: Python’s tuples!
Introducing the Tuple
A Python tuple is the lovechild of a list and a string. Like a list, a tuple
is a sequence of elements. However, once set, we cannot change a tuple. Like
a Python string, a tuple is immutable. Let’s see how this works in practice.
Tuple Syntax
- A tuple is created using parentheses
#tuple with strings and integers
random_tuple=('1', 'hello', 1, 4)
#tuple with only integers
another_random_tuple= (1,2,3,4,5)
- a tuple can contain objects of mixed types
#tuple with lists and dicts
dict_list_tuple=({'food':'icecream', 'country':'Finland'}, [1,2,3,4,5])
Implications of Tuple immutability
The tuple object supports many of the same methods that appear
with lists and strings. Because a tuple is immutable, you have
to pay attention to how you handle the object that is returned after a method call.
For example, suppose we have a tuple which represents the food
currently stored in my fridge.
fridge_contents=('sushi', 'coca-cola', 'apples')
Suppose I buy some carrots and oranges and want to add them to the tuple
fridge_contents
. I can try “tuple concatenation” in the interactive
shell to get something like the following.
>>> fridge_contents=fridge_contents+('carrots', 'oranges')
('sushi', 'coca-cola', 'apples', 'carrots', 'oranges')
Note that I have to reassign the variable fridge_contents
to the new concatenated tuple.
Now, suppose I want to substitute ‘sushi’ with
‘tanmen’ (a Wasabi dish I am particularly fond of). I could try a trick
that often works with lists
>>> fridge_contents[0]='tanmen'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
The Python shell doesn’t like it. Once you’ve created the tuple, you can’t change it.
You can only create a new tuple and reassign your variable to point to it.
So why is a tuple immutable?
We already have lists to store collections of objects. Why do we need tuples?
According to M. Lutz, the answer lies in their immutability.
Suppose you write a method that depends on a particular sequence of objects
remaining the same throughout the lifetime of the code. If you store these objects
in a list, there is a chance that some other method using the list will accidentally
alter it and thus break your method. Thus, in a way, a tuple offers a guarantee
that a particular collection of objects will remain fixed.
25 Dec 2014
Module Import Basics
Even if you only recently picked up Python programming, chances
are you have already encountered the import
statement.
For example, if you want to generate some numbers from the uniform
distribution on a certain interval, you can fire up Python’s standard
random
module.
import random
random.uniform(0,1)
"""
$ 0.6894232262733903
"""
The import
statement is used to give your script access to methods
and attributes defined in other .py files. These files are often
referred to as modules. In fact, modules form one of the cornerstones
of Python’s program architecture philosophy. A large Python program usually has multiple
modules and one “main” script module, which controls the execution and
importing of other modules.
Let’s set up a small module example, which will help us understand how
module importing works in Python.
Module Importing Example
We need to create two .py files:
For now, be sure to place the files in the same directory.
Later on, we will see how to import modules which reside in different
directories from the main module script.
#essentialfunctions.py
#Python 2.X
def average(number1, number2):
"""
Returns the average of two integers
"""
return (number1+number2)/2.0
#the line below seems pointless,
#but will be used to illustrate a module import concept
print "Hello, you have just imported essentialfunctions.py module"
some_constant=1.234
#main.py
#Python 2.X
import essentialfunctions
print "Starting your main module"
For now, our main.py file only imports the essentialfunctions
module without using any of the methods or variables defined
in the essentialfunctions.py module. Let’s see what happens.
Open up your favorite terminal and run the main.py file.
You should see something like this
$ python main.py
Hello, you have just imported essentialfunctions.py module
Starting your main module
We printed out two strings to the terminal: the first string comes
from the essentialfunctions module, the second one from the main.py
script file. This is a feature of Python’s import statement: when
a module is imported, the code in the module file is executed.
We can now take full advantage of our import and use the methods
and variables defined in the essentialfunctions module.
For example, we could alter our main.py script
to compute the average of two integers using the average
function
defined in the essentialfunctions module.
import essentialfunctions
print "Starting your main module"
print essentialfunctions.average(9,10)
The result should be something like the following
$ python main.py
Hello, you have just imported essentialfunctions.py module
Starting your main module
9.5
Importing Python Modules in Interactive Sessions
Let’s try to import our essentialfunctions module into an interactive
Python session.
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import essentialfunctions
Hello, you have just imported essentialfunctions.py module
Just as before, we see that when Python imports a module, all
of the code in the module file is executed and as a result
“Hello, you have just imported essentialfunctions.py module”
is printed to output.
Let’s try rerunning the import statement in the same interactive session.
>>> import essentialfunctions
>>>
We can see that “Hello, you have just imported essentialfunctions.py module”
is not printed out this time. This is a feature of Python’s import statement.
A module’s code is executed only once even if the code changes.
For example, suppose we edit essentialfunctions.py in another window
while our interactive Python session is running.
#essentialfunctions.py MODIFIED
#Python 2.X
def average(number1, number2):
"""
Returns the average of two integers
"""
return (number1+number2)/2.0
#the line below seems pointless,
#but will be used to illustrate a module import concept
print "Hello, you have just imported essentialfunctions.py module"
some_constant=1.234
print "The essentialfunctions.py module has been altered"
We can try importing essentialfunctions.py in our interactive session
again.
>>> import essentialfunctions
>>>
Nothing is printed.
Mark Lutz in Learning Python, writes “This is by design; imports are too
expensive an operation to repeat more than once per file, per
program run”. If we do want Python to read the modified essentialfunctions.py
module, we have to call the reload()
function. The reload
function is built-in only in Python 2.X. For Python 3.X, you
will first have to import the reload function from the imp
standard library module.
#using reload in Python 3.X
from imp import reload
reload(essentialfunctions)
#or alternatively without using from
import imp
imp.reload(essentialfunctions)
>>> reload(essentialfunctions)
Hello, you have just imported essentialfunctions.py module
The essentialfunctions.py module has been alteredt
The string "The essentialfunctions.py module has been altered" is now printed
to the terminal.
Summary
- Modules are a cornerstone of Python’s program architecture philosophy
- Modules are essentially .py files, which contain useful functions, objects
and variables (a module’s attributes)
- Other files can use a module’s attributes by first importing the
module using the
import
statement
- Python executes the code in the module file once when a module is imported
- Subsequent imports in the same Python session will not execute the code.To
force Python to execute the code use
reload()
. What do you have to do to
use reload()
in Python 3.X?
24 Dec 2014
The full script is available here
A Simple Stock Exchange
Suppose we have a stock exchange that only trades equities
of three companies: Nokia, Apple and Google. The number
of shares of each company that the exchange can trade is listed
as follows:
equities={'Nokia':1000000,
'Google': 2000000,
'Apple': 2400000}
The exchange has allowed three traders: Laura, John and Mark
to trade on the exchange.
tally={'Laura': {'Nokia':0,
'Google':0,
'Apple':0},
'John':{'Nokia':0,
'Google':0,
'Apple':0},
'Mark': {'Nokia':0,
'Google':0,
'Apple':0}}
Every day, the probability of a trader making a purchase is determined by drawing from
the uniform distribution. If the probability is greater or equal
to 0.5, the trade purchases 50 shares of each company.
Else, the trader sells 25 shares of purchased stock.
import random
def purchase_stock(broker):
"""
Trader buys 50 shares of each company
"""
for company in equities.keys():
equities[company]=equities[company]-50
temp_dict=tally[broker]
for company in temp_dict.keys():
temp_dict[company]=temp_dict[company]+50
def sell_stock(broker):
"""
Trader sells 25 shares of each company back to the exchange
"""
temp_dict=tally[broker]
for company in temp_dict.keys():
if temp_dict[company]!=0:
"""
Trader can sell stock back only if he/she bought it
in the first place
"""
temp_dict[company]=temp_dict[company]-25
equities[company]=equities[company]+25
We will also add a method to reset the numbers of shares owned
by all traders and a method which simulates one
day of trading at the exchange.
def reset_exchange():
"""
Sets the tally of each trader to 0
"""
tally={'Laura': {'Nokia':0,
'Google':0,
'Apple':0},
'John':{'Nokia':0,
'Google':0,
'Apple':0},
'Mark': {'Nokia':0,
'Google':0,
'Apple':0}}
def trade():
for broker in tally.keys():
probability=random.uniform(0,1)
if probability>0.5:
purchase_stock(broker)
else:
sell_stock(broker)
After each day of trading at the exchange, we want to print
the number of shares owned by each broker in a nice fashion.
Therefore, we add a method called pretty_print
, which will
print our the tally in a nice fashion.
def pretty_print_tally(dictionary):
"""
Pretty prints a tally of stocks
"""
for broker in dictionary.keys():
print broker + ": "
print "Equities"
for company in dictionary[broker].keys():
print company + ": " + str(dictionary[broker][company])
print "#########################################################"
After ten trading days at the exchange, the standing is as follows:
#########################################################
After day 10
Laura:
Equities
Nokia: 275
Apple: 275
Google: 275
#########################################################
John:
Equities
Nokia: 50
Apple: 50
Google: 50
#########################################################
Mark:
Equities
Nokia: 100
Apple: 100
Google: 100
#########################################################
Programming Postmortem
-
No TDD! I launched straight into writing code without pausing
and thinking about it. Doing TDD requires discipline and skipping on
good TDD practices is not going to help develop my TDD-fu.
-
No tests whatsoever! Thus refactoring will most likely be a nightmare.