Hello and welcome to thoughtwisps! This is a personal collection of notes and thoughts on software engineering, machine learning and the technology industry and community. For my professional website, please see race-conditions. Thank you for visiting!

Why I cannot call myself an engineer

25 Jan 2015

The word “engineer” appears in my job title, a title that I have held since graduation in August, but I cannot, in good conscience, call myself an engineer. If you’ve been following some of the posts on this blog you have probably figured out that I work in software. That is what I tell people I meet. “I work in software”. But I am not, nor will I ever be, a software engineer.

There are probably many ways to define what makes a software engineer, but here is what I think makes a good software engineer.

A software engineer is someone who can

comfortably navigate complicated systems
can formulate a well-architected solution to a business problem
can come up with a good testing strategy for code
come to grips with a fairly large and complicated code base quickly

Honestly evaluating my skills, I am none of the things above.

The first Hello World came too late

I sometimes wonder if things would be easier for me had I picked up computer programming earlier in life. I wrote one hello world at about 15 using C++, but I quickly gave up, because I didn’t understand how to work with the compiler. This was back in the stone age before Udacity, Codecademy et co came along.

My first real “hello world” was in Java. By that time I was already a senior in college. In software engineering terms, a mega old person to be starting off in the field.

Most of my colleagues have a Commodore-64 somewhere in their childhood. By their early teens, they were usually experts in C or another language. In college and postgraduate, they took courses called “Operating systems” and “Algorithms and data structures”. They can comfortably converse about UDP and TCP/IP. Occasionally they turn to me and ask, “Do you know what XOR is?”

I know what XOR is, but I had to Google it. I know something about algorithms and data structures, but I won’t be able to implement a Bloom filter. Hell, I’ll probably struggle with implementing merge sort or the like. I didn’t have a Commodore-64 growing up. I didn’t even have my own computer until I went to high school.

Know the Code-scape or get lost

My colleagues are experts at hiking the coding landscape. They have years of experience. Loops and libraries, functions and decorators, variables and pointers, familiar stuff. But I am the n00b of n00bs. I have barely gotten my feet wet in the code-scape. There are times when “being lost” is my default existence on every project I am asked to join.

Don’t get me wrong. I am the good girl scout. I do try. I take a bug ticket and I start looking at the code base. “I will figure out this bug”, I tell myself. “I will make it. I will find out what makes this code tick.”

But the map I’m holding is upside-down. I read the code and I hit one speedbump after another: the object references another object, which references a method in a library that I cannot even find. Within hours, I have drifted way off-course. I still don’t know what caused the bug.

Can you come up with a solution? I don’t think I can.

Ultimately, a software engineer is someone who sees a problem and comes up with a working solution. I see a problem, but I do not know where to start. I end up floundering for hours. I do some TDD and hit a roadblock. I do some crash-and-burn coding and hit a roadblock. I can’t call myself an engineer, because if someone gives me a problem and asks me to come up with a solution, I am absolutely lost. At best, I can give you sometimes that may work or might be brute force, but it won’t be elegant. I can build you a raft made of discarded wood, but a real engineer would build you a yacht.

So when I say, “I just work with software” that is exactly what it means. Regardless of what it says on my business card, I am not a software engineer.

The Awesome (or not so Awesome) Low-level programming linkspam

24 Jan 2015

Tutorials and Advice on Low Level Programming

(proceed at your own risk)

How can I get more experience with lower level programming
Quick tips for getting into systems programming
[r/lowlevel] (http://www.reddit.com/r/lowlevel/)

Last updated on 24th January, 2015.

Why are Python's tuples immutable?

18 Jan 2015

I was recently interviewing for a position at a financial software company. The interviewer and I were discussing some features of built-in Python objects and we got onto the topic of mutable and immutable objects. I was asked to explain why are Python tuples immutable. “Wow, I’ve never though about that,” I said.

Afterwards, I felt like a huge dolt. Here I was waxing code-o-sophical about mutable lists and immutable strings and I didn’t even know why a core built-in object is immutable! Cue: embarrassed silence and a lot of facepalming.

I probably won’t be called for a second interview (due to the aforementioned “stellar” knowledge of Python’s built-in types), but I still want to know why Python’s tuples are immutable. There is probably no better way to spend a cold and wet London Sunday evening than cozying up with Mark Lutz’s Learning Python and a bottle of cold Coke Zero. So here goes: Python’s tuples!

Introducing the Tuple

A Python tuple is the lovechild of a list and a string. Like a list, a tuple is a sequence of elements. However, once set, we cannot change a tuple. Like a Python string, a tuple is immutable. Let’s see how this works in practice.

Tuple Syntax

A tuple is created using parentheses

#tuple with strings and integers
random_tuple=('1', 'hello', 1, 4)

#tuple with only integers
another_random_tuple= (1,2,3,4,5)

a tuple can contain objects of mixed types

#tuple with lists and dicts
dict_list_tuple=({'food':'icecream', 'country':'Finland'}, [1,2,3,4,5])

Implications of Tuple immutability

The tuple object supports many of the same methods that appear with lists and strings. Because a tuple is immutable, you have to pay attention to how you handle the object that is returned after a method call. For example, suppose we have a tuple which represents the food currently stored in my fridge.

fridge_contents=('sushi', 'coca-cola', 'apples')

Suppose I buy some carrots and oranges and want to add them to the tuple fridge_contents. I can try “tuple concatenation” in the interactive shell to get something like the following.

>>> fridge_contents=fridge_contents+('carrots', 'oranges')
('sushi', 'coca-cola', 'apples', 'carrots', 'oranges')

Note that I have to reassign the variable fridge_contents to the new concatenated tuple. Now, suppose I want to substitute ‘sushi’ with ‘tanmen’ (a Wasabi dish I am particularly fond of). I could try a trick that often works with lists

>>> fridge_contents[0]='tanmen'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

The Python shell doesn’t like it. Once you’ve created the tuple, you can’t change it. You can only create a new tuple and reassign your variable to point to it.

So why is a tuple immutable?

We already have lists to store collections of objects. Why do we need tuples? According to M. Lutz, the answer lies in their immutability. Suppose you write a method that depends on a particular sequence of objects remaining the same throughout the lifetime of the code. If you store these objects in a list, there is a chance that some other method using the list will accidentally alter it and thus break your method. Thus, in a way, a tuple offers a guarantee that a particular collection of objects will remain fixed.

The Module Conundrum (part 1)

25 Dec 2014

Module Import Basics

Even if you only recently picked up Python programming, chances are you have already encountered the import statement. For example, if you want to generate some numbers from the uniform distribution on a certain interval, you can fire up Python’s standard random module.

import random
random.uniform(0,1)
"""
$ 0.6894232262733903
"""

The import statement is used to give your script access to methods and attributes defined in other .py files. These files are often referred to as modules. In fact, modules form one of the cornerstones of Python’s program architecture philosophy. A large Python program usually has multiple modules and one “main” script module, which controls the execution and importing of other modules.

Let’s set up a small module example, which will help us understand how module importing works in Python.

Module Importing Example

We need to create two .py files:

main.py will play the role of the main script module
essentialfunctions.py will be the script module we want to import

For now, be sure to place the files in the same directory. Later on, we will see how to import modules which reside in different directories from the main module script.

#essentialfunctions.py
#Python 2.X

def average(number1, number2):
  """
  Returns the average of two integers
  """
  return (number1+number2)/2.0

#the line below seems pointless,
#but will be used to illustrate a module import concept
print "Hello, you have just imported essentialfunctions.py module"

some_constant=1.234

#main.py
#Python 2.X
import essentialfunctions
print "Starting your main module"

For now, our main.py file only imports the essentialfunctions module without using any of the methods or variables defined in the essentialfunctions.py module. Let’s see what happens. Open up your favorite terminal and run the main.py file. You should see something like this

$ python main.py

Hello, you have just imported essentialfunctions.py module
Starting your main module

We printed out two strings to the terminal: the first string comes from the essentialfunctions module, the second one from the main.py script file. This is a feature of Python’s import statement: when a module is imported, the code in the module file is executed.

We can now take full advantage of our import and use the methods and variables defined in the essentialfunctions module. For example, we could alter our main.py script to compute the average of two integers using the average function defined in the essentialfunctions module.

import essentialfunctions
print "Starting your main module"
print essentialfunctions.average(9,10)

The result should be something like the following

$ python main.py

Hello, you have just imported essentialfunctions.py module
Starting your main module
9.5

Importing Python Modules in Interactive Sessions

Let’s try to import our essentialfunctions module into an interactive Python session.

Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import essentialfunctions
Hello, you have just imported essentialfunctions.py module

Just as before, we see that when Python imports a module, all of the code in the module file is executed and as a result “Hello, you have just imported essentialfunctions.py module” is printed to output.

Let’s try rerunning the import statement in the same interactive session.

>>> import essentialfunctions
>>>

We can see that “Hello, you have just imported essentialfunctions.py module” is not printed out this time. This is a feature of Python’s import statement. A module’s code is executed only once even if the code changes. For example, suppose we edit essentialfunctions.py in another window while our interactive Python session is running.

#essentialfunctions.py MODIFIED
#Python 2.X

def average(number1, number2):
  """
  Returns the average of two integers
  """
  return (number1+number2)/2.0

#the line below seems pointless,
#but will be used to illustrate a module import concept
print "Hello, you have just imported essentialfunctions.py module"

some_constant=1.234

print "The essentialfunctions.py module has been altered"

We can try importing essentialfunctions.py in our interactive session again.

>>> import essentialfunctions
>>>

Nothing is printed. Mark Lutz in Learning Python, writes “This is by design; imports are too expensive an operation to repeat more than once per file, per program run”. If we do want Python to read the modified essentialfunctions.py module, we have to call the reload() function. The reload function is built-in only in Python 2.X. For Python 3.X, you will first have to import the reload function from the imp standard library module.

#using reload in Python 3.X
from imp import reload
reload(essentialfunctions)

#or alternatively without using from
import imp
imp.reload(essentialfunctions)

>>> reload(essentialfunctions)
Hello, you have just imported essentialfunctions.py module
The essentialfunctions.py module has been alteredt
The string "The essentialfunctions.py module has been altered" is now printed
to the terminal.

Summary

Modules are a cornerstone of Python’s program architecture philosophy
Modules are essentially .py files, which contain useful functions, objects and variables (a module’s attributes)
Other files can use a module’s attributes by first importing the module using the import statement
Python executes the code in the module file once when a module is imported
Subsequent imports in the same Python session will not execute the code.To force Python to execute the code use reload(). What do you have to do to use reload() in Python 3.X?

A Simple Stock Exchange Simulation (part 1)

24 Dec 2014

The full script is available here

A Simple Stock Exchange

Suppose we have a stock exchange that only trades equities of three companies: Nokia, Apple and Google. The number of shares of each company that the exchange can trade is listed as follows:

equities={'Nokia':1000000,
          'Google': 2000000,
          'Apple': 2400000}

The exchange has allowed three traders: Laura, John and Mark to trade on the exchange.

tally={'Laura': {'Nokia':0,
                 'Google':0,
                  'Apple':0},
       'John':{'Nokia':0,
                'Google':0,
                'Apple':0},
       'Mark': {'Nokia':0,
                 'Google':0,
                  'Apple':0}}

Every day, the probability of a trader making a purchase is determined by drawing from the uniform distribution. If the probability is greater or equal to 0.5, the trade purchases 50 shares of each company. Else, the trader sells 25 shares of purchased stock.

import random

def purchase_stock(broker):
    """
    Trader buys 50 shares of each company
    """
    for company in equities.keys():
        equities[company]=equities[company]-50
    temp_dict=tally[broker]
    for company in temp_dict.keys():
        temp_dict[company]=temp_dict[company]+50

def sell_stock(broker):
    """
    Trader sells 25 shares of each company back to the exchange
    """
    temp_dict=tally[broker]
    for company in temp_dict.keys():
        if temp_dict[company]!=0:
            """
            Trader can sell stock back only if he/she bought it
            in the first place
            """
            temp_dict[company]=temp_dict[company]-25
            equities[company]=equities[company]+25

We will also add a method to reset the numbers of shares owned by all traders and a method which simulates one day of trading at the exchange.

def reset_exchange():
    """
    Sets the tally of each trader to 0
    """
    tally={'Laura': {'Nokia':0,
                     'Google':0,
                      'Apple':0},
           'John':{'Nokia':0,
                    'Google':0,
                    'Apple':0},
           'Mark': {'Nokia':0,
                     'Google':0,
                      'Apple':0}}
def trade():
    for broker in tally.keys():
        probability=random.uniform(0,1)
        if probability>0.5:
            purchase_stock(broker)
        else:
            sell_stock(broker)

After each day of trading at the exchange, we want to print the number of shares owned by each broker in a nice fashion. Therefore, we add a method called pretty_print, which will print our the tally in a nice fashion.

def pretty_print_tally(dictionary):
    """
    Pretty prints a tally of stocks
    """
    for broker in dictionary.keys():
        print broker + ": "
        print "Equities"
        for company in dictionary[broker].keys():
            print company + ": " + str(dictionary[broker][company])
        print "#########################################################"

After ten trading days at the exchange, the standing is as follows:

#########################################################
After day 10
Laura:
Equities
Nokia: 275
Apple: 275
Google: 275
#########################################################
John:
Equities
Nokia: 50
Apple: 50
Google: 50
#########################################################
Mark:
Equities
Nokia: 100
Apple: 100
Google: 100
#########################################################

Programming Postmortem

No TDD! I launched straight into writing code without pausing and thinking about it. Doing TDD requires discipline and skipping on good TDD practices is not going to help develop my TDD-fu.
No tests whatsoever! Thus refactoring will most likely be a nightmare.

Older Newer

thoughtwisps One commit at a time