thoughtwisps One commit at a time

Code Mesh 2016

Code Mesh 2016 : Part 1

Thanks to the generous support of the Code Mesh scholarship program, I was able to attend the conference, meet interesting people working on non-mainstream technologies and most importantly added about 100 new entries to my “Learn More About This” list. What follows is a summary I’ve constructed to remind myself of the talks and provide a convenient reference in future technical research projects.

Code Mesh 2016 Day 1

The first day of the conference kicked off with a keynote by Conor McBride on the topic of Space Monads. My brain - a newcomer to the world of functional programming - was catapulted into the unfamiliar terrain of functors, monads, a peculiar (in terms of syntax ) programming language called Agda and something that eventually compiled to show two moving squares in a console window. What I did ( sort-of) understand is the fact that Agda allows the programmer to specify the type of a program and then automatically generates a program of that type. I am definitely adding space monads to my “Learn More About This” list, but I feel that I can only gain solid understanding of the things that were demonstrated in the keynote after gaining some ground in the functor-monoid-monad territory (Haskell to the rescue? ).

After the keynote, the conference attendees separated into three tracks. I attended a talk on ‘Gossiping Unikernels’ by Andreas Garnaes (Zendesk). Andreas’ discussed rearchitecting a monolith application( everything lives in the same process ) into microservices which are hosted on unikernels. While the traditional ‘server-side’ stack involves multiple layers of virtualisation ( the application space, the container space, the virtual OS, the hypervisor ), unikernels are able to directly interface with the hypervisor. Because unikernels incorporate less code than traditional operating systems, they have a reduced attack surface and are thus (potentially) more secure. The ‘potentially’ in the paraphrasing of Andreas’ talk is my addition - mainly as a footnote, because I am not entirely clear on what the drawbacks of reducing traditional operating system code are. Do unikernels lose some of the traditional OS security by including less library code?

Furthermore, Andreas discussed a problem that emerged once the monolith had been broken into microservices: service discovery - in other words how do other microservices know who is providing what. Two solutions are available: introduce a centralized messaging component or investigate peer-to-peer solutions. The naive implementation of a peer-to-peer solution would be to regularly heartbeat to all members of the microservices cluster, but this approach would take too much bandwidth, since the number of microservices that a component needs to ping to assert membership grows quadratically with the number of microservice nodes. Thus there is a need for a distrubited membership protocol that eventually detects a faulty node, is fairly fast and does not make high demands on the network. SWIM is such a protocol that achieves its aims by randomly heartbeating and gossiping. For example, in a three-node system with Alice, Bob and Charlie microservices, Alice pings Charlie to check for failure. If Charlie reponds, all is good. If Charlie does not respond within a particular timeframe, Alice pings Bob and asks Bob to ping Charlie. If Charlie, reponds to Bob, he relays this response to Alice. If not, Alice records Charlie as unavailable and propagates this information to Bob (gossiping), who is then aware of Charlie being unavailable and in turn propagates this information to all other nodes in the systems. Thus all nodes eventually recognize that Charlie is no longer available.

After describing the theoretical aspect of SWIM, Andreas gave a demo of a pure implementation of the protocol ( once again dipping toes into functional ) and demonstrated how pure implementations can make the development process easier by allowing different types of io interfaces( sync, async, mock, MirageOS) to wrap the core SWIM implementation on demand.
Finally, the talk concluded with a useful tip on how to apply property based testing to reduce the number of execution paths that have to be tested in a distributed system.

After Andreas’ talk, I wandered into the main hall to hear Sophie Wilson speak about the ‘Future of Microprocessors’ . As someone who spends a big part of her day coming with instructions for a silicon brain to execute, I am ashamed to say that the inner workings of that silicon brain are a bit of a mystery. In spite of my inability to coherently define key terms such as microprocess or transistor, I had previously heard of Moore’s Law (“the number of transistors we can fit on a silicon (chip?)” - a lapse in the note taking process - “doubles every 2 years” ). The law is more of an empirical observations and the precise wording, apparently, drifts depending on the chip industry. Sophie showed the original microprocessor designs for the 6502 and the ARM1. All I could do was marvel at the intricate photos and make mental notes to learn more about microprocessors ASAP. In her discussion about FirePath, Sophie demonstrated a set of instructions designed to be truly parallel and remarked that very few moden software development languages are truly parallel and thus are a poor fit for the parallel hardware environment ( a theme that was touched up in Kevin Hammond’s ParaFormance talk later in the afternoon ).

The number of transisitors on a chip is rising, but the ability of devices to utilize this increasing capacity is limited by the cost of etching smaller gates and by the immense heat generated by these power dense microprocessors. Thus even if more and more cores are added to future devices, some of these cores will have to be kept dark to keep the device from overheating. The 10x performance increase that was witnessed in the early development of microprocessors will not be witnessed again.

In the afternoon, Mark Priestly spoke about the process behind the birth of new programming paradigms - with a specific slant toward the functional programming paradigm. Mark pointed out two ways to study the concept of a ‘history of a programming paradigm’ - a backward looking one, captured by Paul Hudak and David Turner and a forward looking one, which is constructed by looking at the prevailing programming problems and challenges in the early 1950s (before the development of the functional paradigm ) and working forwards to see which steps and paths of investigation led to the moden notion of the functional programming paradigm. While Hudak and Turner look to Church’s lambda calculus as the first functional programming language, the forward looking history of the functional programming paradigm traces it’s origin to the work of Herbert and Simon’s theorem proving machine, which gives rise to the list datastructure, which in turn gives rise to Lisp.

After learning a bit about the functional programming paradigm, I went to learn about the ‘A Brief History of Distributed Programming: RPC’ by Caitie McCaffrey (Twitter) and Christopher Meiklejohn (Universite catholique de Louvain) - one of the talks on my ‘Most Anticipated’ list. RPC, as I learned during the talk, stands for Remote Procedure Call and in a nutshell ( which I am hopefully presenting here correctly ) allows one computer to execute some functionality on a remote computer. Caitie and Christopher walked the audience through several iterations of the RPC technology, which was captured in the Request for Comments memos published by various industry members. The problem with RPCs can be boiled down to two main issues: (1) Do we treat everything as local or (2) Everything as remote. RPC has re-emerged in the form of new frameworks, such as Finagle for the JVM. None of these frameworks addresses the original issues (‘everything local’ or everything remote’) and only solves the problem of getting things on and off the wire. Future intresting academic research in this area involves: Lasp, Bloom and Spores.

Finally, I went to Neha Narula’s (MIT Media Lab ) talk on ‘The End of Data Silos: Interoperability via Cryptocurrencies’. Her talk proided an excellent summary on the three ideas that come together to make the blockchain technology possible 1) distributed consensus 2)private key cryptography 3) common data formats and protocols. In distributed consensus, multiple computers come together to agree on a value. This system has to be tolerant against Byzantine failures - failures where certain nodes in the system start sending false or incorrect data. Existing Byzantine fault tolerance protocols require participants to be known ahead of time, which is not possible in a distributed system such as cryptocurrency. This opens up the system to the Sybil attack, in which an attacker creates thousands of subverted nodes to influence the process of distributed consensus. What thwarts this effort in distributed cryptocurrencies is the computational price a new joiner in the blockchain has to pay. A Sybil attack is still possible but would require a large amount of power to be expended by the attacker.