Notes on To Engineer is Human

To engineer is human (see in Amazon or in Goodreads) is a 1992 book about how knowledge of past design failures is useful in current projects, strongly biased to civil engineering. Some examples are repetitive and not studied in depth, thus making the book a bit dull at times. Even then I found some very interesting reflections and quotes that I wanted to comment on or simply organize for later reviewing.

Galileo and wrong models

Galileo’s analysis of the cantilever beam illustrates an extremely important point for understanding how structural accidents can occur: he arrived at what is basically the right qualitative answer to the question he posed himself about the strength of the beam, but his answer was not absolutely correct in a quantitative way. He got the right qualitative answer for the wrong quantitative reason. Thus Galileo could correctly have advised any builders how to orient their beams for the best results, but should he have been asked to predict the absolute minimum-sized beam required to support a certain weight so many feet out from a wall, the answer he calculated from his formula might have been too weak by a factor of three.

— p. 51

I found a blog post about this subject, which you may find interesting.

On learning from mistakes, not successes

[…] the process of successive revision is as common to both writing and engineering as it is to music composition and science, and it is a fair representation of the creative process in writing and in engineering to see the evolution of a book or a design as involving the successive elimination of faults and error. It is this aspect of the analogy that is most helpful in understanding how the celebrated writers and engineers alike learn more from the errors of their predecessors and contemporaries than they do from all the successes in the world.

— p. 79

Should a young engineer look for models in weak-linked structures while they are still functioning, he could indeed design weak links into his own structures. However, if the cause of a failure is understood, then any other similar structures should come under close scrutiny and the incontrovertible lesson of a single failed structure is what not to do in future designs. That is a very positive lesson, and thus the failure of an engineering structure, tragic as it may be, need never be for naught.

— p. 97

Yet no disaster need be repeated, for by talking and writing about the mistakes that escape us we learn from them, and by learning from them we can obviate their recurrence.

— p. 227

Treating every case of failure as an opportunity to test hypotheses, whether embodied in novel designs or in theories about the nature and process of engineering itself, makes even the most ancient of case studies immediately relevant for even the most forward-looking of technologies. In the final analysis, there are aspects of the engineering method, especially those involving conceptualization and the design process itself, that transcend all eras and cultures. Thus every case study—no matter how obsolete its technology or how fresh its trauma, whether in a book or in tomorrow’s news—is potentially a paradigm for understanding how human error and false reasoning can thwart the best laid plans.

— p. 232

The electronic brain

The electronic brain is sometimes promoted from computer or clerk at least to assistant engineer in the design office. Computer-aided design (known by its curiously uncomplimentary acronym CAD) is touted by many a computer manufacturer and many a computer scientist-engineer as the way of the future. But thus far the computer has been as much an agent of unsafe design as it has been a super brain that can tackle problems heretofore too complicated for the pencil-and-paper calculations of a human engineer. The illusion of its power over complexity has led to more and more of a dependence on the computer to solve problems eschewed by engineers with a more realistic sense of their own limitations than the computer can have of its own.

— p. 195

It is not only in the high technological business of building mass-transit buses that our accelerating socio-economic system breaks down. Computer models that predict the behavior of the economy have come increasingly to be relied upon to justify major economic decisions, and yet these models are not necessarily any more infallible than the ones that predict the fatigue life of a bus frame. Thus the same tools that apparently free us from the tedium of analyzing the wheel condemn us to reinvent it. We have come to be a society that is so quick to change that we have lost the benefits of one of mankind’s greatest tools—experience.

— p. 220

On being an engineer

It us a great profession. There is the fascination of watching a figment of the imagination emerge through the aid of science to a plan on paper. Then it moves to realization in stone or metal or energy. Then it brings jobs and homes to men. Then it elevates the standards of living and adds to the comforts of life. That is the engineer’s high privilege.

The great liability of the engineer compared to men of other professions is that his works are out in the open where all can see them. […] The engineer simply cannot deny that he did it. If his works do not work, he is damned. That is the phantasmagoria that haunts his nights and dogs his days. He comes from the job at the end of the day resolved to calculate it again. He wakes in the night in a cold sweat and puts something on paper that looks silly in the morning. All day he shivers at the though of the bugs which will inevitably appear to jolt its smooth consummation.

— p. 215

Other recognized master often express the thought that they abandon a work rather than complete it. What they mean is that they come to realize that for all their drafts and revisions, a manuscript will never be perfect, and they must simply decide when they have caught all its major flaws and when it is as close to perfect as they can make it without working beyond reasonable limits.

— p. 78

Playing Rising Lands in a modern computer

UPDATE: According to the comments of HellRazor, it is possible to play Rising Lands successfully with DOSbox. I added a new section explaining the method.

After reading the StarCraft II novels Heaven’s Devils and Flashpoint, I got back to playing StarCraft 2. I’m an awful player, but remembering the original StarCraft (and its expansion set, Brood War) made me think about games that impressed me when I was young. One of them was Rising Lands, a RTS set in a post-apocalyptic future in which Earth was devastated by a meteor collision and most technological knowledge was lost. (yes, cheesy, but trust me on this – it is good)

Rising Lands introduction

The game does have soldiers mounted on bears, balloons and vehicles with laser weapons! Image from http://www.giantbomb.com/rising-lands/3030-18967.

There are several interesting features (remember that the game was released in 1997). Persistent diplomatic relationships between missions, so you can avoid conflicts if you want to rush the campaign or literally eradicate your enemies. The units become hungry over time, so you can’t move soldiers, builders and farmers across the map mindlessly. You can get extra technology by sending an Archer to spy on other’s Science Labs, opening up new cool things faster. There are four (initiall three) areas of research that you can choose at the start of each mission – Farming, Military, Engineering, and Magic – allowing you to build new kinds of vehicles and units for getting food or destroying enemies faster.

There are a lot of details that are reminiscent of the game, for example the voicing of the units (which is usually funny and well-made) and the cutscenes that appear when your researchers create something new. It can be hard to find good reviews, but the site old-games.com has written some interesting points about it:

With a great premise and very intriguing blend of empire/micro- management (think The Settlers), real-time strategy fare, and historical context of Ultimate Domain, it is a pity that the interface in Rising Lands is not as user-friendly as Warcraft or other classics. You need to make a lot of left-clicks to order the units to do your bidding, and sometimes there is simply not enough time to do that when the enemy’s armies are approaching. However, unlike most game reviewers, I feel the game’s strong points more than outweigh the cumbersome interface. There are simply too many good ideas here, even if not all of them are implemented well. For instance, you have to make sure your units have access to enough food, or they will die. Hunger will also drive your units to abandon their post to find something to eat – a very nice nod to realism that adds a whole new layer to gameplay. You can also trade resources or people with other tribes by the use of markets and special “exchange” buildings, and you need to train messengers to initiate trades and pacts. Like all good 4X games, research is very important in Rising Lands: you get access to balloons and other cool units with enough research.

Source: http://www.old-games.com/download/5513/rising-lands

It’s been a long time since I looked through newer RTS games (apart from StarCraft), so I’m not sure if these ideas aren’t commonplace by now. I’d totally play a game that implemented them in a nicer way, though.

Where do I download it?

First of all: Rising Lands is abandonware. This means it can be downloaded and shared freely (as far as I know) over the Internet, thus the following links don’t count as piracy. I found torrents for the english and french versions of the game, and a download link for the portuguese one as well (on Mega):

However, be aware that you will need to download the english version anyway, as it contains a patch to fix the game speed (more on this later).

How to run it?

First of all, you need to be able to load an IMG file. You can probably get by using something like Daemon Tools. This should mount the IMAGE.img as a CD, thus allowing you to access it.

You don’t need to properly install the game nor anything, just copy the INSTALLHD directory to somewhere else (e.g. your Desktop) and play by running the RISING.EXE executable in that directory.

If you downloaded the portuguese version, you need to load the RisingLands.cue and RisingLands.BIN files in Daemon Tools, which will allow you to get the INSTALLHD directory.

Problems found so far (and possible fixes)

Rising Lands is 1997 game, made to run on Windows 95. We’re practicing necromancy here, and it is not cheap: there are several bugs when running such old software on modern hardware. I collected this from my research so far:

  • “The game is too fast for playing” – You need to download the torrent for the english version (see previous sections). There will be a directory named Patch with a patched RISING.EXE executable. Just copy that one to your INSTALLHD directory and the game should play at a reasonable speed.
  • “The game crashes when I try to save” – You need to provide an empty SAV file first. There’s a very simple trick to do this: disable the option to hide common extensions, create an empty text file and rename it to Save1.SAV (or anything else with extension “SAV”).If you don’t disable the option to show file extensions, you will probably create a file Save1.SAV.txt, so follow the instructions in this link first. The process is similar for Windows 8 and 8.1, simply press Windows Key and q to open a menu, write “Control Panel” and you should be able to follow the rest of the tutorial.
  • “The unknown regions of the map are green and other graphical glitches” – You can fix this by opening the list of processes with Ctrl + Alt + Del and stopping explorer.exe, as can be seen in this video.
  • “The game is crashing randomly” – This one I can’t find a way to fix. Apparently, if you start a mission and it crashes in a specific moment, it will always crash at that moment, independent of the game settings, your units, etc. The only way to “fix” this situation is by restarting the mission and hoping it won’t crash again.From the error message, it is trying to do some forbidden memory access. It is probably debuggable, but I don’t know enough about patching old games to be of any help here.

I’m playing on Windows 8.1 and I can’t finish mission 1 because of the random crashes. The guy who uploaded gameplay videos to YouTube (see below) says he’s using Windows 7, so I’ll open a Virtual Machine and try it someday. I’ll update this post accordingly.

Walkthrough

GeneralTobbe generously posted videos of each mission on YouTube. Look at the first one:

Some more information and next steps

Some folks reported being able to run correctly on a Windows 98 Virtual Machine, but I don’t have a copy of Win98 with me. If you have any other tips on running this game, please comment on this post!

I would really love to build a RTS game someday. If you are a game developer and want to chat about ideas regarding this genre, please contact me.

You can find more information about units on the Wiki. (yes, there is a Wiki!)

RuPy Campinas 2015

Na semana passada participei do RuPy Campinas. Há anos que não ia em algum evento de programação e foi divertido rever algumas pessoas e conhecer outras. :)

Deixarei aqui links para as apresentações que assisti.

Python e a Invasão dos Objetos Inteligentes

Autor: João S. O. Bueno
Slides

O JS trabalhou comigo por uns meses e me ensinou um bucado de Python, então assistir uma apresentação dele foi bem divertido.

De todos os exemplos que ele mostrou, o que mais gostei foi o de Programação Reativa em Python, que vocês podem ver aqui. Essencialmente, é a implementação da engine de uma Spreadsheet em cerca de 30 linhas de código.

Tunando seu código Ruby

Autor: André Luis Anastácio
Slides

Foi uma palestra bem rápida. O mais interessante foi descobrir a gem benchmark-ips. Senti falta de alguns exemplos concretos, i.e. como refatorar código a partir de alguns benchmarks bem feitos.

A evolução de uma arquitetura distribuída

Autor: Guilherme Garnier
Slides

Ouvi novamente sobre algumas coisas que não ouvia há um tempo, como a pattern Circuit Breaker. A história foi parecida com o que vi em outras empresas, o monolito inicial tornou-se difícil de manter e foi quebrado em micro-serviços. No final, percebi que preciso aprender a usar o Docker o quanto antes. :-)

Novas linguagens: o que vem depois do Ruby

Autor: Fabio Akita
Slides

Uma das duas melhores palestras do dia. O ponto alto da palestra foi o grafo de linguagens de programação que o autor montou, mostrando quais linguagens influenciaram outras ao longo do tempo. Podem vê-lo no repositório:

github.com/akitaonrails/computer_languages_genealogy_graphs

Ele citou diversas linguagens curiosas, algumas das quais eu tive o prazer de experimentar, como Ada e Prolog.

A parte mais útil da apresentação foi quando ele comentou sobre a LLVM e como um monte de linguagens a está utilizando agora, e.g. Swift. Apesar dos vários pontos positivos que ele citou, o que mais me marcou foi que fiquei com uma vontade imensa de aprender mais linguagens de programação… enfim, acho que vou adicionar Swift à lista das que quero aprender.

Girando Pratos: Concorrência com Futures em Python

Luciano Ramalho
Slides

Outra palestra realmente boa. Faz um tempo que não uso nada de Python que não seja o NumPy, SciPy ou o Scikit-learn, mas fiquei com vontade de brincar de concorrência (muito embora precise terminar o Parallel and Concurrent Programming in Haskell primeiro…).

As duas bibliotecas usadas na apresentação são a threading e a asyncio.

Hieroglyphics as types, whitespace as function names

I came across some curious Haskell tweets lately and decided to group them in a unique place.

These made me remember about a curious fact: did you know that there are other types of spaces in Unicode, like U+00A0, the no-break space? What about using it in Ruby? (please don’t)

Whenever I see someone talking about non-ASCII characters in programming languages, I always get back to APL, an old language that used extremely concise notation similar to mathematics itself. Due to most keyboards being horrible, it never caught on. :-(

(Mental note: having some kind of LaTeX math symbols embedded into a language for scientific computing would be… interesting.)

Playing with Lua

I work for a mobile games company as a data scientist. I use Ruby for data wrangling and some sorts of analysis, Python for more specific things (essentially scikit-learn) and bash scripts for gluing everything together.

The developers use Corona for creating our games, which uses Lua. I decided to give that language a try.

Some facts:

  • Lua is tiny. As someone accostumed to Python and Ruby, it is shocking to see such a small standard library. For example, this is the manual – there are only 158 Lua functions listed there.
  • The syntax is incredibly simple. Take a look at these diagrams; if you understand the Extended Backus-Naur Form, you can read Lua’s syntax quite easily. For comparison, Ruby’s syntax is complex enough that there are lots (and lots and lots) of small corner cases that I probably never heard about, even after years using it. Ah! And Ruby’s parse.y has 11.3k lines.
  • Lua was built with embedding in mind; it is used for interface customization in World of Warcraft, for example.
  • It is a Brazilian programming language! :-) Lua was created in 1993 in Rio de Janeiro, according to Wikipedia.

Random number generators

After finding so many interesting features about the language, I wrote some random number generators:

I decided to write RNGs after reading John D. Cook’s post about RNGs in Julia. :)

 

The Data Package Format

At my last job, I worked with data from the Brazilian educational system in several situations. The details aren’t the important part, but the format – a giant denormalized CSV with an accompanying PDF detailing its fields. It is very nice after you work with it for some time, but there are some things that could be better.

In that format, enumerations (fields with a fixed, finite set of values) are encoded as some arbitrary integer range, boolean values as 0 or 1, and other implementation details that are explained in the PDF. Thus far we have a cute CSV with documented fields. Nice, right?

Actually, yes.

I was quite happy with it for months, doing analyses and maintaining the internal libraries used to work with it. However, as soon as we started working with data from earlier years, things went awry. Not so obviously wrong, but the code started getting lots of little conditionals, including things like:

And this is freaking ugly.

I thought I could improve the situation. For example, keeping a directory for each year with the dataset (the CSV file) and a JSON describing the schema of the fields. The gains aren’t pronounced in this case, basically it is a translation of the PDF documentation to a computer-readable format.

We could also create a default schema such that each year’s data is mapped to it. This would move the complexity of the application to data pre-processing, which I prefer – that is one of the ugliest and most troublesome steps of data analysis anyway.

The Data Package Format

Today I was organizing the output files of some internal tools I developed at my current job:

So a bunch of directories, each with various CSVs representing data for a country. For reasons that I can’t write here, I started thinking about how it would be awesome if I could write some sort of metadata file for those IDs.

This opens up some possibilities. The format of those CSV files changed some times in the past 2 months, and some of my recent scripts can’t work with earlier versions of them. If I had a metadata file describing the precise schema of those files, I could abort any incompatible operation instead of receiving an error or, much worse, failing silently.

Thankfully, I found something that filled that niche: what is known as a Data Package. It is a bundle of data (which can be in any format, CSV, XLS, etc) and a “descriptor” file called  datapackage.json . Quite simple. The specification can be found here.

For the case I’m working with, i.e. lots of CSV files, there is an extension to the format called Tabular Data Package. Its specification can be found here.

Another thing

These formats are defined using what is called JSON schema, which I hadn’t heard about before. The json-schema.org website shows some interesting examples.

Learning new programming languages

Programming languages are possibly one of the simplest parts of software engineering. You can know your language from the inside-out and still have problems in a project — knowing the tool doesn’t imply knowing the craft. But learning a new language is really a lot of fun.

Inspired by Avdi Grimm’s roadmap for learning new languages, I decided to give it a try and put my current interests in writing.

  • Julia – http://julialang.org/
    I have experience writing code in MATLAB, Octave, Python (with Numpy, Scipy and Pandas) and a bit of R, and still I’m excited with Julia.There are at least 3 features of Julia that are powerful and make me wish to work with it: its Just-In-Time compiler, parallel for and the awesome metaprogramming inherited from LISP.

    The drawback is… is… well, I didn’t have time to really use it and get comfortable writing Julia programs. Yet.

  • Haskell – https://www.haskell.org/

    I already tried learning Haskell a couple of times. Maybe 3 or 4 or 5 times. I wrote programs based on mathematics and some simple scripts, most of the syntax isn’t strange anymore, even monads make sense now; however, I still feel a bit stiff when writing Haskell. I don’t know.

    Two books I recently bought might help with that – Real World Haskell and Parallel and Concurrent Programming in Haskell. I probably need to motivate myself to write something useful with it.

  • Rust – http://www.rust-lang.org/

    There is a quote in Rust’s website that sums my expectations of it:

    Rust is a systems programming language that runs blazingly fast, prevents nearly all segfaults, and guarantees thread safety.

    I know how to read C/C++ and even write a bit of it, but it’s messy and takes more time than I usually have for side projects. Writing code that is safe & fast shouldn’t be so hard. ;)

All-in-all, this is a very brief list. However, I don’t think I should focus on more languages right now. To be honest, I think that my next learning targets are in applied mathematics. I need a stronger foundation in Partial Differential Equations and Probability Theory. There are several topics in optimization that I should take the time to study. Calculus of variations also seems quite cool.

(good thing that I have friends in pure math to help me find references!)

SciRuby projects for Google Summer of Code 2015

Another year with SciRuby accepted as a mentoring organization in Google Summer of Code (GSoC)! The Community Bonding Period ended yesterday; the coding period officially begins today.

I’m really happy with the projects chosen this year; various different subjects and some would be really useful for me, i.e. Alexej’s LMM gem, Sameer’s Daru and Will’s changes to NMatrix.

That’s all. After the next GSoC meeting, I should write about how each of the projects are going.

Tools of the trade

Searching for your tools when you need to use them is bad organization.

Having a standard set of tools is a good thing. I have two toolboxes in my house, one for electronics and another for “hard” tools.

A voltimeter, a Raspberry Pi and an arduino.

 

With that in mind, I decided to list the technologies I’m currently using at work. Some of them will be listed with a * to indicate that I’m still testing & learning about it.

  • Machine – 2013 MacBook Pro, OS X Yosemite.
  • Text editor – vim. I’ve been using it for a year and a half with no intention of switching over to another editor. My vimrc file is on GitHub.
  • Programming languages – Ruby for data cleaning and other pre-processing tasks & Python for building models and preparing results for presentations. I’m working towards using only Ruby, but IRuby, Nyaplot and Daru still need some work before that is possible.
  • Pry is much, much better than the default IRB console. Being able to look in objects contexts everywhere is underrated, you only notice how powerful this is after spending a few minutes to find a bug that would otherwise have taken 1 or 2 hours. Besides, pry-byebug allows you to use a decent debugger, with breakpoints, nexts and continues.
  • Libraries
    • SmarterCSV is quite good for handling CSV files. It has features for reading batches of rows, so bigger files are fine. Its interface is really simple, so I tend to investigate new datasets via  irb -r smarter_csv. For simpler operations, like projections or joins, I prefer CSVkit (as a matter of fact, implementing csvkit in Ruby with SmarterCSV should be a piece of cake).
    • Nyaplot [*] is a great plotting library when used with IRuby. It is very easy to generate interactive plots and there is even an extension for plotting on top of maps, called Mapnya.
    • Pandas for joining and grouping data in notebooks. There is a similar library in Ruby called Daru [*] that I still haven’t had the chance to try.
    • Scikit-learn for building classifiers and doing cross validation easily.
    • Matplotlib for plotting when in Python land. There are some niceties like allowing LaTeX in titles and labels and using subplots and axes.
    • Jupyter notebook are amazing for presenting analysis and results. One of the SciRuby projects is the IRuby notebook, by Daniel Mendler, which brings the same facilities available in IPython to Ruby.
  • GNU Parallel – This is probably the single most useful tool in the list right now. I’m not dealing with large datasets; the largest are a few GBs in size. Instead of booting a Hadoop cluster on AWS, I write my “MapReduce” pipeline with a few scripts and calls to Parallel.
  • Julia Language [*] – I wrote a few number crunching scripts so far, but there’s a lot of potential in Julia. I hope to have something cool to show in some weeks.

And that’s it.

Deep copying objects in Ruby

Time and time again I forget that Object#clone in Ruby is a shallow clone and end up biting myself and spending 30 seconds looking at myself asking what the hell happened. The only difference today is that I decided to finally post about it in my blog – let’s hope this time is the last.

Well, what is a deep copy?

In C++ there is the concept of a copy constructor, which is used when an object is initialized as a copy of an existing object. In many situations this can be deduced by a compiler and you don’t have to worry. If your object contains pointers to things that can’t be shared, however, you have to provide what is called a user-defined copy constructor:

A user-defined copy constructor is generally needed when an object owns pointers or non-shareable references, such as to a file […]

— Wikipedia on Copy Constructor

In Ruby, variables have references to objects. If you want a clone of that variable (e.g. an Array), you can simply do:

This works because numbers are singletons, so you’re not passing references around (when working in 64 bits, the Ruby interpreter inlines Numeric objects for most operations as well). But if you have other arrays or hashes instead of numbers, things start to break. You don’t have the object, but a reference to it, thus when you do:

This kind of bug can be hard to understand when first encountered, so it’s definitely a good thing to have in mind.

The answer

I was in the middle of implementing what I just explained when I noticed I was reinventing the wheel. Turning to Stackoverflow, I found an answer similar to what I was doing and another, simpler, more interesting and applicable to my specific situation:

Duh. The Marshal library is a tool for storing objects as bytes outside of the program for later use. I’ve never had to use it before, as the only apparent use case I have (storing trained statistical classifiers) can be achieved more robustly by saving parameters in a JSON.

But I digress. By storing the object’s data as a byte stream and reconstructing the same object afterwards, you create new copies of each of the constituent objects.

However, there are two problems with this approach:

  • Some objects can’t be marshalled. You’ll need to implement marshalling logic yourself, which kind of defeats the purpose of using this technique: why not implement deep copying instead?
  • It is slow. In some cases this doesn’t matter. I was building a small simulation in which I copied an Array with less than 100 Hashes at each iteration and there were less than 2000 time steps in total, thus resulting in maybe some extra seconds. But for larger scripts this can be problematic.

The second point could be solved by thinking the problem through, but I had 30 minutes to come up with an argument for a point I was about to make in a meeting. I sure hope I never have to do this again (famous last words…)

Books read in the first quarter of 2015

Covers of the books read this quarter

Covers of the books read this quarter

In 2014, I wrote a list of the books read at the time. This year, I’ll collect the books and papers each quarter.

A curious note: Nassim Nicholas Taleb (author of Black Swan) wrote on Facebook about the paper about eusociality (think about insect colonies like ants and bees). The authors showed that kin selection theory is unnecessary given that traditional natural selection theory can predict this relation, and Richard Dawkins attacked them making no mentions to their mathematical models. I’m no biologist, but the text is very accessible (the math too if you’re familiar with stochastic processes).

Papers

  • AI Planning: Systems and Techniques. James Hendler et al. PDF
  • O-Plan: a Common Lisp Planning Web Service. A. Tate and J. Dalton. PDF
  • EduRank: A Collaborative Filtering Approach to Personalization in E-learning. Avi Segal et al. PDF
  • Discovering Gender-Specific Knowledge from Finnish Basic Education using PISA Scale Indices. M. Saarela and T. Kärkkäinen. PDF
  • The Unified Logging Infrastructure for Data Analytics at Twitter. George Lee et al. PDF
  • The Evolution of Eusociality. Martin A. Nowak et al. Read online. Supplementary material.

Mangas

5 Centimeters per Second cover

5 Centimeters per Second manga cover

It has been a long time since I read any mangas, but 5 Centimeters per Second is a masterpiece. Read more about it on MyAnimeList.

The most socially useful communication technology

Text is the most socially useful communication technology. It works well in 1:1, 1:N, and M:N modes. It can be indexed and searched efficiently, even by hand. It can be translated. It can be produced and consumed at variable speeds. It is asynchronous. It can be compared, diffed, clustered, corrected, summarized and filtered algorithmically. It permits multiparty editing. It permits branching conversations, lurking, annotation, quoting, reviewing, summarizing, structured responses, exegesis, even fan fic.

I read the post “Always bet on text” today and, I must say, it is a beautiful way to look at the process of communicating by writing. :)

Excel trying to take over the world

Intuitively, it is not just the limited capability of ordinary software that makes it safe: it is also its lack of ambition. There is no subroutine in Excel that secretly wants to take over the world if only it were smart enough to find a way.
— Nick Bostrom, Superintelligence

I wouldn’t be so certain about it.

There are “scientists” (economists) who think it is OK to use Excel for making predictions that affect several people, as you can see from this article in The Guardian. Essentially, they didn’t add four years of data from New Zealand to a spreadsheet. Other methodological factors were in effect as well. And all of this contributed to lots of people losing their jobs in various countries when the recommended austerity measures were put in place. Imagine if Excel wanted to take over the world.

The paper that discusses in depth about Reinhart & Rogoff’s mistake is “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogo ff“.

Completeness and incomputability

It is notable that completeness and incomputability are complementary properties: It is easy to prove that any complete prediction method must be incomputable. Moreover, any computable prediction method cannot be complete — there will always be a large space of regularities for which the predictions are catastrophically poor.

— Ray Solomonoff, “Algorithmic Probability — Its Discovery — Its Properties and Application to Strong AI”

This quote is a paragraph from the book Randomness Through Computation, an amazing work I was reading this morning.

The idea that any computable prediction method can’t be complete is profound for those of us that work with machine learning; it implies we always have to deal with trade-offs. Explicitly considering this makes for a better thought process when designing applications.

References

  1. Ray Solomonoff — Wikipedia.
  2. Solomonoff’s Lightsaber — Wikipedia, LessWrong

Books so far in 2014

I have a lot of books.

I’ve finally decided to organize my collection and keep track of what I read. In this post, I’ll list the books I read since January — or at least an approximation given by the email confirmations of the ebooks I bought, my memory and the ones in my bookshelf. I also divided them in sections. Papers are included as well.

Continue reading