Books read during May-August 2015.

Books read in the second quadrimester of 2015

Following my first post on books between the first quadrimester (Jan-Apr), here’s the list for May-Aug. There’s also a post on the books I read read during 2014.

I didn’t read as much as I wanted… but being able to measure how much I’m reading is nice, and I’m coming up with ways to improve on that, e.g. don’t read a book on Haskell and a treatise on Memory written by a historian in parallel.

(I’ve recently discovered that quadrimester is sometimes used as a synonym for quarter in english. Well. In 2016 I might try breaking it into Q1, Q2, Q3, and Q4.)

Continue reading

Beginning studies on deep learning

tl;dr: building features manually is inefficient and extracting them automatically is possible and, in a sense, better.

I’m not writing a lot here these days, so let me summarize: I left my job recently to start working on my Masters at Universidade of São Paulo (USP). I’m studying an area of Machine Learning called Deep Learning (or Representation Learning, the term I prefer) for my research. This post is an overview of what I’ve read so far about the subject.

(By the way, it is practically impossible to enter a PhD program in Brazil without a MSc.)

Continue reading

Notes on To Engineer is Human

To engineer is human (see in Amazon or in Goodreads) is a 1992 book about how knowledge of past design failures is useful in current projects, strongly biased to civil engineering. Some examples are repetitive and not studied in depth, thus making the book a bit dull at times. Even then I found some very interesting reflections and quotes that I wanted to comment on or simply organize for later reviewing.

Continue reading

Rising Lands introduction

Playing Rising Lands in a modern computer

UPDATE: According to the comments of HellRazor, it is possible to play Rising Lands successfully with DOSbox. I added a new section explaining the method.

After reading the StarCraft II novels Heaven’s Devils and Flashpoint, I got back to playing StarCraft 2. I’m an awful player, but remembering the original StarCraft (and its expansion set, Brood War) made me think about games that impressed me when I was young. One of them was Rising Lands, a RTS set in a post-apocalyptic future in which Earth was devastated by a meteor collision and most technological knowledge was lost. (yes, cheesy, but trust me on this – it is good)

Continue reading

Hieroglyphics as types, whitespace as function names

I came across some curious Haskell tweets lately and decided to group them in a unique place.

These made me remember about a curious fact: did you know that there are other types of spaces in Unicode, like U+00A0, the no-break space? What about using it in Ruby? (please don’t)

Whenever I see someone talking about non-ASCII characters in programming languages, I always get back to APL, an old language that used extremely concise notation similar to mathematics itself. Due to most keyboards being horrible, it never caught on. :-(

(Mental note: having some kind of LaTeX math symbols embedded into a language for scientific computing would be… interesting.)

Playing with Lua

I work for a mobile games company as a data scientist. I use Ruby for data wrangling and some sorts of analysis, Python for more specific things (essentially scikit-learn) and bash scripts for gluing everything together.

The developers use Corona for creating our games, which uses Lua. I decided to give that language a try.

Some facts:

  • Lua is tiny. As someone accostumed to Python and Ruby, it is shocking to see such a small standard library. For example, this is the manual – there are only 158 Lua functions listed there.
  • The syntax is incredibly simple. Take a look at these diagrams; if you understand the Extended Backus-Naur Form, you can read Lua’s syntax quite easily. For comparison, Ruby’s syntax is complex enough that there are lots (and lots and lots) of small corner cases that I probably never heard about, even after years using it. Ah! And Ruby’s parse.y has 11.3k lines.
  • Lua was built with embedding in mind; it is used for interface customization in World of Warcraft, for example.
  • It is a Brazilian programming language! :-) Lua was created in 1993 in Rio de Janeiro, according to Wikipedia.

Continue reading

The Data Package Format

At my last job, I worked with data from the Brazilian educational system in several situations. The details aren’t the important part, but the format – a giant denormalized CSV with an accompanying PDF detailing its fields. It is very nice after you work with it for some time, but there are some things that could be better.

In that format, enumerations (fields with a fixed, finite set of values) are encoded as some arbitrary integer range, boolean values as 0 or 1, and other implementation details that are explained in the PDF. Thus far we have a cute CSV with documented fields. Nice, right?

Actually, yes.

Continue reading

Learning new programming languages

Programming languages are possibly one of the simplest parts of software engineering. You can know your language from the inside-out and still have problems in a project — knowing the tool doesn’t imply knowing the craft. But learning a new language is really a lot of fun.

Inspired by Avdi Grimm’s roadmap for learning new languages, I decided to give it a try and put my current interests in writing.

  • Julia –
    I have experience writing code in MATLAB, Octave, Python (with Numpy, Scipy and Pandas) and a bit of R, and still I’m excited with Julia.There are at least 3 features of Julia that are powerful and make me wish to work with it: its Just-In-Time compiler, parallel for and the awesome metaprogramming inherited from LISP.

    The drawback is… is… well, I didn’t have time to really use it and get comfortable writing Julia programs. Yet.

  • Haskell –

    I already tried learning Haskell a couple of times. Maybe 3 or 4 or 5 times. I wrote programs based on mathematics and some simple scripts, most of the syntax isn’t strange anymore, even monads make sense now; however, I still feel a bit stiff when writing Haskell. I don’t know.

    Two books I recently bought might help with that – Real World Haskell and Parallel and Concurrent Programming in Haskell. I probably need to motivate myself to write something useful with it.

  • Rust –

    There is a quote in Rust’s website that sums my expectations of it:

    Rust is a systems programming language that runs blazingly fast, prevents nearly all segfaults, and guarantees thread safety.

    I know how to read C/C++ and even write a bit of it, but it’s messy and takes more time than I usually have for side projects. Writing code that is safe & fast shouldn’t be so hard. ;)

All-in-all, this is a very brief list. However, I don’t think I should focus on more languages right now. To be honest, I think that my next learning targets are in applied mathematics. I need a stronger foundation in Partial Differential Equations and Probability Theory. There are several topics in optimization that I should take the time to study. Calculus of variations also seems quite cool.

(good thing that I have friends in pure math to help me find references!)

SciRuby projects for Google Summer of Code 2015

Another year with SciRuby accepted as a mentoring organization in Google Summer of Code (GSoC)! The Community Bonding Period ended yesterday; the coding period officially begins today.

I’m really happy with the projects chosen this year; various different subjects and some would be really useful for me, i.e. Alexej’s LMM gem, Sameer’s Daru and Will’s changes to NMatrix.

That’s all. After the next GSoC meeting, I should write about how each of the projects are going.

The most socially useful communication technology

Text is the most socially useful communication technology. It works well in 1:1, 1:N, and M:N modes. It can be indexed and searched efficiently, even by hand. It can be translated. It can be produced and consumed at variable speeds. It is asynchronous. It can be compared, diffed, clustered, corrected, summarized and filtered algorithmically. It permits multiparty editing. It permits branching conversations, lurking, annotation, quoting, reviewing, summarizing, structured responses, exegesis, even fan fic.

I read the post “Always bet on text” today and, I must say, it is a beautiful way to look at the process of communicating by writing. :)

Excel trying to take over the world

Intuitively, it is not just the limited capability of ordinary software that makes it safe: it is also its lack of ambition. There is no subroutine in Excel that secretly wants to take over the world if only it were smart enough to find a way.
— Nick Bostrom, Superintelligence

I wouldn’t be so certain about it.

There are “scientists” (economists) who think it is OK to use Excel for making predictions that affect several people, as you can see from this article in The Guardian. Essentially, they didn’t add four years of data from New Zealand to a spreadsheet. Other methodological factors were in effect as well. And all of this contributed to lots of people losing their jobs in various countries when the recommended austerity measures were put in place. Imagine if Excel wanted to take over the world.

The paper that discusses in depth about Reinhart & Rogoff’s mistake is “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogo ff“.