# Updates on NMatrix and SciRuby development

For the last couple of days, I’ve been thinking about what I wrote two weeks ago regarding SciRuby and the whole Ruby scientific computing scene. I still believe that the sciruby gem can be used as an integrated environment, but there are some problems that must be solved before:

1. We need a reasonably feature complete and easy to install version of NMatrix.
2. A good plotting tool. Right now, Naoki is working on this as part of GSoC 2014.
3. Statistics. Lots of things are already implemented in Statsample, but both Statsample::DataFrame and Statsample::Vector should use NMatrix behind the hood. Supporting JRuby can be problematic here…
4. Given 1 and 2, it’s possible to implement a lot of other interesting and useful things. For example: linear regression methods, k-means clustering, neural networks, use NMatrix as a matrix type for OpenCV images. There are lots of possibilities.
5. Minimization, integration and others.

With that in mind, my objective for the following weeks is to improve NMatrix. First, there are BLAS routines (mainly from level II, but some stuff from level I and III as well) that aren’t implemented in NMatrix and/or that aren’t available for the rational and ruby objects dtypes. There’s also LAPACK.

Another benefit of having complete C/C++ implementations is that we’ll eventually have to generalize these interfaces to allow other implementations (e.g. Mac OSX vecLib’s LAPACK, Intel’s MKL), thus making it much easier to install NMatrix. As Collin (and, I think, Pjotr) said in the sciruby-dev mailing list, it should be as easy as gem install nmatrix.

## BLAS and LAPACK general implementations

• HAVE_CBLAS_H being derived from mkmf‘s have_header
• Many more routines are implemented. Ideally, BLAS level 1 and 2 should be complete by the end of May.

An important next step is to be able to link against arbitrary BLAS and LAPACK implementations, given that they obey the standard. Issue #188 started some ideas; issue #22 is the original (and very old) one.

## After that…

When NMatrix support both BLAS and LAPACK without a problem — i.e. have its own implementation and can also link against arbitrary ones (OSX’s vecLib, GSL, ATLAS, Intel’s MKL, AMD’s Core Math Library) — we’ll be able to build on top of it. There are some routines in NMatrix that are already working with every dtype, but most of them aren’t. When we know exactly which routines can’t work with which dtypes, we’ll reach a very good standpoint to talk about what we support.

Alright, we have determinants for rational matrices, but not “other operation”, etc. What else? STYPES! We also need to have good support for Yale matrices. (obs: maybe add “old Yale” format?)

There isn’t much to do: we have to support the whole BLAS/LAPACK standard, almost everything linear algebra-wise is in these. After that, it’s mostly improvements to the interface, better method naming, better documentation and examples, better IO, etc.

Another point that would be good to adress is to remove the dependency of g++ > 4.6. We should strive to remove everything that depends on C++11 features, thus allowing normal Mac OSX users to install NMatrix without having to first install another compiler.

## Better documentation

We need to refactor our documentation. Oh, how we need to!

First, remove everything that shouldn’t be in the facing API — the classes and modules used in NMatrix::IO shouldn’t be available in the public API anyway, only the outside-facing stuff: how to save and load to/from each format. Probably more things as well.

Second, do a better job of being consistent with docs. There are some methods without a return type or stuff like that. Lots of methods in the C/C++ world aren’t documented as well. We can do better!

Finally, a really good documentation template. Fivefish is a good choice — it provides a very pretty, searchable and clean interface. (create NMatrix’s docs with it and host on my own server, see what happens).

# Solving linear systems in NMatrix

I’m writing some guides for NMatrix, so in the following weeks there should be some posts similar to this one, but more complex.

Linear systems are one of the most useful methods from “common algebra”. Various problems can be represented by them: systems of linear ODEs, operations research optimizations, linear electrical circuits and a lot of the “wording problems” from basic algebra. We can represent these systems as

$Ax = b$

Where $A$ is a matrix of coefficients and $b$ a vector representing the other side of the equation.

# What are BLAS and LAPACK

At the beginning, the names BLAS, LAPACK and ATLAS confused me — imagine a young programmer, without formal training, trying to understand what’s a “de facto application programming interface standard” with lots of strangely-named functions and some references to the ancient FORTRAN language.

As of now, I think my understanding is sufficient to write about them.

BLAS (Basic Linear Algebra Subroutine) is a standard that provides 3 levels of functions for different kinds of linear algebra operations. Consider $\alpha$ and $\beta$ as scalars, x and y as vectors and A, B and T (triangular) as matrices. The levels are divided in the following way:

1. Scalar and vector operations of the form $y = \alpha * x + y$, dot product and vector norms.
2. Matrix-vector operations of the form $y = \alpha * A * x + \beta * y$ and solving $T * x = y$.
3. Matrix-matrix operations of the form $C = \alpha * A * B + \beta * C$ and solving $B = \alpha * T^{-1} * B$. GEMM (GEneral Matrix Multiply) is contained in this level.

There are several functions in LAPACK (Linear Algebra PACKage), from solving linear systems to eigenvalues and factorizations. It’s much better to take a look at its documentation when you’re looking for something specific.

## A bit of history

BLAS was first published in 1979, as can be seen in this paper. An interesting part of it is the section named Reasons for Developing the Package:

1. It can serve as a conceptual aid in both the design and coding stages of a programming effort to regard an operation such as the dot product as a basic building block.

2. It improves the self-documenting quality of code to identify an operation such as the dot product by a unique mnemonic name.

3. Since a significant amount of the execution time in complicated linear algebraic programs may be spent in a few low level operations, a reduction of the execution time spent in these operations may be reflected in cost savings in the running of programs. Assembly language coded subprograms for these operations provide such savings on some computers.

4. The programming of some of these low level operations involves algorithmic and implementation subtleties that are likely to be ignored in the typical applications programming environment. For example, the subprograms provided for the modified Givens transformation incorporate control of the scaling terms, which otherwise can drift monotonically toward underflow.

So it seems we still use BLAS for the reasons it was created. The paper’s a pretty good read if you have the time. (and if you don’t know what’s a Givens transformation, read this)

LAPACK was first published in 1992, as can be seen in the release history. By reading the LAWNs (LAPACK Working Notes), we can get a pretty good picture of its beginning, e.g. papers that presented techniques which were later added to it and installation notes (with sayings of the sort “[…] by sending the authors a hard copy of the output files or by returning the distribution tape with the output files stored on it”).

## Implementations

There are various implementations of the BLAS API, e.g. by Intel, AMD, Apple and the GNU Scientific Library. The one supported by NMatrix is ATLAS (Automatically Tuned Linear Algebra Software), a very cool project that uses a lot of heuristics to determine optimal compilation parameters to maximize its BLAS & LAPACK implementations’ performance.

As for LAPACK, its original goal was “to make the widely used EISPACK and LINPACK libraries run efficiently on shared-memory vector and parallel processors” (source). Simply put, it’s a library for speeding up various matrix-related routines by taking advantage of each architecture’s memory hierarchy. The trick is that it uses block algorithms for dealing with matrices instead of an element-by-element approach. This way, less time is spent moving data around. It’s written in Fortran 90.

Another important point regarding LAPACK is that it requires a good BLAS implementation — it assumes there’s one already made for the system at hand — by calling the level 3 operations as much as possible.

## Function naming conventions

One of the strangest things about BLAS and LAPACK is how their functions are named. In LAPACK, a subroutine name is of the form pmmaaa, where:

• p is the type of the numbers used, e.g. S for single-precision floating-point and Z for double-precision complex.
• mm is the kind of matrix used in the algorithm, e.g. GE for GEneral matrices, SY for SYmmetric and TB for Triangular Band.
• aaa is the algorithm implemented by the subroutine, e.g. QRF for QR factorization, TRS for solving linear equations with factorization.

BLAS functions are named as <character><name><mod>, which, although similar to LAPACK’s, have differences depending on the specific level. In level 1, <name> is the operation type, while is level 2 and 3 it’s the matrix argument type. For each level, there are some specific values that <mod> (if present) can take, each providing additional information of the operation. <character> is the data type regardless of the level.

These arcane names are derived from the fact that FORTRAN’s identifiers were limited to 6 characters in length. This was solved by FORTRAN90 by allowing up to 31 characters, but the names used in BLAS and LAPACK remain to this day.

## Use in NMatrix

NMatrix has bindings to both BLAS and LAPACK. Let me show you:

If you want to take a look at the low-level bindings, grab some coffee and read the ext/nmatrix/math/ directory. Since 8f129f, it has been greatly simplified and can actually be understood.

## References

Below you can find a list of the main resources used in this post.

# Ruby in scientific computing

I had a dream, which was not all a dream.
The bright sun was extinguish’d, and the stars
Did wander darkling in the eternal space,
Rayless, and pathless, and the icy earth
Swung blind and blackening in the moonless air;
Darkness, George Gordon Byron

My dream wasn’t scary as the one Lord Byron had in the 19th century; I simply imagined Ruby being more used in scientific computations.

SciRuby was accepted as a mentoring organization for Google Summer of Code 2013. This is an opportunity for Rubyists all over the world to see that there are big guys interested in this subject — remember the grant from the Ruby Association last year.

Our mailing list and IRC channel has been receiving lots of attention from people with project ideas and other suggestions, which is great! I’ll wait some more time to write about what I envision for the future, but for now I want to talk about what I’d like to work on for the next couple of months.

### Sciruby::Dataframe

Many Ruby gems have ad-hoc implementations of the data frame concept as there is in the R language. Some examples are:

This situation is obviously inefficient. This was already discussed in SciRuby’s mailing list: we need to create a library to be used in data-heavy projects with NMatrix at its core. Pandas is a great Python example of what I want to build.

One of the GSoC projects is based on designing and implementing this, but, unfortunately, no one demonstrated interest in it yet. As it’s very important (imo), I’ll probably start it anyway.

### NMatrix

There are various points that need improvement in NMatrix. Documentation, rational operations, better algorithms for non-BLAS dtypes, some bugfixes, an easier installation procedure, &c.

I did a lot of work on documentation during my fellowship and some rational operations (determinants and matrix inversion) are partly working. There are some students already asking me about it, so I expect to see lots of progress on it during GSoC.

### General documentation and user guides

In my opinion, the nastiest problem in the Ruby community is the idea that “code is documentation”. This is pure bullshit. Thanks to the language’s elegance, some developers say that “you should read the code” or simply write a wiki page showing how to get started.

Of course, if you’re working on maintaing a library, it makes sense to say that code is documentation (to some extent). Not so if you’re a user pulling his hair out trying to understand why something obvious is failing.

Thus, I’ll continue to improve NMatrix documentation. Two of my goals are to create a good RDoc template for SciRuby in general, probably based off on Rails’, and “SciRuby Guides”, inspired by RailsGuides.

Remember: code isn’t documentation.

By the way, if you’re interested, check SciRuby’s project ideas page or #sciruby at freenode. We need mentors for GSoC, but documentation, filling tickets or writing user stories, any help is appreciated.

Let’s hope that my dream was in fact a premonition.

# Creating a new NMatrix

As I’ve been working with NMatrix documentation lately, I thought that a good explanation of how to create a new NMatrix object and all the options would be useful.

Let’s start with the simplest thing possible: to create a NMatrix from an array of values, without any options:

# How to use NMatrix’s shortcuts

John Woods merged my last pull request to NMatrix recently and I wanted to write about why we created these shortcuts and what can be done with them.

UPDATE: This post is outdated. I’m writing a guide on NMatrix and will probably reuse some parts of this post. I’ll update with a link eventually.