Friday, September 26, 2008

How to get rid of dot/slash when running executables on Linux

... and why you shouldn't be doing this unless it is absolutely necessary.

But anyway, sometimes there is this old/legacy program that tries to execute another program inside the same directory without the dot/slash, and we don't have access to its source code or we don't wanna access its source code (for reasons apparent to a CS person)

Here: http://www.linfo.org/dot_slash.html

Have fun.

Wednesday, January 16, 2008

Heap Memory in Multithreaded NUMA Programs

Ok, here is the deal: You want to parallelize you application (say matrix multiplication) and you want asynchronous computation. If you have a shared memory NUMA machine, threads seem to be the obvious choice. The caveat is memory management. Calls to "new" and "delete" are serialized because heap memory is always shared among threads. Also, and probably more costly, there are memory contentions, false sharing, etc.
I tried to use hoard library here is what happened:

Without hoard, using plain GNU C++ library (which might be using ptmalloc?)


Loading Matrices
Loaded !
Loading took 14.019800 seconds
Multiplications started
Transposition took 0.423375 seconds
Multiplication took 4.816739 seconds
Retransposition took 0.486832 seconds
Multiplications finished
8.713002 seconds elapsed (including thread creation cost)


Using hoard:

Loading Matrices
Loaded !
Loading took 2.489643 seconds
Multiplications started
Transposition took 0.463726 seconds
Multiplication took 5.337530 seconds
Retransposition took 0.493123 seconds
Multiplications finished
9.304911 seconds elapsed (including thread creation cost)


What's wrong here? Hoard makes loading matrices really fast (2.4 instead of 14 seconds), but it slowed down the multiplication at the same time (5.3 sec instead of 4.8).
Note that the code uses 16 threads on a 16 core machine.

Monday, January 14, 2008

Boost::bind

Boost::bind library is a useful library that I usually use for my multithreaded code, even though it is not really required in any sense. Instead of writing my own function objects to be called by boost::thread, I delegate the job to boost::bind.
However, there is a caveat and I want to share it with you today.

"The arguments that bind takes are copied and held internally by the returned function object".
So,that means the copy constructors are called for your arguments. If you intentionally used reference parameters to avoid data copying during function calls, using boost::bind blindly just nullifies your efforts :(

Two solutions exist as far as I know:

1) Plain and simple, pass pointers instead. Ok, this is too C-like.
2) Force bind to hold a reference instead of calling the copy constructors and keeping copies of function arguments.boost::ref and boost::cref does that for you

Sunday, January 13, 2008

Analytical Geometry vs. Programming Languages

I guess anyone with a little bit of programming experience knows the storage of multidimensional arrays. For simplicity, I will just deal with 2D arrays.
In C,C++,etc. memory storage is row-major ordered, meaning that rows are stored consecutively. A 3X4 grid would be stored in the memory as follows:

1 2 3 4
5 6 7 8
9 10 11 12

In Fortran or Matlab, that would be:

1 5 9
2 6 10
3 7 11
4 8 12

So now, my old knowledge from analytical geometry forces me to think about the following axis system:

y
|
|
|
|
|
---------------- x

That certainly doesn't fit to the row-major storage system. I have seen so many people with math background that would access the ith row and the jth column like A[j][i] in C++. Why? Because that is the geometric way to do it. By moving along the x-axis, you change the current column and the x-axis naturally seems to be the first axis to be written in row-major storage.

A[x-axis value][y-axis value]


But that's plain wrong, in fact all accesses are of the form A[y-axis value][x-axis value]. If you want to increase the current column you're in, you change the second dimension like A[i][j++]

Weird, right?