Wednesday, October 20, 2010

Unintended std::vector blues

std::vector sometimes has rather obscure semantics. For example, when you're writing generic code with MPI and you use vector to store some local data that needs to be sent/received by another processor, the usual trick of retrieving the beginning address of the internal array by &my_vector[0] will fail miserably when T=bool

This post is about another obscurity. Namely the reserve() method. Unless you use push_back()'s, this is rarely useful. Here is why:

vector < T > v, w;
v.reserve(n);
for(int i=0; i<n; ++i) v[i] = (T) i;
w.swap(v);


Now, guess what? You have two completely empty vectors. The reserve only changed capacity, and assignments with operator[] did not need to resize as well since capacity was enough (imho, it should update size whenever an unused location in the vector starts being used, but that's hard to ensure in practice). Therefore, v is zeros-sized even after the for-loop.

Tuesday, October 19, 2010

Intrusive Data Structures

This is about something that standard CS curriculums rarely teach. We always think of data structures as synonyms to containers. We "insert" something, we "delete", or we traverse that "bag" of objects. Then at some point, we figure out that for certain big enough elements, copying data to the container costs more than the actual operations on the container. Later, we go even more crazy and try the discouraged: inserting pointers to the container instead of the actual objects. This leaves us with terrible memory management and long hours of debugging.

An alternative is to think of "intrusive" data structures. As the simplest example, think about a linked list. Instead of inserting the object to a bigger ListNode object like this:

template <typename Obj>
struct ListNode
{
Obj object;
ListNode * next;
}


We can embed the next pointer inside each object inside the list.

struct Object
{
Object * next;
...
}



Read on... for more information: http://www.codefarms.com/publications/intrusiv/Intrusive.pdf