My Friendly Zombie

GraphBLAS keeps track of the pattern of your matrices and vectors, and this pattern can be queried by the user application.  For example, consider:

info = GrB_Matrix_extractElement (&aij, A, i, j) ;

which is similar to aij = A (i,j) in MATLAB if i and j are scalars. If there is no entry A(i,j), then the edge (i,j) does not exist in the graph, and GraphBLAS returns info as GrB_NO_VALUE. Zeros are not special like they are in MATLAB, but that's a topic for another blog post.

Other GraphBLAS operations can delete entries. This can be very costly in MATLAB, but very efficient in SuiteSparse:GraphBLAS. The reason is ...

ZOMBIES!!!

MATLAB doesn't have zombies. It keeps the matrix in compressed-sparse-column form (CSC), which is a sequence of lists, one for each column. The lists are all packed together with no space between them. Each column is a list of the row indices of nonzeros in the column, and their values. If a single entry A(i,j) is deleted in MATLAB, then all columns j to n must be shifted up by one, to close up the empty space. If e=nnz(A) is the number of entries in the matrix, MATLAB takes O(e) time to delete just one entry.  Ouch!

If only we had a way to leave the empty space in place, and do it later.

Enter the Zombie!

GraphBLAS has a non-blocking mode, where operations can be left pending. I can be lazy, and save up any work to do later. If a user application asks me to delete an entry (with GrB_assign for example), then I can tag the entry for deletion, but not remove it right away. Let's suppose I'm using CSC format for a matrix. To enable fast binary search of a sparse column, I still keep its row index i. I just "negate" it to denote that it's a zombie (well, I can't exactly do that since i=0 is a valid row index, but the basic idea is the same).

I use the term Zombie since the entry is "dead" but is still in the land of the living. I can also look at the zombie entry and see what it was when it it was alive (I can look at its index). Unlike most movie zombies, I can bring a zombie back to life, with another GraphBLAS operation that sets its value.

Many GraphBLAS operations in my implementation can't tolerate zombies in their input matrices, so they are deleted in this case. But there can be many zombies, and it's just as fast to delete a single zombie as it is to delete many of them.

Currently, the operations that can tolerate zombies (leaving them in place, and not deleting them) include: GrB_*_setElement, GrB_*_*assign, and GrB_reduce (to scalar). These functions can be used for fast incremental update/downdates of a matrix or vector, where many deletions and insertions can be queued up for later completion. I kill the zombies only when I have to.

Zombies are only an internal construct in the opaque data structure.  If you ask if an entry is present, and it's there but as a zombie, then I'll tell you it's not there.  So Zombies are invisible to Mere Mortals (the user).  Only I can see the Zombies.

In SuiteSparse:GraphBLAS, because of My Friendly Zombie, deleting an entry can take just constant time, in an amortized sense. In contrast, MATLAB always takes O(nnz(A)) time to delete just one entry, because it's not Zombie-Friendly.

So the next time your matrix encounters a Zombie, send it a thank-you note.  I can always kill it later.

1 thought on “My Friendly Zombie”

Leave a Reply

Your email address will not be published. Required fields are marked *