Thursday, 17 May 2012

A quick libnih tutorial

Introduction

The NIH Utility Library (libnih) is a small, efficient and most importantly safe library of general purpose routines. It was written by Keybuk so you can be assured that it is extremely elegant, well-designed, well-tested (includes 2863 tests currently!) and well-written. NIH is used by Upstart, the event-based init daemon which is used by:

That's a lot of deployments of Upstart and NIH around the world!! (And we're not even including mobile device operating systems in that list).

But why not just use glib I hear you ask? Well, glib is a very large library whereas NIH is small and designed for low-level daemons and systems which may be resource-constrained. Also, lets not forget that NIH, like Upstart, comes with a very comprehensive test suite so bugs are rare.

Other reasons to use NIH:
  • It handles garbage collection for you

    That's right, you don't need to free memory manually.

  • It uses an Object-Oriented-like Design... in C!

    This is extremely powerful and elegant. It's also quite easy to use once you understand the way the API works.

Let's start with some basics...

Garbage Collection


/* WARNING! Contains bugs! */
int
main (int argc, char *argv[])
{
    nih_local char *string;

    if (argc > 1) {
        string = nih_strdup (NULL, "hello, world");
        nih_message ("string is set to '%s'", string);
    }
}

This code nominally is trying to display a message if the user runs this application with one or more command-line arguments specified. However, there are a couple of problems with it:

  • No check is performed on the memory allocated by nih_strdup().
  • If no command-line argument is specified, chances are this program will crash.
The first issue is easy to spot and easy to remedy, but what about this crash? Well, nih_local variables are garbage collected automatically when they go out of scope. The string variable will therefore be garbage collected when it goes out of scope, which is when main() exits. However, since string was never initialized, it will be pointing to a random location in memory such that when the program exits, the runtime will attempt to free that random memory address. That will probably result in a SIGSEGV caused by dereferencing an illegal pointer value. The fix is easy and you should chant this mantra whenever you use nih_local variables:

Always assign nih_local variables to NULL.

Here's a corrected version:

/* Correct version */
int
main (int argc, char *argv[])
{
    /* XXX: *ALWAYS* set nih_local variables to NULL */ 
    nih_local char *string = NULL;

    if ( argc > 1) {
        string = nih_strdup (NULL, "hello, world");
        if (string) 
            nih_message ("string is set to '%s'", string);
        else {
            nih_error ("failed to allocate space for string");
            exit (EXIT_FAILURE);
        }
    }
}


However, there is an even better way to code that check to ensure nih_strdup() succeeded:


/* Improved version */ 
int
main (int argc, char *argv[])
{
    /* XXX: *ALWAYS* set nih_local variables to NULL */ 
    nih_local char *string = NULL;

    if ( argc > 1) {
        string = NIH_MUST (nih_strdup (NULL, "hello, world"));
        nih_message ("string is set to '%s'", string);
    }
}

So now, if the user specifies a command-line option, the program will print "hello, world" and automatically free the variable string. If the user does not specify a command-line option, no garbage collection will be performed since the string variable will never be associated with allocated memory.

Note that the code is simpler and easier to understand as a result. Note too that we're now using NIH_MUST(). This is a macro which will call the block you pass to it ('nih_strdup (NULL, "hello, world")' in this case) repeatedly until it succeeds. You should exercise caution using NIH_MUST()though since if there is a high likelihood of the allocation never succeeding, the code will spin forever at this point. There is similar call "NIH_SHOULD()" that will call the block passed to it repeatedly until either the result is TRUE, or an error other than ENOMEM is raised.

Parent-Pointer

Let's take a closer look at that call to nih_strdup. The system version of strdup takes a single argument (the string to copy), so why does nih_strdup take two arguments?


nih_strdup (NULL, "hello, world");

Well that first NULL parameter is the parent pointer. Most NIH functions take a parent pointer as their first argument. Lets see these pointers in action before explaining the detail...


#include <nih/macros.h>
#include <nih/logging.h>
#include <nih/string.h>
#include <nih/alloc.h>

int
main(int argc, char *argv[])
{
    typedef struct foo {
        char *str1;
        char *str2;
    } Foo;

    nih_local Foo *foo = NIH_MUST (nih_new (NULL, Foo));

    foo->str1 = NIH_MUST (nih_strdup (foo, "first string"));
    foo->str2 = NIH_MUST (nih_strdup (foo, "second string"));

    nih_message ("foo->str1='%s'", foo->str1); 
    nih_message ("foo->str2='%s'", foo->str2); 

    exit(EXIT_SUCCESS);
}


Here we see our first complete NIH program. There are a couple of important points to note:


  • The call to nih_new() is like malloc()except it too takes a parent pointer. Since the foo object we're creating doesn't have a parent, we set the pointer to NULL.
  • Note that there is no call to free the memory allocated by nih_new()because since we're using nih_local, the object and all its children will be freed automatically when the block (in this example the main() function) ends. This is incredibly powerful: we've made 3 memory allocations in the example (one call to nih_new() and two calls to nih_strdup()), and all that memory will be automatically garbage collected for us because NIH knows to free the foo object when it goes out of scope, but it also knows that the str1 and str2 elements also need to be freed (since we told nih_strdup() their parent is the foo object we previously created).
So the parent pointer provided by most NIH calls is used to enable intelligent garbage collection: by effectively tagging objects with a reference to their parent you are assured of that object being automatically garbage collected when the parent is freed.

Lists

The NIH list implementation is essentially the same as a "struct list_head" in the Linux kernel:

typedef struct nih_list {
    struct nih_list *prev, *next;  
} NihList;

Lists are designed to be contained within other objects like this:
typedef struct bar {
    NihList   entry;
    char     *str;
} Bar;

So you don't create a "list of Bar objects", you create a list of list objects which provide access to their containing types.

Note that the list element is the first in the Bar structure. This allows a list pointer to be dereferenced to its containing type trivially.

Let's look at an example of list usage by implementing echo(1):

#include <nih/macros.h>
#include <nih/logging.h>
#include <nih/string.h>
#include <nih/alloc.h>

typedef struct bar {
    NihList  entry;
    char    *str;
} Bar;

int
main(int argc, char *argv[])
{
    int i;
    NihList *args;

    args = NIH_MUST (nih_list_new (NULL));

    /* store all arguments in a list */
    for (i = 1; i < argc; ++i) {
        Bar *bar = NIH_MUST (nih_new (args, Bar));

        nih_list_init (&bar->entry);

        bar->str = NIH_MUST (nih_strdup (bar, argv[i]));

        nih_list_add (args, &bar->entry);
    }

    i = 1;

    /* display all arguments by iterating over list */
    NIH_LIST_FOREACH (args, iter) {
        Bar *bar = (Bar *)iter;

        nih_message ("argument %d='%s'", i, bar->str);

        ++i;
    }

    nih_free (args);

    return (0);
}

The new features introduced here are the calls to nih_list_init() to initialise a list, and nih_list_add(), which adds the second argument to the list specified by the first argument. Additionally, we have that rather funky NIH_LIST_FOREACH() macro which allows for easy (and fast!) list traversal. In this example we are not using nih_local so what happens when nih_free() is called? Well, all entries in the args list are freed, but before each is freed, the str string within each entry is freed. Then the list itself is freed. Neat huh?

To build our version of echo:
gcc -std=gnu99 -Wall -pedantic echo.c -o echo $(pkg-config --cflags --libs libnih)
Now let's run it:
$ ./echo a b c "hello world" "foo bar" wibble "the end"
argument 1='a'
argument 2='b'
argument 3='c'
argument 4='hello world'
argument 5='foo bar'
argument 6='wibble'
argument 7='the end'
$ 


We've really only scratched the surface of NIHs abilities in this post. Here are some of the other facilities it provides:


  • hashes
  • binary trees
  • string arrays
  • file watches
  • I/O handling
  • signal handling
  • timers
  • reference handling
  • error/exception handling
  • main loop handling
  • command-line option and usage handling
  • child process handling
  • config file handling
  • logging facilities
  • test facilities
If you're interested to learn more, start hacking, or take a look at some of the projects already using NIH:

References

3 comments:

  1. awesome example why one should rather use C++

    ReplyDelete
  2. Well It Was Very Good Information For All Linux Learners.We Also Provide All Linux Online Training Courses In World Wide.

    ReplyDelete
  3. Nice Post, Thanks for your very useful information... I will bookmark for next reference.

    ReplyDelete