Mark_malloc

I ask the reader to excuse my bad english, this is not a natural language to me. (thanks fly to Mike Schiraldi for some corrections in this file)

Introduction

int average(FILE *f)
{
  int *value;
  int nb_numbers;
  int i;
  int avg = 0;

  fread(&nb_numbers, sizeof(int), 1, f);

  value = malloc(nb_numbers * sizeof(int));

  fread(value, sizeof(int), nb_numbers, f);

  for (i = 0; i < nb_numbers; i++)
    avg += value[i];

  return avg / nb_numbers;
}
Let's forget the endianness, checks about read/malloc returns and correctness of nb_numbers usage.
What do we have here? Obviously, the malloc is never freed, so if the function average is used all around the program, we're in trouble. This is a memory leak, meaning some allocated memory is never freed explicitly by the program.
Here it is obvious, and a simple free added before the function's return is the solution. But what about more complicated memory management?
Most of the time, we have some structures in our programs whose life duration which we don't always know. Sometimes, a mallocated buffer can change its meaning. For example, we can have a very sophisticated allocation policy, for speed reasons, where we want to avoid as much as possible calls to malloc and free and so reuse useless buffers for absolutly totally different purposes, or we may have a list of free buffers so that we don't free them all the time, but put them in this list instead, for later usage.
Bref, our coding practice leads us to errors in our malloc/free handling, even if we are experienced programmers.

Memory leak detection

To avoid memory leaks, we have several options.

Mark_malloc

Mark_malloc is my proposition in the direction of dynamic inspection of malloc/free bad usage.

It marks mallocated buffers and detects, to the end of the program, the ones that are not freed. If it finds some that are not, it will display the calling sequence of functions that are responsible for this error.

It is written in C and works under unix (see here to get the list of supported systems).

It interfaces well with C programs, and maybe others that I didn't try. (I would be pleased to know if it can help C++ coders in their coding. Unfortunately, I don't speak C++, so I can't try with a C++ program.)

You can read the detailed documentation to get more information about it.

It is released in the public domain, even if I ripped some code from the gnu binutils package.
This is a total lake of respect of the GPL, but I don't like licences, even copylefted ones. You can disagree with my views (I guess you do in fact), but well, I won't change. Freedom wants no licence. By the way, I wrote this to explain my opinion. And yes, I know this is illegal too. (Anyway, this code is so short and so few people will check it and much fewer will use it, that it's not a big problem.)
And, if there was a good documentation about libbfd and how to use it, I wouldn't have ripped code from addr2line. So, I can say it is bad design from the binutils team that led to this situation. I know it's hard to write good documentation, but it is vital.

Related work

Mark_malloc is far from perfect. I find it useful, that's the reason why I give it away publicly.

Some other tools exist, to help you in your coding.

I put this file on my site which comes from there with lots of links.
In particular, see electric fence, which is very useful in detecting bad usage of buffers (when you do out of bounds access).
mpr is very similar to mark_malloc too, but provides much more information. Well, check the page to see the rest.

To finish, a little word about lclint. Lclint is a static analyzer. It uses annotations to give some sematic information to the checker. It is nice even if I prefer totally automatic methods (but I don't know if it is possible in practice; I know some undecidability results (if you want the paper, contact me) (maybe this link works, maybe not), but when I see how the proof has been done, I won't say it's impossible in our common coding styles).

There is a huge research field in this area. Where could you start to get some information? Well, the ResearchIndex is a good start, with a keyword based search engine. The ACM's digital library is very good too, but maybe you won't be able to access everything if you are not inside a university (I do have full access to the digital library). And your favorite search engine can be used too, I guess. Try "memory leaks detection", "malloc errors" and the like.


Contact: sed@free.fr
Creation time: around 2002 probably. Last update: Wed, 09 Mar 2005 13:47:05 +0100

Powered by vi.
Best viewed with your eyes (or your fingers if you are blind).