Mark_malloc
I ask the reader to excuse my bad english, this is not a natural
language to me.
(thanks fly to Mike Schiraldi
for some corrections in this file)
Introduction
int average(FILE *f)
{
int *value;
int nb_numbers;
int i;
int avg = 0;
fread(&nb_numbers, sizeof(int), 1, f);
value = malloc(nb_numbers * sizeof(int));
fread(value, sizeof(int), nb_numbers, f);
for (i = 0; i < nb_numbers; i++)
avg += value[i];
return avg / nb_numbers;
}
Let's forget the endianness, checks about read/malloc returns and
correctness of nb_numbers usage.
What do we have here? Obviously, the malloc is never freed, so
if the function average
is used all around the program, we're in
trouble. This is a memory leak, meaning some allocated memory is never
freed explicitly by the program.
Here it is obvious, and a simple free added before the function's
return is the solution. But what about more complicated memory
management?
Most of the time, we have some structures in our programs whose
life duration which we don't always know. Sometimes,
a mallocated buffer can change its meaning. For example, we can
have a very sophisticated allocation policy, for speed reasons,
where we want to avoid as much as possible calls to malloc and
free and so reuse useless buffers for absolutly totally different
purposes, or we may have a list of free buffers so that we don't free
them all the time, but put them in this list instead, for later
usage.
Bref, our coding practice leads us to errors in our malloc/free
handling, even if we are experienced programmers.
Memory leak detection
To avoid memory leaks, we have several options.
- Code perfectly, with no error.
This is nonsense, we all make errors, and malloc/free errors
are some of the most elusive (distributed programming is even harder,
but that's off-topic).
- Do a static analysis of the program.
This means to inspect the code line by line, watching the mallocs
and frees that are done, and deciding, with all the code in mind,
if those mallocs and frees are correct.
Due to some limitations on our brains, this solution can't be applied
directly this way. Several thousands lines of code can't be remembered
by a human being. Even a hundred of lines of code is hard to fully
remember, if not impossible.
So what? The answer could be to reason about programs, abstract
unnecessary constructs, thus resulting in a smaller and smaller program,
still correct to check the malloc/free handling. Well, this is a possible
direction. I am not aware of practical results (i.e. tools that can be
used to do this) in this area.
Another way is to add some semantic information about the pointers
used in our programs, thus helping a static analyzer in finding bugs
(see lclint for a good example).
This is a good idea, but not sufficient enough. We would like to have a
fully automatic tool. We would like to run:
check my_program.c
and have an easy to understand answer like:
line 10, this malloc is never freed
or something like that
(it should work with several input files).
Currently speaking (October 2001), no such tool exists. And we still need
something to help us in our coding! And we need it today, not
in ten or twenty years. So what now?
- Have some coding practice.
A solution could be to impose upon us the use of some coding practice.
For example, each time a pointer becomes useless
somewhere in our code, free it systematically, even if we will do a malloc
of the same size in the next line (this is the extreme case, as you
can imagine, but you get the point). We could totally avoid the use of
free buffers. We could always copy the parameters we get from another
module, using private structures, which are much easier to trace. We could
do this, we could avoid that... This is endless.
The question that arises here is: what is a good
practice? If one can answer that, in a formal manner
preferably, it would help most of us in our daily troubles.
And another question is: what about performance? When we
take it into account, some hacks are necessary to speed up the code, and
they very well may break our "good practice".
- Do some runtime tests.
A runtime test is to trace, at runtime, the value of
some variables of the program, and see if they are correct in respect
to what we expect. We can check parameters of functions,
calls to certain functions, and (what is done into mark_malloc)
the malloc/free stuff.
This is a working solution. It is far from perfect of course, for the
main reason that it depends on inputs we provide to the program, and most of
the time, there is an infinite number of those inputs, so we don't catch
every case, so we can't say something fully reliable based on this
method only.
But it works. :-)
See the Related work section below for some pointers
to other tools that exist, in this context.
This approach is the same that led to the creation of our beloved
debuggers, which should be useless if we coded well. But we don't,
and I don't think I take a big risk in saying we never will.
So, this is a not-so-bad solution after all.
Mark_malloc
Mark_malloc is my proposition in the direction of dynamic inspection
of malloc/free bad usage.
It marks mallocated buffers and detects, to the end of the program,
the ones that are not freed. If it finds some that are not, it
will display the calling sequence of functions that are responsible
for this error.
It is written in C and works under unix (see
here to get the list of supported
systems).
It interfaces well with C programs, and maybe others that I didn't
try. (I would be pleased to know if it can help C++ coders in their
coding. Unfortunately, I don't speak C++, so I can't try with a C++ program.)
You can read the detailed documentation to get more
information about it.
It is released in the public domain, even if I ripped some code
from the gnu binutils package.
This is a total lake of respect of the GPL, but I don't like
licences, even copylefted ones. You can disagree with my views (I guess
you do in fact), but well, I won't change. Freedom wants no licence.
By the way, I wrote
this to
explain my opinion. And yes, I know this is illegal too. (Anyway, this
code is so short and so few people will check it and much fewer will use
it, that it's not a big problem.)
And, if there was a good documentation about libbfd and how to use it,
I wouldn't have ripped code from addr2line
. So, I can say it
is bad design from the binutils team that led to this situation. I know
it's hard to write good documentation, but it is vital.
- Download
New ! There is a CVS now. Jump
here to get it.
Thanks to Orlando Bassotto for all his efforts on mark_malloc.
- mark-malloc-2.0.1.tar.gz
(2002, May 31st).
Orlando Bassotto did all this. Now you have a script markm
to
use mark_malloc very nicely, and autoconf/automake to install it friendly
too. Many ports exist too. Contact Orlando for more infos or hosts wanted.
The documentation has not been updated, but remains almost the same. No time
to update, ask Orlando for any problems (mail below).
- mark_malloc-1.0.2.tar.gz
(2002, March 14th).
Orlando Bassotto did
a port under PowerPC and made some changes here and there. He added the
use of '_' under bash 2.x so you don't need to declare
MARK_MALLOC_PROGRAM_FILE anymore under this shell. He added
MARK_MALLOC_HEXDUMP_CONTENT to dump the content of the allocated unfreed
buffers (this can be very huge if your code is very very unclean). He fixed
some code here and there.
- mark_malloc-1.0.1.tar.gz
(2001, October 9th).
Mike Schiraldi corrected
a bit the grammar/spelling of this web page and proposed some fixes
to avoid warnings at compile time.
Loïc Lefort proposed
the use of __attribute__((__noreturn__))
too.
- mark_malloc-1.0.0.tar.gz
(2001, October 3rd).
- Detailed documentation
Go to the documentation to get a deep view
onto mark_malloc, a small tutorial, and how to use it into your
development process.
Related work
Mark_malloc is far from perfect. I find it useful, that's the reason
why I give it away publicly.
Some other tools exist, to help you in your coding.
I put this file on my site which
comes from
there with lots of links.
In particular, see electric fence, which is very useful in detecting
bad usage of buffers (when you do out of bounds access).
mpr is very similar to mark_malloc too, but provides much more
information. Well, check the page to see the rest.
To finish, a little word about
lclint.
Lclint is a static analyzer. It uses annotations to give
some sematic information to the checker. It is nice even
if I prefer totally automatic methods (but I don't know if it
is possible in practice; I know some
undecidability results (if you want the paper, contact me)
(maybe this link works, maybe not),
but when I see how the proof has been done, I won't say it's impossible
in our common coding styles).
There is a huge research field in this area. Where could you
start to get some information? Well, the
ResearchIndex is a good
start, with a keyword based search engine. The
ACM's digital library is very good too, but maybe you won't be able
to access everything if you are not inside a university (I do have full
access to the digital library). And your favorite search engine can be used
too, I guess. Try "memory leaks detection", "malloc errors" and the like.
Contact: sed@free.fr
Creation time: around 2002 probably.
Last update:
Wed, 09 Mar 2005 13:47:05 +0100
Powered by vi.
Best viewed with your eyes
(or your fingers if you are blind).