.:: Wawan .Net - .NET News & Article ::..

Talking About All .NET Technology

Tuesday, February 21, 2006

What is garbage collection?

What is garbage collection?

Garbage collection is a heap-management strategy where a run-time component takes
responsibility for managing the lifetime of the memory used by objects. This concept
is not new to .NET - Java and many other languages/runtimes have used garbage
collection for some time.

Is it true that objects don't always get destroyed immediately when
the last reference goes away?


Yes. The garbage collector offers no guarantees about the time when an object
will be destroyed and its memory reclaimed.


There was an interesting thread on the DOTNET list, started by Chris Sells,
about the implications of non-deterministic destruction of objects in C#. In
October 2000, Microsoft's Brian Harry posted a lengthy analysis of the problem.
Chris Sells' response to Brian's posting is here.


Why doesn't the .NET runtime offer deterministic destruction?

Because of the garbage collection algorithm. The .NET garbage collector works
by periodically running through a list of all the objects that are currently
being referenced by an application. All the objects that it doesn't find during
this search are ready to be destroyed and the memory reclaimed. The implication
of this algorithm is that the runtime doesn't get notified immediately when
the final reference on an object goes away - it only finds out during the next
'sweep' of the heap.


Futhermore, this type of algorithm works best by performing the garbage collection
sweep as rarely as possible. Normally heap exhaustion is the trigger for a collection
sweep.


Is the lack of deterministic destruction in .NET a problem?

It's certainly an issue that affects component design. If you have objects that
maintain expensive or scarce resources (e.g. database locks), you need to provide
some way to tell the object to release the resource when it is done. Microsoft
recommend that you provide a method called Dispose() for this purpose. However,
this causes problems for distributed objects - in a distributed system who calls
the Dispose() method? Some form of reference-counting or ownership-management
mechanism is needed to handle distributed objects - unfortunately the runtime
offers no help with this.


Should I implement Finalize on my class? Should I implement IDisposable?

This issue is a little more complex than it first appears. There are really
two categories of class that require deterministic destruction - the first category
manipulate unmanaged types directly, whereas the second category manipulate
managed types that require deterministic destruction. An example of the first
category is a class with an IntPtr member representing an OS file handle. An
example of the second category is a class with a System.IO.FileStream member.


For the first category, it makes sense to implement IDisposable and override
Finalize. This allows the object user to 'do the right thing' by calling Dispose,
but also provides a fallback of freeing the unmanaged resource in the Finalizer,
should the calling code fail in its duty. However this logic does not apply
to the second category of class, with only managed resources. In this case implementing
Finalize is pointless, as managed member objects cannot be accessed in the Finalizer.
This is because there is no guarantee about the ordering of Finalizer execution.
So only the Dispose method should be implemented. (If you think about it, it
doesn't really make sense to call Dispose on member objects from a Finalizer
anyway, as the member object's Finalizer will do the required cleanup.)


For classes that need to implement IDisposable and override Finalize, see Microsoft's
documented pattern.


Note that some developers argue that implementing a Finalizer is always a bad
idea, as it hides a bug in your code (i.e. the lack of a Dispose call). A less
radical approach is to implement Finalize but include a Debug.Assert at the
start, thus signalling the problem in developer builds but allowing the cleanup
to occur in release builds.


Do I have any control over the garbage collection algorithm?

A little. For example the System.GC class exposes a Collect method, which forces
the garbage collector to collect all unreferenced objects immediately.


Also there is a gcConcurrent setting that can be specified via the application
configuration file. This specifies whether or not the garbage collector performs
some of its collection activities on a separate thread. The setting only applies
on multi-processor machines, and defaults to true.


How can I find out what the garbage collector is doing?

Lots of interesting statistics are exported from the .NET runtime via the '.NET
CLR xxx' performance counters. Use Performance Monitor to view them.


What is the lapsed listener problem?

The lapsed listener problem is one of the primary causes of leaks in .NET applications.
It occurs when a subscriber (or 'listener') signs up for a publisher's event,
but fails to unsubscribe. The failure to unsubscribe means that the publisher
maintains a reference to the subscriber as long as the publisher is alive. For
some publishers, this may be the duration of the application.


This situation causes two problems. The obvious problem is the leakage of the
subscriber object. The other problem is the performance degredation due to the
publisher sending redundant notifications to 'zombie' subscribers.


There are at least a couple of solutions to the problem. The simplest is to
make sure the subscriber is unsubscribed from the publisher, typically by adding
an Unsubscribe() method to the subscriber. Another solution, documented here
by Shawn Van Ness, is to change the publisher to use weak references in its
subscriber list.


When do I need to use GC.KeepAlive?

It's very unintuitive, but the runtime can decide that an object is garbage
much sooner than you expect. More specifically, an object can become garbage
while a method is executing on the object, which is contrary to most developers'
expectations. Chris Brumme explains the issue on his blog. I've taken Chris's
code and expanded it into a full app that you can play with if you want to prove
to yourself that this is a real problem:


using System;

using System.Runtime.InteropServices;


class Win32

{

[DllImport("kernel32.dll")]

public static extern IntPtr CreateEvent( IntPtr lpEventAttributes,

bool bManualReset,bool bInitialState, string lpName);


[DllImport("kernel32.dll", SetLastError=true)]

public static extern bool CloseHandle(IntPtr hObject);


[DllImport("kernel32.dll")]

public static extern bool SetEvent(IntPtr hEvent);

}


class EventUser

{

public EventUser()

{

hEvent = Win32.CreateEvent( IntPtr.Zero, false, false, null );

}



~EventUser()

{

Win32.CloseHandle( hEvent );

Console.WriteLine("EventUser finalized");

}


public void UseEvent()

{

UseEventInStatic( this.hEvent );

}


static void UseEventInStatic( IntPtr hEvent )

{

//GC.Collect();

bool bSuccess = Win32.SetEvent( hEvent );

Console.WriteLine( "SetEvent " + (bSuccess ? "succeeded"
: "FAILED!") );

}


IntPtr hEvent;

}


class App

{

static void Main(string[] args)

{

EventUser eventUser = new EventUser();

eventUser.UseEvent();

}

}

If you run this code, it'll probably work fine, and you'll get the following
output:


SetEvent succeeded

EventDemo finalized

However, if you uncomment the GC.Collect() call in the UseEventInStatic() method,
you'll get this output:


EventDemo finalized

SetEvent FAILED!

(Note that you need to use a release build to reproduce this problem.)


So what's happening here? Well, at the point where UseEvent() calls UseEventInStatic(),
a copy is taken of the hEvent field, and there are no further references to
the EventUser object anywhere in the code. So as far as the runtime is concerned,
the EventUser object is garbage and can be collected. Normally of course the
collection won't happen immediately, so you'll get away with it, but sooner
or later a collection will occur at the wrong time, and your app will fail.


A solution to this problem is to add a call to GC.KeepAlive(this) to the end
of the UseEvent method, as Chris explains.


0 Comments:

Post a Comment

<< Home