Wednesday, July 20, 2011

Garbage Collection

To begin, it is first necessary to identify the root pointers. The root set must contain references somewhere on the stack, including both local storage and the operand stack or in the constant pool. The root set must also include references contained in static objects.

(1) Mark and Sweep starts with the root references and traces through all the reachable objects, marking each one as it is reached. Marking may coinsist of setting a flag bit in the object or in a separate bitmap. Garbage objects found can be combined into a linked list of free objects. Overall, this is a relatively fast way of identifying and collecting garbage. However, the free objects are of varying size and are scattered in the heap space. This leads to memory fragmentation which requires compaction. Inefficiencies can be reduced by using segregated free lists, or in other words, by dividing the heap into a set of fixed size chunks having arrange of sizes.

(2) Compacting Collectors essentially slides the live objects to the bottom or top of the heap so all live objects are adjacent. What is left is a contiguous region of free space. Although conceptually simple, the compacting collector is relatively slow as it makes multiple passes through the heap. One pass does the marking and then subsequent passes compute the new locations for the live objects, move the objects and update all reference poniters to the new location. To reduce the number of reference update, some system uses handle pool (one objects referenced by 2 other objects are pointed to through the handle pool. The 2 referencing object point to the handle pool which in turn points to the referenced object. This introduced another level of indirection.

(3) Copying collectors divids the heap into 2 halves. At any one time, only 1 half is used. The collectors copies the objecs from one half to the other half. The memory requirement for copying collector is high compares to other collector.

(4) Generational Collectors
Objects manifests a bimodel behaviour. Most objects tend to be short-lived due to good object oriented programming practice. Objects that are not short-lived tends to have very long life-time. The heap is divided into 2 subheaps. The nursery is for newly created objects. It is garbage-collected more frequently. Any objects survives a number of collection in the nursery is moved to the tenured heap which has less frequent collection.

(5) Incremental and concurrent collectors
All the above collectors stop program execution while they perform collection. Collection time may be spread out if done incrementally. Also, in a multiprocessor envrionment, it is advantageous to collect using one thread while other threads are used to run programs. As a partially collected heap is in a state of flux, some synchronization between the collector and program is needed. For example, once the collection marks a object to be alive, it may be dereferenced by the program. One of the common solutions is to provide write barriers for references to objects that are marked.

Overall, mark and sweep provide good collection time.

No comments: