This repository has been archived on 2022-08-10. You can view files and clone it, but cannot push or open issues or pull requests.
chez-openbsd/csug/smgmt.stex

967 lines
37 KiB
Text
Raw Normal View History

2022-07-29 15:12:07 +02:00
% Copyright 2005-2017 Cisco Systems, Inc.
%
% Licensed under the Apache License, Version 2.0 (the "License");
% you may not use this file except in compliance with the License.
% You may obtain a copy of the License at
%
% http://www.apache.org/licenses/LICENSE-2.0
%
% Unless required by applicable law or agreed to in writing, software
% distributed under the License is distributed on an "AS IS" BASIS,
% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
% See the License for the specific language governing permissions and
% limitations under the License.
\chapter{Storage Management\label{CHPTSMGMT}}
This chapter describes aspects of the storage management system and
procedures that may be used to control its operation.
\section{Garbage Collection\label{SECTSMGMTGC}}
Scheme objects such as pairs, strings, procedures, and user-defined
records are never explicitly deallocated by a Scheme program.
Instead, the \index{storage management}storage management system
automatically reclaims the
storage associated with an object once it proves the object is no longer
accessible.
In order to reclaim this storage, {\ChezScheme} employs a
\index{garbage collector}garbage
collector which runs periodically as a program runs.
Starting from a set of known \emph{roots}, e.g., the machine registers,
the garbage collector locates all accessible objects,
copies them (in most cases) in order to eliminate fragmentation
between accessible objects, and reclaims storage occupied by
inaccessible objects.
Collections are triggered automatically by the default collect-request
handler, which is invoked via a collect-request interrupt that occurs
after approximately $n$ bytes of storage have been allocated, where $n$ is
the value of the parameter
\index{\scheme{collect-trip-bytes}}\scheme{collect-trip-bytes}.
The default collect-request handler causes a collection by calling the
procedure \index{\scheme{collect}}\scheme{collect} without arguments.
The collect-request handler can be redefined by changing the value of the
parameter
\index{\scheme{collect-request-handler}}\scheme{collect-request-handler}.
A program can also cause a collection to occur between collect-request
interrupts by calling \scheme{collect} directly either without or with
arguments.
{\ChezScheme}'s collector is a \emph{generation-based} collector.
It segregates objects based on their age (roughly speaking, the
number of collections survived) and collects older objects less
frequently than younger objects.
Since younger objects tend to become inaccessible more quickly than
older objects, the result is that most collections take little
time.
The system also maintains a
\index{static generation}\emph{static} generation from
which storage is never reclaimed.
Objects are placed into the static generation only
when a heap is compacted (see
\index{\scheme{Scompact_heap}}\scheme{Scompact_heap} in
Section~\ref{SECTFOREIGNCLIB}) or when an explicitly specified
target-generation is the symbol \scheme{static}.
This is primarily useful after an application's permanent code and data
structures have been loaded and initialized, to reduce the overhead of
subsequent collections.
Nonstatic generations are numbered starting at zero for the youngest
generation up through the current value of
\index{\scheme{collect-maximum-generation}}\scheme{collect-maximum-generation}.
The storage manager places newly allocated objects into generation 0.
When \scheme{collect} is invoked without arguments, generation 0
objects that survive collection move to generation 1, generation 1
objects that survive move to generation 2, and so on, except that
objects are never moved past the maximum nonstatic generation.
Objects in the maximum nonstatic generation are collected back into
the maximum nonstatic generation.
While generation 0 is collected during each collection, older
generations are collected less frequently.
An internal counter, gc-trip, is maintained to control when each
generation is collected.
Each time \scheme{collect} is called without arguments (as from the default
collect-request handler), gc-trip is incremented by one, and the set of
generations to be collected is determined from the current value of
gc-trip and the value of
\index{\scheme{collect-generation-radix}}\scheme{collect-generation-radix}:
with a collect-generation radix of $r$, the maximum collected generation
is the highest numbered generation $g$ for which gc-trip is a
multiple of $r^g$.
If \scheme{collect-generation-radix} is set to 4, the system thus
collects generation 0 every time, generation 1 every 4 times,
generation 2 every 16 times, and so on.
When \scheme{collect} is invoked with arguments, the generations to be
collected and their target generations are determined by the arguments.
In addition, the first argument \var{cg} affects the value of gc-trip;
that is, gc-trip is advanced to the next $r^{cg}$ boundary, but
not past the next $r^{cg+1}$ boundary, where $r$ is the
value of \scheme{collect-generation-radix}.
It is possible to make substantial adjustments in the collector's behavior
by setting the parameters described in this section.
It is even possible to completely override the collector's default strategy for
determining when each generation is collected by redefining the
collect-request handler to call \scheme{collect} with arguments.
For example, the programmer can redefine the handler to treat the
maximum nonstatic generation as a static generation over a long
period of time by calling \scheme{collect} with arguments that
prevent the maximum nonstatic generation from being collected during
that period of time.
Additional information on {\ChezScheme}'s collector can be found in the
report ``Don't stop the {BiBOP}: Flexible and efficient
storage management for dynamically typed languages''~\cite{Dybvig:sm}.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect}{\categoryprocedure}{(collect)}
\formdef{collect}{\categoryprocedure}{(collect \var{cg})}
\formdef{collect}{\categoryprocedure}{(collect \var{cg} \var{max-tg})}
\formdef{collect}{\categoryprocedure}{(collect \var{cg} \var{min-tg} \var{max-tg})}
\returns unspecified
\listlibraries
\endentryheader
\noindent
This procedure causes the storage manager to perform a garbage
collection.
\scheme{collect} is invoked periodically without arguments by the
default collect-request handler, but it may also be called explicitly,
e.g., from a custom collect-request handler, between phases of a
computation when collection is most likely to be successful, or
before timing a computation.
In the threaded versions of {\ChezScheme}, the thread that invokes
\scheme{collect} must be the only active thread.
When called without arguments, the system determines automatically
which generations to collect and the target generation for each
collected generation as described in the lead-in to this section.
When called with arguments, the system collects all and only objects
in generations less than or equal to \var{cg} (the maximum collected
generation) into the target generation or generations determined
by \var{min-tg} (the minimum target generation) and \var{max-tg}
(the maximum target generation).
Specifically, the target generation for any object in a collected
generation \var{g} is
$\mbox{min}(\mbox{max}(\mbox{\emph{g}}+1,\mbox{\emph{min-tg}}),\mbox{\emph{max-tg}})$, where
\scheme{static} is taken to have the value one greater
than the maximum nonstatic generation.
If present, \var{cg} must be a nonnegative fixnum no greater than
the maximum nonstatic generation, i.e., the current value of the
parameter \scheme{collect-maximum-generation}.
If present, \var{max-tg} must be a nonnegative fixnum or the symbol
\scheme{static} and either equal to \var{cg} or one greater than
\var{cg}, again treating \scheme{static} as having the value one
greater than the maximum nonstatic generation.
If \var{max-tg} is not present (but \var{cg} is), it defaults to
\var{cg} if \var{cg} is equal to the maximum target generation and
to one more than \var{cg} otherwise.
If present, \var{min-tg} must be a nonnegative fixnum or the symbol
\scheme{static} and no greater than \var{max-tg}, again treating
\scheme{static} as having the value one greater than the maximum
nonstatic generation.
Unless \var{max-cg} is the same as \var{cg}, \var{min-tg} must also
be greater than \var{cg}.
If \var{min-tg} is not present (but \var{cg} is), it defaults to
the same value as \var{max-tg}.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect-rendezvous}{\categoryprocedure}{(collect-rendezvous)}
\returns unspecified
\listlibraries
\endentryheader
\noindent
Requests a garbage collection in the same way as when the system
determines that a collection should occur. All running threads are
coordinated so that one of them calls the collect-request handler, while
the other threads pause until the handler returns.
Note that if the collect-request handler (see
\scheme{collect-request-handler}) does not call \scheme{collect}, then
\scheme{collect-rendezvous} does not actually perform a garbage
collection.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect-notify}{\categoryglobalparameter}{collect-notify}
\listlibraries
\endentryheader
\noindent
If \scheme{collect-notify} is set to a true value, the collector prints
a message whenever a collection is run.
\scheme{collect-notify} is set to \scheme{#f} by default.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect-trip-bytes}{\categoryglobalparameter}{collect-trip-bytes}
\listlibraries
\endentryheader
\noindent
This parameter determines the approximate amount of storage that is
allowed to be allocated between garbage collections.
Its value must be a positive fixnum.
{\ChezScheme} allocates memory internally in large chunks and
subdivides these chunks via inline operations for efficiency.
The storage manager determines whether to request a collection only
once per large chunk allocated.
Furthermore, some time may elapse between when a collection is
requested by the storage manager and when the collect request is
honored, especially if interrupts are temporarily disabled via
\index{\scheme{with-interrupts-disabled}}\scheme{with-interrupts-disabled}
or \index{\scheme{disable-interrupts}}\scheme{disable-interrupts}.
Thus, \scheme{collect-trip-bytes} is an approximate measure only.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect-generation-radix}{\categoryglobalparameter}{collect-generation-radix}
\listlibraries
\endentryheader
\noindent
This parameter determines how often each generation is collected
when \scheme{collect} is invoked without arguments, as by the default
collect-request handler.
Its value must be a positive fixnum.
Generations are collected once every $r^g$ times a collection occurs,
where $r$ is the
value of \scheme{collect-generation-radix} and $g$ is the generation
number.
Setting \scheme{collect-generation-radix} to one forces all generations
to be collected each time a collection occurs.
Setting \scheme{collect-generation-radix} to a very large number
effectively delays collection of older generations indefinitely.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect-maximum-generation}{\categoryglobalparameter}{collect-maximum-generation}
\listlibraries
\endentryheader
This parameter determines the maximum nonstatic generation, hence the
total number of generations, currently in use.
Its value is an exact integer in the range 1 through 254.
When set to 1, only two nonstatic generations are used; when set to 2,
three nonstatic generations are used, and so on.
When set to 254, 255 nonstatic generations are used, plus the single
static generation for a total of 256 generations.
Increasing the number of generations effectively decreases how often old
objects are collected, potentially decreasing collection overhead but
potentially increasing the number of inaccessible objects retained in the
system and thus the total amount of memory required.
%----------------------------------------------------------------------------
\entryheader
\formdef{collect-request-handler}{\categoryglobalparameter}{collect-request-handler}
\listlibraries
\endentryheader
\noindent
The value of \scheme{collect-request-handler} must be a procedure.
The procedure is invoked without arguments whenever the
system determines that a collection should occur, i.e., some time after
an amount of storage determined by the parameter
\scheme{collect-trip-bytes} has been allocated since the last
collection.
By default, \scheme{collect-request-handler} simply invokes
\scheme{collect} without arguments.
Automatic collection may be disabled by setting
\scheme{collect-request-handler} to a procedure that does nothing,
e.g.:
\schemedisplay
(collect-request-handler void)
\endschemedisplay
Collection can also be temporarily disabled using
\scheme{critical-section}, which prevents any interrupts from
being handled.
In the threaded versions of {\ChezScheme}, the collect-request
handler is invoked by a single thread with all other threads
temporarily suspended.
%----------------------------------------------------------------------------
\entryheader
\formdef{release-minimum-generation}{\categoryglobalparameter}{release-minimum-generation}
\listlibraries
\endentryheader
This parameter's value must be between 0 and the value of
\scheme{collect-maximum-generation}, inclusive, and defaults to the
value of \scheme{collect-maximum-generation}.
As new data is allocated and collections occur, the storage-management
system automatically requests additional virtual memory address space
from the operating system.
Correspondingly, in the event the heap shrinks significantly, the system
attempts to return some of the virtual-memory previously obtained from
the operating system back to the operating system.
By default, the system attempts to do so only after a collection that
targets the maximum nonstatic generation.
The system can be asked to do so after collections
targeting younger generations as well by altering the value
\scheme{release-minimum-generation} to something less than the value
of \scheme{collect-maximum-generation}.
When the generation to which the parameter is set, or any older
generation, is the target generation of a collection, the storage
management system attempts to return unneeded virtual memory to the
operating system following the collection.
When \scheme{collect-maximum-generation} is set to a new value \var{g},
\scheme{release-minimum-generation} is implicitly set to \var{g} as well
if (a) the two parameters have the same value before the change, or (b)
\scheme{release-minimum-generation} has a value greater than \var{g}.
%----------------------------------------------------------------------------
\entryheader
\formdef{heap-reserve-ratio}{\categoryglobalparameter}{heap-reserve-ratio}
\listlibraries
\endentryheader
This parameter determines the approximate amount of memory reserved (not
returned to the O/S as described in the entry for \scheme{release-minimum-generation})
in proportion to the amount currently occupied, excluding areas
of memory that have been made static.
Its value must be an inexact nonnegative flonum value; if set to an exact
real value, the exact value is converted to an inexact value.
The default value, 1.0, reserves one page of memory for each currently
occupied nonstatic page.
Setting it to a smaller value may result in a smaller average virtual
memory footprint, while setting it to a larger value may result in fewer
calls into the operating system to request and free memory space.
\section{Weak Pairs, Ephemeron Pairs, and Guardians\label{SECTGUARDWEAKPAIRS}}
\index{weak pairs}\index{weak pointers}\emph{Weak pairs} allow programs
to maintain \emph{weak pointers} to objects.
A weak pointer to an object does not prevent the object from being
reclaimed by the storage management system, but it does remain valid as
long as the object is otherwise accessible in the system.
\index{ephemeron pairs}\emph{Ephemeron pairs} are like weak pairs, but
ephemeron pairs combine two pointers where the second is retained only
as long as the first is retained.
\index{guardians}\emph{Guardians}
allow programs to protect objects from deallocation
by the garbage collector and to determine when the objects would
otherwise have been deallocated.
Weak pairs, ephemeron pairs, and guardians allow programs to retain
information about objects in separate data structures (such as hash
tables) without concern that maintaining this information will cause
the objects to remain indefinitely in the system. Ephemeron pairs
allow such data structures to retain key--value combinations
where a value may refer to its key, but the combination
can be reclaimed if neither must be saved otherwise.
In addition, guardians allow objects to be saved from deallocation
indefinitely so that they can be reused or so that clean-up or other
actions can be performed using the data stored within the objects.
The implementation of guardians and weak pairs used by {\ChezScheme}
is described in~\cite{Dybvig:guardians}. Ephemerons are described
in~\cite{Hayes:ephemerons}, but the implementation in {\ChezScheme}
avoids quadratic-time worst-case behavior.
%----------------------------------------------------------------------------
\entryheader\label{desc:weak-cons}
\formdef{weak-cons}{\categoryprocedure}{(weak-cons \var{obj_1} \var{obj_2})}
\returns a new weak pair
\listlibraries
\endentryheader
\noindent
\var{obj_1} becomes the car and \var{obj_2} becomes the cdr of the
new pair.
Weak pairs are indistinguishable from ordinary pairs in all but two ways:
\begin{itemize}
\item weak pairs can be distinguished from pairs using the
\scheme{weak-pair?} predicate, and
\item weak pairs maintain a weak pointer to the object in the
car of the pair.
\end{itemize}
\noindent
The weak pointer in the car of a weak pair is just like a normal
pointer as long as the object to which it points is accessible through
a normal (nonweak) pointer somewhere in the system.
If at some point the garbage collector recognizes that there are no
nonweak pointers to the object, however, it replaces each weak pointer
to the object with the ``broken weak-pointer'' object, \scheme{#!bwp},
and discards the object.
The cdr field of a weak pair is \emph{not} a weak pointer, so
weak pairs may be used to form lists of weakly held objects.
These lists may be manipulated using ordinary list-processing
operations such as \scheme{length}, \scheme{map}, and \scheme{assv}.
(Procedures like \scheme{map} that produce list structure always
produce lists formed from nonweak pairs, however, even when their input
lists are formed from weak pairs.)
Weak pairs may be altered using \scheme{set-car!} and \scheme{set-cdr!}; after
a \scheme{set-car!} the car field contains a weak pointer to the new
object in place of the old object.
Weak pairs are especially useful for building association pairs
in association lists or hash tables.
Weak pairs are printed in the same manner as ordinary pairs; there
is no reader syntax for weak pairs.
As a result, weak pairs become normal pairs when they are written
and then read.
\schemedisplay
(define x (cons 'a 'b))
(define p (weak-cons x '()))
(car p) ;=> (a . b)
(define x (cons 'a 'b))
(define p (weak-cons x '()))
(set! x '*)
(collect)
(car p) ;=> #!bwp
\endschemedisplay
\noindent
The latter example above may in fact return \scheme{(a . b)} if a
garbage collection promoting the pair into an older generation occurs
prior to the assignment of \scheme{x} to \scheme{*}.
It may be necessary to force an older generation collection to allow
the object to be reclaimed.
The storage management system guarantees only that the object
will be reclaimed eventually once all nonweak pointers to it are
dropped, but makes no guarantees about when this will occur.
%----------------------------------------------------------------------------
\entryheader
\formdef{weak-pair?}{\categoryprocedure}{(weak-pair? \var{obj})}
\returns \scheme{#t} if obj is a weak pair, \scheme{#f} otherwise
\listlibraries
\endentryheader
\schemedisplay
(weak-pair? (weak-cons 'a 'b)) ;=> #t
(weak-pair? (cons 'a 'b)) ;=> #f
(weak-pair? "oops") ;=> #f
\endschemedisplay
%----------------------------------------------------------------------------
\entryheader\label{desc:ephemeron-cons}
\formdef{ephemeron-cons}{\categoryprocedure}{(ephemeron-cons \var{obj_1} \var{obj_2})}
\returns a new ephemeron pair
\listlibraries
\endentryheader
\noindent
\var{obj_1} becomes the car and \var{obj_2} becomes the cdr of the
new pair.
Ephemeron pairs are indistinguishable from ordinary pairs in all but two ways:
\begin{itemize}
\item ephemeron pairs can be distinguished from pairs using the
\scheme{ephemeron-pair?} predicate, and
\item ephemeron pairs maintain a weak pointer to the object in the
car of the pair, and the cdr of the pair is preserved only as long
as the car of the pair is preserved.
\end{itemize}
\noindent
An ephemeron pair behaves like a weak pair, but the cdr is treated
specially in addition to the car: the cdr of an ephemeron is set to
\scheme{#!bwp} at the same time that the car is set to \scheme{#!bwp}.
Since the car and cdr fields are set to \scheme{#!bwp} at the same
time, then the fact that the car object may be referenced through the
cdr object does not by itself imply that car must be preserved (unlike
a weak pair); instead, the car must be saved for some reason
independent of the cdr object.
Like weak pairs and other pairs, ephemeron pairs may be altered using
\scheme{set-car!} and \scheme{set-cdr!}, and ephemeron pairs are
printed in the same manner as ordinary pairs; there is no reader
syntax for ephemeron pairs.
\schemedisplay
(define x (cons 'a 'b))
(define p (ephemeron-cons x x))
(car p) ;=> (a . b)
(cdr p) ;=> (a . b)
(define x (cons 'a 'b))
(define p (ephemeron-cons x x))
(set! x '*)
(collect)
(car p) ;=> #!bwp
(cdr p) ;=> #!bwp
(define x (cons 'a 'b))
(define p (weak-cons x x)) ; \var{not an ephemeron pair}
(set! x '*)
(collect)
(car p) ;=> (a . b)
(cdr p) ;=> (a . b)
\endschemedisplay
\noindent
As with weak pairs, the last two expressions of the middle example
above may in fact return \scheme{(a . b)} if a garbage collection
promoting the pair into an older generation occurs prior to the
assignment of \scheme{x} to \scheme{*}. In the last example above,
however, the results of the last two expressions will always be
\scheme{(a . b)}, because the cdr of a weak pair holds a non-weak
reference, and that non-weak reference prevents the car field from becoming
\scheme{#!bwp}.
%----------------------------------------------------------------------------
\entryheader
\formdef{ephemeron-pair?}{\categoryprocedure}{(ephemeron-pair? \var{obj})}
\returns \scheme{#t} if obj is a ephemeron pair, \scheme{#f} otherwise
\listlibraries
\endentryheader
\schemedisplay
(ephemeron-pair? (ephemeron-cons 'a 'b)) ;=> #t
(ephemeron-pair? (cons 'a 'b)) ;=> #f
(ephemeron-pair? (weak-cons 'a 'b)) ;=> #f
(ephemeron-pair? "oops") ;=> #f
\endschemedisplay
%----------------------------------------------------------------------------
\entryheader
\formdef{bwp-object?}{\categoryprocedure}{(bwp-object? \var{obj})}
\returns \scheme{#t} if obj is the broken weak-pair object, \scheme{#f} otherwise
\listlibraries
\endentryheader
\schemedisplay
(bwp-object? #!bwp) ;=> #t
(bwp-object? 'bwp) ;=> #f
(define x (cons 'a 'b))
(define p (weak-cons x '()))
(set! x '*)
(collect (collect-maximum-generation))
(car p) ;=> #!bwp
(bwp-object? (car p)) ;=> #t
\endschemedisplay
%----------------------------------------------------------------------------
\entryheader
\formdef{make-guardian}{\categoryprocedure}{(make-guardian)}
\returns a new guardian
\listlibraries
\endentryheader
\noindent
Guardians are represented by procedures that encapsulate groups of
objects registered for preservation.
When a guardian is created, the group of registered objects is empty.
An object is registered with a guardian by passing the object as an
argument to the guardian:
\schemedisplay
(define G (make-guardian))
(define x (cons 'aaa 'bbb))
x ;=> (aaa . bbb)
(G x)
\endschemedisplay
It is also possible to specify a ``representative'' object when
registering an object.
Continuing the above example:
\schemedisplay
(define y (cons 'ccc 'ddd))
y ;=> (ccc . ddd)
(G y 'rep)
\endschemedisplay
The group of registered objects associated with a guardian is logically
subdivided into two disjoint subgroups: a subgroup referred to
as ``accessible'' objects, and one referred to ``inaccessible'' objects.
Inaccessible objects are objects that have been proven to be
inaccessible (except through the guardian mechanism itself or through
the car field of a weak or ephemeron pair), and
accessible objects are objects that have not been proven so.
The word ``proven'' is important here: it may be that some objects in
the accessible group are indeed inaccessible but
that this has not yet been proven.
This proof may not be made in some cases until long after the object
actually becomes inaccessible (in the current implementation, until a
garbage collection of the generation containing the object occurs).
Objects registered with a guardian are initially placed in the accessible
group and are moved into the inaccessible group at some point after they
become inaccessible.
Objects in the inaccessible group are retrieved by invoking the guardian
without arguments.
If there are no objects in the inaccessible group, the guardian returns
\scheme{#f}.
Continuing the above example:
\schemedisplay
(G) ;=> #f
(set! x #f)
(set! y #f)
(collect)
(G) ;=> (aaa . bbb) ; \var{this might come out second}
(G) ;=> rep ; \var{and this first}
(G) ;=> #f
\endschemedisplay
\noindent
The initial call to \scheme{G} returns \scheme{#f}, since the pairs bound
to \scheme{x} and \scheme{y} are the
only object registered with \scheme{G}, and the pairs are still accessible
through those bindings.
When \scheme{collect} is called, the objects shift into the inaccessible group.
The two calls to \scheme{G} therefore return the pair previously bound to
\scheme{x} and the representative of the pair previously bound to \scheme{y},
though perhaps in the other order from the one shown.
(As noted above for weak pairs, the call to collect may not actually be
sufficient to prove the object inaccessible, if the object has
migrated into an older generation.)
Although an object registered without a representative and returned from
a guardian has been proven otherwise
inaccessible (except possibly via the car field of a weak or ephemeron pair), it has
not yet been reclaimed by the storage management system and will not be
reclaimed until after the last nonweak pointer to it within or outside
of the guardian system has been dropped.
In fact, objects that have been retrieved from a guardian have no
special status in this or in any other regard.
This feature circumvents the problems that might otherwise arise with
shared or cyclic structure.
A shared or cyclic structure consisting of inaccessible objects is
preserved in its entirety, and each piece registered for preservation
with any guardian is placed in the inaccessible set for that guardian.
The programmer then has complete control over the order in which pieces
of the structure are processed.
An object may be registered with a guardian more than once, in which
case it will be retrievable more than once:
\schemedisplay
(define G (make-guardian))
(define x (cons 'aaa 'bbb))
(G x)
(G x)
(set! x #f)
(collect)
(G) ;=> (aaa . bbb)
(G) ;=> (aaa . bbb)
\endschemedisplay
\noindent
It may also be registered with more than one guardian, and guardians
themselves can be registered with other guardians.
An object that has been registered with a guardian without a
representative and placed in
the car field of a weak or ephemeron pair remains in the car field of the
weak or ephemeron pair until after it has been returned from the guardian and
dropped by the program or until the guardian itself is dropped.
\schemedisplay
(define G (make-guardian))
(define x (cons 'aaa 'bbb))
(define p (weak-cons x '()))
(G x)
(set! x #f)
(collect)
(set! y (G))
y ;=> (aaa . bbb)
(car p) ;=> (aaa . bbb)
(set! y #f)
(collect 1)
(car p) ;=> #!bwp
\endschemedisplay
\noindent
(The first collector call above would
promote the object at least into generation~1, requiring the second
collector call to be a generation~1 collection.
This can also be forced by invoking \scheme{collect} several times.)
On the other hand, if a representative (other than the object itself)
is specified, the guarded object is dropped from the car field of the
weak or ephemeron pair at the same time as the representative becomes available
from the guardian.
\schemedisplay
(define G (make-guardian))
(define x (cons 'aaa 'bbb))
(define p (weak-cons x '()))
(G x 'rep)
(set! x #f)
(collect)
(G) ;=> rep
(car p) ;=> #!bwp
\endschemedisplay
The following example illustrates that the object is deallocated and
the car field of the weak pair set to \scheme{#!bwp} when the guardian
itself is dropped:
\schemedisplay
(define G (make-guardian))
(define x (cons 'aaa 'bbb))
(define p (weak-cons x '()))
(G x)
(set! x #f)
(set! G #f)
(collect)
(car p) ;=> #!bwp
\endschemedisplay
The example below demonstrates how guardians might be used to
deallocate external storage, such as storage managed by the C library
``malloc'' and ``free'' operations.
\schemedisplay
(define malloc
(let ([malloc-guardian (make-guardian)])
(lambda (size)
; first free any storage that has been dropped. to avoid long
; delays, it might be better to deallocate no more than, say,
; ten objects for each one allocated
(let f ()
(let ([x (malloc-guardian)])
(when x
(do-free x)
(f))))
; then allocate and register the new storage
(let ([x (do-malloc size)])
(malloc-guardian x)
x))))
\endschemedisplay
\noindent
\scheme{do-malloc} must return a Scheme object ``header'' encapsulating a pointer to the
external storage (perhaps as an unsigned integer), and all access to the
external storage must be made through this header.
In particular, care must be taken that no pointers to the external storage
exist outside of Scheme after the corresponding header has been
dropped.
\scheme{do-free} must deallocate the external storage using the encapsulated
pointer.
Both primitives can be defined in terms of \scheme{foreign-alloc}
and \scheme{foreign-free} or the C-library ``malloc'' and ``free''
operators, imported as foreign procedures. (See
Chapter~\ref{CHPTFOREIGN}.)
If it is undesirable to wait until \scheme{malloc} is called to free dropped
storage previously allocated by \scheme{malloc}, a collect-request handler
can be used instead to check for and free dropped storage, as shown below.
\schemedisplay
(define malloc)
(let ([malloc-guardian (make-guardian)])
(set! malloc
(lambda (size)
; allocate and register the new storage
(let ([x (do-malloc size)])
(malloc-guardian x)
x)))
(collect-request-handler
(lambda ()
; first, invoke the collector
(collect)
; then free any storage that has been dropped
(let f ()
(let ([x (malloc-guardian)])
(when x
(do-free x)
(f)))))))
\endschemedisplay
%% for testing:
% (define do-malloc (lambda (x) (list x)))
% (define do-free (lambda (x) (printf "freeing ~s~%" (car x))))
% (define a (malloc 1))
% (malloc 10)
% (let f () (cons f f) (f))
With a bit of refactoring, it would be possible to register
the encapsulated foreign address as a representative with
each header, in which \scheme{do-free} would take just the
foreign address as an argument.
This would allow the header to be dropped from the Scheme
heap as soon as it becomes inaccessible.
Guardians can also be created via
\index{\scheme{ftype-guardian}}\scheme{ftype-guardian}, which
supports reference counting of foreign objects.
%----------------------------------------------------------------------------
\entryheader
\formdef{guardian?}{\categoryprocedure}{(guardian? \var{obj})}
\returns \scheme{#t} if obj is a guardian, \scheme{#f} otherwise
\listlibraries
\endentryheader
\schemedisplay
(guardian? (make-guardian)) ;=> #t
(guardian? (ftype-guardian iptr)) ;=> #t
(guardian? (lambda x x)) ;=> #f
(guardian? "oops") ;=> #f
\endschemedisplay
%----------------------------------------------------------------------------
\entryheader
\formdef{unregister-guardian}{\categoryprocedure}{(unregister-guardian \var{guardian})}
\returns see below
\listlibraries
\endentryheader
\noindent
\scheme{unregister-guardian} unregisters the
as-yet unresurrected objects currently registered with the guardian,
with one caveat.
The caveat, which applies only to threaded versions of {\ChezScheme},
is that objects registered with the guardian by other threads since
the last garbage collection might not be unregistered.
To ensure that all objects are unregistered in a multithreaded
application, a single thread can be used both to register and
unregister objects.
Alternatively, an application can arrange to define a
\index{\scheme{collect-request-handler}}collect-request
handler that calls \scheme{unregister-guardian} after it calls
\scheme{collect}.
In any case, \scheme{unregister-guardian} returns a list containing each object
(or its representative, if specified) that it unregisters, with
duplicates as appropriate if the same object is registered more
than once with the guardian.
Objects already resurrected but not yet retrieved from the guardian
are not included in the list but remain retrievable from the
guardian.
In the current implementation, \scheme{unregister-guardian} takes time proportional
to the number of unresurrected objects currently registered with
all guardians rather than those registered just with
the corresponding guardian.
The example below assumes no collections occur except for those resulting from
explicit calls to \scheme{collect}.
\schemedisplay
(define g (make-guardian))
(define x (cons 'a 'b))
(define y (cons 'c 'd))
(g x)
(g x)
(g y)
(g y)
(set! y #f)
(collect 0 0)
(unregister-guardian g) ;=> ((a . b) (a . b))
(g) ;=> (c . d)
(g) ;=> (c . d)
(g) ;=> #f
\endschemedisplay
\scheme{unregister-guardian} can also be used to unregister ftype
pointers registered with guardians created by
\index{\scheme{ftype-guardian}}\scheme{ftype-guardian}
(Section~\ref{SECTTHREADFTYPEGUARDIANS}).
\section{Locking Objects\label{SECTSMGMTLOCKING}}
All pointers from C variables or data structures to Scheme objects
should generally be discarded before entry (or reentry) into Scheme.
When this guideline cannot be followed, the object may be
\emph{locked} via \scheme{lock-object} or via the equivalent
C library procedure \index{\scheme{Slock_object}}\scheme{Slock_object}
(Section~\ref{SECTFOREIGNCLIB}).
%----------------------------------------------------------------------------
\entryheader
\formdef{lock-object}{\categoryprocedure}{(lock-object \var{obj})}
\returns unspecified
\listlibraries
\endentryheader
\noindent
Locking an object prevents the storage manager from reclaiming or
relocating the object.
Locking should be used sparingly, as it introduces memory fragmentation
and increases storage management overhead.
Locking can also lead to accidental retention of storage if objects
are not unlocked.
Objects may be unlocked via \scheme{unlock-object} or the equivalent
C library procedure
\index{\scheme{Sunlock_object}}\scheme{Sunlock_object}.
Locking immediate values, such as fixnums, booleans, and characters,
or objects that have been made static is unnecessary but harmless.
%----------------------------------------------------------------------------
\entryheader
\formdef{unlock-object}{\categoryprocedure}{(unlock-object \var{obj})}
\returns unspecified
\listlibraries
\endentryheader
\noindent
An object may be locked more than once by successive calls to
\scheme{lock-object}, \scheme{Slock_object}, or both, in which case it must
be unlocked by an equal number of calls to
\scheme{unlock-object} or \scheme{Sunlock_object} before it is
truly unlocked.
An object contained within a locked object, such as an object in the
car of a locked pair, need not also be locked unless a separate C
pointer to the object exists.
That is, if the inner object is accessed only via an indirection of the
outer object, it should be left unlocked so that the collector is free
to relocate it during collection.
Unlocking immediate values, such as fixnums, booleans, and characters,
or objects that have been made static is unnecessary and ineffective but harmless.
%----------------------------------------------------------------------------
\entryheader
\formdef{locked-object?}{\categoryprocedure}{(locked-object? \var{obj})}
\returns \scheme{#t} if \var{obj} is locked, immediate, or static
\listlibraries
\endentryheader
\noindent
This predicate returns true if \var{obj} cannot be relocated or reclaimed
by the collector, including immediate values, such as fixnums,
booleans, and characters, and objects that have been made static.