2904 lines
127 KiB
Text
2904 lines
127 KiB
Text
|
\documentclass{releasenotes}
|
||
|
|
||
|
\thisversion{Version 9.5.9}
|
||
|
\thatversion{Version 8.4}
|
||
|
\pubmonth{April}
|
||
|
\pubyear{2022}
|
||
|
|
||
|
\begin{document}
|
||
|
|
||
|
\maketitle
|
||
|
|
||
|
% \tableofcontents
|
||
|
|
||
|
\section{Overview}
|
||
|
|
||
|
This document outlines the changes made to {\ChezScheme} for
|
||
|
{\thisversion} since {\thatversion}.
|
||
|
|
||
|
{\thisversion} is supported for the following platforms.
|
||
|
The Chez Scheme machine type (returned by the \scheme{machine-type}
|
||
|
procedure) is given in parentheses.
|
||
|
|
||
|
\begin{itemize}
|
||
|
\item Linux x86, nonthreaded (i3le) and threaded (ti3le)
|
||
|
\item Linux x86\_64, nonthreaded (a6le) and threaded (ta6le)
|
||
|
\item MacOS X x86, nonthreaded (i3osx) and threaded (ti3osx)
|
||
|
\item MacOS X x86\_64, nonthreaded (a6osx) and threaded (ta6osx)
|
||
|
\item Linux ARMv6 (32-bit), nonthreaded (arm32le)
|
||
|
\item Linux PowerPC (32-bit), nonthreaded (ppc32le) and threaded (tppc32le)
|
||
|
\item Windows x86, nonthreaded (i3nt) and threaded (ti3nt)
|
||
|
\item Windows x86\_64, nonthreaded (a6nt) and threaded (ta6nt) [experimental]
|
||
|
%\item OpenBSD x86, nonthreaded (i3ob) and threaded (ti3ob)
|
||
|
%\item OpenBSD x86\_64, nonthreaded (a6ob) and threaded (ta6ob)
|
||
|
%\item FreeBSD x86, nonthreaded (i3fb) and threaded (ti3fb)
|
||
|
%\item FreeBSD x86\_64, nonthreaded (a6fb) and threaded (ta6fb)
|
||
|
%\item NetBSD x86, nonthreaded (i3nb) and threaded (ti3nb)
|
||
|
%\item NetBSD x86\_64, nonthreaded (a6nb) and threaded (ta6nb)
|
||
|
%\item OpenSolaris x86, nonthreaded (i3s2) and threaded (ti3s2)
|
||
|
%\item OpenSolaris x86\_64, nonthreaded (a6s2) and threaded (ta6s2)
|
||
|
\end{itemize}
|
||
|
|
||
|
This document contains three sections describing significant
|
||
|
(1) \href[static]{section:functionality}{functionality changes},
|
||
|
(2) \href[static]{section:bugfixes}{bugs fixed}, and
|
||
|
(3) \href[static]{section:performance}{performance enhancements}.
|
||
|
A version number listed in parentheses in the header for a change
|
||
|
indicates the first minor release or internal prerelease to support
|
||
|
the change.
|
||
|
|
||
|
More information on {\ChezScheme} and {\PetiteChezScheme} can
|
||
|
\scheme{be} found at \hyperlink{http://www.scheme.com/}{http://www.scheme.com},
|
||
|
and extensive documentation is available in
|
||
|
\TSPL{4}{th} (available directly from MIT Press or from online and local retailers)
|
||
|
and the \CSUG{9}.
|
||
|
Online versions of both books can be found at
|
||
|
\hyperlink{http://www.scheme.com/}{http://www.scheme.com}.
|
||
|
|
||
|
%-----------------------------------------------------------------------------
|
||
|
\section{Functionality Changes}\label{section:functionality}
|
||
|
|
||
|
\subsection{Unicode 14.0 Support (9.6.0)}
|
||
|
|
||
|
The character sets, character classes, and word-breaking algorithms for character, string,
|
||
|
and Unicode-related bytevector operations have now been updated to Unicode 14.0.
|
||
|
|
||
|
\subsection{Basic ftypes can be referenced, even if shadowed by syntactic binding (9.5.8)}
|
||
|
|
||
|
Previously, it was possible to interfere with the definition of ftypes by
|
||
|
creating a syntactic binding for one of the built-in types, such as
|
||
|
\scheme{integer-32}, \scheme{float}, etc.
|
||
|
As of 9.5.8, syntactic bindings that do not bind an ftype descriptor are no
|
||
|
longer considered when defining ftypes.
|
||
|
|
||
|
This change also allows a base ftype to be bound using `define-ftype`, though
|
||
|
this fixes the endianness of the type. For instance:
|
||
|
|
||
|
\schemedisplay
|
||
|
(define-ftype integer-32 integer-32)
|
||
|
\endschemedisplay
|
||
|
|
||
|
This binds the ftype `integer-32` to the native-endian `integer-32`. It is possible to bind both endiannesses by using explicit names:
|
||
|
|
||
|
\schemedisplay
|
||
|
(define-ftype integer-32-be (endian big integer-32))
|
||
|
(define-ftype integer-32-le (endian little integer-32))
|
||
|
(define-ftype integer-32 integer-32) ;; fixed to native endianness
|
||
|
\endschemedisplay
|
||
|
|
||
|
\subsection{Improved error messages (9.5.6)}
|
||
|
|
||
|
When the reader reports an invalid bytevector element, the error
|
||
|
message now includes the token value only if the token type is atomic.
|
||
|
|
||
|
When the expander reports that an ellipsis is missing in a syntax form,
|
||
|
it now includes the name of an identifier that is missing an ellipsis
|
||
|
within that form.
|
||
|
|
||
|
\subsection{Additional reader syntax for booleans (9.5.6)}
|
||
|
|
||
|
The reader now case-insensitively accepts \scheme{#true} and
|
||
|
\scheme{#false} as alternative spellings of the booleans \scheme{#t}
|
||
|
and \scheme{#f}, respectively.
|
||
|
|
||
|
\subsection{Self-evaluating vector literals (9.5.6)}
|
||
|
|
||
|
The new parameter \scheme{self-evaluating-vectors} can be used to treat unquoted vector
|
||
|
literals as self-evaluating instead of syntax errors. This parameter is turned off by
|
||
|
default.
|
||
|
|
||
|
\subsection{Incremental promotion of collected objects (9.5.4)}
|
||
|
|
||
|
In previous versions of {\ChezScheme}, the collector always promoted
|
||
|
surviving objects from every collected generation into a single
|
||
|
target generation.
|
||
|
For example, when the target generation was 3, it promoted not only
|
||
|
surviving objects from generation 2 to generation 3 but also surviving
|
||
|
objects from generations 0 and 1 directly to generation 3.
|
||
|
This caused some prematurely promoted objects to be subjected to
|
||
|
collection less frequently than their ages justified, potentially
|
||
|
resulting in substantial inappropriate storage retention.
|
||
|
This is particularly problematic when side effects result in pointers
|
||
|
from the inappropriately retained objects to younger objects, as
|
||
|
can happen with nonfunctional queues and lazy streams.
|
||
|
|
||
|
Unless directed to do otherwise, the collector now promotes objects
|
||
|
up only one generation at a time.
|
||
|
That is, generation 0 objects that survive collection are promoted
|
||
|
to generation 1, generation 1 objects are promoted to generation
|
||
|
2, and so on.
|
||
|
(Objects that survive a maximum nonstatic collection are promoted
|
||
|
back into the maximum nonstatic collection.)
|
||
|
Most applications should exhibit lower peak memory usage and possibly
|
||
|
lower run times with this change.
|
||
|
Applications that are adversely affected, if any, might benefit
|
||
|
from a custom collect-request handler or custom values for the
|
||
|
collection parameters that affect the behavior of the default
|
||
|
handler.
|
||
|
|
||
|
\subsection{Unicode Basic Multilingual Plane console I/O in Windows (9.5.4)}
|
||
|
|
||
|
Console I/O now supports characters from the Unicode Basic
|
||
|
Multilingual Plane in Windows. Windows consoles do not yet support the
|
||
|
supplementary planes.
|
||
|
|
||
|
\subsection{Incompatible fasl-format and compiled-file compression changes (9.5.4)}
|
||
|
|
||
|
The fasl (fast-load) format now supports per-object compression.
|
||
|
Whether the fasl writer actually performs compression is determined
|
||
|
by the new \scheme{fasl-compressed} parameter, whose value defaults
|
||
|
to \scheme{#t}.
|
||
|
The compression format and level are determined by the
|
||
|
\scheme{compress-format} and \scheme{compress-level}
|
||
|
parameters.
|
||
|
|
||
|
The \scheme{compile-compressed} parameter has been eliminated.
|
||
|
Since compiled files are written in fasl format, the
|
||
|
\scheme{fasl-compressed} parameter also now controls whether compiled
|
||
|
files are compressed.
|
||
|
|
||
|
Because individual portions of a fasl file are already compressed
|
||
|
by default, attempting to compress a fasl file as a whole is often
|
||
|
ineffective as well as inefficient both when writing and reading
|
||
|
fasl objects.
|
||
|
Thus, in particular, the \var{output-port} and \scheme{wpo-port}
|
||
|
supplied to \scheme{compile-port} and \scheme{compile-to-port}
|
||
|
should not be opened for compression.
|
||
|
Similarly, external tools should not expect compiled files to be
|
||
|
compressed as a whole, nor should they compress compiled files.
|
||
|
|
||
|
Because compression of fasl files was previously encouraged and is
|
||
|
now discouraged, the first attempt to write fasl data to or read
|
||
|
fasl data from a compressed port will cause a warning to be issued,
|
||
|
i.e., an exception with condition type \scheme{&warning} to be
|
||
|
raised.
|
||
|
|
||
|
The rationale for this change is to allow the fasl reader to seek
|
||
|
past, without reading, portions of an object file that contain
|
||
|
compile-time code at run time and run-time code at compile time.
|
||
|
|
||
|
\subsection{Bytevector compression and compression level (9.5.4)}
|
||
|
|
||
|
The procedure \scheme{bytevector-compress} now selects the level of
|
||
|
compression based on the \scheme{compress-level} parameter.
|
||
|
Prior to this it always used a default setting for compression.
|
||
|
|
||
|
The \scheme{compress-level} parameter can now take on the new value
|
||
|
\scheme{minimum} in addition to \scheme{low}, \scheme{medium},
|
||
|
\scheme{high}, and \scheme{maximum}.
|
||
|
|
||
|
\subsection{Combining object files (9.5.4)}
|
||
|
|
||
|
In previous versions of Chez Scheme, multiple object files could
|
||
|
be combined by concatenating them into a single file. To support faster
|
||
|
object file loading and loadability verification (described later in this
|
||
|
document), recompile information and information about libraries and
|
||
|
top-level programs within an object file is now placed at the top of the
|
||
|
file. The new \scheme{concatenate-object-files} procedure can be used to
|
||
|
combine multiple object files while moving this information to the
|
||
|
top of the combined file.
|
||
|
|
||
|
\subsection{Explicitly invoking libraries (9.5.4)}
|
||
|
|
||
|
The new procedure \scheme{invoke-library} can be used to force
|
||
|
the evaluation of a library's body expressions (variable definition
|
||
|
right-hand sides and initialization expressions) before they might
|
||
|
otherwise be needed.
|
||
|
It is generally useful only for libraries whose body expressions
|
||
|
have side effects.
|
||
|
|
||
|
\subsection{Verifying loadability of libraries and programs (9.5.4)}
|
||
|
|
||
|
The new procedure \scheme{verify-loadability} can be used to
|
||
|
determine, without actually loading any object code or defining any
|
||
|
libraries, whether a set of object files and the object files
|
||
|
satisfying their library dependencies, direct or indirect, are
|
||
|
present, readable, and mutually compatible.
|
||
|
|
||
|
To support loadability verification, information about libraries
|
||
|
and top-level programs within an object file is now placed at the
|
||
|
top of the file, just after recompile information. This change can
|
||
|
be detected by unusual setups, e.g., a source file that interleaves
|
||
|
library definitions and top-level forms that call library-list, but
|
||
|
is backward compatible for standard use cases in which each file
|
||
|
contains one or more libraries possibly followed by a top-level
|
||
|
program.
|
||
|
|
||
|
\subsection{Unregistering objects from guardians (9.5.4)}
|
||
|
|
||
|
The set of as-yet unresurrected objects registered with a guardian
|
||
|
can be unregistered and retrieved by means of the new primitive
|
||
|
\scheme{unregister-guardian}.
|
||
|
Consult the user's guide for usage and caveats.
|
||
|
Guardians can now be distinguished from other procedures (and other
|
||
|
objects) via the new primitive \scheme{guardian?}.
|
||
|
|
||
|
\subsection{Coverage support and source tables (9.5.4)}
|
||
|
|
||
|
When the new parameter \scheme{generate-covin-files} is set to \scheme{#t}
|
||
|
rather than the default \scheme{#f}, file compilation routines such as
|
||
|
\scheme{compile-file} and \scheme{compile-library} produce coverage
|
||
|
information (\scheme{.covin}) files that can be used in conjunction with
|
||
|
profile information to measure coverage of a source-code base.
|
||
|
Coverage information is also written out when the optional \var{covop}
|
||
|
argument is supplied to \scheme{compile-port} and \scheme{compile-to-port}.
|
||
|
|
||
|
A covin file contains a printed representation of a \emph{source
|
||
|
table} mapping each profiled source object in the code base to a
|
||
|
count of zero.
|
||
|
Source tables generally associate source objects with arbitrary values
|
||
|
and are allocated and manipulated with hashtable-like operations specific
|
||
|
to source tables.
|
||
|
|
||
|
Profile information can be tracked even through releasing and clearing
|
||
|
of profile counters via the new procedure \scheme{with-profile-tracker},
|
||
|
which produces a source table.
|
||
|
|
||
|
Coverage of a source-code base can thus be achieved by comparing
|
||
|
the set of source objects in the covin-file source tables for one
|
||
|
or more source files with the set of source objects in the source
|
||
|
tables produced by one or more runs of tests run with profile
|
||
|
information tracked by \scheme{with-profile-tracker}.
|
||
|
|
||
|
\subsection{Importing a library from an object file now visits the file (9.5.4)}
|
||
|
|
||
|
As described in Section~\ref{sec:faster-object-file-loading},
|
||
|
importing a library from an object file now causes the object file
|
||
|
to be visited rather than fully loaded.
|
||
|
If the run-time information is needed, i.e., if the library is
|
||
|
invoked, the file will be revisited.
|
||
|
This is typically transparent to the program, but problems can arise
|
||
|
if the program changes its current directory (via
|
||
|
\scheme{current-directory}) prior to invoking a library, and the
|
||
|
object file cannot be found.
|
||
|
|
||
|
\subsection{Recompile information (9.5.4)}
|
||
|
|
||
|
As described in Section~\ref{sec:faster-object-file-loading}, all
|
||
|
recompile information is now placed at the front of each object
|
||
|
file where it can be read without the need to scan through the
|
||
|
remainder of the file.
|
||
|
Because the library manager expects to find recompile information
|
||
|
at the front of an object file, it will not find all recompile
|
||
|
information if object files are concatenated together via some
|
||
|
mechanism other than then new \scheme{concatenate-object-files}
|
||
|
procedure.
|
||
|
|
||
|
Also, the compiler has to hold in memory the object code for all
|
||
|
expressions in a file so that it can emit the unified recompile
|
||
|
information, rather than writing to the object file incrementally,
|
||
|
which can significantly increase the memory required to compile a
|
||
|
large file full of individual top-level forms.
|
||
|
This does not affect top-level programs, which were already handled
|
||
|
as a whole, or a typical library file that contains just a single
|
||
|
library form.
|
||
|
|
||
|
\subsection{Optional new \protect\scheme{fasl-read} situation argument (9.5.4)}
|
||
|
|
||
|
It is now possible to direct \scheme{fasl-read} to read only visit
|
||
|
(compile-time) or revisit (run-time) objects via the optional new
|
||
|
situation argument.
|
||
|
Situation \scheme{visit} causes the fasl reader to skip over
|
||
|
revisit (run-time-only) objects, while
|
||
|
\scheme{revisit} causes the fasl reader to skip over
|
||
|
visit (compile-time-only) objects.
|
||
|
Situation \scheme{load} doesn't skip over any objects.
|
||
|
|
||
|
\subsection{Optional \protect\scheme{read-token} \protect\var{sfd} and \protect\var{bfp} arguments (9.5.4)}
|
||
|
|
||
|
In addition to the optional input-port argument, \scheme{read-token} now takes
|
||
|
optional \var{sfd} (source-file-descriptor) and \var{bfp} (beginning-file-position)
|
||
|
arguments.
|
||
|
If either is provided, both must be provided.
|
||
|
Specifying \var{sfd} and \var{bfp} improves the quality of error messages,
|
||
|
guarantees the \scheme{read-token} \var{start} and \var{end} return values can be determined,
|
||
|
and eliminates the overhead of asking for a file position on each call
|
||
|
to \scheme{read-token}.
|
||
|
\var{bfp} is normally 0 for the first call
|
||
|
to \scheme{read-token} at the start of a file,
|
||
|
and the \var{end} return value of the preceding
|
||
|
call for each subsequent call.
|
||
|
|
||
|
\subsection{Compression format and level (9.5.4)}
|
||
|
|
||
|
Support for LZ4 compression has been added.
|
||
|
LZ4 is now the default format when compressing files (including
|
||
|
object files produced by the compiler) and bytevectors, while {\tt
|
||
|
gzip} is still supported and can be enabled by setting
|
||
|
the new \scheme{compress-format} parameter to the symbol \scheme{gzip} instead of the
|
||
|
default \scheme{lz4}. Reading in compressed mode
|
||
|
infers the format, so reading {\tt gzip}-compressed files will still
|
||
|
work without changing \scheme{compress-format}. Reading LZ4-format
|
||
|
files tends to be much faster than reading {\tt gzip}-format files,
|
||
|
while {\tt gzip}-compressed files tend to be smaller.
|
||
|
In particular, object files created by the compiler now tend to be
|
||
|
larger but load more quickly.
|
||
|
|
||
|
The new \scheme{compress-level} parameter can be used to control
|
||
|
the amount of time spent on file and bytevector compression.
|
||
|
It can be set to one of the symbols \scheme{minimum}, \scheme{low},
|
||
|
\scheme{medium}, \scheme{high}, and \scheme{maximum}, which are
|
||
|
listed in order from shortest to longest compression time and least
|
||
|
to greatest effectiveness.
|
||
|
The default value is \scheme{medium}.
|
||
|
|
||
|
\subsection{Mutexes and condition variables can have names (9.5.4)}
|
||
|
|
||
|
The procedures \scheme{make-mutex} and \scheme{make-condition} now
|
||
|
accept an optional argument \scheme{name}, which must be a symbol
|
||
|
that identifies the object or \scheme{#f} for no name. The name is
|
||
|
printed every time the mutex or condition object is printed, which
|
||
|
is useful for debugging.
|
||
|
|
||
|
\subsection{Improved packaging support (9.5.1)}
|
||
|
|
||
|
The Chez Scheme \scheme{Makefile} has been enhanced with new targets for
|
||
|
creating binary packages for Unix-like operating systems.
|
||
|
The \scheme{create-tarball} target generates a binary tarball package for
|
||
|
distribution, the \scheme{create-rpm} target generates a Linux RPM package, and
|
||
|
the \scheme{create-pkg} target generates a macOS package file.
|
||
|
|
||
|
\subsection{Library search handler (9.5.1)}
|
||
|
|
||
|
The new \scheme{library-search-handler} parameter controls how library source
|
||
|
or object code is located when \scheme{import}, \scheme{compile-whole-program},
|
||
|
or \scheme{compile-whole-library} are used to load a library.
|
||
|
The value of the \scheme{library-search-handler} parameter must be a procedure
|
||
|
expecting four arguments: the \var{who} argument is a symbol that provides
|
||
|
context in \scheme{import-notify} messages, the \var{library} argument is the
|
||
|
name of the desired library, the \var{directories} is a list of source and
|
||
|
object directory pairs in the form returned by \scheme{library-directories},
|
||
|
and the \var{extensions} parameter is a list of source and object extension
|
||
|
pairs in the form returned by \scheme{library-extensions}.
|
||
|
The default value of the \scheme{library-search-handler} is the newly exposed
|
||
|
\scheme{default-library-search-handler} procedure.
|
||
|
|
||
|
\subsection{Ftype guardians (9.5.1)}
|
||
|
|
||
|
Applications that manage memory outside the Scheme heap can leverage
|
||
|
new support for ftype guardians to help perform reference counting.
|
||
|
An ftype guardian is like an ordinary guardian except that it does
|
||
|
not necessarily save from collection each ftype pointer registered
|
||
|
with it but instead decrements (atomically) a reference count at
|
||
|
the head of the object to which the ftype pointer points.
|
||
|
If the reference count becomes zero as a result of the decrement,
|
||
|
it preserves the object so that it can be retrieved from the guardian
|
||
|
and freed; otherwise it allows it to be collected.
|
||
|
|
||
|
\subsection{Recompile information and whole-program optimization (9.5.1)}
|
||
|
|
||
|
\scheme{compile-whole-program} and \scheme{compile-whole-library}
|
||
|
now propagate recompile information from the named \scheme{wpo}
|
||
|
file to the object file to support \scheme{maybe-compile-program}
|
||
|
and \scheme{maybe-compile-library} in the case where the new object
|
||
|
file overwrites the original object file.
|
||
|
|
||
|
\subsection{Directly accessing the value of compile-time values (9.5.1)}
|
||
|
|
||
|
The value of a compile-time value created by \scheme{make-compile-time-value}
|
||
|
can be retrieved via the new procedure \scheme{compile-time-value-value}.
|
||
|
The new predicate \scheme{compile-time-value?} can be used to determine if
|
||
|
an object is a compile-time value.
|
||
|
|
||
|
\subsection{Extracting a subset of hashtable entries (9.5.1)}
|
||
|
|
||
|
The new \scheme{hashtable-cells} function is similar to
|
||
|
\scheme{hashtable-entries}, but it returns a vector of cells instead
|
||
|
of two vectors. An optional argument to \scheme{hashtable-keys},
|
||
|
\scheme{hashtable-values}, \scheme{hashtable-entries}, or \scheme{hashtable-cells}
|
||
|
limits the size of the result vector.
|
||
|
|
||
|
\subsection{Profile data retained for reclaimed code (9.5.1)}
|
||
|
|
||
|
Profile data is now retained indefinitely even for code objects
|
||
|
that have been reclaimed by the garbage collector.
|
||
|
Previously, the counters holding the data were reclaimed by the
|
||
|
collector along with the code objects.
|
||
|
This makes profile output more complete and accurate, but it does
|
||
|
represent a potential space leak in programs that create or load
|
||
|
and release code dynamically.
|
||
|
Such programs can avoid the potential space leak by releasing the
|
||
|
counters explicitly via the new procedure
|
||
|
\scheme{profile-release-counters}.
|
||
|
|
||
|
\subsection{Procedure source location without inspector information (9.5.1)}
|
||
|
|
||
|
When \scheme{generate-inspector-information} is set to \scheme{#f} and
|
||
|
\scheme{generate-procedure-source-information} is set to \scheme{#t},
|
||
|
source location information is preserved for a procedure, even though
|
||
|
other inspector information is not preserved.
|
||
|
|
||
|
\subsection{Atomic compare-and-set (9.5.1)}
|
||
|
|
||
|
The new procedures \scheme{box-cas!} and \scheme{vector-cas!}
|
||
|
atomically update a box or vector with a given new value when the
|
||
|
current content is \scheme{eq?} to a given old value. Atomicity is
|
||
|
guaranteed even if multiple threads attempt to update the same box or
|
||
|
vector.
|
||
|
|
||
|
\subsection{Foreign-procedure thread activation (9.5.1)}
|
||
|
|
||
|
A new \scheme{__collect_safe} foreign-procedure convention, which can
|
||
|
be combined with other conventions, causes a foreign-procedure call to
|
||
|
deactivate the current thread during the call so that other threads can
|
||
|
perform a garbage collection. Similarly, the \scheme{__collect_safe}
|
||
|
convention modifier for callables causes the current thread to be
|
||
|
activated on entry to the callable, and the activation state is
|
||
|
reverted on exit from the callable; this activation makes callables
|
||
|
work from threads that are otherwise unknown to the Scheme system.
|
||
|
|
||
|
\subsection{Garbage collection and threads (9.5.1)}
|
||
|
|
||
|
A new \scheme{collect-rendezvous} function performs a garbage
|
||
|
collection in the same way as when the system determines that a
|
||
|
collection should occur. For many purposes,
|
||
|
\scheme{collect-rendezvous} is a variant of \scheme{collect} that
|
||
|
works when multiple threads are active. More precisely, the
|
||
|
\scheme{collect-rendezvous} function invokes the collect-request
|
||
|
handler (in an unspecified thread) after synchronizing all active
|
||
|
threads and temporarily deactivating all but the one used to call the
|
||
|
collect-request handler.
|
||
|
|
||
|
\subsection{Foreign-procedure struct arguments and results (9.5.1)}
|
||
|
|
||
|
A new \scheme{(& \var{ftype})} form allows a struct or union to be
|
||
|
passed between Scheme and a foreign procedure. The Scheme-side
|
||
|
representation of a \scheme{(& \var{ftype})} argument is the
|
||
|
same as a \scheme{(* \var{ftype})} argument, but where
|
||
|
\scheme{(& \var{ftype})} passes an address between the Scheme and C
|
||
|
worlds, \scheme{(& \var{ftype})} passes a copy of the data at the
|
||
|
address. When \scheme{(& \var{ftype})} is used as a result type,
|
||
|
an extra \scheme{(* \var{ftype})} argument must be provided to receive
|
||
|
the copied result, and the directly returned result is unspecified.
|
||
|
|
||
|
\subsection{Record equality and hashing (9.5, 9.5.1)}
|
||
|
|
||
|
Several new procedures and parameters allow a program to control what
|
||
|
\scheme{equal?} and \scheme{equal-hash} do when applied
|
||
|
to structures containing record instances.
|
||
|
The procedures \scheme{record-type-equal-procedure} and
|
||
|
\scheme{record-type-hash-procedure} can be used to customize the
|
||
|
handling of records of specific types by \scheme{equal?} and \scheme{hash}, and
|
||
|
the procedures \scheme{record-equal-procedure} and
|
||
|
\scheme{record-hash-procedure} can be used to look up the
|
||
|
applicable (possibly inherited) equality and hashing procedures
|
||
|
for specific record instances.
|
||
|
The parameters \scheme{default-record-equal-procedure} and
|
||
|
\scheme{default-record-hash-procedure} can be used to control
|
||
|
the default behavior when comparing or hashing records without
|
||
|
type-specific equality and hashing procedures.
|
||
|
|
||
|
\subsection{Immutable vectors, fxvectors, bytevectors, strings, and boxes (9.5)}
|
||
|
|
||
|
Support for immutable vectors, fxvectors, bytevectors, strings, and boxes
|
||
|
has been added.
|
||
|
Immutable vectors are created via \scheme{vector->immutable-vector},
|
||
|
and immutable fxvectors, bytevectors, and strings are created by similarly named
|
||
|
procedures.
|
||
|
Immutable boxes are created via \scheme{box-immutable}.
|
||
|
Any attempt to modify an immutable object causes an exception to be raised.
|
||
|
|
||
|
\subsection{Ephemeron pairs and hashtables (9.5)}
|
||
|
|
||
|
Support for ephemeron pairs has been added, along with eq and eqv
|
||
|
hashtables that use ephemeron pairs to combine keys and values. An
|
||
|
ephemeron pair avoids the ``key in value'' problem of weak pairs,
|
||
|
where a weakly held key is paired to a value that refers back to the
|
||
|
key, in which case the key remains reachable as long as the pair is
|
||
|
reachable. In an ephemeron pair, the cdr of the pair is not considered
|
||
|
reachable by the garbage collector until both the pair and the car of
|
||
|
the pair have been found reachable. An ephemeron hashtable implements
|
||
|
a weak mapping where referencing a key in a value does not prevent the
|
||
|
mapping from being removed from the table.
|
||
|
|
||
|
\subsection{Optional timeout for \protect\scheme{condition-wait} (9.5)}
|
||
|
|
||
|
The \scheme{condition-wait} procedure now takes an optional
|
||
|
\var{timeout} argument and returns a boolean indicating whether the
|
||
|
thread was awakened by the condition before the timeout. The
|
||
|
\var{timeout} can be a time record of type \scheme{time-duration} or
|
||
|
\scheme{time-utc}, or it can be \scheme{#f} for no timeout (the
|
||
|
default).
|
||
|
|
||
|
\subsection{\protect\scheme{date-dst?} and \protect\scheme{date-zone-name} (9.5)}
|
||
|
|
||
|
The new primitive procedures \scheme{date-dst?} and
|
||
|
\scheme{date-zone-name} access time-zone information for a
|
||
|
\scheme{date} record that is created without an explicit
|
||
|
zone offset. The zone-offset argument to \scheme{make-date}
|
||
|
is now optional.
|
||
|
|
||
|
\subsection{\protect\scheme{procedure-arity-mask} (9.5)}
|
||
|
|
||
|
The new primitive procedure \scheme{procedure-arity-mask} takes a
|
||
|
procedure \var{p} and returns a two's complement bitmask representing
|
||
|
the argument counts accepted by \var{p}.
|
||
|
For example, the arity mask for a two-argument procedure such as
|
||
|
\var{cons} is $4$ (only bit two set),
|
||
|
while the arity mask for a procedure that accepts one or more arguments,
|
||
|
such as \var{list*}, is $-2$ (all but bit 0 set).
|
||
|
|
||
|
\subsection{Bytevector compression (9.5)}
|
||
|
|
||
|
The new primitive procedures \scheme{bytevector-compress} and
|
||
|
\scheme{bytevector-decompress} exposes for bytevectors the kind of
|
||
|
compression functionality that is used for files with the
|
||
|
\scheme{compressed} option.
|
||
|
|
||
|
\subsection{Line caching and source objects (9.5)}
|
||
|
|
||
|
The \scheme{locate-source} function accepts an optional argument that
|
||
|
enables the use of a cache for line information, so that a source file
|
||
|
does not have to be consulted each time to compute line information.
|
||
|
To further avoid file and caching issues, a source object has optional
|
||
|
beginning-line and beginning-column components. Source objects with line
|
||
|
and column components take more space, but they allow reporting of line and column
|
||
|
information even if a source file is later modified or becomes unavailable.
|
||
|
The value of the \scheme{current-make-source-object} parameter is used by the
|
||
|
reader to construct source objects for programs, and the parameter can be
|
||
|
modified to collect line and column information eagerly. The value of the
|
||
|
\scheme{current-locate-source-object-source} parameter is used for
|
||
|
error reporting, instead of calling \scheme{locate-source} or
|
||
|
\scheme{locate-source-object-source} directly, so that just-in-time
|
||
|
source-location lookup can be adjusted, too.
|
||
|
|
||
|
\subsection{High-precision clock time in Windows 8 and up (9.5)}
|
||
|
|
||
|
When running on Windows 8 and up, Chez Scheme uses the high-precision
|
||
|
clock time function for the current date and time.
|
||
|
|
||
|
\subsection{Printing of non-standard (extended) identifiers (9.5)}
|
||
|
|
||
|
Chez Scheme extends the syntax of identifiers as described in the
|
||
|
introduction to the Chez Scheme User's Guide, except within forms prefixed
|
||
|
by \scheme{#!r6rs}, which is implied in a library or top-level program.
|
||
|
Prior to Version~9.5, the printer always printed such identifiers using
|
||
|
hex scalar value escapes as necessary to render them with valid R6RS identifier syntax.
|
||
|
When the new parameter \scheme{print-extended-identifiers} is set
|
||
|
to \scheme{#t}, these identifiers are printed without escapes, e.g.,
|
||
|
\scheme{1+} prints as \scheme{1+} rather than as \scheme{\x31;+}.
|
||
|
The default value of this parameter is \scheme{#f}.
|
||
|
|
||
|
\subsection{Expression-editor Unicode support (9.5)}
|
||
|
|
||
|
The expression editor now supports Unicode characters under Linux and MacOS~X
|
||
|
except that combining characters are not treated correctly for
|
||
|
line-wrapping.
|
||
|
|
||
|
\subsection{Extensions to whole-program, whole-library optimization (9.3.1, 9.3.4)}
|
||
|
|
||
|
\scheme{compile-whole-program} now supports incomplete
|
||
|
whole-program optimization, i.e., whole program optimization that
|
||
|
incorporates only libraries for which wpo files are available while
|
||
|
leaving separate libraries for which only object files are available.
|
||
|
In addition, imported libraries can be left visible for run-time
|
||
|
use by the \scheme{environment} procedure or for dynamically loaded
|
||
|
object files that might require them.
|
||
|
The new procedure \scheme{compile-whole-library} supports the combination
|
||
|
of groups of libraries separate from programs and unconditionally
|
||
|
leaves all imported libraries visible.
|
||
|
|
||
|
\subsection{24-, 40-, 48-, and 56-bit bit-field containers (9.3.3)}
|
||
|
|
||
|
The total size of the fields within an ftype \scheme{bits} can now be
|
||
|
24, 40, 48, or 56 (as well as 8, 16, 32, and 64).
|
||
|
|
||
|
\subsection{Object-counting for static-generation collections (9.3.3)}
|
||
|
|
||
|
Object counting (see \scheme{object-counts} below) is now enabled for
|
||
|
all collections targeting the static generation.
|
||
|
|
||
|
\subsection{Support for off-line profile-dump processing (9.3.2)}
|
||
|
|
||
|
Previously, the output of \scheme{profile-dump} was not specified.
|
||
|
It is now specified to be a list of source-object, profile-count pairs.
|
||
|
In addition, \scheme{profile-dump-html}, \scheme{profile-dump-list},
|
||
|
and \scheme{profile-dump-data} all now take an optional \var{dump}
|
||
|
argument, which is a list of source-object, profile-count pairs in
|
||
|
the form returned by \scheme{profile-dump} and defaults to the current
|
||
|
value of \scheme{(profile-dump)}.
|
||
|
|
||
|
With these changes, it is now possible to obtain a dump from
|
||
|
\scheme{profile-dump} in one process, and write it to a fasl file
|
||
|
(using \scheme{fasl-write}) for subsequent off-line processing in
|
||
|
another process, where it can be read from the fasl file (using
|
||
|
\scheme{fasl-read}) and processed using \scheme{profile-dump-html},
|
||
|
\scheme{profile-dump-list}, \scheme{profile-dump-data} or some
|
||
|
custom mechanism.
|
||
|
|
||
|
\subsection{More support for controlling return of memory to the O/S (9.3.2)}
|
||
|
|
||
|
A new parameter, \scheme{release-minimum-generation}, determines when
|
||
|
the collector attempts to return unneeded virtual memory to the O/S.
|
||
|
It defaults to the value of \scheme{collect-maximum-generation}, so the
|
||
|
collector attempts to return memory to the O/S only when performing a
|
||
|
maximum-generation collection.
|
||
|
It can be set to a lower generation number to cause the collector to
|
||
|
do so for younger generations we well.
|
||
|
|
||
|
\subsection{sstats changes (9.3.1)}
|
||
|
|
||
|
The vector-based sstats structure has been replaced with a record type.
|
||
|
The time fields are all time objects, and the bytes and count fields
|
||
|
are now exact integers.
|
||
|
\scheme{time-difference} no longer coerces negative results to zero.
|
||
|
|
||
|
\subsection{\protect\scheme{library-group} eliminated (9.3.1)}
|
||
|
|
||
|
With the extensions to \scheme{compile-whole-program} and the
|
||
|
addition of \scheme{compile-whole-library}, as described above,
|
||
|
support for whole-program and whole-library optimization now subsumes
|
||
|
the functionality of the experimental \scheme{library-group} form,
|
||
|
and the form has been eliminated.
|
||
|
This is an \emph{incompatible change}.
|
||
|
|
||
|
\subsection{Support for Version~7 interaction-environment semantics eliminated (9.3.1)}
|
||
|
|
||
|
Prior to Version~8, the semantics of the interaction environment
|
||
|
used by the read-eval-print loop (REPL), aka waiter, and by
|
||
|
\scheme{load}, \scheme{compile}, and \scheme{interpret} without
|
||
|
explicit environment arguments treated all variables in the environment
|
||
|
as mutable, including those bound to primitives.
|
||
|
This meant that top-level references to primitive names could not
|
||
|
be optimized by the compiler because their values might change at
|
||
|
run time, except that, at optimize-level 2 and above, the compiler
|
||
|
did treat primitive names as always having their original values.
|
||
|
|
||
|
In Version 8 and subsequent versions, primitive bindings in the
|
||
|
interaction environment are immutable, as if imported directly from
|
||
|
the immutable Scheme environment.
|
||
|
That is, they cannot be assigned, although they can be replaced
|
||
|
with new bindings with a top-level definition.
|
||
|
|
||
|
To provide temporary backward compatibility, the
|
||
|
\scheme{--revert-interaction-semantics} command-line option and
|
||
|
\scheme{revert-interaction-semantics} parameter allowed programmers
|
||
|
to revert the interaction environment to Version~7 semantics.
|
||
|
This functionality has now been eliminated and along with it the
|
||
|
special treatment of primitive bindings at optimize level 2 and
|
||
|
above.
|
||
|
|
||
|
This is an \emph{incompatible change}.
|
||
|
|
||
|
\subsection{Explicit specification of profile source locations (9.3.1)}
|
||
|
|
||
|
Version 9.3.1 augments existing support for explicit source-code
|
||
|
annotations with additional features targeted at source profiling
|
||
|
for externally generated programs, including programs generated by
|
||
|
language front ends that target Scheme and use Chez Scheme as the
|
||
|
back end.
|
||
|
Included is a \scheme{profile} expression that explicitly associates
|
||
|
a specified source object with a profile count (of times the
|
||
|
expression is evaluated), \scheme{generate-profile-forms} parameter
|
||
|
that controls whether the compiler (also) associates profile counts
|
||
|
with source locations implicitly identified by annotated expressions
|
||
|
in the input, and a finer-grained method for marking whether an
|
||
|
individual annotation should be used for debugging, profiling, or
|
||
|
both.
|
||
|
|
||
|
\subsection{``Maybe'' file (re)compilation (9.3.1)}
|
||
|
|
||
|
When \scheme{compile-imported-libraries} is set to \scheme{#t},
|
||
|
libraries required indirectly by one of the
|
||
|
file-compilation procedures, e.g., \scheme{compile-library},
|
||
|
\scheme{compile-program}, and \scheme{compile-file}, are automatically
|
||
|
compiled if and only if the object file is not present, older than
|
||
|
the source (main and include) files, or some library upon which
|
||
|
they depend has been or needs to be recompiled.
|
||
|
|
||
|
Version 9.3.1 adds three new procedures: \scheme{maybe-recompile-library},
|
||
|
\scheme{maybe-recompile-program}, and \scheme{maybe-recompile-file},
|
||
|
that perform a similar analysis and compile the library, program,
|
||
|
or file only under similar circumstances.
|
||
|
|
||
|
\subsection{New primitives for querying memory utilization (9.3.1)}
|
||
|
|
||
|
Three new primitives have been added to allow a Scheme process to
|
||
|
track usage of virtual memory for its heap.
|
||
|
|
||
|
\scheme{current-memory-bytes} returns the total number of bytes of
|
||
|
virtual memory used or reserved to represent the Scheme heap.
|
||
|
This differs from \scheme{bytes-allocated}, which returns the number
|
||
|
of bytes currently occupied by Scheme objects.
|
||
|
\scheme{current-memory-bytes} additionally includes memory used for
|
||
|
heap management as well as memory held in reserve to satisfy future
|
||
|
allocation requests.
|
||
|
|
||
|
\scheme{maximum-memory-bytes} returns the maximum number of bytes
|
||
|
of virtual memory occupied or reserved for the Scheme heap by the
|
||
|
calling process since the last call to \scheme{reset-maximum-memory-bytes!}
|
||
|
or, if \scheme{reset-maximum-memory-bytes!} has never been called,
|
||
|
since system start-up.
|
||
|
|
||
|
\scheme{reset-maximum-memory-bytes!} resets the maximum memory bytes
|
||
|
to the current memory bytes.
|
||
|
|
||
|
\subsection{Unicode 7.0 support (9.3.1)}
|
||
|
|
||
|
The character sets, character classes, and word-breaking algorithms
|
||
|
for character, string, and Unicode-related bytevector operations
|
||
|
have now been updated to Unicode 7.0.
|
||
|
|
||
|
\subsection{Linux PowerPC (32-bit) support (9.3)}
|
||
|
|
||
|
Support for running {\ChezScheme} on 32-bit PowerPC processors
|
||
|
running Linux has been added, with machines type ppc32le (nonthreaded)
|
||
|
and tppc32le (threaded).
|
||
|
C~code intended to be linked with these versions of the system
|
||
|
should be compiled using the GNU C~compiler's \scheme{-m32} option.
|
||
|
|
||
|
\subsection{Printed representation of procedures (9.2.1)}
|
||
|
|
||
|
The printed representation of a procedure now includes the source
|
||
|
file and beginning file position when available.
|
||
|
|
||
|
\subsection{I/O errors writing to the console error port (9.2.1)}
|
||
|
|
||
|
The default exception handler now catches I/O exceptions that occur
|
||
|
when it attempts to display a condition and, if an I/O exception
|
||
|
does occur, resets as if by calling the \scheme{reset} procedure.
|
||
|
The intent is to avoid an infinite regression (ultimately ending
|
||
|
in exhaustion of memory) in which the process repeatedly recurs
|
||
|
back to the default exception handler trying to write to a console-error
|
||
|
port (typically stderr) that is no longer writable, e.g., due to
|
||
|
the other end of a pipe or socket having been closed.
|
||
|
|
||
|
\subsection{C locking macros (9.2.1)}
|
||
|
|
||
|
The header file scheme.h distributed with Chez Scheme now includes
|
||
|
several new lock-related macros:
|
||
|
\scheme{INITLOCK} (corresponding to \scheme{ftype-init-lock!}),
|
||
|
\scheme{SPINLOCK} (\scheme{ftype-spin-lock!}),
|
||
|
\scheme{UNLOCK} (\scheme{ftype-unlock!}),
|
||
|
\scheme{LOCKED_INCR} (\scheme{ftype-locked-incr!}), and
|
||
|
\scheme{LOCKED_DECR} (\scheme{ftype-locked-decr!}).
|
||
|
All take a pointer to an iptr or uptr.
|
||
|
\scheme{LOCKED_INCR} and \scheme{LOCKED_DECR} also take an
|
||
|
\scheme{lvalue} argument that is set to true (nonzero) if the result
|
||
|
of the increment or decrement is zero, otherwise false (zero).
|
||
|
|
||
|
\subsection{New \protect\scheme{compile-to-file} procedure (9.2.1)}
|
||
|
|
||
|
The new procedure \scheme{compile-to-file} is similar to
|
||
|
\scheme{compile-to-port} with the output port replaced with an
|
||
|
output pathname.
|
||
|
|
||
|
\subsection{Whole-program optimization (9.2)}
|
||
|
|
||
|
Version 9.2 includes support for whole-program optimization of a top-level
|
||
|
program and the libraries upon which it depends at run time based on ``wpo''
|
||
|
(whole-program-optimization) files produced as a byproduct of compiling
|
||
|
the program and libraries when the parameter \scheme{generate-wpo-files}
|
||
|
is set to \scheme{#t}.
|
||
|
The new procedure \scheme{compile-whole-program} takes as input
|
||
|
a wpo file for a top-level program, combines it with the wpo files for
|
||
|
any libraries the program requires at run time, and produces a single
|
||
|
object file containing a self-contained program.
|
||
|
In so doing, it discards unused code and optimizes across program and
|
||
|
library boundaries, potentially reducing program load time, run time,
|
||
|
and memory requirements.
|
||
|
|
||
|
\scheme{compile-file}, \scheme{compile-program}, \scheme{compile-library},
|
||
|
and \scheme{compile-script} produce wpo files as well as ordinary
|
||
|
object files when the new \scheme{generate-wpo-files} parameter is set
|
||
|
to \scheme{#t} (the default is \scheme{#f}).
|
||
|
\scheme{compile-port} and \scheme{compile-to-port} do so when passed
|
||
|
an optional \var{wpo output port}.
|
||
|
|
||
|
\subsection{Type-specific symbol-hashtable operators (9.2)\label{sec:symbol-hashtables}}
|
||
|
|
||
|
A new set of primitives that operate on symbol
|
||
|
hashtables has been added:
|
||
|
|
||
|
\schemedisplay
|
||
|
symbol-hashtable?
|
||
|
symbol-hashtable-ref
|
||
|
symbol-hashtable-set!
|
||
|
symbol-hashtable-contains?
|
||
|
symbol-hashtable-cell
|
||
|
symbol-hashtable-update!
|
||
|
symbol-hashtable-delete!
|
||
|
\endschemedisplay
|
||
|
|
||
|
These are like their generic counterparts but operate only on symbol
|
||
|
hashtables, i.e., hashtables created with \scheme{symbol-hash} as
|
||
|
the hash function and \scheme{eq?}, \scheme{eqv?}, \scheme{equal?},
|
||
|
or \scheme{symbol=?} as the equivalence function.
|
||
|
|
||
|
These primitives are more efficient at optimize-level 3 than their
|
||
|
generic counterparts when both are applied to symbol hashtables.
|
||
|
The performance of symbol hashtables has been improved even when the new
|
||
|
operators are not used (Section~\ref{sec:symbol-hashtable-performance}).
|
||
|
|
||
|
\subsection{\protect\scheme{strip-fasl-file} is now machine-independent (9.2)}
|
||
|
|
||
|
\scheme{strip-fasl-file} can now strip fasl files created for a machine
|
||
|
type other than the machine type of the calling process as long as the
|
||
|
Chez Scheme version is the same.
|
||
|
|
||
|
\subsection{\protect\scheme{source-file-descriptor} and \protect\scheme{locate-source} (9.2)}
|
||
|
|
||
|
The new procedure \scheme{source-file-descriptor} can be used to construct
|
||
|
a custom source-file descriptor or reconstruct a source-file descriptor
|
||
|
from values previously extracted from another source-file descriptor.
|
||
|
It takes two arguments: a string \var{path} and exact nonnegative integer
|
||
|
\var{checksum} and returns a new source-file descriptor.
|
||
|
|
||
|
The new procedure \scheme{locate-source} can be used to determine a full
|
||
|
path, line number, and character position from a source-file descriptor
|
||
|
and file position.
|
||
|
It accepts two arguments: a source-file descriptor \var{sfd} and an
|
||
|
exact nonnegative integer file position \var{fp}.
|
||
|
It returns zero values if the unmodified file is not found in the source
|
||
|
directories and three values (string \var{path}, exact nonnegative
|
||
|
integer \var{line}, and exact nonnegative integer \var{char}) if the
|
||
|
file is found.
|
||
|
|
||
|
\subsection{Compressed compiled scripts and partially compressed files (9.2)}
|
||
|
|
||
|
Support for creating and handling files that begin with uncompressed
|
||
|
data and end with compressed data has been added in the form of the
|
||
|
new procedure \scheme{port-file-compressed!} that takes a port and
|
||
|
if not already set up to read or write compressed data, sets it up
|
||
|
to do so.
|
||
|
The port must be a file port pointing to a regular file, i.e., a
|
||
|
file on disk rather than a socket or pipe, and the port must not be
|
||
|
an input/output port.
|
||
|
The port can be a binary or textual port.
|
||
|
If the port is an output port, subsequent output sent to the port
|
||
|
will be compressed.
|
||
|
If the port is an input port, subsequent input will be decompressed
|
||
|
if and only if the port is currently pointing at compressed data.
|
||
|
|
||
|
When the parameter \scheme{compile-compressed} is set ot \scheme{#t},
|
||
|
the \scheme{compile-script} and \scheme{compile-program} procedures
|
||
|
take advantage of this functionality to copy the \scheme{#!} prefix,
|
||
|
if present in the source file, uncompressed in the object file while
|
||
|
compressing the object code emitted for the program, thus reducing
|
||
|
the size of the resulting file without preventing the \scheme{#!}
|
||
|
line from being read and interpreted properly by the operating
|
||
|
system.
|
||
|
|
||
|
\subsection{Change in library import handling (9.2)}
|
||
|
|
||
|
In previous releases, when an object file was found before the
|
||
|
corresponding source file in the library directories, the object file was
|
||
|
older, and the parameter \scheme{compile-imported-libraries} was not set,
|
||
|
the object file was loaded rather than the source file.
|
||
|
The (newer) source file is now loaded instead, just as it would be if
|
||
|
the source file is found before the corresponding, older object file.
|
||
|
This is an \emph{incompatible change}.
|
||
|
|
||
|
\subsection{Change in fasl-strip options (9.1)}
|
||
|
|
||
|
\scheme{strip-fasl-file} now supports stripping of all compile-time
|
||
|
information and no longer supports stripping of just library visit code.
|
||
|
Stripping all compile-time information nearly always results in smaller
|
||
|
object files than stripping just library visit code, with a corresponding
|
||
|
reduction in the memory required when the resulting
|
||
|
file is loaded.
|
||
|
|
||
|
To reflect this, the old fasl-strip option \scheme{library-visit-code}
|
||
|
has been eliminated, and the new fasl-strip option
|
||
|
\scheme{compile-time-information} has been added.
|
||
|
This is an \emph{incompatible change} in that code that previously
|
||
|
used the fasl-strip option \scheme{library-visit-code} will
|
||
|
have to be modified to omit the option or to replace it with
|
||
|
\scheme{compile-time-information}.
|
||
|
|
||
|
\subsection{Library loading (9.1)}
|
||
|
|
||
|
Visiting (via \scheme{visit}) a library no longer loads the library's
|
||
|
run-time information (invoke dependencies and invoke code), and revisiting
|
||
|
(via \scheme{revisit}) a library no longer loads the library's
|
||
|
compile-time information (import and visit dependencies and import and
|
||
|
visit code).
|
||
|
|
||
|
When a library is invoked due to a run-time dependency of another
|
||
|
library or a top-level program on the library, the library is now
|
||
|
``revisited'' (as if via \scheme{revisit}) rather than ``loaded''
|
||
|
(as if via \scheme{load}).
|
||
|
As a result, the compile-time information is not loaded, which can result
|
||
|
in substantial reductions in both library invocation time and memory
|
||
|
footprint.
|
||
|
|
||
|
If a library is revisited, either explicitly or as the result of run-time
|
||
|
dependency, a subsequent import of the library causes it to be
|
||
|
``visited'' (as if via \scheme{visit}) if the same object file can be
|
||
|
found at the same path and the visit code has not been stripped.
|
||
|
The compile-time code can alternatively be loaded explicitly from the same or a
|
||
|
different file via a direct call to \scheme{visit}.
|
||
|
|
||
|
While this change is mostly transparent (ignoring the reduced invocation
|
||
|
time and memory footprint), it is an \emph{incompatible change} in the
|
||
|
sense that the system potentially reads the file twice and can run
|
||
|
code that is marked using \scheme{eval-when} as both visit
|
||
|
and revisit code.
|
||
|
|
||
|
\subsection{Finding objects in the heap (9.1)}
|
||
|
|
||
|
Version 9.1 includes support for a new heap inspection tool that
|
||
|
allows a programmer to look for objects in the heap according to
|
||
|
arbitrary predicates.
|
||
|
The new procedure \scheme{make-object-finder} takes a predicate \var{pred} and two optional
|
||
|
arguments: a starting point \var{x} and a maximum generation \var{g}.
|
||
|
The starting point defaults to the value of the procedure \scheme{oblist},
|
||
|
and the maximum generation defaults to the value of the parameter
|
||
|
\scheme{collect-maximum-generation}.
|
||
|
\scheme{make-object-finder} returns an object finder \var{p} that can be used to
|
||
|
search for objects satisfying \var{pred} within the starting-point object \var{x}.
|
||
|
Immediate objects and objects in generations older than \var{g} are treated
|
||
|
as leaves.
|
||
|
\var{p} is a procedure accepting no arguments.
|
||
|
If an object \var{y} satisfying \var{pred} can be found starting with \var{x},
|
||
|
\var{p} returns a list whose first element is \var{y} and whose remaining
|
||
|
elements represent the path of objects from \var{x} to \var{y}, listed
|
||
|
in reverse order.
|
||
|
\var{p} can be invoked multiple times to find additional objects satisfying
|
||
|
the predicate, if any.
|
||
|
\var{p} returns \scheme{#f} if no more objects matching the predicate
|
||
|
can be found.
|
||
|
|
||
|
\var{p} maintains internal state recording where it has been so that it
|
||
|
can restart at the point of the last found object and not return
|
||
|
the same object twice.
|
||
|
The state can be several times the size of the starting-point object
|
||
|
\var{x} and all that is reachable from \var{x}.
|
||
|
|
||
|
The interactive inspector provides a convenient interface to the object
|
||
|
finder in the form of \scheme{find} and \scheme{find-next} commands.
|
||
|
The \scheme{find} command evaluates its first argument, which should
|
||
|
evaluate to the desired predicate, and treats its second argument, if
|
||
|
present, as the maximum generation, overriding the default.
|
||
|
The starting point \var{x} is the object upon which the
|
||
|
inspector is currently focused.
|
||
|
If an object is found, the inspector's new focus is the found object,
|
||
|
the parent focus (obtainable via the \scheme{up} command) is the first
|
||
|
element in the (reversed) path, the parent's parent is the next element,
|
||
|
and so on up to \var{x}.
|
||
|
The \scheme{find-next} command repeats the last find, as if by an explicit
|
||
|
invocation of the same object finder.
|
||
|
|
||
|
Relocation tables for static code objects are discarded by default, which
|
||
|
prevents object finders from providing accurate results when static code
|
||
|
objects are involved.
|
||
|
That is, they will not find any objects pointed to directly from a code
|
||
|
object that has been promoted to the static generation.
|
||
|
If this is a problem, the command-line argument
|
||
|
\scheme{--retain-static-relocation} can be used to prevent the relocation
|
||
|
tables from being discarded.
|
||
|
|
||
|
\subsection{Object counts (9.1)}
|
||
|
|
||
|
The new procedure \scheme{object-counts} can be used to determine,
|
||
|
for each type of object, the number and size in bytes of objects of
|
||
|
that type in each generation.
|
||
|
Its return value has the following structure:
|
||
|
|
||
|
\schemedisplay
|
||
|
((\var{type} (\var{generation} \var{count} . \var{bytes}) \dots) \dots)
|
||
|
\endschemedisplay
|
||
|
|
||
|
\var{type} is either the name of a primitive type, represented as a
|
||
|
symbol, e.g., \scheme{pair}, or a record-type descriptor (rtd).
|
||
|
\var{generation} is a nonnegative fixnum between 0 and the value
|
||
|
of \scheme{(collect-maximum-generation)}, inclusive, or the symbol
|
||
|
\scheme{static} representing the static generation.
|
||
|
\var{count} and \var{bytes} are nonnegative fixnums.
|
||
|
|
||
|
Object counts are accurate for a generation $n$ immediately after
|
||
|
a collection of generation $n$ or higher if enabled during that
|
||
|
collection.
|
||
|
Object counts are enabled by setting the parameter
|
||
|
\scheme{enable-object-counts} to \scheme{#t}.
|
||
|
The command-line option \scheme{--enable-object-counts} can be used to
|
||
|
set this parameter to \scheme{#t} on startup.
|
||
|
Object counts are not enabled by default since it adds overhead to
|
||
|
garbage collection.
|
||
|
|
||
|
To make the information more useful in the presence of ftype pointers,
|
||
|
the ftype descriptors produced by \scheme{define-ftype} for each
|
||
|
defined ftype now carry the name of the ftype rather than a generic
|
||
|
name like \scheme{ftd-struct}.
|
||
|
(Ftype descriptors are subtypes of record-type descriptors and can appear
|
||
|
as types in the \scheme{object-counts} return value.)
|
||
|
|
||
|
\subsection{Native-eol style is now none (9.1)}
|
||
|
|
||
|
To simplify interaction with tools that naively expose multiple-character
|
||
|
end-of-line sequences such as CRLF as separate characters to the user, the
|
||
|
native end-of-line style (\scheme{native-eol-style}) is now \scheme{none}
|
||
|
on all machine types.
|
||
|
This is an \emph{incompatible change}.
|
||
|
|
||
|
\subsection{Library-requirements options (9.1)}
|
||
|
|
||
|
In previous releases, the \scheme{library-requirements} procedure
|
||
|
returns a list of all libraries required by the specified library,
|
||
|
whether they are needed when the specified library is imported,
|
||
|
visited, or invoked.
|
||
|
While this remains the default behavior, \scheme{library-requirements}
|
||
|
now takes an optional ``options'' argument.
|
||
|
This must be a library-requirements-options enumerations set, i.e., the
|
||
|
value of a \scheme{library-requirements-options} form with some subset of
|
||
|
the options \scheme{import}, \scheme{visit@visit}, \scheme{invoke@visit},
|
||
|
and \scheme{invoke}. \scheme{import} includes the libraries
|
||
|
that must be imported when the specified library is imported;
|
||
|
\scheme{visit@visit} includes the libraries that must be visited when
|
||
|
the specified library is visited; \scheme{invoke@visit} includes the libraries
|
||
|
that must be invoked when the specified library is visited; and
|
||
|
\scheme{invoke} includes the libraries that must be invoked when
|
||
|
the specified library is invoked.
|
||
|
The default behavior is obtained by supplying a enumeration set containing all
|
||
|
of these options.
|
||
|
|
||
|
\subsection{Nested object size and composition (9.1)}
|
||
|
|
||
|
Two new procedures, \scheme{compute-size} and
|
||
|
\scheme{compute-composition}, can be used to determine the
|
||
|
size and make-up of nested objects with the heap.
|
||
|
|
||
|
Both take an object and an optional generation.
|
||
|
The generation must be a fixnum between 0 and the value of
|
||
|
\scheme{(collect-maximum-generation)}, inclusive, or the symbol static.
|
||
|
It defaults to the value of \scheme{(collect-maximum-generation)}.
|
||
|
|
||
|
\scheme{compute-size} returns the number of bytes occupied by the object
|
||
|
and everything to which it points, ignoring objects in generations older
|
||
|
than the specified generation.
|
||
|
|
||
|
\scheme{compute-composition} returns an association list giving the
|
||
|
number and number of bytes of each type of object that the specified
|
||
|
object is constructed from, ignoring objects in generations older than
|
||
|
the specified generation. The association list maps type names (e.g.,
|
||
|
pair and flonum) or record-type descriptors to a pair of fixnums
|
||
|
giving the count and bytes.
|
||
|
Types with zero counts are not included in the list.
|
||
|
|
||
|
A surprising number of objects effectively point indirectly to a large
|
||
|
percentage of all objects in the heap due to the attachment of top-level
|
||
|
environment bindings to symbols, but the generation argument can be used
|
||
|
in combination with explicit calls to collect (with automatic collections
|
||
|
disabled) to measure precisely how much space is allocated to freshly
|
||
|
allocated structures.
|
||
|
|
||
|
When used directly from the REPL with no other threads running,
|
||
|
\scheme{(compute-size (oblist) 'static)} effectively gives the size of
|
||
|
the entire heap, and \scheme{(compute-composition (oblist) 'static)}
|
||
|
effectively gives the composition of the entire heap.
|
||
|
|
||
|
The inspector makes the aggregate size of an object similarly available
|
||
|
through the \scheme{size} inspector-object message and the corresponding
|
||
|
\scheme{size} interactive-inspector command, with the twist that it
|
||
|
does not include objects whose sizes were previously requested in the
|
||
|
same session, making it possible to see the effectively smaller sizes
|
||
|
of what the programmer perceives to be substructures in shared and
|
||
|
cyclic structures.
|
||
|
|
||
|
These procedures potentially allocate a large amount of memory and
|
||
|
so should be used only when the information returned by the
|
||
|
procedure \scheme{object-counts} (see preceding entry) does not suffice.
|
||
|
|
||
|
Relocation tables for static code objects are discarded by default,
|
||
|
which prevents these procedures from providing accurate results when
|
||
|
static code objects are involved.
|
||
|
That is, they will not find any objects pointed to directly from a code
|
||
|
object that has been promoted to the static generation.
|
||
|
If accurate sizes and compositions for static code objects are
|
||
|
required, the command-line argument \scheme{--retain-static-relocation}
|
||
|
can be used to prevent the relocation tables from being discarded.
|
||
|
|
||
|
\subsection{Showing expander and optimizer output (9.1)}
|
||
|
|
||
|
When the parameter \scheme{expand-output} is set to a textual output
|
||
|
port, the output of the expander is printed to the port as a side effect
|
||
|
of running \scheme{compile}, \scheme{interpret}, or any of the file
|
||
|
compiling primitives, e.g., \scheme{compile-file} or
|
||
|
\scheme{compile-library}.
|
||
|
Similarly, when the parameter \scheme{expand/optimize-output} is set to a
|
||
|
textual output port, the output of the source optimizer is printed.
|
||
|
|
||
|
\subsection{Undefined-variable warnings (9.1)}
|
||
|
|
||
|
When \scheme{undefined-variable-warnings} is set to \scheme{#t}, the
|
||
|
compiler issues a warning message whenever it cannot determine that
|
||
|
a variable bound by \scheme{letrec}, \scheme{letrec*}, or an internal
|
||
|
definition will not be referenced before it is defined.
|
||
|
The default value is \scheme{#f}.
|
||
|
|
||
|
Regardless of the setting of this parameter, the compiler inserts code
|
||
|
to check for the error, except at optimize level 3.
|
||
|
The check is fairly inexpensive and does not typically inhibit inlining
|
||
|
or other optimizations.
|
||
|
In code that must be carefully tuned, however, it is sometimes useful
|
||
|
to reorder bindings or make other changes to eliminate the checks.
|
||
|
Enabling this warning can facilitate this process.
|
||
|
|
||
|
The checks are also visible in the output of \scheme{expand/optimize}.
|
||
|
|
||
|
\subsection{Detecting accidental use of generative record types (9.1)}
|
||
|
|
||
|
When the new boolean parameter \scheme{require-nongenerative-clause}
|
||
|
is set to \scheme{#t}, a \scheme{define-record-type} without a
|
||
|
\scheme{nongenerative} clause is treated as a syntax error.
|
||
|
This allows the programmer to detect accidental use of generative
|
||
|
record types.
|
||
|
Generative record types are rarely useful and are less efficient
|
||
|
than nongenerative types, since generative record types require the
|
||
|
construction of a record-type-descriptor each time a
|
||
|
\scheme{define-record-type} form is evaluated rather than once,
|
||
|
at compile time.
|
||
|
To support the rare need for a generative record type while still
|
||
|
allowing accidental generativity to be detected,
|
||
|
\scheme{define-record-type} has been extended to allow a generative
|
||
|
record type to be explicitly declared with a \scheme{nongenerative}
|
||
|
clause with \scheme{#f} for the uid, i.e., \scheme{(nongenerative #f)}.
|
||
|
|
||
|
\subsection{Improved support for cross compilation (9.1)}
|
||
|
|
||
|
Cross-compilation support has been improved in two ways: (1) it is
|
||
|
now possible to cross-compile a library and import it later in a
|
||
|
separate process for cross-compilation of dependent libraries, and
|
||
|
(2) the code produced for the target machine when cross compiling is no
|
||
|
longer less efficient than code produced natively on the target
|
||
|
machine.
|
||
|
|
||
|
\subsection{Linux ARMv6 (32-bit) support (9.1)}
|
||
|
|
||
|
Support for running {\ChezScheme} on ARMv6 processors running Linux
|
||
|
has been added, with machine type arm32le (32-bit nonthreaded).
|
||
|
C~code intended to be linked with these versions of the system
|
||
|
should be compiled using the GNU C~compiler's \scheme{-m32} option.
|
||
|
|
||
|
\subsection{Source information in ftype ref/set! error messages (9.0)}
|
||
|
|
||
|
When available at compile time, source information is now included
|
||
|
in run-time error messages produced when \scheme{ftype-&ref},
|
||
|
\scheme{ftype-ref}, \scheme{ftype-set!}, and the locked ftype
|
||
|
operations are handed invalid inputs, e.g., ftype pointers of some
|
||
|
unexpected type, RHS values of some unexpected type, or improper
|
||
|
indices.
|
||
|
|
||
|
\subsection{\protect\scheme{compile-to-port} top-level-program dependencies (9.0)}
|
||
|
|
||
|
When passed a single \scheme{top-level-program} form,
|
||
|
\scheme{compile-to-port} now returns a list of the libraries the
|
||
|
top-level program requires at run time, as with \scheme{compile-program}.
|
||
|
Otherwise, the return value is unspecified.
|
||
|
|
||
|
\subsection{Better feedback for record-type mismatches (9.0)}
|
||
|
|
||
|
When \scheme{make-record-type} or \scheme{make-record-type-descriptor}
|
||
|
detect an incompatibility between two record types with the same
|
||
|
UID, the resulting error messages provide more information to
|
||
|
describe the mismatch, i.e., whether the parent, fields, flags, or
|
||
|
mutability differ.
|
||
|
|
||
|
\subsection{\protect\scheme{enable-cross-library-optimization} parameter (9.0)}
|
||
|
|
||
|
When a library is compiled, information is stored with the object
|
||
|
code to enable propagation of constants and inlining of procedures
|
||
|
defined in the library into dependent libraries.
|
||
|
The new parameter \scheme{enable-cross-library-optimization}, whose
|
||
|
value defaults to \scheme{#t}, can be set to \scheme{#f} to prevent
|
||
|
this information from being stored and disable the corresponding
|
||
|
optimizations.
|
||
|
This might be done to reduce the size of the object files or to
|
||
|
reduce the potential for exposure of near-source information via
|
||
|
the object file.
|
||
|
|
||
|
\subsection{Stripping object files (9.0)}
|
||
|
|
||
|
The new procedure \scheme{strip-fasl-file} allows the removal of
|
||
|
source information of various sorts from a compiled object (fasl) file
|
||
|
produced by \scheme{compile-file} or one of the other file compiling
|
||
|
procedures.
|
||
|
It also allows removal of library visit code, i.e., the code
|
||
|
required to compile (but not run) dependent libraries.
|
||
|
|
||
|
\scheme{strip-fasl-file} accepts three arguments: an input pathname,
|
||
|
and output pathname, and a fasl-strip-options enumeration set,
|
||
|
created by \scheme{fasl-strip-options} with zero or more of the
|
||
|
following options.
|
||
|
|
||
|
\begin{description}
|
||
|
\item[\scheme{inspector-source}:]
|
||
|
Strip inspector source information.
|
||
|
|
||
|
\item[\scheme{source-annotations}:]
|
||
|
Strip source annotations.
|
||
|
|
||
|
\item[\scheme{profile-source}:]
|
||
|
Strip source file and character position information from profiled
|
||
|
code objects.
|
||
|
|
||
|
\item[\scheme{library-visit-code}:]
|
||
|
This strips library visit code from compiled libraries.
|
||
|
\end{description}
|
||
|
|
||
|
\subsection{Ftype array bound of zero (9.0)}
|
||
|
|
||
|
The bound of an ftype array can now be zero and, when zero, is
|
||
|
treated as unbounded in the sense that no run-time upper-bound
|
||
|
checks are performed for accesses to the array.
|
||
|
This simplifies the creation of ftype arrays whose actual bounds
|
||
|
are determined dynamically.
|
||
|
|
||
|
\subsection{\protect\scheme{compile-profile} no longer implies \protect\scheme{generate-inspector-information} (9.0)}
|
||
|
|
||
|
In previous releases, profile and inspector source information was
|
||
|
gathered and stored together so that compiling with profiling enabled
|
||
|
required that inspector information also be stored with each code object.
|
||
|
This is no longer the case.
|
||
|
|
||
|
\subsection{\protect\scheme{case} now uses \protect\scheme{member} (9.0)}
|
||
|
|
||
|
\scheme{case} now uses \scheme{member} rather than \scheme{memv} for key
|
||
|
comparisons, a generalization that allows \scheme{case} to be used for
|
||
|
strings, lists, vectors, etc., rather than just atomic values.
|
||
|
This adds no overhead when keys are comparable with \scheme{memv},
|
||
|
since the compiler converts calls to \scheme{member} into calls to
|
||
|
\scheme{memv} (or \scheme{memq}, or even individual inline pointer
|
||
|
comparisons) when it can determine the more expensive test is not
|
||
|
required.
|
||
|
|
||
|
The \scheme{case} syntax exported by the \scheme{(rnrs)} and
|
||
|
\scheme{(rnrs base)} libraries still uses \scheme{memv} for
|
||
|
compatibility with the R6RS standard.
|
||
|
|
||
|
\subsection{\protect\scheme{write} and \protect\scheme{display} and foreign addresses (9.0)}
|
||
|
|
||
|
The \scheme{write} and \scheme{display} procedures now recognize
|
||
|
foreign addresses that happen to look like Scheme objects and print
|
||
|
them as \scheme{#<foreign>}; previously, \scheme{write} and
|
||
|
\scheme{display} would attempt to treat the addresses as Scheme
|
||
|
objects, typically leading to invalid memory references.
|
||
|
Some foreign addresses are indistinguishable from fixnums and
|
||
|
still print as fixnums.
|
||
|
|
||
|
\subsection{Profile-directed optimization (9.0)}
|
||
|
|
||
|
Compiled code can be instrumented to gather two kinds of
|
||
|
execution counts, source-level and block-level, via different settings
|
||
|
of the \scheme{compile-profile} parameter.
|
||
|
When \scheme{compile-profile} is set to the symbol \scheme{source}
|
||
|
at compile time, source execution counts are gathered by the generated
|
||
|
code, and when \scheme{compile-profile} is set to \scheme{block},
|
||
|
block execution counts are gathered.
|
||
|
Setting it to \scheme{#f} (the default) disables instrumentation.
|
||
|
|
||
|
Source counts are identical to the source counts gathered by generated
|
||
|
code in previous releases when compiled with
|
||
|
\scheme{compile-profile} set to \scheme{#t}, and \scheme{#t}
|
||
|
can be still be used in place of \scheme{source} for backward
|
||
|
compatibility.
|
||
|
Source counts can be viewed by the programmer at the end of the run
|
||
|
of the generated code via \scheme{profile-dump-list} and
|
||
|
\scheme{profile-dump-html}.
|
||
|
|
||
|
Block counts are per \emph{basic block}.
|
||
|
Basic blocks are individual sequences of straight-line code and are
|
||
|
the building blocks of the machine code generated by the compiler.
|
||
|
Counting the number of times a block is executed is thus equivalent
|
||
|
to counting the number of times the instructions within it are
|
||
|
executed.
|
||
|
|
||
|
There is no mechanism for the programmer to view block counts, but
|
||
|
both block counts and source counts can now be saved after a sample
|
||
|
run of the generated code for use in guiding various optimizations
|
||
|
during a subsequent compilation of the same code.
|
||
|
|
||
|
The source counts can be used by ``profile-aware macros,'' i.e.,
|
||
|
macros whose expansion is guided by profiling information.
|
||
|
A profile-aware macro can use profile information to optimize
|
||
|
the code it produces.
|
||
|
For example, a macro defining an abstract datatype might choose
|
||
|
representations and algorithms based on the frequencies
|
||
|
of its operations.
|
||
|
Similarly, a macro, like \scheme{case}, that performs a set of
|
||
|
disjoint tests might choose to order those tests based on which are
|
||
|
most likely to succeed.
|
||
|
Indeed, the built-in \scheme{case} now does just that.
|
||
|
A new syntactic form, \scheme{exclusive-cond}, abstracts a common
|
||
|
use case for profile-aware macros.
|
||
|
|
||
|
The block counts are used to guide certain low-level optimizations,
|
||
|
such as block ordering and register allocation.
|
||
|
|
||
|
The procedure \scheme{profile-dump-data} writes to a specified file
|
||
|
the profile data collected during the run of a program compiled
|
||
|
with \scheme{compile-profile} set to either \scheme{source} or
|
||
|
\scheme{block}.
|
||
|
It is similar to \scheme{profile-dump-list} or \scheme{profile-dump-html}
|
||
|
but stores the profile data in a machine readable form.
|
||
|
|
||
|
The procedure \scheme{profile-load-data} loads one or more files
|
||
|
previously created by \scheme{profile-dump-data} into an internal
|
||
|
database.
|
||
|
|
||
|
The database associates \emph{weights} with source locations or
|
||
|
blocks, where a weight is a flonum representing the ratio of the
|
||
|
location's count versus the maximum count.
|
||
|
When multiple profile data sets are loaded, the weights for each
|
||
|
location are averaged across the data sets.
|
||
|
|
||
|
The procedure \scheme{profile-query-weight} accepts a source object
|
||
|
and returns the weight associated with the location identified by
|
||
|
the source object, or \scheme{#f} if no weight is associated with
|
||
|
the location.
|
||
|
This procedure is intended to be used by a profile-aware macro on
|
||
|
pieces of its input to optimize code based on profile data previously
|
||
|
stored by \scheme{profile-dump-data} and loaded by
|
||
|
\scheme{profile-load-data}.
|
||
|
|
||
|
The procedure \scheme{profile-clear-data} clears the database.
|
||
|
|
||
|
The new \scheme{exclusive-cond} syntax is similar to \scheme{cond}
|
||
|
except it assumes the tests performed by the clauses are disjoint
|
||
|
and reorders them based on available profiling data.
|
||
|
Because the tests might be reordered, the order in which side effects
|
||
|
of the test expressions occur is undefined.
|
||
|
The built-in \scheme{case} form is implemented in terms of
|
||
|
\scheme{exclusive-cond}.
|
||
|
|
||
|
\subsection{New \protect\scheme{ssize_t} foreign type (9.0)}
|
||
|
|
||
|
A new foreign type, \scheme{ssize_t}, is now supported.
|
||
|
It is the signed analogue of \scheme{size_t}.
|
||
|
|
||
|
\subsection{Guardian representatives (9.0)}
|
||
|
|
||
|
When \scheme{make-guardian} is passed a second, \emph{representative},
|
||
|
argument, the representative is returned from the guardian in place
|
||
|
of the guarded object when the guarded object is no longer accessible.
|
||
|
|
||
|
\subsection{Library reloading on dependency change (9.0)}
|
||
|
|
||
|
A library initially imported from an object file is now reimported from
|
||
|
source when a dependency (another library or include file) has changed
|
||
|
since the library was compiled.
|
||
|
|
||
|
\subsection{Expression-editor filename completion (8.9.5)}
|
||
|
|
||
|
The expression editor now performs filename- rather than
|
||
|
command-completion within string constants.
|
||
|
It looks only at the current line to determine whether the cursor is
|
||
|
within a string constant; this can lead to the wrong kind of command
|
||
|
completion for strings that cross line boundaries.
|
||
|
|
||
|
\subsection{New lock mechanisms and elimination of old lock mechanism (8.9.5)}
|
||
|
|
||
|
The built in ftype \scheme{ftype-lock} has been eliminated along
|
||
|
with the corresponding procedures, \scheme{acquire-lock},
|
||
|
\scheme{release-lock}, and \scheme{initialize-lock}.
|
||
|
This is an incompatible change, although defining
|
||
|
\scheme{ftype-lock} and the associated procedures is straightforward
|
||
|
using the forms described below.
|
||
|
|
||
|
The functionality has been replaced and generalized by four new syntactic
|
||
|
forms that operate on lock fields wherever they appear within a foreign
|
||
|
type:
|
||
|
|
||
|
\schemedisplay
|
||
|
(ftype-init-lock! \var{T} (\var{a} ...) \var{e})
|
||
|
(ftype-lock! \var{T} (\var{a} ...) \var{e})
|
||
|
(ftype-spin-lock! \var{T} (\var{a} ...) \var{e})
|
||
|
(ftype-unlock! \var{T} (\var{a} ...) \var{e})
|
||
|
\endschemedisplay
|
||
|
|
||
|
The access chain \scheme{\var{a} \dots} must specify a word-size
|
||
|
integer represented using the native endianness, i.e., a \scheme{uptr}
|
||
|
or \scheme{iptr}.
|
||
|
It is a syntax violation when this is not the case.
|
||
|
|
||
|
For each of the forms, the expression \var{e} is evaluated first
|
||
|
and must evaluate to a ftype pointer \var{p} of type \var{T}.
|
||
|
|
||
|
\scheme{ftype-init-lock!} initializes the specified field of the foreign
|
||
|
object to which \var{p} points, puts the field into the unlocked state,
|
||
|
and returns an unspecified value.
|
||
|
|
||
|
If the field is in the unlocked state, \scheme{ftype-lock!} puts it
|
||
|
into the locked state and returns \scheme{#t}.
|
||
|
If the field is already in the locked state, \scheme{ftype-lock!}
|
||
|
returns \scheme{#f}.
|
||
|
|
||
|
\scheme{ftype-spin-lock!} loops until the lock is in the unlocked
|
||
|
state, then puts it into the locked state and returns an unspecified
|
||
|
value.
|
||
|
\emph{This operation will never return if no other thread or process
|
||
|
unlocks the field, causing interrupts and requests for collection to
|
||
|
be ignored.}
|
||
|
|
||
|
Finally, \scheme{ftype-unlock} puts the field into the unlocked state
|
||
|
(regardless of the current state) and returns an unspecified value.
|
||
|
|
||
|
An additional pair of syntactic forms can be used when just an
|
||
|
atomic increment or decrement is required:
|
||
|
|
||
|
\schemedisplay
|
||
|
(ftype-locked-incr! \var{T} (\var{a} ...) \var{e})
|
||
|
(ftype-locked-decr! \var{T} (\var{a} ...) \var{e})
|
||
|
\endschemedisplay
|
||
|
|
||
|
As for the first set of forms, the access chain \scheme{\var{a} \dots}
|
||
|
must specify a word-size integer represented using the native endianness.
|
||
|
|
||
|
\subsection{\protect\scheme{ftype-pointer-null?}, \protect\scheme{ftype-pointer=?} (8.9.5)}
|
||
|
|
||
|
The new procedure \scheme{ftype-pointer-null?} can be used to compare the
|
||
|
address of its single argument, which must be an ftype pointer, against 0.
|
||
|
It returns \scheme{#t} if the address is 0 and \scheme{#f} otherwise.
|
||
|
Similarly, \scheme{ftype-pointer=?} can be used to compare the
|
||
|
addresses of two ftype-pointer arguments.
|
||
|
It returns \scheme{#t} if the address are the same and \scheme{#f}
|
||
|
otherwise.
|
||
|
|
||
|
These are potentially more efficient than extracting ftype-pointer
|
||
|
addresses first, which might result in bignum allocation for addresses
|
||
|
outside the fixnum range,
|
||
|
although the compiler also now
|
||
|
tries to avoid allocation when the result of a call to
|
||
|
\scheme{ftype-pointer-address} is directly compared with 0 or with the
|
||
|
result of another call to \scheme{ftype-pointer-address}, as described
|
||
|
in Section~\ref{ftpaopt}.
|
||
|
|
||
|
\subsection{\protect\scheme{gensym}'s new optional unique-name argument (8.9.5)}
|
||
|
|
||
|
\scheme{gensym} now accepts a second optional argument, the unique
|
||
|
name to use.
|
||
|
It must be a string and should not be used by any other gensym intended
|
||
|
to be distinct from the new gensym.
|
||
|
|
||
|
\subsection{GC times now maintained with finer granularity (8.9.5)}
|
||
|
|
||
|
In previous releases, collection times as reported by \scheme{statistics}
|
||
|
or printed by \scheme{display-statistics} were gathered internally
|
||
|
with millisecond granularity at each collection, possibly leading to
|
||
|
significant inaccuracies over the course of many collections.
|
||
|
They are now maintained using high-resolution timers with generally
|
||
|
much better accuracy.
|
||
|
|
||
|
\subsection{New time types for tracking collection times (8.9.5)}
|
||
|
|
||
|
New time types \scheme{time-collector-cpu} and \scheme{time-collector-real}
|
||
|
have been added.
|
||
|
When \scheme{current-time} is passed one of these types, a time
|
||
|
object of the specified type is returned and represents the time
|
||
|
(cpu or real) spent during collection.
|
||
|
|
||
|
Previously, this information was available only via the
|
||
|
\scheme{statistics} or \scheme{display-statistics} procedures, and then
|
||
|
with lower precision.
|
||
|
|
||
|
\subsection{New storage-management introspection procedures (8.9.5)}
|
||
|
|
||
|
Three new storage-management introspection procedures have been
|
||
|
added:
|
||
|
|
||
|
\schemedisplay
|
||
|
(collections)
|
||
|
(initial-bytes-allocated)
|
||
|
(bytes-deallocated)
|
||
|
\endschemedisplay
|
||
|
|
||
|
\scheme{collections} returns the number of collections performed so
|
||
|
far by the current Scheme process.
|
||
|
|
||
|
\scheme{initial-bytes-allocated} returns the number of bytes
|
||
|
allocated after loading the boot files and before running any
|
||
|
non-boot user code.
|
||
|
|
||
|
\scheme{bytes-deallocated} returns the total number of bytes
|
||
|
deallocated by the collector.
|
||
|
|
||
|
Previously, this information was available only via the
|
||
|
\scheme{statistics} or \scheme{display-statistics}
|
||
|
procedures.
|
||
|
|
||
|
\subsection{New time-object manipulation procedures (8.9.5)}
|
||
|
|
||
|
Three new procedures for performing arithmetic on time objects have
|
||
|
been added, per SRFI~19:
|
||
|
|
||
|
\schemedisplay
|
||
|
(time-difference \var{t1} \var{t2}) ;=> \var{t3}
|
||
|
(add-duration \var{t1} \var{t2}) ;=> \var{t3}
|
||
|
(subtract-duration \var{t1} \var{t2}) ;=> \var{t3}
|
||
|
\endschemedisplay
|
||
|
|
||
|
\scheme{time-difference} takes two time objects \var{t1} and \var{t2},
|
||
|
which must have the same time type, and returns the result of subtracting
|
||
|
\var{t2} from \var{t1}, represented as a new time object with type
|
||
|
\scheme{time-duration}.
|
||
|
\scheme{add-duration} adds time object \var{t2}, which must be of type
|
||
|
\scheme{time-duration}, to time object \var{t1}, producing a new time object
|
||
|
\var{t3} with the same type as \var{t1}.
|
||
|
\scheme{subtract-duration} subtracts time object \var{t2} which must be
|
||
|
of type \scheme{time-duration}, from time object \var{t1}, producing a new
|
||
|
time object \var{t3} with the same type as \var{t1}.
|
||
|
|
||
|
SRFI~19 also names destructive versions of these operators:
|
||
|
|
||
|
\schemedisplay
|
||
|
(time-difference! \var{t1} \var{t2}) ;=> \var{t3}
|
||
|
(add-duration! \var{t1} \var{t2}) ;=> \var{t3}
|
||
|
(subtract-duration! \var{t1} \var{t2}) ;=> \var{t3}
|
||
|
\endschemedisplay
|
||
|
|
||
|
These are available as well in {\ChezScheme} but are actually
|
||
|
nondestructive, i.e., entirely equivalent to the nondestructive
|
||
|
versions.
|
||
|
|
||
|
\subsection{Better reporting of profile counts (8.9.4, 8.9.5)}
|
||
|
|
||
|
The compiler now collects and reports profile counts for every
|
||
|
source expression that is not determined to be dead either at
|
||
|
compile time or by the time the profile information is obtained via
|
||
|
\scheme{profile-dump-list} or \scheme{profile-dump-html}.
|
||
|
Previously, the compiler suppressed profile counts for constants and
|
||
|
variable references in contexts where the information was likely (though
|
||
|
not guaranteed) to be redundant, and it dropped profile counts for some
|
||
|
forms that were optimized away, such as inlined calls, folded calls,
|
||
|
or useless code.
|
||
|
Furthermore, profile counts now uniformly represent the number of times
|
||
|
a source expression's evaluation was started, which was not always the
|
||
|
case before.
|
||
|
|
||
|
A small related enhancement has been made in the HTML output produced
|
||
|
by \scheme{profile-dump-html}.
|
||
|
Hovering over a source expression now shows, in addition to the count,
|
||
|
the starting position (line number and character) of the source expression
|
||
|
to which the count belongs.
|
||
|
This is useful for identifying when a source expression does not have its
|
||
|
own count but instead inherits the count (and color) from an enclosing
|
||
|
expression.
|
||
|
|
||
|
\subsection{Virtual registers (8.9.4)}
|
||
|
|
||
|
A limited set of \emph{virtual registers} is now supported by the compiler
|
||
|
for use by programs that require high-speed, global, and mutable storage
|
||
|
locations.
|
||
|
Referencing or assigning a virtual register is potentially faster and
|
||
|
never slower than accessing an assignable local or global variable,
|
||
|
and the code sequences for doing so are generally smaller.
|
||
|
Assignment is potentially significantly faster because there is no need
|
||
|
to track pointers from the virtual registers to young objects, as there
|
||
|
is for variable locations that might reside in older generations.
|
||
|
On threaded versions of the system, virtual registers are ``per thread''
|
||
|
and thus serve as thread-local storage in a manner that is less expensive
|
||
|
than thread parameters.
|
||
|
|
||
|
The interface consists of three procedures:
|
||
|
|
||
|
\scheme{(virtual-register-count)} returns the number of virtual registers.
|
||
|
As of this writing, the count is set at 16. This number is fixed, i.e.,
|
||
|
cannot be changed except by recompiling {\ChezScheme} from source.
|
||
|
|
||
|
\scheme{(set-virtual-register! \var{k} \var{x})} stores \var{x} in virtual
|
||
|
register \var{k}.
|
||
|
\var{k} must be a fixnum between 0 (inclusive) and the value of
|
||
|
\scheme{(virtual-register-count)} (exclusive).
|
||
|
|
||
|
\scheme{(virtual-register \var{k})} returns the value most recently
|
||
|
stored in virtual register \var{k} (on the current thread, in threaded
|
||
|
versions of the system).
|
||
|
|
||
|
To get the fastest possible speed out of the latter two procedures,
|
||
|
\var{k} should be a constant embedded right in the call
|
||
|
(or propagatable via optimization to the call).
|
||
|
To avoid putting these constants in the source code, programmers should
|
||
|
consider using identifier macros to give names to virtual registers, e.g.:
|
||
|
|
||
|
\schemedisplay
|
||
|
(define-syntax foo
|
||
|
(identifier-syntax
|
||
|
[id (virtual-register 0)]
|
||
|
[(set! id e) (set-virtual-register! 0 e)]))
|
||
|
(set! foo 'hello)
|
||
|
foo ;=> hello
|
||
|
\endschemedisplay
|
||
|
|
||
|
Virtual-registers must be treated as an application-level resource, i.e.,
|
||
|
libraries intended to be used by multiple applications should generally
|
||
|
not use virtual registers to avoid conflicts with the applications use of
|
||
|
the registers.
|
||
|
|
||
|
\subsection{24-, 40-, 48-, and 56-bit integer values (8.9.3)}
|
||
|
|
||
|
Support for storing and extracting 24-, 40-, 48-, and 56-bit integers
|
||
|
to and from records, bytevectors, and foreign types (ftypes) has been
|
||
|
added.
|
||
|
For records and ftypes, this is accomplished by declaring a field
|
||
|
to be of type
|
||
|
\scheme{integer-24}, \scheme{unsigned-24},
|
||
|
\scheme{integer-40}, \scheme{unsigned-40},
|
||
|
\scheme{integer-48}, \scheme{unsigned-48},
|
||
|
\scheme{integer-56}, or \scheme{unsigned-56}.
|
||
|
For bytevectors, this is accomplished via the following new
|
||
|
primitives:
|
||
|
|
||
|
\schemedisplay
|
||
|
bytevector-24-ref
|
||
|
bytevector-24-set!
|
||
|
bytevector-40-ref
|
||
|
bytevector-40-set!
|
||
|
bytevector-48-ref
|
||
|
bytevector-48-set!
|
||
|
bytevector-56-ref
|
||
|
bytevector-56-set!
|
||
|
\endschemedisplay
|
||
|
|
||
|
Similarly, support has been added for sending and receiving
|
||
|
24-, 40-, 48-, and 56-bit integers to and from foreign code via
|
||
|
\scheme{foreign-procedure} and \scheme{foreign-callable}.
|
||
|
Arguments and return values of type \scheme{integer-24} and
|
||
|
\scheme{unsigned-24} are passed as 32-bit quantities, while
|
||
|
those of type \scheme{integer-40}, \scheme{unsigned-40},
|
||
|
\scheme{integer-48}, \scheme{unsigned-48}, \scheme{integer-56},
|
||
|
and \scheme{unsigned-56} are passed as 64-bit quantities.
|
||
|
|
||
|
For unpacked ftypes, a 48-bit (6-byte) quantity is aligned
|
||
|
on an even two-byte boundary, while a
|
||
|
24-bit (3-byte), 40-bit (5-byte), or 56-bit (7-byte) quantity
|
||
|
is aligned on an arbitrary byte boundary.
|
||
|
|
||
|
\subsection{New \protect\scheme{pariah} expression (8.9.3)}
|
||
|
|
||
|
A \scheme{pariah} expression:
|
||
|
|
||
|
\schemedisplay
|
||
|
(pariah \var{expr} \var{expr} \dots)
|
||
|
\endschemedisplay
|
||
|
|
||
|
is syntactically similar and semantically equivalent to a begin
|
||
|
expression but tells the compiler that the expressions within are
|
||
|
relatively unlikely to be executed.
|
||
|
This information is currently used by the compiler for prioritizing
|
||
|
allocation of registers to variables and for putting pariah code
|
||
|
out-of-line in an attempt to reduce instruction cache misses for the
|
||
|
remaining code.
|
||
|
|
||
|
A \scheme{pariah} form is generally most usefully wrapped around the
|
||
|
consequent or alternative of an \scheme{if} expression to identify which
|
||
|
is the less likely path.
|
||
|
|
||
|
The compiler implicitly treats as pariah code any code that leads
|
||
|
up to an unconditional call to \scheme{raise}, \scheme{error},
|
||
|
\scheme{errorf}, \scheme{assertion-violation}, etc., so it is not
|
||
|
necessary to wrap a \scheme{pariah} around such a call.
|
||
|
|
||
|
At some point, there will likely be an option for gathering similar
|
||
|
information automatically via profiling.
|
||
|
In the meantime, we are interested in feedback about whether the
|
||
|
mechanism is beneficial and whether the benefit of using the
|
||
|
\scheme{pariah} form outweighs the programming overhead.
|
||
|
|
||
|
\subsection{Improved automatic library recompilation (8.9.2)}
|
||
|
|
||
|
Local imports within a library now trigger automatic recompilation
|
||
|
of the library when the imported library has been recompiled or needs
|
||
|
to be recompiled, in the same manner as imports listed directly in the
|
||
|
importing library's \scheme{library} form.
|
||
|
Changes in include files also trigger automatic recompilation.
|
||
|
|
||
|
(Automatic recompilation of a library is enabled when an import of
|
||
|
the library, e.g., in another library or in a top-level program, is
|
||
|
compiled and the parameter \scheme{compile-imported-libraries} is set
|
||
|
to a true value.)
|
||
|
|
||
|
\subsection{Redundant profile information (8.9.2)}
|
||
|
|
||
|
Profiling information is no longer produced for constants and variable
|
||
|
references where the information is likely to be redundant.
|
||
|
It is still produced in contexts where the counts are likely to differ
|
||
|
from those of the enclosing form, e.g., where a constant or variable
|
||
|
reference occurs in the consequent or alternative of an \scheme{if}
|
||
|
expression.
|
||
|
This change brings the profiling information largely in sync with
|
||
|
Version~8.4.1 and earlier, though Version~8.9.2 retains source information
|
||
|
in a few cases where it is inappropriately discarded by Version~8.4.1's
|
||
|
compiler, and Version~8.9.2 discards source information in a few cases
|
||
|
where the code has been optimized away.
|
||
|
|
||
|
\subsection{New \protect\scheme{compile-to-port} procedure (8.9.2)}
|
||
|
|
||
|
The procedure \scheme{compile-to-port} is like \scheme{compile-port}
|
||
|
but, instead of taking an input port from which it reads expressions
|
||
|
to be compiled, takes a list of expressions to be compiled.
|
||
|
As with \scheme{compile-port}, the second argument must be a binary
|
||
|
output port.
|
||
|
|
||
|
\subsection{Debug levels (8.9.1)}
|
||
|
|
||
|
Newly introduced debug levels control the amount of debugging support
|
||
|
embedded in the code generated by the compiler.
|
||
|
The current debug level is controlled by the parameter
|
||
|
\scheme{debug-level} and must be set when the compiler is run to have
|
||
|
any effect on the generated code.
|
||
|
Valid debug levels are~0, 1, 2, and~3, and the default is~1.
|
||
|
At present, the only difference between debug levels is whether calls to
|
||
|
certain error-producing routines, like \scheme{error}, whether explicit
|
||
|
or as the result of an implicit run-time check (such as the pair check
|
||
|
in \scheme{car}), are treated as tail calls even when not in tail position.
|
||
|
At debug levels 0 and 1, they are treated as tail calls, and at debug
|
||
|
levels 2 and 3, they are treated as nontail calls.
|
||
|
Treating them as tail calls is more efficient, but treating them as
|
||
|
nontail calls leaves more information on the stack, which affects what
|
||
|
can be shown by the inspector.
|
||
|
|
||
|
For example, assume \scheme{f} is defined as follows:
|
||
|
|
||
|
\schemedisplay
|
||
|
(define f
|
||
|
(lambda (x)
|
||
|
(unless (pair? x) (error #f "oops"))
|
||
|
(car x)))
|
||
|
\endschemedisplay
|
||
|
|
||
|
and is called with a non-pair argument, e.g.:
|
||
|
|
||
|
\schemedisplay
|
||
|
(f 3)
|
||
|
\endschemedisplay
|
||
|
|
||
|
If the debug level is 2 or more at the time the definition is compiled,
|
||
|
the call to \scheme{f} will still be on the stack when the exception
|
||
|
is raised by \scheme{error} and will thus be visible to the inspector:
|
||
|
|
||
|
\schemedisplay
|
||
|
> (f 3)
|
||
|
Exception: oops
|
||
|
Type (debug) to enter the debugger.
|
||
|
> (debug)
|
||
|
debug> i
|
||
|
#<continuation in f> : sf
|
||
|
0: #<continuation in f>
|
||
|
1: #<system continuation in new-cafe>
|
||
|
#<continuation in f> : s
|
||
|
continuation: #<system continuation in new-cafe>
|
||
|
procedure code: (lambda (x) (if (...) ...) (car x))
|
||
|
call code: (error #f "oops")
|
||
|
frame and free variables:
|
||
|
0. x: 3
|
||
|
\endschemedisplay
|
||
|
|
||
|
On the other hand, if the debug level is 1 (the default) or 0 at the
|
||
|
time the definition of \scheme{f} is compiled, the call to \scheme{f}
|
||
|
will no longer be on the stack:
|
||
|
|
||
|
\schemedisplay
|
||
|
> (f 3)
|
||
|
Exception: oops
|
||
|
Type (debug) to enter the debugger.
|
||
|
> (debug)
|
||
|
debug> i
|
||
|
#<system continuation in new-cafe> : sf
|
||
|
1: #<system continuation in new-cafe>
|
||
|
\endschemedisplay
|
||
|
|
||
|
\subsection{Cost centers (8.9.1)}
|
||
|
|
||
|
Cost centers are used to track the bytes allocated, instructions executed,
|
||
|
and/or cpu time elapsed while evaluating selected sections of code.
|
||
|
Cost centers are created via the procedure \scheme{make-cost-center}, and
|
||
|
costs are tracked via the procedure \scheme{with-cost-center}.
|
||
|
|
||
|
Allocation and instruction counts are tracked only for code instrumented
|
||
|
for that purpose.
|
||
|
This instrumentation is controlled by the \scheme{generate-allocation-counts}
|
||
|
and \scheme{generate-instruction-counts} parameters.
|
||
|
Instrumentation is disabled by default.
|
||
|
Built in procedures are not instrumented, nor is interpreted code or
|
||
|
non-Scheme code.
|
||
|
Elapsed time is tracked only when the optional \scheme{timed?} argument to
|
||
|
\scheme{with-cost-center} is provided and is not false.
|
||
|
|
||
|
The \scheme{with-cost-center} procedure accurately tracks costs, subject
|
||
|
to the caveats above, even when reentered with the same cost center, used
|
||
|
simultaneously in multiple threads, and exited or reentered one or more
|
||
|
times via continuation invocation.
|
||
|
|
||
|
\textbf{thread parameter:} \scheme{generate-allocation-counts}
|
||
|
|
||
|
When this parameter has a true value, the compiler inserts a short sequence of
|
||
|
instructions at each allocation point in generated code to track the amount of
|
||
|
allocation that occurs.
|
||
|
This parameter is initially false.
|
||
|
|
||
|
\textbf{thread parameter:} \scheme{generate-instruction-counts}
|
||
|
|
||
|
When this parameter has a true value, the compiler inserts a short
|
||
|
sequence of instructions in each block of generated code to track the
|
||
|
number of instructions executed by that block.
|
||
|
This parameter is initially false.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(make-cost-center)}
|
||
|
|
||
|
Creates a new \scheme{cost-center} object with all of its recorded costs
|
||
|
set to zero.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(cost-center? \var{obj})}
|
||
|
|
||
|
Returns \scheme{#t} if \var{obj} is a \scheme{cost-center} object, otherwise
|
||
|
returns \scheme{#f}.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(with-cost-center \var{cost-center} \var{thunk})}\\
|
||
|
\textbf{procedure:} \scheme{(with-cost-center \var{timed?} \var{cost-center} \var{thunk})}
|
||
|
|
||
|
This procedure invokes \var{thunk} without arguments and returns its
|
||
|
values.
|
||
|
It also tracks, dynamically, the bytes allocated, instructions executed,
|
||
|
and cpu time elapsed while evaluating the invocation of \var{thunk} and
|
||
|
adds the tracked costs to the cost center's running record of these costs.
|
||
|
|
||
|
Allocation counts are tracked only for code compiled with the parameter
|
||
|
\scheme{generate-allocation-counts} set to true, and
|
||
|
instruction counts are tracked only for code compiled with
|
||
|
\scheme{generate-instruction-counts} set to true.
|
||
|
Cpu time is tracked only if \var{timed?} is provided and not false and
|
||
|
includes cpu time spent in instrumented, uninstrumented, and non-Scheme
|
||
|
code.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(cost-center-instruction-count \var{cost-center})}
|
||
|
|
||
|
This procedure returns instructions executed recorded by
|
||
|
\var{cost-center}.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(cost-center-allocation-count \var{cost-center})}
|
||
|
|
||
|
This procedure returns the bytes allocated recorded by \var{cost-center}.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(cost-center-time \var{cost-center})}
|
||
|
|
||
|
This procedure returns the cpu time recorded by \var{cost-center}.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(reset-cost-center! \var{cost-center})}
|
||
|
|
||
|
This procedure resets the costs recorded by \var{cost-center} to zero.
|
||
|
|
||
|
\subsection{Experimental access to hardware performance counters (8.9.1)}
|
||
|
|
||
|
Two system primitives, \scheme{#%$read-time-stamp-counter} and
|
||
|
\scheme{#%$read-performance-monitoring-counter}, provide access to the
|
||
|
x86 and x86\_64 hardware time-stamp counter register and to the
|
||
|
model-specific performance monitoring registers.
|
||
|
|
||
|
These primitives rely on instructions that might be restricted to run only in
|
||
|
kernel mode, depending on kernel configuration.
|
||
|
The performance monitoring counters must also be configured to enable
|
||
|
monitoring and to specify which event to monitor.
|
||
|
This can be configured only by instructions executed in kernel mode.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(#%$read-time-stamp-counter)}
|
||
|
|
||
|
This procedure returns the current value of the time-stamp counter for
|
||
|
the processor core executing this code.
|
||
|
A general protection fault, which manifests as an invalid memory
|
||
|
reference exception, results if this operation is not permitted by
|
||
|
the operating system.
|
||
|
|
||
|
Since multiple processes might run on the same core between reads of
|
||
|
the time-stamp counter, the counter does not necessarily reflect time
|
||
|
spent only in the current process.
|
||
|
Also, on machines with multiple cores, the executing process might be
|
||
|
swapped to a different core with a different time-stamp counter.
|
||
|
|
||
|
\textbf{procedure:} \scheme{(#%$read-performance-monitoring-counter \var{counter})}
|
||
|
|
||
|
This procedure returns the current value of the model-specific
|
||
|
performance monitoring register specified by \var{counter}.
|
||
|
\var{counter} must be a fixnum and should specify a valid performance
|
||
|
monitoring register.
|
||
|
Allowable values depend on the processor model.
|
||
|
A general protection fault, which manifests as an invalid memory
|
||
|
reference exception, results if this operation is not permitted by
|
||
|
the operating system or if the specified counter does not exist.
|
||
|
|
||
|
In order to get meaningful results, the performance monitoring registers
|
||
|
must be enabled, and the event to be monitored must by configured by
|
||
|
the performance monitoring control register.
|
||
|
This configuration can be done only by code run in kernel mode.
|
||
|
|
||
|
Since multiple processes might run on the same core between reads of
|
||
|
a performance monitoring register, the register does not necessarily reflect
|
||
|
only the activities of the current process.
|
||
|
Also, on machines with multiple cores, the executing process might be
|
||
|
swapped to a different core with its own set of performance monitoring
|
||
|
registers and possibly a different configuration for those registers.
|
||
|
|
||
|
\subsection{New inspector functionality (8.9.1)}
|
||
|
|
||
|
Within the interactive inspector, closure and frame variables can now
|
||
|
be set by name, and the forward (f) and back (b) commands can now be
|
||
|
used to to move among the frames that comprise a continuation.
|
||
|
|
||
|
A new show-local (sl) command can be be used to look at just the local
|
||
|
variables of a stack frame.
|
||
|
This contrasts with the show (s) command, which shows the free variables
|
||
|
of the frame's closure as well.
|
||
|
|
||
|
Errors occurring during inspection, such as attempts to assign immutable
|
||
|
variables, are handled more smoothly than in previous versions.
|
||
|
|
||
|
\subsection{Fasl support for records with non-ptr fields (8.4.1)}
|
||
|
|
||
|
The fasl writer and reader now support records with non-ptr fields,
|
||
|
e.g., integer-32, wchar, etc., allowing constant record instances with
|
||
|
such fields to appear in source code (or be introduced as constants
|
||
|
by macros) into code to be compiled via \scheme{compile-file},
|
||
|
\scheme{compile-library}, \scheme{compile-program},
|
||
|
\scheme{compile-script}, or \scheme{compile-port}.
|
||
|
Ftype-pointer fields are not supported, since storing addresses
|
||
|
in fasl files does not generally make sense.
|
||
|
|
||
|
%-----------------------------------------------------------------------------
|
||
|
\section{Bug Fixes}\label{section:bugfixes}
|
||
|
|
||
|
\subsection{Foreign-callable floating-point argument allocation for x86 (9.6.0)}
|
||
|
|
||
|
When a foreign callable receives a \scheme{double} or \scheme{float} argument, the
|
||
|
allocation of space to box the number would try to save the
|
||
|
floating-point return register in the (rare) case that allocation
|
||
|
requires a new page of memory. Saving the return register is harmless on
|
||
|
most platforms, but on x86, a save and restore involves popping then
|
||
|
pushing the x87 register stack, which is invalid if nothing was there
|
||
|
at the start.
|
||
|
|
||
|
\subsection{Code generation for a specific branch displacement on ppc32 (9.6.0)}
|
||
|
|
||
|
Branch generation would go wrong if the displacement was exactly 32,764 bytes.
|
||
|
|
||
|
\subsection{\scheme{char-} returns negative results (9.6.0)}
|
||
|
|
||
|
The character difference operator returned a large positive integer in
|
||
|
situations where the first argument is represented by a lower number than the
|
||
|
second argument. For example: \scheme{(char- #\a #\b)} on 64-bit macOS returns
|
||
|
\scheme{720575940379279351}. The fix corrects this, so that it instead
|
||
|
returns \scheme{-1}.
|
||
|
|
||
|
\subsection{Certain mixed exact/inexact arithmetic comparisons (9.5.8)}
|
||
|
|
||
|
The arithmetic comparison functions (\scheme{<}, \scheme{<=}, \scheme{=},
|
||
|
\scheme{>=}, and \scheme{>}) are required to be transitive by the R6RS
|
||
|
specification, but this property was not maintained for \scheme{<=},
|
||
|
\scheme{=}, and \scheme{>=} comparisons between exact and inexact numbers
|
||
|
in the range where fixnum precision is greater than flonum precision. For
|
||
|
example, the flonum representation of \scheme{9007199254740992.0} and
|
||
|
\scheme{9007199254740993.0} is identical, but obviously
|
||
|
\scheme{9007199254740992} and \scheme{9007199254740993} (which are fixnums
|
||
|
on 64 bit systems) are not. The arithmetic comparators now no longer
|
||
|
convert comparisons of a fixnum and flonum to comparisons of two flonums
|
||
|
when the fixnum cannot be converted without loss of precision.
|
||
|
|
||
|
\subsection{\protect\scheme{rational-valued?} and exceptional flonums (9.5.8)}
|
||
|
|
||
|
The \scheme{rational-value?} function returned incorrect results when called on
|
||
|
a value with an inexact zero imaginary part and real part that is an exceptional
|
||
|
floating point value (i.e., an infinity or NaN). For example,
|
||
|
\scheme{(rational-valued? +inf.0+0.0i)} incorrectly returned \scheme{#t}, but now
|
||
|
returns \scheme{#f}.
|
||
|
|
||
|
\subsection{Calls to foreign-callable procedures may cause the process to terminate with
|
||
|
error 0xC0000409 STATUS\_STACK\_BUFFER\_OVERRUN on 64-bit Windows (9.5.8)}
|
||
|
|
||
|
A interaction bug between Microsoft's longjmp on 64-bit Windows and foreign-callable stack
|
||
|
frames has been fixed.
|
||
|
|
||
|
\subsection{Calls to \protect\scheme{printf} may cause an invalid memory reference at
|
||
|
compile time (9.5.8)}
|
||
|
|
||
|
A bug in the compiler that causes an invalid memory reference with particular
|
||
|
\scheme{printf} control strings and argument counts has been fixed. One example is
|
||
|
\scheme{(printf "~a~:*")}.
|
||
|
|
||
|
\subsection{Certain foreign calls with signed 8- and 16-bit integers on x86\_64 (9.5.6)}
|
||
|
|
||
|
The x86\_64 code generator now properly sign-extends foreign-call
|
||
|
arguments passed in registers via \scheme{(& integer-8)} and \scheme{(& integer-16)}.
|
||
|
|
||
|
\subsection{Bitwise right shift of negative bignum (9.5.6)}
|
||
|
|
||
|
When a negative bignum is shifted right by a multiple of the big-digit
|
||
|
bit size (32), a shifted-off bit is non-zero, and the result would be
|
||
|
a sequence of big digits with all one bits before rounding to deal
|
||
|
with the dropped bits, then a carry that should have been delivered to
|
||
|
a new high digit was dropped, producing 0 instead of a negative
|
||
|
number.
|
||
|
|
||
|
For example, \scheme{(ash (- 1 (ash 1 64)) -32)} no longer returns 0.
|
||
|
|
||
|
\subsection{\protect\scheme{sleep} with negative duration (9.5.6)}
|
||
|
|
||
|
Prior to this release, \scheme{sleep} of a negative duration would
|
||
|
result in an infinite pause in Windows. Now \scheme{sleep} returns
|
||
|
immediately on all platforms when given a negative duration.
|
||
|
|
||
|
\subsection{Flonum \protect\scheme{remainder} and \protect\scheme{modulo} (9.5.6)}
|
||
|
|
||
|
The \scheme{remainder} and \scheme{modulo} functions could produce
|
||
|
imprecise or wrong answers for large integer flonums. Most of the
|
||
|
repair was to use the C library's \texttt{fmod}.
|
||
|
|
||
|
\subsection{Buffering signals (9.5.4)}
|
||
|
|
||
|
Prior to this release, only one unhandled signal was buffered for
|
||
|
any signal for which a handler has been registered via
|
||
|
\scheme{register-signal-handler}, so two signals delivered in
|
||
|
quick succession could be seen as only one.
|
||
|
The system now buffers a much larger number (63 in this release) of
|
||
|
signals, and the fact that signals can be dropped has now been
|
||
|
documented.
|
||
|
|
||
|
\subsection{Clear-output bug (9.5.4)}
|
||
|
|
||
|
A bug has been fixed in which a call to \scheme{clear-output-port}
|
||
|
on a port could lead to unexpected behavior involving the port,
|
||
|
including loss of buffering or suppression of future output to the
|
||
|
port.
|
||
|
|
||
|
\subsection{Various argument type-error issues (9.5.4)}
|
||
|
|
||
|
A variety of primitive argument type-checking issues have been
|
||
|
fixed, including missing checks, misleading error messages,
|
||
|
and checks made later than appropriate, i.e., after the primitive
|
||
|
has already had side effects.
|
||
|
|
||
|
\subsection{\protect\scheme{__collect_safe}, x86\_64, and floating-point arguments or results (9.5.4)}
|
||
|
|
||
|
The \scheme{__collect_safe} mode for a foreign call or callable now
|
||
|
correctly preserves floating-point registers used for arguments or
|
||
|
results while activating or deactivating a thread on x86\_64.
|
||
|
|
||
|
\subsection{\protect\scheme{putenv} memory leak (9.5.4)}
|
||
|
|
||
|
\scheme{putenv} now calls the host system's \scheme{setenv} instead of
|
||
|
\scheme{putenv} on non-Windows hosts and avoids allocating memory that
|
||
|
is never freed, although \scheme{setenv} might do so.
|
||
|
|
||
|
\subsection{String ports from immutable strings (9.5.4)}
|
||
|
|
||
|
A bug that miscalculated the buffer size for
|
||
|
\scheme{open-string-input-port} given an immutable string has been
|
||
|
fixed.
|
||
|
|
||
|
\subsection{Multiplying $-2^{30}$ with itself on 64-bit platforms (9.5.4)}
|
||
|
|
||
|
A bug that produced the wrong sign when multiplying $-2^{30}$ with
|
||
|
itself on 64-bit platforms has been fixed.
|
||
|
|
||
|
\subsection{Compiler dropping affects from record-accessor calls (9.5.4)}
|
||
|
|
||
|
A bug that could cause the source optimizer to drop effects within
|
||
|
the argument of a record-accessor call has been fixed.
|
||
|
|
||
|
\subsection{Welcome text in macOS package file (9.5.2)}
|
||
|
|
||
|
The welcome text and copyright year in the macOS package file was
|
||
|
corrected.
|
||
|
|
||
|
\subsection{Fasl representation change for recursive ftypes (9.5.2)}
|
||
|
|
||
|
A bug in the reading of mutually recursive ftype definitions from
|
||
|
compiled files has been fixed.
|
||
|
The bug was triggered by recursive ftype definitions in which one
|
||
|
of the mutually recursive ftypes is a subtype of another, as in:
|
||
|
|
||
|
\schemedisplay
|
||
|
(define-ftype
|
||
|
[A (* B)]
|
||
|
[B (struct [h A])]))
|
||
|
\endschemedisplay
|
||
|
|
||
|
It manifested in the fasl reader raising bogus "incompatible record
|
||
|
type" exceptions when two or more references to one of the ftypes
|
||
|
occur in in separate compiled files or in separate top-level forms
|
||
|
of a file compiled via \scheme{compile-file}.
|
||
|
The bug could also have affected other record-type descriptors with
|
||
|
cycles involving parent rtds and ``extra'' fields as well as fasl
|
||
|
output created via \scheme{fasl-write}.
|
||
|
|
||
|
\subsection{Unbound object resulting from libraries combined with \protect\scheme{compile-whole-library} (9.5.1)}
|
||
|
|
||
|
A bug in \scheme{compile-whole-library} that allowed the invoke code for a
|
||
|
library included in the combined library body to be executed without first
|
||
|
invoking its binary library dependencies has been fixed.
|
||
|
This bug could arise when a member of a combined library was invoked without
|
||
|
invoking the requirements of the other libraries it was combined with. For
|
||
|
instance, consider the case where libraries \scheme{(A)} and \scheme{(B)} are
|
||
|
combined and \scheme{(B)} has dependencies on library \scheme{(A)} and binary
|
||
|
library \scheme{(C)}.
|
||
|
One possible sort order of this graph is \scheme{(C)}, \scheme{(A)},
|
||
|
\scheme{(B)}, where the invoke code for \scheme{(A)} and \scheme{(B)} are
|
||
|
combined into a single block of invoke code. If library \scheme{(A)} is
|
||
|
invoked first, it will implicitly cause the invoke code for \scheme{(B)} to be
|
||
|
invoked without invoking the code for \scheme{(C)}.
|
||
|
We address this by adding explicit dependencies between \scheme{(A)} and all
|
||
|
the binary libraries that precede it and all of the other libraries clustered
|
||
|
with \scheme{(A)} and \scheme{(A)}, such that no matter which library clustered
|
||
|
with \scheme{(A)} is invoked first, \scheme{(A)} will be invoked, causing all
|
||
|
binary libraries that precede \scheme{(A)} to be invoked.
|
||
|
It is also possible for a similar problem to exist between clusters, where
|
||
|
invoking a later cluster may invoke an earlier cluster without invoking the
|
||
|
binary dependencies for the earlier cluster.
|
||
|
We address this issue by adding an invoke requirement between each cluster and
|
||
|
the first library in the cluster that precedes it.
|
||
|
These extended invoke requirements are also added to the import requirements
|
||
|
for each library, and the dependency graph is enhanced with import requirement
|
||
|
links to ensure these are taken into account during the topological sort.
|
||
|
|
||
|
|
||
|
\subsection{Automatic recompilation and missing include files (9.5.1)}
|
||
|
|
||
|
A bug in automatic recompilation involving missing include files
|
||
|
has been fixed.
|
||
|
The bug caused automatic recompilation to fail, often with an
|
||
|
exception in \scheme{file-modification-time}, when a file specified
|
||
|
by an absolute pathname or pathname starting with "./" or "../" was
|
||
|
included via \scheme{include} during a previous compilation run and
|
||
|
is no longer present.
|
||
|
|
||
|
\subsection{Invalid memory reference instantiating \protect\scheme{foreign-callable} code object (9.5.1)}
|
||
|
|
||
|
A bug that caused evaluation of a \scheme{foreign-callable} expression in
|
||
|
code that has been collected into the static generation (e.g., when the
|
||
|
\scheme{foreign-callable} form appears in code compiled to a boot file)
|
||
|
to result in an invalid memory reference has been fixed.
|
||
|
|
||
|
\subsection{Invalid constant-folding of some calls to \protect\scheme{apply} (9.5.1)}
|
||
|
|
||
|
A bug in the source optimizer (cp0) allowed constant-folding of some calls to
|
||
|
\scheme{apply} where the last argument is not known to be a list. For example,
|
||
|
cp0 incorrectly reduced
|
||
|
\scheme{(apply zero? 0)} to \scheme{#t}
|
||
|
and reduced
|
||
|
\scheme{(lambda (x) (apply box? x) x)} to \scheme{(lambda (x) x)},
|
||
|
but now preserves these calls to \scheme{apply} so that they may raise an
|
||
|
exception.
|
||
|
|
||
|
\subsection{Disk-relative filenames in Windows (9.5.1)}
|
||
|
|
||
|
In Windows, filenames that start with a disk designator but no
|
||
|
directory separator are now treated as relative paths. For example,
|
||
|
\scheme{(path-absolute? "C:")} now returns \scheme{#f}, and
|
||
|
\scheme{(directory-list "C:")} now lists the files in the current
|
||
|
directory on disk C instead of the files in the root directory of disk
|
||
|
C.
|
||
|
|
||
|
In addition, \scheme{file-access-time}, \scheme{file-change-time},
|
||
|
\scheme{file-directory?}, \scheme{file-exists?},
|
||
|
\scheme{file-modification-time}, and \scheme{get-mode} no longer
|
||
|
remove trailing directory separators on Windows.
|
||
|
|
||
|
\subsection{Globally unique names on non-Windows systems no longer contain the IP address (9.5.1)}
|
||
|
|
||
|
The globally unique names of gensyms no longer contain the IP address
|
||
|
on non-Windows systems. Windows systems already used a universally
|
||
|
unique identifier.
|
||
|
|
||
|
\subsection{Invalid memory reference from \protect\scheme{fxvector} calls (9.5)}
|
||
|
|
||
|
A compiler bug that could result in an invalid memory reference or
|
||
|
some other unpleasant behavior for calls to \scheme{fxvector} in
|
||
|
which the nested subexpression to compute the new value to be stored
|
||
|
is nontrivial has been fixed.
|
||
|
This bug could also affect calls to \scheme{vector-set-fixnum!} and possibly
|
||
|
other primitive operations.
|
||
|
|
||
|
\subsection{Incorrect return code when \protect\scheme{exit} is called with multiple arguments (9.5)}
|
||
|
|
||
|
A bug in the implementation of the default exit handler with multiple
|
||
|
values has been fixed.
|
||
|
|
||
|
\subsection{Boot files containing compiled library code fail to load (9.5)}
|
||
|
|
||
|
Compiled library code may now appear within fasl objects loaded during
|
||
|
the boot process, provided that they are appended to the end of the base boot
|
||
|
file or appear within a later boot file.
|
||
|
|
||
|
\subsection{Misleading cyclic dependency error (9.5)}
|
||
|
|
||
|
The library system no longer reports a cyclic dependency error
|
||
|
during the second and subsequent attempts to visit or invoke a
|
||
|
library after the first attempt fails for some reason other than
|
||
|
an actual cyclic dependency.
|
||
|
The fix also allows a library to be visited or invoked successfully
|
||
|
on the second or subsequent attempt if the visit or invoke failed
|
||
|
for a transient reason, such as a missing or incorrect version in
|
||
|
an imported library.
|
||
|
|
||
|
\subsection{Incomplete handling of import specs within standalone export forms (9.5)}
|
||
|
|
||
|
A bug that limited the \scheme{(import \var{import-spec} \dots)} form within a
|
||
|
standalone \scheme{export} form to \scheme{(import \var{import-spec})} has been
|
||
|
fixed.
|
||
|
|
||
|
\subsection{Permission denied after deleting files or directories in Windows (9.5)}
|
||
|
|
||
|
In Windows, deleting a file or directory briefly leaves the file or
|
||
|
directory in a state where a subsequent create operation fails with
|
||
|
permission denied. This race condition is now mitigated.
|
||
|
[This bug applies to all versions up to 9.5 on Windows 7 and later.]
|
||
|
|
||
|
\subsection{Incorrect handling of offset in
|
||
|
\protect\scheme{date->time-utc} on Windows (9.5)}
|
||
|
|
||
|
A bug when \scheme{date->time-utc} is called on Windows with a
|
||
|
date-zone-offset smaller than the system's time-zone offset has been
|
||
|
fixed.
|
||
|
[This bug dated back to Version 9.5.]
|
||
|
|
||
|
\subsection{Compiler mishandling of fx /carry operations (9.5)}
|
||
|
|
||
|
A bug in the source optimizer that caused an internal compiler error when
|
||
|
folding certain calls to \scheme{fx+/carry}, \scheme{fx-/carry}, and
|
||
|
\scheme{fx*/carry} has been fixed.
|
||
|
[This bug dated back to Version 9.1.]
|
||
|
|
||
|
\subsection{Compiler mishandling of nested \protect\scheme{call-with-values} calls (9.5)}
|
||
|
|
||
|
A bug in that caused an internal compiler error when optimizing certain
|
||
|
nested calls to \scheme{call-with-values} has been fixed.
|
||
|
[This bug dated back to Version 8.9.1.]
|
||
|
|
||
|
\subsection{Incorrect expansion of \protect\scheme{define-values} of no values (9.5)}
|
||
|
|
||
|
A bug in the expansion of \scheme{define-values} that caused it to produce
|
||
|
a non-definition form when used to define no values has been fixed.
|
||
|
[This bug dated back to at least Version 8.4.]
|
||
|
|
||
|
\subsection{Optimizer dropping \protect\scheme{pariah} forms (9.5)}
|
||
|
|
||
|
A bug in the source optimizer that caused pariah forms to be ignored
|
||
|
has been fixed.
|
||
|
[This bug dated back to at least Version 9.3.1.]
|
||
|
|
||
|
\subsection{Invalid memory references involving complex numbers (9.5)}
|
||
|
|
||
|
A bug on 64-bit platforms that occasionally caused invalid memory
|
||
|
references when operating on inexact complex numbers or the imaginary parts
|
||
|
of inexact complex numbers has been fixed.
|
||
|
[This bug dated back to Version 8.9.1.]
|
||
|
|
||
|
\subsection{Overflow detection for left-shift operations on fixnums (9.5)}
|
||
|
|
||
|
A bug that caused \scheme{fxsll}, \scheme{fxarithmetic-shift-left},
|
||
|
and \scheme{fxarithmetic-shift} to fail to detect overflow in certain
|
||
|
cases has been fixed.
|
||
|
[This bug dated back to Version 4.0.]
|
||
|
|
||
|
\subsection{Missing \protect\scheme{enum-set-indexer} argument check (9.5)}
|
||
|
|
||
|
A missing argument check that resulted in the procedure returned by \scheme{enum-set-indexer}
|
||
|
causing an invalid memory reference when passed a non-symbol argument has been fixed.
|
||
|
[This bug dated back to Version 7.5.]
|
||
|
|
||
|
\subsection{Storage for inaccessible mutexes and conditions is reclaimed (9.5)}
|
||
|
|
||
|
The C heap storage for inaccessible mutexes and conditions is now reclaimed.
|
||
|
[This bug dated back to Version 6.5.]
|
||
|
|
||
|
\subsection{Missing guardian entries when a thread exits (9.5)}
|
||
|
|
||
|
A bug that caused guardian entries for a thread to be lost when a
|
||
|
thread exits has been fixed.
|
||
|
[This bug dated back to Version 6.5.]
|
||
|
|
||
|
\subsection{Incorrect code for certain nested \protect\scheme{if} patterns (9.5)}
|
||
|
|
||
|
A bug in the source optimizer that produced incorrect code for certain
|
||
|
nested \scheme{if} patterns has been fixed.
|
||
|
For example, the code generated for the following expression:
|
||
|
|
||
|
\schemedisplay
|
||
|
(if (if (if (if (zero? (a)) #f #t) (begin (b) #t) #f)
|
||
|
(c)
|
||
|
#f)
|
||
|
(x)
|
||
|
(y))
|
||
|
\endschemedisplay
|
||
|
|
||
|
inappropriately evaluated the subexpression \scheme{(b)} when the
|
||
|
subexpression \scheme{(a)} evaluates to 0 and not when \scheme{(a)}
|
||
|
evaluates to 1.
|
||
|
[This bug dated back to Version 9.0.]
|
||
|
|
||
|
\subsection{Leaked or unexpected \protect\scheme{cpvalid-defer} form (9.5)}
|
||
|
|
||
|
A bug in the pass of the compiler that inserts valid checks for
|
||
|
\scheme{letrec} and \scheme{letrec*} bindings has been fixed.
|
||
|
The bug resulted in an internal compiler exception with a condition
|
||
|
message regarding a leaked or unexpected \scheme{cpvalid-defer} form.
|
||
|
[This bug dated back to Version 6.9c.]
|
||
|
|
||
|
\subsection{\protect\scheme{string->number} and reader numeric syntax issues (9.4)}
|
||
|
|
||
|
\scheme{string->number} and the reader previously treated all complex
|
||
|
numbers written in polar notation that Chez Scheme cannot represent
|
||
|
exactly as inexact, even with an explicit \scheme{#e} prefix.
|
||
|
For such numbers with the \scheme{#e} prefix, \scheme{string->number}
|
||
|
now returns \scheme{#f} and the reader now raises an exception with
|
||
|
condition type \scheme{&implementation-restriction}.
|
||
|
Both still return an inexact representation for such numbers written without
|
||
|
the \scheme{#e} prefix, even if R6RS requires an exact result, i.e.,
|
||
|
even if they have no decimal point, exponent, or mantissa width.
|
||
|
|
||
|
Ratios with an exponent, like \scheme{1/2e10}, are non-standard and
|
||
|
now cause the procedure \scheme{string->number} imported from
|
||
|
\scheme{(rnrs)} to return \scheme{#f}.
|
||
|
When the reader encounters a ratio followed by an exponent while in R6RS
|
||
|
mode (i.e., when reading a library or top-level program and not following
|
||
|
an \scheme{#!chezscheme}, or when following an explicit \scheme{#!r6rs}),
|
||
|
it raises an exception.
|
||
|
|
||
|
Positive or negative zero followed by a large exponent now properly
|
||
|
produces zero rather than an infinity, e.g., \scheme{0e3000} now produces
|
||
|
\scheme{0} rather than \scheme{+inf.0}.
|
||
|
|
||
|
A rounding bug converting some small ratios into floating point numbers,
|
||
|
when those numbers fall into the range of denormalized floats, has
|
||
|
been fixed.
|
||
|
This bug also affected the reading of and conversion of strings into
|
||
|
denormalized floating-point numbers.
|
||
|
[Some of these bugs dated back to Version 3.0.]
|
||
|
|
||
|
\subsection{\protect\scheme{date->time-utc} ignoring zone-offset field (9.4)}
|
||
|
|
||
|
\scheme{date->time-utc} has been fixed to properly take into account the
|
||
|
zone-offset field.
|
||
|
[This bug dated back to Version 8.0.]
|
||
|
|
||
|
\subsection{\protect\scheme{wchar} and \protect\scheme{wchar_t} record field types fail to inline in Windows (9.4)}
|
||
|
|
||
|
On Windows, the source optimizer has been fixed to handle \scheme{wchar} and
|
||
|
\scheme{wchar_t} record field types.
|
||
|
|
||
|
\subsection{path-related procedures cause invalid memory reference with non-string arguments in Windows (9.4)}
|
||
|
|
||
|
On Windows, the path-related procedures now raise an appropriate exception when the path argument is not a string.
|
||
|
|
||
|
\subsection{Mutex acquisition bug (9.4)}
|
||
|
|
||
|
A bug in the handling of mutexes has been fixed.
|
||
|
The bug typically presented as a spurious ``recursively locked'' exception.
|
||
|
|
||
|
\subsection{\protect\scheme{dynamic-wind} mistakenly enabling interrupts (9.3.3)}
|
||
|
|
||
|
A bug causing \scheme{dynamic-wind} to unconditionally enable
|
||
|
interrupts upon a nonlocal exit from the body thunk has been fixed.
|
||
|
Interrupts are now properly enabled only when the optional
|
||
|
\var{critical?} argument is supplied and is not false.
|
||
|
[This bug dated back to Version 6.9c.]
|
||
|
|
||
|
\subsection{Incorrect optimization of various primitives (9.3.1)}
|
||
|
|
||
|
Mistakes in our primitive database that caused the source optimizer
|
||
|
to treat \scheme{append}, \scheme{append!}, \scheme{list*},
|
||
|
\scheme{cons*}, and \scheme{record-type-parent} as always returning
|
||
|
true values have been fixed, along with mistakes that caused the
|
||
|
source optimizer to treat \scheme{null-environment},
|
||
|
\scheme{source-object-bfp}, \scheme{source-object-efp}, and
|
||
|
\scheme{source-object-sfd} as not requiring argument checks.
|
||
|
[This bug dated back to Version 6.0.]
|
||
|
|
||
|
\subsection{Increased allocation ceiling under 32-bit Windows (9.3.1)}
|
||
|
|
||
|
We have worked around a limitation in the number of distinct allocation
|
||
|
areas the Windows VirtualAlloc function permits to be allocated by
|
||
|
allocating fewer, larger chunks of memory, effectively increasing the
|
||
|
maximum size of the heap to the full amount permitted by the operating
|
||
|
system.
|
||
|
|
||
|
\subsection{Syntax errors for \protect\scheme{let} and \protect\scheme{let*} (9.2.1)}
|
||
|
|
||
|
The expander now handles \scheme{let} and \scheme{let*} in such a
|
||
|
way that certain syntax errors previously reported as syntax errors
|
||
|
in \scheme{lambda} are now reported properly as syntax errors in
|
||
|
\scheme{let} or \scheme{let*}. This includes duplicate identifier
|
||
|
errors for \scheme{let} and errors involving internal definitions
|
||
|
for both \scheme{let} and \scheme{let*}.
|
||
|
|
||
|
\subsection{Dropped \protect\scheme{profile-dump-html} calls (9.0)}
|
||
|
|
||
|
A bug that caused effect-context calls to \scheme{profile-dump-html}
|
||
|
to be dropped at optimize-level 3 has been fixed.
|
||
|
[This bug dated back to Version 7.5.]
|
||
|
|
||
|
\subsection{Proper treatment of imported meta bindings (8.9.3)}
|
||
|
|
||
|
A deficiency in the handling of library dependencies that prevented meta
|
||
|
definitions exported in one library from being used reliably by a macro
|
||
|
defined in another library has been fixed.
|
||
|
Handling imported meta bindings involves tracking
|
||
|
visit-visit-requirements, which for a library \scheme{(A)} is the set of
|
||
|
libraries that must be visited (rather than invoked) when \scheme{(A)}
|
||
|
is visited.
|
||
|
An attempt to assign a meta variable imported from a library now results
|
||
|
in a syntax error.
|
||
|
[This bug dated back to Version 7.9.1.]
|
||
|
|
||
|
\subsection{Reexport of identifiers with properties (8.9.3)}
|
||
|
|
||
|
A bug that prevented an identifier given a property via
|
||
|
\scheme{define-property} from being exported from a library \scheme{(A)},
|
||
|
imported into and reexported from a second library \scheme{(B)}, and
|
||
|
imported from both \scheme{(A)} and \scheme{(B)} into and reexported
|
||
|
from a third library \scheme{(C)} has been fixed.
|
||
|
[This bug dated back to Version 8.1.]
|
||
|
|
||
|
\subsection{Cyclic record-type descriptors (8.4.1)}
|
||
|
|
||
|
The fasl (fast load) format used for compiled files now supports cyclic
|
||
|
record-type descriptors (RTDs), which are produced for recursive ftype
|
||
|
definitions.
|
||
|
Previously, compiling a file containing a recursive ftype definition
|
||
|
and subsequently loading the file resulted in corruption of the ftype
|
||
|
descriptor used to typecheck ftype pointers, potentially leading to
|
||
|
incorrect behavior or invalid memory references.
|
||
|
[This bug dated back to Version 8.2.]
|
||
|
|
||
|
\subsection{Invalid folding of record accesses (8.4.1)}
|
||
|
|
||
|
A bug that caused the optimizer to fold calls to record accessors applied
|
||
|
to a constant value of the wrong type, sometimes resulting in compile-time
|
||
|
invalid memory references or other compile-time errors, has been fixed.
|
||
|
[This bug dated back to Version 8.4.]
|
||
|
|
||
|
\subsection{4GB+ allocation for Windows x86\_64 (8.4.1)}
|
||
|
|
||
|
A bug that prevented objects larger than 4GB to be created under Windows
|
||
|
x86\_64 has been fixed.
|
||
|
[This bug dated back to Version 8.4.]
|
||
|
|
||
|
%-----------------------------------------------------------------------------
|
||
|
\section{Performance Enhancements}\label{section:performance}
|
||
|
|
||
|
\subsection{Special-cased basic arithmetic operations (9.5.4)}
|
||
|
|
||
|
The basic arithmetic operations (addition, subtraction, multiplication,
|
||
|
division) are now much faster when presented with certain special
|
||
|
cases, e.g., multiplication of a large integer by 1 or -1 or addition
|
||
|
of large integer and 0.
|
||
|
|
||
|
\subsection{Faster right-shift of large integers (9.5.4)}
|
||
|
|
||
|
Right shifting a large integer is now much faster in most cases
|
||
|
where the shift count is a significant fraction of the number of
|
||
|
bits in the large integer.
|
||
|
|
||
|
\subsection{Faster object-file loading (9.5.4)}\label{sec:faster-object-file-loading}
|
||
|
|
||
|
Visiting an object file (to obtain only compile-time information and
|
||
|
code) and revisiting an object file (to obtain only run-time information
|
||
|
and code) is now faster, because revisions to the fasl format, fasl
|
||
|
writer, and fasl reader allow run-time code to be seeked past when
|
||
|
visiting and compile-time code to be seeked past when revisiting.
|
||
|
For compressed object files (the default), seeking still requires
|
||
|
reading all of the data, but the cost of parsing the fasl format and
|
||
|
building objects in the skipped portions is avoided, as are certain
|
||
|
side effects, such as associating record type descriptors with their
|
||
|
uids.
|
||
|
|
||
|
Similarly, recompile information is now placed at the front of each
|
||
|
object file where it can be loaded separately from
|
||
|
the remainder of an object file without even seeking past the other
|
||
|
portions of the file.
|
||
|
Recompile information is used by \scheme{import} (when
|
||
|
\scheme{compile-imported-libraries} is \scheme{#t}) and by maybe-compile
|
||
|
routines such as \scheme{maybe-compile-program} to help determine
|
||
|
whether recompilation is necessary.
|
||
|
|
||
|
Importing a library from an object file now causes the object file
|
||
|
to be visited rather than fully loaded. (Libraries were already
|
||
|
just revisited when required for their run-time code, e.g., when
|
||
|
used from a top-level program.)
|
||
|
|
||
|
Together these changes can significantly reduce compile-time and
|
||
|
run-time overhead, particularly in applications that make use of
|
||
|
a large number of libraries.
|
||
|
|
||
|
\subsection{Faster \protect\scheme{profile-release-counters} (9.5.4)}
|
||
|
|
||
|
\scheme{profile-release-counters} is now generation-friendly, meaning
|
||
|
it does not incur any overhead for code objects in generations that
|
||
|
have not been collected since the last call to\scheme{profile-release-counters}.
|
||
|
Also, it no longer allocates memory when counters are released.
|
||
|
|
||
|
\subsection{Reduced cost for obtaining profile counts (9.5.4)}
|
||
|
|
||
|
The cost of obtaining profile counts via \scheme{profile-dump} and
|
||
|
other mechanisms has been reduced significantly.
|
||
|
|
||
|
\subsection{Better code for \protect\scheme{bytevector} (9.5.1)}
|
||
|
|
||
|
The compiler now generates better inline code for the \scheme{bytevector}
|
||
|
procedure.
|
||
|
Instead of one byte memory write for each argument, it writes up
|
||
|
to four (32-bit machines) or eight (64-bit machines) bytes at a
|
||
|
time, which almost always results in fewer instructions and fewer
|
||
|
writes.
|
||
|
|
||
|
\subsection{\protect\scheme{vector-for-each} and \protect\scheme{string-for-each} improvement (9.5.1)}
|
||
|
|
||
|
The last call to the procedure passed to \scheme{vector-for-each}
|
||
|
or \scheme{string-for-each} is now reliably implemented as tail
|
||
|
call, as was already the case for \scheme{for-each}.
|
||
|
|
||
|
\subsection{Lambda commonization (9.5.1)}
|
||
|
|
||
|
After running the main source optimization pass (cp0), the
|
||
|
compiler optionally runs a \emph{commonization} pass, which
|
||
|
commonizes code for similar lambda expressions.
|
||
|
The parameter \scheme{commonization-level} controls whether the
|
||
|
commonization pass is run and, if so, how aggressive it is.
|
||
|
The parameter's value must be a nonnegative exact integer ranging
|
||
|
from 0 through 9. When the parameter is set to 0, the default,
|
||
|
commonization is not run. Otherwise, higher values result in more
|
||
|
commonization.
|
||
|
|
||
|
\subsection{Improved compile times (9.5.1)}
|
||
|
|
||
|
Compile times are now lower, sometimes by an order of magnitude or
|
||
|
more, for procedures with thousands of parameters, local variables,
|
||
|
and compiler-introduced temporaries.
|
||
|
For such procedures, the register/frame allocator proactively spills
|
||
|
variables with large live ranges, cutting down on the size and cost
|
||
|
of building the conflict graph used to represent pairs of variables
|
||
|
that are live at the same time and therefore cannot share a location.
|
||
|
|
||
|
\subsection{Improved oblist management (9.3.3)}
|
||
|
|
||
|
As a result of improvements in the handing of the oblist (symbol table),
|
||
|
the storage for a symbol is often reclaimed more quickly after it
|
||
|
becomes inaccessible, less space is set aside for the oblist at
|
||
|
start-up, oblist lookups are faster when the oblist contains a large
|
||
|
number of symbols, and the minimum cost of a maximum-generation
|
||
|
collection has been cut significantly, down from tens of microseconds
|
||
|
to just a handful on contemporary hardware.
|
||
|
|
||
|
\subsection{Reduced maximum-generation collection overhead (9.3.3)}
|
||
|
|
||
|
Various changes in the storage manager have reduced the amount of
|
||
|
extra memory required for managing heap storage and increased the
|
||
|
likelihood that memory can be returned to the O/S as the heap
|
||
|
shrinks.
|
||
|
Returning memory to the O/S is now faster, so the minimum time for
|
||
|
a maximum-generation collection, or any other collection where
|
||
|
release of memory to the O/S is enabled, has been cut.
|
||
|
|
||
|
\subsection{Faster library load times (9.3.1)}
|
||
|
|
||
|
Libraries now load faster at both compile and run time, with more
|
||
|
pronounced improvements when dozens of libraries or more are being
|
||
|
loaded.
|
||
|
|
||
|
\subsection{Partially static record instances (9.3.1)}
|
||
|
|
||
|
The source optimizer now maintains information about partially static
|
||
|
record instances to eliminate field accesses and type checks when a
|
||
|
binding site for a record instance is visible to the access or checking
|
||
|
code.
|
||
|
For example,
|
||
|
|
||
|
\schemedisplay
|
||
|
(let ()
|
||
|
(import scheme)
|
||
|
(define-record foo ([immutable ptr a] [immutable ptr b]))
|
||
|
(define (inc r) (make-foo (foo-a r) (+ (foo-b r) 1)))
|
||
|
(lambda (x)
|
||
|
(let* ([r (make-foo 37 x)]
|
||
|
[r (inc r)]
|
||
|
[r (inc r)])
|
||
|
r)))
|
||
|
\endschemedisplay
|
||
|
|
||
|
is reduced by the source optimizer down to:
|
||
|
|
||
|
\schemedisplay
|
||
|
(lambda (x) ($record '#<record type foo> 37 (+ (+ x 1) 1)))
|
||
|
\endschemedisplay
|
||
|
|
||
|
where \scheme{$record} is a low-level primitive for creating record
|
||
|
instances.
|
||
|
That is, the source optimizer eliminates the intermediate record
|
||
|
structures, record references, and type checks, in addition to
|
||
|
creating the record-type descriptor at compile time, eliminating
|
||
|
the record-constructor descriptor, record constructor, and record
|
||
|
accessors produced by expansion of the record definition.
|
||
|
|
||
|
\subsection{More source-optimizer improvements (9.3.1)}
|
||
|
|
||
|
The source optimizer now handles \scheme{apply} with a known-list
|
||
|
final argument, e.g., a constant list or list constructed directly
|
||
|
within the apply operation via \scheme{cons}, \scheme{list}, or
|
||
|
\scheme{list*} (\scheme{cons*}) as if it were an ordinary call,
|
||
|
i.e., without the \scheme{apply} and without the constant list
|
||
|
wrapper or list constructor.
|
||
|
For example:
|
||
|
|
||
|
\schemedisplay
|
||
|
(apply apply apply + (list 1 (cons 2 (list x (cons* 4 '(5 6))))))
|
||
|
\endschemedisplay
|
||
|
|
||
|
folds down to \scheme{(+ 18 x)}.
|
||
|
While not common at the source level, patterns like this can
|
||
|
materialize as the result of other source optimizations,
|
||
|
particularly inlining.
|
||
|
|
||
|
The source optimizer now also reduces applications of \scheme{car} and
|
||
|
\scheme{cdr} to the list-building operators \scheme{cons} and
|
||
|
\scheme{list}, e.g.:
|
||
|
|
||
|
\schemedisplay
|
||
|
(car (cons \var{e_1} \var{e_2})) ;-> (begin \var{e_2} \var{e_1})
|
||
|
(car (list \var{e_1} \var{e_2} \var{e_3})) ;-> (begin \var{e_2} \var{e_3} \var{e_1})
|
||
|
(cdr (list \var{e_1} \var{e_2} \var{e_3})) ;-> (begin \var{e_1} (list \var{e_2} \var{e_3}))
|
||
|
\endschemedisplay
|
||
|
|
||
|
discarding side-effect-free expressions in the \scheme{begin} forms
|
||
|
where appropriate.
|
||
|
It treats similarly calls of \scheme{vector-ref} on \scheme{vector};
|
||
|
\scheme{list-ref} on \scheme{list}, \scheme{list*}, and \scheme{cons*};
|
||
|
\scheme{string-ref} on \scheme{string}; and \scheme{fxvector-ref}
|
||
|
on \scheme{fxvector}, taking care with \scheme{string-ref} and
|
||
|
\scheme{fxvector-ref} not to optimize when doing so might mask an
|
||
|
invalid type of argument to a safe constructor.
|
||
|
|
||
|
Finally, the source optimizer now removes certain unnecessary
|
||
|
\scheme{let} bindings within the constraints of evaluation-order
|
||
|
preservation.
|
||
|
For example,
|
||
|
|
||
|
\schemedisplay
|
||
|
(let ([x \var{e_1}] [y \var{e_2}]) (list (cons x y) 7))
|
||
|
\endschemedisplay
|
||
|
|
||
|
reduces to:
|
||
|
|
||
|
\schemedisplay
|
||
|
(list (cons \var{e_1} \var{e_2}) 7)
|
||
|
\endschemedisplay
|
||
|
|
||
|
Such bindings commonly arise from inlining. Eliminating them tends
|
||
|
to make the output of \scheme{expand/optimize} more readable.
|
||
|
|
||
|
The impact on performance is minimal, but it can result in smaller
|
||
|
expressions and thus enable more inlining within the same size limits.
|
||
|
|
||
|
\subsection{Improved foreign-pointer address handling (9.3.1)}
|
||
|
|
||
|
Various composed operation on ftypes now avoid allocating
|
||
|
and dereferencing intermediate ftype pointers, i.e., \scheme{ftype-ref},
|
||
|
\scheme{ftype-set!}, \scheme{ftype-init-lock!}, \scheme{ftype-lock!},
|
||
|
\scheme{ftype-unlock!}, \scheme{ftype-spin-lock!},
|
||
|
\scheme{ftype-locked-incr!}, or \scheme{ftype-locked-decr!} applied
|
||
|
directly to the result of \scheme{ftype-ref}, \scheme{ftype-&ref}, or
|
||
|
\scheme{make-ftype-pointer}.
|
||
|
|
||
|
\subsection{New source optimizations (9.2.1)}
|
||
|
|
||
|
The source optimizer does a few new optimizations: it folds
|
||
|
calls to \scheme{symbol->string}, \scheme{string->symbol}, and
|
||
|
\scheme{gensym->unique-string} if the argument is known at compile
|
||
|
time and has the right type; it folds zero-argument calls to
|
||
|
\scheme{vector}, \scheme{string}, \scheme{bytevector}, and
|
||
|
\scheme{fxvector}; and it discards subsumed case-lambda clauses,
|
||
|
e.g., the second clause in
|
||
|
\scheme{(case-lambda [(x . y) \var{e_1}] [(x y) \var{e_2}])}.
|
||
|
|
||
|
\subsection{Reduced stack requirements after large apply (9.2)}
|
||
|
|
||
|
A call to \scheme{apply} with a very long argument list can cause a
|
||
|
large chunk of memory to be allocated for the topmost portion of
|
||
|
the stack.
|
||
|
This space is now reclaimed during the next collection.
|
||
|
|
||
|
\subsection{Improved symbol-hashtables performance (9.2)\label{sec:symbol-hashtable-performance}}
|
||
|
|
||
|
The performance of operations on symbol hashtables has been improved
|
||
|
generally over previous releases by eliminating call overhead for the
|
||
|
hash and equality functions.
|
||
|
Further improvements are possible with the use of the new type-specific
|
||
|
symbol-hashtable operators (Section~\ref{sec:symbol-hashtables}).
|
||
|
|
||
|
\subsection{Reduced library-invocation time, memory consumption (9.1)}
|
||
|
|
||
|
The amount of time required to invoke a library and the amount of memory
|
||
|
occupied by the library when the library is invoked as the result of a
|
||
|
run-time dependency of another library or a top-level program have both
|
||
|
been reduced by ``revisiting'' rather than ``invoking'' the library,
|
||
|
effectively leaving the compile-time information on disk until if and
|
||
|
when it is needed.
|
||
|
|
||
|
\subsection{Discarding relocation tables for static code objects (9.1)}
|
||
|
|
||
|
Unless the command-line parameter \scheme{--retain-static-relocation}
|
||
|
is supplied, the collector now discards relocation tables for code
|
||
|
objects when the code objects are promoted to the static generation,
|
||
|
either at boot time via heap compaction or via a call to \scheme{collect}
|
||
|
with the symbol \scheme{static} as the target generation.
|
||
|
This results in a significant reduction in the memory occupied by the
|
||
|
code object (around 20\% in our tests).
|
||
|
|
||
|
\subsection{Guardian registration (9.1)}
|
||
|
|
||
|
The code to register an object with a guardian is now open-coded, at
|
||
|
the cost of some additional work during the next collection.
|
||
|
The result is a modest net improvement in registration overhead (around
|
||
|
15\% in our tests).
|
||
|
Of potentially greater importance when threaded, each registration no
|
||
|
longer requires synchronization.
|
||
|
|
||
|
\subsection{Generated code improvements (9.1)}
|
||
|
|
||
|
The compiler generates better code in several small ways, resulting
|
||
|
in small decreases in code size and corresponding small
|
||
|
performance improvements in the range of 1--5\% in our tests.
|
||
|
|
||
|
\subsection{Reduced collector overhead for large heaps (9.0)}
|
||
|
|
||
|
In previous releases, a factor in collector performance was the
|
||
|
overall size of the heap (measured both in number of pages and the
|
||
|
amount of virtual memory spanned by the heap).
|
||
|
Through various changes to the data structures used to support the
|
||
|
storage manager, this factor has been eliminated, which can
|
||
|
significantly reduce the cost of collecting a younger generation
|
||
|
with a small number of accessible objects relative to overall heap
|
||
|
size.
|
||
|
In our experiments, the minimum cost of collection on contemporary
|
||
|
hardware exceeded 100 microseconds for heaps of 64MB or more and 5
|
||
|
milliseconds for heaps of 1GB or more.
|
||
|
The minimum cost grew in proportion to the heap size from there.
|
||
|
This is now fixed for all heap sizes at just a few microseconds.
|
||
|
|
||
|
\subsection{Reduced mutation overhead (9.0)}
|
||
|
|
||
|
Improvements in the compiler and storage manager have been made to
|
||
|
reduce the cost of tracking possible pointers from older to younger
|
||
|
generations when objects are mutated.
|
||
|
|
||
|
\subsection{Improved foreign-pointer address handling (8.9.5)\label{ftpaopt}}
|
||
|
|
||
|
Ftype pointers with constant addresses are now created at compile
|
||
|
time, with ftype-pointer address checks optimized away as well.
|
||
|
|
||
|
Bignum allocation overhead is avoided for addresses outside the
|
||
|
fixnum range when the results of two \scheme{ftype-pointer-address}
|
||
|
calls are directly compared or the result of one
|
||
|
\scheme{ftype-pointer-address} call is directly compared with 0.
|
||
|
That is, comparisons like:
|
||
|
|
||
|
\schemedisplay
|
||
|
(= (ftype-pointer-address x) 0)
|
||
|
(= (ftype-pointer-address x) (ftype-pointer-address y))
|
||
|
\endschemedisplay
|
||
|
|
||
|
are effectively optimized to:
|
||
|
|
||
|
\schemedisplay
|
||
|
(ftype-pointer-null? x)
|
||
|
(ftype-pointer=? x y)
|
||
|
\endschemedisplay
|
||
|
|
||
|
This optimization is performed when the comparison procedure is
|
||
|
\scheme{=}, \scheme{eqv?}, or \scheme{equal?} and the arguments
|
||
|
are given in either order.
|
||
|
The optimization is also performed when \scheme{zero?} is applied directly
|
||
|
to the result of \scheme{ftype-pointer-address}.
|
||
|
|
||
|
Bignum allocation overhead is also avoided at optimize-level~3
|
||
|
when \scheme{ftype-pointer-address} is used in combination with
|
||
|
\scheme{make-ftype-pointer} to effect a type cast, as in:
|
||
|
|
||
|
\schemedisplay
|
||
|
(make-ftype-pointer T (ftype-pointer-address x))
|
||
|
\endschemedisplay
|
||
|
|
||
|
Both bignum and ftype-pointer allocation is avoided when the result
|
||
|
of such a cast is used directly as the base pointer in an
|
||
|
\scheme{ftype-ref}, \scheme{ftype-&ref}, \scheme{ftype-set!},
|
||
|
\scheme{ftype-locked-incr!}, \scheme{ftype-locked-decr!},
|
||
|
\scheme{ftype-init-lock!}, \scheme{ftype-lock!}, \scheme{ftype-spin-lock!},
|
||
|
or \scheme{ftype-unlock!} form, as in:
|
||
|
|
||
|
\schemedisplay
|
||
|
(ftype-ref T (fld) (make-ftype-pointer T (ftype-pointer-address x)))
|
||
|
\endschemedisplay
|
||
|
|
||
|
These optimizations do not occur when the calls to
|
||
|
\scheme{ftype-pointer-address} are not nested directly within the outer
|
||
|
form, as when a \scheme{let} binding is used to name the result of the
|
||
|
\scheme{ftype-pointer-address} call, e.g.:
|
||
|
|
||
|
\schemedisplay
|
||
|
(let ([addr (ftype-pointer-address x)]) (= addr 0))
|
||
|
\endschemedisplay
|
||
|
|
||
|
In other places where \scheme{ftype-pointer-address} is used, the compiler
|
||
|
now open-codes the extraction and (if necessary) bignum allocation,
|
||
|
reducing overhead by the cost of a procedure call.
|
||
|
|
||
|
\subsection{Improved performance when profiling (8.9.5)}
|
||
|
|
||
|
In addition to improvements in the tracking of profile counts, the
|
||
|
run-time overhead for gathering profile information has gone down by
|
||
|
5--10\% in our tests and is now typically around 10\% of the total
|
||
|
unprofiled run time.
|
||
|
(Unprofiled code is also slightly faster, but by less than 2\% in
|
||
|
our tests.)
|
||
|
|
||
|
\subsection{New compiler back-end (8.9.1, 8.9.2, 8.9.5)}
|
||
|
|
||
|
Versions starting with 8.9.1 employ a new compiler back end that is
|
||
|
structured as a series of nanopasses and replaces the old linear-time
|
||
|
register allocator with a graph-coloring register allocator.
|
||
|
Compilation with the new back end is substantially slower (up to a factor
|
||
|
of two) than with the old back end, while code generated with the new
|
||
|
back end is faster (14--40\% depending on architecture and optimization
|
||
|
level) in our tests.
|
||
|
These improvements are independent of improvements
|
||
|
resulting from cross-library constant folding and inlining
|
||
|
(Section~\ref{subsection:clcfai}).
|
||
|
The code generated for a specific program might be faster or slower.
|
||
|
|
||
|
\subsection{Open-coding of \protect\scheme{make-guardian} (8.9.4)}
|
||
|
|
||
|
Calls to \scheme{make-guardian} are now open-coded by the compiler to
|
||
|
expose the implicit resulting \scheme{case-lambda} expression so that
|
||
|
calls to the guardian can themselves be inlined, thus reducing the overhead
|
||
|
for registering objects with a guardian and querying the guardian for
|
||
|
resurrected objects.
|
||
|
|
||
|
\subsection{Improved open-coding of \protect\scheme{make-parameter} and \protect\scheme{make-thread-parameter} (8.9.4)}
|
||
|
|
||
|
\scheme{make-parameter} and \scheme{make-thread-parameter}
|
||
|
are now open-coded in all cases to expose the implicit resulting
|
||
|
\scheme{case-lambda} expression.
|
||
|
(They were already open-coded when the second, \emph{filter},
|
||
|
argument was a \scheme{lambda} expression or primitive name.)
|
||
|
|
||
|
\subsection{Cross-library constant folding and inlining (8.9.2)\label{subsection:clcfai}}
|
||
|
|
||
|
The compiler now propagates constants and inlines simple procedures
|
||
|
across library boundaries.
|
||
|
A simple procedure is one that, after optimization of the exporting
|
||
|
library, is smaller than a given threshold, contains no free references
|
||
|
to other bindings in the exporting library, and contains no constants
|
||
|
that cannot be copied without breaking pointer identity.
|
||
|
The size threshold is determined, as for inlining within a library or
|
||
|
other compilation unit, by the parameter \scheme{cp0-score-limit}.
|
||
|
In this case, the size threshold is determined based on the size
|
||
|
\emph{before} inlining rather than the size \emph{after} inlining,
|
||
|
which is often more conservative.
|
||
|
Omitting larger procedures that might generate less code when inlined in
|
||
|
a particular context reduces the amount of information that must be stored
|
||
|
in the exporting library's object code to support cross-library inlining.
|
||
|
|
||
|
One particularly useful benefit of this optimization is that record
|
||
|
predicates, accessors, mutators, and (depending on protocols)
|
||
|
constructors created by a record definition in one library and exported
|
||
|
by another are inlined in the importing library, just as if the record
|
||
|
type were defined in the importing library.
|
||
|
|
||
|
\end{document}
|