8651 lines
287 KiB
Plaintext
8651 lines
287 KiB
Plaintext
\input texinfo @c -*- texinfo -*-
|
||
@comment ========================================================
|
||
@comment %**start of header
|
||
@setfilename m4.info
|
||
@include version.texi
|
||
@settitle GNU M4 @value{VERSION} macro processor
|
||
@setchapternewpage odd
|
||
@ifnothtml
|
||
@setcontentsaftertitlepage
|
||
@end ifnothtml
|
||
@finalout
|
||
|
||
@c @tabchar{}
|
||
@c ----------
|
||
@c The testsuite expects literal tab output in some examples, but
|
||
@c literal tabs in texinfo lead to formatting issues.
|
||
@macro tabchar
|
||
@ @c
|
||
@end macro
|
||
|
||
@c @ovar{ARG}
|
||
@c -------------------
|
||
@c The ARG is an optional argument. To be used for macro arguments in
|
||
@c their documentation (@defmac).
|
||
@macro ovar{varname}
|
||
@r{[}@var{\varname\}@r{]}@c
|
||
@end macro
|
||
|
||
@c @dvar{ARG, DEFAULT}
|
||
@c -------------------
|
||
@c The ARG is an optional argument, defaulting to DEFAULT. To be used
|
||
@c for macro arguments in their documentation (@defmac).
|
||
@macro dvar{varname, default}
|
||
@r{[}@var{\varname\} = @samp{\default\}@r{]}@c
|
||
@end macro
|
||
|
||
@comment %**end of header
|
||
@comment ========================================================
|
||
|
||
@copying
|
||
|
||
This manual (@value{UPDATED}) is for @acronym{GNU} M4 (version
|
||
@value{VERSION}), a package containing an implementation of the m4 macro
|
||
language.
|
||
|
||
Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 2004, 2005,
|
||
2006, 2007, 2008, 2009 Free Software Foundation, Inc.
|
||
|
||
@quotation
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the @acronym{GNU} Free Documentation License,
|
||
Version 1.2 or any later version published by the Free Software
|
||
Foundation; with no Invariant Sections, no Front-Cover Texts, and no
|
||
Back-Cover Texts. A copy of the license is included in the section
|
||
entitled ``@acronym{GNU} Free Documentation License.''
|
||
@end quotation
|
||
@end copying
|
||
|
||
@dircategory Text creation and manipulation
|
||
@direntry
|
||
* M4: (m4). A powerful macro processor.
|
||
@end direntry
|
||
|
||
@titlepage
|
||
@title GNU M4, version @value{VERSION}
|
||
@subtitle A powerful macro processor
|
||
@subtitle Edition @value{EDITION}, @value{UPDATED}
|
||
@author by Ren@'e Seindal, Fran@,{c}ois Pinard,
|
||
@author Gary V. Vaughan, and Eric Blake
|
||
@author (@email{bug-m4@@gnu.org})
|
||
|
||
@page
|
||
@vskip 0pt plus 1filll
|
||
@insertcopying
|
||
@end titlepage
|
||
|
||
@contents
|
||
|
||
@ifnottex
|
||
@node Top
|
||
@top GNU M4
|
||
@insertcopying
|
||
@end ifnottex
|
||
|
||
@acronym{GNU} @code{m4} is an implementation of the traditional UNIX macro
|
||
processor. It is mostly SVR4 compatible, although it has some
|
||
extensions (for example, handling more than 9 positional parameters
|
||
to macros). @code{m4} also has builtin functions for including
|
||
files, running shell commands, doing arithmetic, etc. Autoconf needs
|
||
@acronym{GNU} @code{m4} for generating @file{configure} scripts, but not for
|
||
running them.
|
||
|
||
@acronym{GNU} @code{m4} was originally written by Ren@'e Seindal, with
|
||
subsequent changes by Fran@,{c}ois Pinard and other volunteers
|
||
on the Internet. All names and email addresses can be found in the
|
||
files @file{m4-@value{VERSION}/@/AUTHORS} and
|
||
@file{m4-@value{VERSION}/@/THANKS} from the @acronym{GNU} M4
|
||
distribution.
|
||
|
||
This is release @value{VERSION}. It is now considered stable: future
|
||
releases in the 1.4.x series are only meant to fix bugs, increase speed,
|
||
or improve documentation. However@dots{}
|
||
|
||
An experimental feature, which would improve @code{m4} usefulness,
|
||
allows for changing the syntax for what is a @dfn{word} in @code{m4}.
|
||
You should use:
|
||
@comment ignore
|
||
@example
|
||
./configure --enable-changeword
|
||
@end example
|
||
@noindent
|
||
if you want this feature compiled in. The current implementation
|
||
slows down @code{m4} considerably and is hardly acceptable. In the
|
||
future, @code{m4} 2.0 will come with a different set of new features
|
||
that provide similar capabilities, but without the inefficiencies, so
|
||
changeword will go away and @emph{you should not count on it}.
|
||
|
||
@menu
|
||
* Preliminaries:: Introduction and preliminaries
|
||
* Invoking m4:: Invoking @code{m4}
|
||
* Syntax:: Lexical and syntactic conventions
|
||
|
||
* Macros:: How to invoke macros
|
||
* Definitions:: How to define new macros
|
||
* Conditionals:: Conditionals, loops, and recursion
|
||
|
||
* Debugging:: How to debug macros and input
|
||
|
||
* Input Control:: Input control
|
||
* File Inclusion:: File inclusion
|
||
* Diversions:: Diverting and undiverting output
|
||
|
||
* Text handling:: Macros for text handling
|
||
* Arithmetic:: Macros for doing arithmetic
|
||
* Shell commands:: Macros for running shell commands
|
||
* Miscellaneous:: Miscellaneous builtin macros
|
||
* Frozen files:: Fast loading of frozen state
|
||
|
||
* Compatibility:: Compatibility with other versions of @code{m4}
|
||
* Answers:: Correct version of some examples
|
||
|
||
* Copying This Package:: How to make copies of the overall M4 package
|
||
* Copying This Manual:: How to make copies of this manual
|
||
* Indices:: Indices of concepts and macros
|
||
|
||
@detailmenu
|
||
--- The Detailed Node Listing ---
|
||
|
||
Introduction and preliminaries
|
||
|
||
* Intro:: Introduction to @code{m4}
|
||
* History:: Historical references
|
||
* Bugs:: Problems and bugs
|
||
* Manual:: Using this manual
|
||
|
||
Invoking @code{m4}
|
||
|
||
* Operation modes:: Command line options for operation modes
|
||
* Preprocessor features:: Command line options for preprocessor features
|
||
* Limits control:: Command line options for limits control
|
||
* Frozen state:: Command line options for frozen state
|
||
* Debugging options:: Command line options for debugging
|
||
* Command line files:: Specifying input files on the command line
|
||
|
||
Lexical and syntactic conventions
|
||
|
||
* Names:: Macro names
|
||
* Quoted strings:: Quoting input to @code{m4}
|
||
* Comments:: Comments in @code{m4} input
|
||
* Other tokens:: Other kinds of input tokens
|
||
* Input processing:: How @code{m4} copies input to output
|
||
|
||
How to invoke macros
|
||
|
||
* Invocation:: Macro invocation
|
||
* Inhibiting Invocation:: Preventing macro invocation
|
||
* Macro Arguments:: Macro arguments
|
||
* Quoting Arguments:: On Quoting Arguments to macros
|
||
* Macro expansion:: Expanding macros
|
||
|
||
How to define new macros
|
||
|
||
* Define:: Defining a new macro
|
||
* Arguments:: Arguments to macros
|
||
* Pseudo Arguments:: Special arguments to macros
|
||
* Undefine:: Deleting a macro
|
||
* Defn:: Renaming macros
|
||
* Pushdef:: Temporarily redefining macros
|
||
|
||
* Indir:: Indirect call of macros
|
||
* Builtin:: Indirect call of builtins
|
||
|
||
Conditionals, loops, and recursion
|
||
|
||
* Ifdef:: Testing if a macro is defined
|
||
* Ifelse:: If-else construct, or multibranch
|
||
* Shift:: Recursion in @code{m4}
|
||
* Forloop:: Iteration by counting
|
||
* Foreach:: Iteration by list contents
|
||
* Stacks:: Working with definition stacks
|
||
* Composition:: Building macros with macros
|
||
|
||
How to debug macros and input
|
||
|
||
* Dumpdef:: Displaying macro definitions
|
||
* Trace:: Tracing macro calls
|
||
* Debug Levels:: Controlling debugging output
|
||
* Debug Output:: Saving debugging output
|
||
|
||
Input control
|
||
|
||
* Dnl:: Deleting whitespace in input
|
||
* Changequote:: Changing the quote characters
|
||
* Changecom:: Changing the comment delimiters
|
||
* Changeword:: Changing the lexical structure of words
|
||
* M4wrap:: Saving text until end of input
|
||
|
||
File inclusion
|
||
|
||
* Include:: Including named files
|
||
* Search Path:: Searching for include files
|
||
|
||
Diverting and undiverting output
|
||
|
||
* Divert:: Diverting output
|
||
* Undivert:: Undiverting output
|
||
* Divnum:: Diversion numbers
|
||
* Cleardivert:: Discarding diverted text
|
||
|
||
Macros for text handling
|
||
|
||
* Len:: Calculating length of strings
|
||
* Index macro:: Searching for substrings
|
||
* Regexp:: Searching for regular expressions
|
||
* Substr:: Extracting substrings
|
||
* Translit:: Translating characters
|
||
* Patsubst:: Substituting text by regular expression
|
||
* Format:: Formatting strings (printf-like)
|
||
|
||
Macros for doing arithmetic
|
||
|
||
* Incr:: Decrement and increment operators
|
||
* Eval:: Evaluating integer expressions
|
||
|
||
Macros for running shell commands
|
||
|
||
* Platform macros:: Determining the platform
|
||
* Syscmd:: Executing simple commands
|
||
* Esyscmd:: Reading the output of commands
|
||
* Sysval:: Exit status
|
||
* Mkstemp:: Making temporary files
|
||
|
||
Miscellaneous builtin macros
|
||
|
||
* Errprint:: Printing error messages
|
||
* Location:: Printing current location
|
||
* M4exit:: Exiting from @code{m4}
|
||
|
||
Fast loading of frozen state
|
||
|
||
* Using frozen files:: Using frozen files
|
||
* Frozen file format:: Frozen file format
|
||
|
||
Compatibility with other versions of @code{m4}
|
||
|
||
* Extensions:: Extensions in @acronym{GNU} M4
|
||
* Incompatibilities:: Facilities in System V m4 not in GNU M4
|
||
* Other Incompatibilities:: Other incompatibilities
|
||
|
||
Correct version of some examples
|
||
|
||
* Improved exch:: Solution for @code{exch}
|
||
* Improved forloop:: Solution for @code{forloop}
|
||
* Improved foreach:: Solution for @code{foreach}
|
||
* Improved copy:: Solution for @code{copy}
|
||
* Improved m4wrap:: Solution for @code{m4wrap}
|
||
* Improved cleardivert:: Solution for @code{cleardivert}
|
||
* Improved capitalize:: Solution for @code{capitalize}
|
||
* Improved fatal_error:: Solution for @code{fatal_error}
|
||
|
||
How to make copies of the overall M4 package
|
||
|
||
* GNU General Public License:: License for copying the M4 package
|
||
|
||
How to make copies of this manual
|
||
|
||
* GNU Free Documentation License:: License for copying this manual
|
||
|
||
Indices of concepts and macros
|
||
|
||
* Macro index:: Index for all @code{m4} macros
|
||
* Concept index:: Index for many concepts
|
||
|
||
@end detailmenu
|
||
@end menu
|
||
|
||
@node Preliminaries
|
||
@chapter Introduction and preliminaries
|
||
|
||
This first chapter explains what @acronym{GNU} @code{m4} is, where @code{m4}
|
||
comes from, how to read and use this documentation, how to call the
|
||
@code{m4} program, and how to report bugs about it. It concludes by
|
||
giving tips for reading the remainder of the manual.
|
||
|
||
The following chapters then detail all the features of the @code{m4}
|
||
language.
|
||
|
||
@menu
|
||
* Intro:: Introduction to @code{m4}
|
||
* History:: Historical references
|
||
* Bugs:: Problems and bugs
|
||
* Manual:: Using this manual
|
||
@end menu
|
||
|
||
@node Intro
|
||
@section Introduction to @code{m4}
|
||
|
||
@cindex overview of @code{m4}
|
||
@code{m4} is a macro processor, in the sense that it copies its
|
||
input to the output, expanding macros as it goes. Macros are either
|
||
builtin or user-defined, and can take any number of arguments.
|
||
Besides just doing macro expansion, @code{m4} has builtin functions
|
||
for including named files, running shell commands, doing integer
|
||
arithmetic, manipulating text in various ways, performing recursion,
|
||
etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
|
||
or as a macro processor in its own right.
|
||
|
||
The @code{m4} macro processor is widely available on all UNIXes, and has
|
||
been standardized by @acronym{POSIX}.
|
||
Usually, only a small percentage of users are aware of its existence.
|
||
However, those who find it often become committed users. The
|
||
popularity of @acronym{GNU} Autoconf, which requires @acronym{GNU}
|
||
@code{m4} for @emph{generating} @file{configure} scripts, is an incentive
|
||
for many to install it, while these people will not themselves
|
||
program in @code{m4}. @acronym{GNU} @code{m4} is mostly compatible with the
|
||
System V, Release 3 version, except for some minor differences.
|
||
@xref{Compatibility}, for more details.
|
||
|
||
Some people find @code{m4} to be fairly addictive. They first use
|
||
@code{m4} for simple problems, then take bigger and bigger challenges,
|
||
learning how to write complex sets of @code{m4} macros along the way.
|
||
Once really addicted, users pursue writing of sophisticated @code{m4}
|
||
applications even to solve simple problems, devoting more time
|
||
debugging their @code{m4} scripts than doing real work. Beware that
|
||
@code{m4} may be dangerous for the health of compulsive programmers.
|
||
|
||
@node History
|
||
@section Historical references
|
||
|
||
@cindex history of @code{m4}
|
||
@cindex @acronym{GNU} M4, history of
|
||
@code{GPM} was an important ancestor of @code{m4}. See
|
||
C. Strachey: ``A General Purpose Macro generator'', Computer Journal
|
||
8,3 (1965), pp.@: 225 ff. @code{GPM} is also succinctly described into
|
||
David Gries classic ``Compiler Construction for Digital Computers''.
|
||
|
||
The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
|
||
Addison-Wesley, Inc.@: (1976) describes and implements a Unix
|
||
macro-processor language, which inspired Dennis Ritchie to write
|
||
@code{m3}, a macro processor for the AP-3 minicomputer.
|
||
|
||
Kernighan and Ritchie then joined forces to develop the original
|
||
@code{m4}, as described in ``The M4 Macro Processor'', Bell
|
||
Laboratories (1977). It had only 21 builtin macros.
|
||
|
||
While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
|
||
the true intricacies of real life: macros can be recognized without
|
||
being pre-announced, skipping whitespace or end-of-lines is easier,
|
||
more constructs are builtin instead of derived, etc.
|
||
|
||
Originally, the Kernighan and Plauger macro-processor, and then
|
||
@code{m3}, formed the engine for the Rational FORTRAN preprocessor,
|
||
that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
|
||
was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
|
||
|
||
Ren@'e Seindal released his implementation of @code{m4}, @acronym{GNU}
|
||
@code{m4},
|
||
in 1990, with the aim of removing the artificial limitations in many
|
||
of the traditional @code{m4} implementations, such as maximum line
|
||
length, macro size, or number of macros.
|
||
|
||
The late Professor A. Dain Samples described and implemented a further
|
||
evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
|
||
Language: 2nd edition'', Electronic Announcement on comp.compilers
|
||
newsgroup (1992).
|
||
|
||
Fran@,{c}ois Pinard took over maintenance of @acronym{GNU} @code{m4} in
|
||
1992, until 1994 when he released @acronym{GNU} @code{m4} 1.4, which was
|
||
the stable release for 10 years. It was at this time that @acronym{GNU}
|
||
Autoconf decided to require @acronym{GNU} @code{m4} as its underlying
|
||
engine, since all other implementations of @code{m4} had too many
|
||
limitations.
|
||
|
||
More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
|
||
addressed some long standing bugs in the venerable 1.4 release. Then in
|
||
2005, Gary V. Vaughan collected together the many patches to
|
||
@acronym{GNU} @code{m4} 1.4 that were floating around the net and
|
||
released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
|
||
prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
|
||
More bug fixes were incorporated in 2007, with releases 1.4.9 and
|
||
1.4.10. Eric continued with some portability fixes for 1.4.11 and
|
||
1.4.12 in 2008, and 1.4.13 in 2009.
|
||
|
||
Meanwhile, development has continued on new features for @code{m4}, such
|
||
as dynamic module loading and additional builtins. When complete,
|
||
@acronym{GNU} @code{m4} 2.0 will start a new series of releases.
|
||
|
||
@node Bugs
|
||
@section Problems and bugs
|
||
|
||
@cindex reporting bugs
|
||
@cindex bug reports
|
||
@cindex suggestions, reporting
|
||
If you have problems with @acronym{GNU} M4 or think you've found a bug,
|
||
please report it. Before reporting a bug, make sure you've actually
|
||
found a real bug. Carefully reread the documentation and see if it
|
||
really says you can do what you're trying to do. If it's not clear
|
||
whether you should be able to do something or not, report that too; it's
|
||
a bug in the documentation!
|
||
|
||
Before reporting a bug or trying to fix it yourself, try to isolate it
|
||
to the smallest possible input file that reproduces the problem. Then
|
||
send us the input file and the exact results @code{m4} gave you. Also
|
||
say what you expected to occur; this will help us decide whether the
|
||
problem was really in the documentation.
|
||
|
||
Once you've got a precise problem, send e-mail to
|
||
@email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
|
||
you are using. You can get this information with the command
|
||
@kbd{m4 --version}. Also provide details about the platform you are
|
||
executing on.
|
||
|
||
Non-bug suggestions are always welcome as well. If you have questions
|
||
about things that are unclear in the documentation or are just obscure
|
||
features, please report them too.
|
||
|
||
@node Manual
|
||
@section Using this manual
|
||
|
||
@cindex examples, understanding
|
||
This manual contains a number of examples of @code{m4} input and output,
|
||
and a simple notation is used to distinguish input, output and error
|
||
messages from @code{m4}. Examples are set out from the normal text, and
|
||
shown in a fixed width font, like this
|
||
|
||
@comment ignore
|
||
@example
|
||
This is an example of an example!
|
||
@end example
|
||
|
||
To distinguish input from output, all output from @code{m4} is prefixed
|
||
by the string @samp{@result{}}, and all error messages by the string
|
||
@samp{@error{}}. When showing how command line options affect matters,
|
||
the command line is shown with a prompt @samp{$ @kbd{like this}},
|
||
otherwise, you can assume that a simple @kbd{m4} invocation will work.
|
||
Thus:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{command line to invoke m4}
|
||
Example of input line
|
||
@result{}Output line from m4
|
||
@error{}and an error message
|
||
@end example
|
||
|
||
The sequence @samp{^D} in an example indicates the end of the input
|
||
file. The sequence @samp{@key{NL}} refers to the newline character.
|
||
The majority of these examples are self-contained, and you can run them
|
||
with similar results by invoking @kbd{m4 -d}. In fact, the testsuite
|
||
that is bundled in the @acronym{GNU} M4 package consists of the examples
|
||
in this document! Some of the examples assume that your current
|
||
directory is located where you unpacked the installation, so if you plan
|
||
on following along, you may find it helpful to do this now:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{cd m4-@value{VERSION}}
|
||
@end example
|
||
|
||
As each of the predefined macros in @code{m4} is described, a prototype
|
||
call of the macro will be shown, giving descriptive names to the
|
||
arguments, e.g.,
|
||
|
||
@deffn Composite example (@var{string}, @dvar{count, 1}, @
|
||
@ovar{argument}@dots{})
|
||
This is a sample prototype. There is not really a macro named
|
||
@code{example}, but this documents that if there were, it would be a
|
||
Composite macro, rather than a Builtin. It requires at least one
|
||
argument, @var{string}. Remember that in @code{m4}, there must not be a
|
||
space between the macro name and the opening parenthesis, unless it was
|
||
intended to call the macro without any arguments. The brackets around
|
||
@var{count} and @var{argument} show that these arguments are optional.
|
||
If @var{count} is omitted, the macro behaves as if count were @samp{1},
|
||
whereas if @var{argument} is omitted, the macro behaves as if it were
|
||
the empty string. A blank argument is not the same as an omitted
|
||
argument. For example, @samp{example(`a')}, @samp{example(`a',`1')},
|
||
and @samp{example(`a',`1',)} would behave identically with @var{count}
|
||
set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
|
||
would explicitly pass the empty string for @var{count}. The ellipses
|
||
(@samp{@dots{}}) show that the macro processes additional arguments
|
||
after @var{argument}, rather than ignoring them.
|
||
@end deffn
|
||
|
||
@cindex numbers
|
||
All macro arguments in @code{m4} are strings, but some are given
|
||
special interpretation, e.g., as numbers, file names, regular
|
||
expressions, etc. The documentation for each macro will state how the
|
||
parameters are interpreted, and what happens if the argument cannot be
|
||
parsed according to the desired interpretation. Unless specified
|
||
otherwise, a parameter specified to be a number is parsed as a decimal,
|
||
even if the argument has leading zeros; and parsing the empty string as
|
||
a number results in 0 rather than an error, although a warning will be
|
||
issued.
|
||
|
||
This document consistently writes and uses @dfn{builtin}, without a
|
||
hyphen, as if it were an English word. This is how the @code{builtin}
|
||
primitive is spelled within @code{m4}.
|
||
|
||
@node Invoking m4
|
||
@chapter Invoking @code{m4}
|
||
|
||
@cindex command line
|
||
@cindex invoking @code{m4}
|
||
The format of the @code{m4} command is:
|
||
|
||
@comment ignore
|
||
@example
|
||
@code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
|
||
@end example
|
||
|
||
@cindex command line, options
|
||
@cindex options, command line
|
||
@cindex @env{POSIXLY_CORRECT}
|
||
All options begin with @samp{-}, or if long option names are used, with
|
||
@samp{--}. A long option name need not be written completely, any
|
||
unambiguous prefix is sufficient. @acronym{POSIX} requires @code{m4} to
|
||
recognize arguments intermixed with files, even when
|
||
@env{POSIXLY_CORRECT} is set in the environment. Most options take
|
||
effect at startup regardless of their position, but some are documented
|
||
below as taking effect after any files that occurred earlier in the
|
||
command line. The argument @option{--} is a marker to denote the end of
|
||
options.
|
||
|
||
With short options, options that do not take arguments may be combined
|
||
into a single command line argument with subsequent options, options
|
||
with mandatory arguments may be provided either as a single command line
|
||
argument or as two arguments, and options with optional arguments must
|
||
be provided as a single argument. In other words,
|
||
@kbd{m4 -QPDfoo -d a -df} is equivalent to
|
||
@kbd{m4 -Q -P -D foo -d -df -- ./a}, although the latter form is
|
||
considered canonical.
|
||
|
||
With long options, options with mandatory arguments may be provided with
|
||
an equal sign (@samp{=}) in a single argument, or as two arguments, and
|
||
options with optional arguments must be provided as a single argument.
|
||
In other words, @kbd{m4 --def foo --debug a} is equivalent to
|
||
@kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
|
||
considered canonical (not to mention more robust, in case a future
|
||
version of @code{m4} introduces an option named @option{--default}).
|
||
|
||
@code{m4} understands the following options, grouped by functionality.
|
||
|
||
@menu
|
||
* Operation modes:: Command line options for operation modes
|
||
* Preprocessor features:: Command line options for preprocessor features
|
||
* Limits control:: Command line options for limits control
|
||
* Frozen state:: Command line options for frozen state
|
||
* Debugging options:: Command line options for debugging
|
||
* Command line files:: Specifying input files on the command line
|
||
@end menu
|
||
|
||
@node Operation modes
|
||
@section Command line options for operation modes
|
||
|
||
Several options control the overall operation of @code{m4}:
|
||
|
||
@table @code
|
||
@item --help
|
||
Print a help summary on standard output, then immediately exit
|
||
@code{m4} without reading any input files or performing any other
|
||
actions.
|
||
|
||
@item --version
|
||
Print the version number of the program on standard output, then
|
||
immediately exit @code{m4} without reading any input files or
|
||
performing any other actions.
|
||
|
||
@item -E
|
||
@itemx --fatal-warnings
|
||
@cindex errors, fatal
|
||
@cindex fatal errors
|
||
Controls the effect of warnings. If unspecified, then execution
|
||
continues and exit status is unaffected when a warning is printed. If
|
||
specified exactly once, warnings become fatal; when one is issued,
|
||
execution continues, but the exit status will be non-zero. If specified
|
||
multiple times, then execution halts with non-zero status the first time
|
||
a warning is issued. The introduction of behavior levels is new to M4
|
||
1.4.9; for behavior consistent with earlier versions, you should specify
|
||
@option{-E} twice.
|
||
|
||
@item -i
|
||
@itemx --interactive
|
||
@itemx -e
|
||
Makes this invocation of @code{m4} interactive. This means that all
|
||
output will be unbuffered, and interrupts will be ignored. The
|
||
spelling @option{-e} exists for compatibility with other @code{m4}
|
||
implementations, and issues a warning because it may be withdrawn in a
|
||
future version of @acronym{GNU} M4.
|
||
|
||
@item -P
|
||
@itemx --prefix-builtins
|
||
Internally modify @emph{all} builtin macro names so they all start with
|
||
the prefix @samp{m4_}. For example, using this option, one should write
|
||
@samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
|
||
instead of @samp{__file__}. This option has no effect if @option{-R}
|
||
is also specified.
|
||
|
||
@item -Q
|
||
@itemx --quiet
|
||
@itemx --silent
|
||
Suppress warnings, such as missing or superfluous arguments in macro
|
||
calls, or treating the empty string as zero.
|
||
|
||
@item --warn-macro-sequence@r{[}=@var{regexp}@r{]}
|
||
Issue a warning if the regular expression @var{regexp} has a non-empty
|
||
match in any macro definition (either by @code{define} or
|
||
@code{pushdef}). Empty matches are ignored; therefore, supplying the
|
||
empty string as @var{regexp} disables any warning. If the optional
|
||
@var{regexp} is not supplied, then the default regular expression is
|
||
@samp{\$\(@{[^@}]*@}\|[0-9][0-9]+\)} (a literal @samp{$} followed by
|
||
multiple digits or by an open brace), since these sequences will
|
||
change semantics in the default operation of @acronym{GNU} M4 2.0 (due
|
||
to a change in how more than 9 arguments in a macro definition will be
|
||
handled, @pxref{Arguments}). Providing an alternate regular
|
||
expression can provide a useful reverse lookup feature of finding
|
||
where a macro is defined to have a given definition.
|
||
|
||
@item -W @var{regexp}
|
||
@itemx --word-regexp=@var{regexp}
|
||
Use @var{regexp} as an alternative syntax for macro names. This
|
||
experimental option will not be present in all @acronym{GNU} @code{m4}
|
||
implementations (@pxref{Changeword}).
|
||
@end table
|
||
|
||
@node Preprocessor features
|
||
@section Command line options for preprocessor features
|
||
|
||
@cindex macro definitions, on the command line
|
||
@cindex command line, macro definitions on the
|
||
@cindex preprocessor features
|
||
Several options allow @code{m4} to behave more like a preprocessor.
|
||
Macro definitions and deletions can be made on the command line, the
|
||
search path can be altered, and the output file can track where the
|
||
input came from. These features occur with the following options:
|
||
|
||
@table @code
|
||
@item -D @var{name}@r{[}=@var{value}@r{]}
|
||
@itemx --define=@var{name}@r{[}=@var{value}@r{]}
|
||
This enters @var{name} into the symbol table. If @samp{=@var{value}} is
|
||
missing, the value is taken to be the empty string. The @var{value} can
|
||
be any string, and the macro can be defined to take arguments, just as
|
||
if it was defined from within the input. This option may be given more
|
||
than once; order with respect to file names is significant, and
|
||
redefining the same @var{name} loses the previous value.
|
||
|
||
@item -I @var{directory}
|
||
@itemx --include=@var{directory}
|
||
Make @code{m4} search @var{directory} for included files that are not
|
||
found in the current working directory. @xref{Search Path}, for more
|
||
details. This option may be given more than once.
|
||
|
||
@item -s
|
||
@itemx --synclines
|
||
@cindex synchronization lines
|
||
@cindex location, input
|
||
@cindex input location
|
||
Generate synchronization lines, for use by the C preprocessor or other
|
||
similar tools. Order is significant with respect to file names. This
|
||
option is useful, for example, when @code{m4} is used as a
|
||
front end to a compiler. Source file name and line number information
|
||
is conveyed by directives of the form @samp{#line @var{linenum}
|
||
"@var{file}"}, which are inserted as needed into the middle of the
|
||
output. Such directives mean that the following line originated or was
|
||
expanded from the contents of input file @var{file} at line
|
||
@var{linenum}. The @samp{"@var{file}"} part is often omitted when
|
||
the file name did not change from the previous directive.
|
||
|
||
Synchronization directives are always given on complete lines by
|
||
themselves. When a synchronization discrepancy occurs in the middle of
|
||
an output line, the associated synchronization directive is delayed
|
||
until the next newline that does not occur in the middle of a quoted
|
||
string or comment.
|
||
|
||
@comment options: -s
|
||
@example
|
||
define(`twoline', `1
|
||
2')
|
||
@result{}#line 2 "stdin"
|
||
@result{}
|
||
changecom(`/*', `*/')
|
||
@result{}
|
||
define(`comment', `/*1
|
||
2*/')
|
||
@result{}#line 5
|
||
@result{}
|
||
dnl no line
|
||
hello
|
||
@result{}#line 7
|
||
@result{}hello
|
||
twoline
|
||
@result{}1
|
||
@result{}#line 8
|
||
@result{}2
|
||
comment
|
||
@result{}/*1
|
||
@result{}2*/
|
||
one comment `two
|
||
three'
|
||
@result{}#line 10
|
||
@result{}one /*1
|
||
@result{}2*/ two
|
||
@result{}three
|
||
goodbye
|
||
@result{}#line 12
|
||
@result{}goodbye
|
||
@end example
|
||
|
||
@item -U @var{name}
|
||
@itemx --undefine=@var{name}
|
||
This deletes any predefined meaning @var{name} might have. Obviously,
|
||
only predefined macros can be deleted in this way. This option may be
|
||
given more than once; undefining a @var{name} that does not have a
|
||
definition is silently ignored. Order is significant with respect to
|
||
file names.
|
||
@end table
|
||
|
||
@node Limits control
|
||
@section Command line options for limits control
|
||
|
||
There are some limits within @code{m4} that can be tuned. For
|
||
compatibility, @code{m4} also accepts some options that control limits
|
||
in other implementations, but which are automatically unbounded (limited
|
||
only by your hardware and operating system constraints) in @acronym{GNU}
|
||
@code{m4}.
|
||
|
||
@table @code
|
||
@item -g
|
||
@itemx --gnu
|
||
Enable all the extensions in this implementation. In this release of
|
||
M4, this option is always on by default; it is currently only useful
|
||
when overriding a prior use of @option{--traditional}. However, having
|
||
@acronym{GNU} behavior as default makes it impossible to write a
|
||
strictly @acronym{POSIX}-compliant client that avoids all incompatible
|
||
@acronym{GNU} M4 extensions, since such a client would have to use the
|
||
non-@acronym{POSIX} command-line option to force full @acronym{POSIX}
|
||
behavior. Thus, a future version of M4 will be changed to implicitly
|
||
use the option @option{--traditional} if the environment variable
|
||
@env{POSIXLY_CORRECT} is set. Projects that intentionally use
|
||
@acronym{GNU} extensions should consider using @option{--gnu} to state
|
||
their intentions, so that the project will not mysteriously break if the
|
||
user upgrades to a newer M4 and has @env{POSIXLY_CORRECT} set in their
|
||
environment.
|
||
|
||
@item -G
|
||
@itemx --traditional
|
||
Suppress all the extensions made in this implementation, compared to the
|
||
System V version. @xref{Compatibility}, for a list of these.
|
||
|
||
@item -H @var{num}
|
||
@itemx --hashsize=@var{num}
|
||
Make the internal hash table for symbol lookup be @var{num} entries big.
|
||
For better performance, the number should be prime, but this is not
|
||
checked. The default is 509 entries. It should not be necessary to
|
||
increase this value, unless you define an excessive number of macros.
|
||
|
||
@item -L @var{num}
|
||
@itemx --nesting-limit=@var{num}
|
||
@cindex nesting limit
|
||
@cindex limit, nesting
|
||
Artificially limit the nesting of macro calls to @var{num} levels,
|
||
stopping program execution if this limit is ever exceeded. When not
|
||
specified, nesting defaults to unlimited on platforms that can detect
|
||
stack overflow, and to 1024 levels otherwise. A value of zero means
|
||
unlimited; but then heavily nested code could potentially cause a stack
|
||
overflow.
|
||
|
||
The precise effect of this option is more correctly associated
|
||
with textual nesting than dynamic recursion. It has been useful
|
||
when some complex @code{m4} input was generated by mechanical means, and
|
||
also in diagnosing recursive algorithms that do not scale well.
|
||
Most users never need to change this option from its default.
|
||
|
||
@cindex rescanning
|
||
This option does @emph{not} have the ability to break endless
|
||
rescanning loops, since these do not necessarily consume much memory
|
||
or stack space. Through clever usage of rescanning loops, one can
|
||
request complex, time-consuming computations from @code{m4} with useful
|
||
results. Putting limitations in this area would break @code{m4} power.
|
||
There are many pathological cases: @w{@samp{define(`a', `a')a}} is
|
||
only the simplest example (but @pxref{Compatibility}). Expecting @acronym{GNU}
|
||
@code{m4} to detect these would be a little like expecting a compiler
|
||
system to detect and diagnose endless loops: it is a quite @emph{hard}
|
||
problem in general, if not undecidable!
|
||
|
||
@item -B @var{num}
|
||
@itemx -S @var{num}
|
||
@itemx -T @var{num}
|
||
These options are present for compatibility with System V @code{m4}, but
|
||
do nothing in this implementation. They may disappear in future
|
||
releases, and issue a warning to that effect.
|
||
|
||
@item -N @var{num}
|
||
@itemx --diversions=@var{num}
|
||
These options are present only for compatibility with previous
|
||
versions of @acronym{GNU} @code{m4}, and were controlling the number of
|
||
possible diversions which could be used at the same time. They do nothing,
|
||
because there is no fixed limit anymore. They may disappear in future
|
||
releases, and issue a warning to that effect.
|
||
@end table
|
||
|
||
@node Frozen state
|
||
@section Command line options for frozen state
|
||
|
||
@acronym{GNU} @code{m4} comes with a feature of freezing internal state
|
||
(@pxref{Frozen files}). This can be used to speed up @code{m4}
|
||
execution when reusing a common initialization script.
|
||
|
||
@table @code
|
||
@item -F @var{file}
|
||
@itemx --freeze-state=@var{file}
|
||
Once execution is finished, write out the frozen state on the specified
|
||
@var{file}. It is conventional, but not required, for @var{file} to end
|
||
in @samp{.m4f}.
|
||
|
||
@item -R @var{file}
|
||
@itemx --reload-state=@var{file}
|
||
Before execution starts, recover the internal state from the specified
|
||
frozen @var{file}. The options @option{-D}, @option{-U}, and
|
||
@option{-t} take effect after state is reloaded, but before the input
|
||
files are read.
|
||
@end table
|
||
|
||
@node Debugging options
|
||
@section Command line options for debugging
|
||
|
||
Finally, there are several options for aiding in debugging @code{m4}
|
||
scripts.
|
||
|
||
@table @code
|
||
@item -d@r{[}@var{flags}@r{]}
|
||
@itemx --debug@r{[}=@var{flags}@r{]}
|
||
Set the debug-level according to the flags @var{flags}. The debug-level
|
||
controls the format and amount of information presented by the debugging
|
||
functions. @xref{Debug Levels}, for more details on the format and
|
||
meaning of @var{flags}. If omitted, @var{flags} defaults to @samp{aeq}.
|
||
|
||
@item --debugfile@r{[}=@var{file}@r{]}
|
||
@itemx -o @var{file}
|
||
@itemx --error-output=@var{file}
|
||
Redirect @code{dumpdef} output, debug messages, and trace output to the
|
||
named @var{file}. Warnings, error messages, and @code{errprint} output
|
||
are still printed to standard error. If these options are not used, or
|
||
if @var{file} is unspecified (only possible for @option{--debugfile}),
|
||
debug output goes to standard error; if @var{file} is the empty string,
|
||
debug output is discarded. @xref{Debug Output}, for more details. The
|
||
option @option{--debugfile} may be given more than once, and order is
|
||
significant with respect to file names. The spellings @option{-o} and
|
||
@option{--error-output} are misleading and inconsistent with other
|
||
@acronym{GNU} tools; for now they are silently accepted as synonyms of
|
||
@option{--debugfile} and only recognized once, but in a future version
|
||
of M4, using them will cause a warning to be issued.
|
||
|
||
@ignore
|
||
@comment not worth including in the manual, but provides a good test
|
||
|
||
@comment examples
|
||
@comment options: -Dbar=hello -tbar --debugfile= foo --debugfile -
|
||
@example
|
||
$ @kbd{m4 -d -Iexamples -Dbar=hello -tbar --debugfile= foo --debugfile -
|
||
@result{}hello
|
||
errprint(`hi
|
||
')dnl
|
||
@error{}hi
|
||
bar
|
||
@error{}m4trace: -1- bar -> `hello'
|
||
@result{}hello
|
||
@end example
|
||
@end ignore
|
||
|
||
@item -l @var{num}
|
||
@itemx --arglength=@var{num}
|
||
Restrict the size of the output generated by macro tracing to @var{num}
|
||
characters per trace line. If unspecified or zero, output is
|
||
unlimited. @xref{Debug Levels}, for more details.
|
||
|
||
@item -t @var{name}
|
||
@itemx --trace=@var{name}
|
||
This enables tracing for the macro @var{name}, at any point where it is
|
||
defined. @var{name} need not be defined when this option is given.
|
||
This option may be given more than once, and order is significant with
|
||
respect to file names. @xref{Trace}, for more details.
|
||
@end table
|
||
|
||
@node Command line files
|
||
@section Specifying input files on the command line
|
||
|
||
@cindex command line, file names on the
|
||
@cindex file names, on the command line
|
||
The remaining arguments on the command line are taken to be input file
|
||
names. If no names are present, standard input is read. A file
|
||
name of @file{-} is taken to mean standard input. It is
|
||
conventional, but not required, for input files to end in @samp{.m4}.
|
||
|
||
The input files are read in the sequence given. Standard input can be
|
||
read more than once, so the file name @file{-} may appear multiple times
|
||
on the command line; this makes a difference when input is from a
|
||
terminal or other special file type. It is an error if an input file
|
||
ends in the middle of argument collection, a comment, or a quoted
|
||
string.
|
||
|
||
The options @option{--define} (@option{-D}), @option{--undefine}
|
||
(@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
|
||
(@option{-t}) only take effect after processing input from any file
|
||
names that occur earlier on the command line. For example, assume the
|
||
file @file{foo} contains:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{cat foo}
|
||
bar
|
||
@end example
|
||
|
||
The text @samp{bar} can then be redefined over multiple uses of
|
||
@file{foo}:
|
||
|
||
@comment options: -Dbar=hello foo -Dbar=world foo
|
||
@example
|
||
$ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
|
||
@result{}hello
|
||
@result{}world
|
||
@end example
|
||
|
||
If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
|
||
exit status of @code{m4} will be 0 for success, 1 for general failure
|
||
(such as problems with reading an input file), and 63 for version
|
||
mismatch (@pxref{Using frozen files}).
|
||
|
||
If you need to read a file whose name starts with a @file{-}, you can
|
||
specify it as @samp{./-file}, or use @option{--} to mark the end of
|
||
options.
|
||
|
||
@ignore
|
||
@comment Test that 'm4 file/' detects that file is not a directory; we
|
||
@comment can assume that the current directory contains a Makefile.
|
||
@comment mingw fails with EINVAL rather than ENOTDIR.
|
||
|
||
@comment status: 1
|
||
@comment xerr: ignore
|
||
@comment options: Makefile/
|
||
@example
|
||
@error{}m4: cannot open `Makefile/': Not a directory
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Syntax
|
||
@chapter Lexical and syntactic conventions
|
||
|
||
@cindex input tokens
|
||
@cindex tokens
|
||
As @code{m4} reads its input, it separates it into @dfn{tokens}. A
|
||
token is either a name, a quoted string, or any single character, that
|
||
is not a part of either a name or a string. Input to @code{m4} can also
|
||
contain comments. @acronym{GNU} @code{m4} does not yet understand
|
||
multibyte locales; all operations are byte-oriented rather than
|
||
character-oriented (although if your locale uses a single byte
|
||
encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
|
||
However, @code{m4} is eight-bit clean, so you can
|
||
use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
|
||
comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
|
||
exception of the @sc{nul} character (the zero byte @samp{'\0'}).
|
||
|
||
@menu
|
||
* Names:: Macro names
|
||
* Quoted strings:: Quoting input to @code{m4}
|
||
* Comments:: Comments in @code{m4} input
|
||
* Other tokens:: Other kinds of input tokens
|
||
* Input processing:: How @code{m4} copies input to output
|
||
@end menu
|
||
|
||
@node Names
|
||
@section Macro names
|
||
|
||
@cindex names
|
||
@cindex words
|
||
A name is any sequence of letters, digits, and the character @samp{_}
|
||
(underscore), where the first character is not a digit. @code{m4} will
|
||
use the longest such sequence found in the input. If a name has a
|
||
macro definition, it will be subject to macro expansion
|
||
(@pxref{Macros}). Names are case-sensitive.
|
||
|
||
Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
|
||
|
||
@node Quoted strings
|
||
@section Quoting input to @code{m4}
|
||
|
||
@cindex quoted string
|
||
@cindex string, quoted
|
||
A quoted string is a sequence of characters surrounded by quote
|
||
strings, defaulting to
|
||
@samp{`} and @samp{'}, where the nested begin and end quotes within the
|
||
string are balanced. The value of a string token is the text, with one
|
||
level of quotes stripped off. Thus
|
||
|
||
@comment ignore
|
||
@example
|
||
`'
|
||
@result{}
|
||
@end example
|
||
|
||
@noindent
|
||
is the empty string, and double-quoting turns into single-quoting.
|
||
|
||
@comment ignore
|
||
@example
|
||
``quoted''
|
||
@result{}`quoted'
|
||
@end example
|
||
|
||
The quote characters can be changed at any time, using the builtin macro
|
||
@code{changequote}. @xref{Changequote}, for more information.
|
||
|
||
@node Comments
|
||
@section Comments in @code{m4} input
|
||
|
||
@cindex comments
|
||
Comments in @code{m4} are normally delimited by the characters @samp{#}
|
||
and newline. All characters between the comment delimiters are ignored,
|
||
but the entire comment (including the delimiters) is passed through to
|
||
the output---comments are @emph{not} discarded by @code{m4}.
|
||
|
||
Comments cannot be nested, so the first newline after a @samp{#} ends
|
||
the comment. The commenting effect of the begin-comment string
|
||
can be inhibited by quoting it.
|
||
|
||
@example
|
||
$ @kbd{m4}
|
||
`quoted text' # `commented text'
|
||
@result{}quoted text # `commented text'
|
||
`quoting inhibits' `#' `comments'
|
||
@result{}quoting inhibits # comments
|
||
@end example
|
||
|
||
The comment delimiters can be changed to any string at any time, using
|
||
the builtin macro @code{changecom}. @xref{Changecom}, for more
|
||
information.
|
||
|
||
@ignore
|
||
@comment Detect regression in 1.4.10b in regards to reparsing comments.
|
||
@comment Not worth including in the manual.
|
||
@example
|
||
define(`e', `$@@')define(`q', ``$@@'')define(`foo', `bar')
|
||
@result{}
|
||
q(e(`one
|
||
',#two ' foo
|
||
))
|
||
@result{}`one
|
||
@result{}',`#two bar
|
||
@result{}''
|
||
changecom(`<', `>')define(`n', `$#')
|
||
@result{}
|
||
n(e(<`>, <'>))
|
||
@result{}1
|
||
len(e(<`>, ,<'>))
|
||
@result{}12
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Other tokens
|
||
@section Other kinds of input tokens
|
||
|
||
@cindex tokens, special
|
||
Any character, that is neither a part of a name, nor of a quoted string,
|
||
nor a comment, is a token by itself. When not in the context of macro
|
||
expansion, all of these tokens are just copied to output. However,
|
||
during macro expansion, whitespace characters (space, tab, newline,
|
||
formfeed, carriage return, vertical tab), parentheses (@samp{(} and
|
||
@samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
|
||
roles, explained later.
|
||
|
||
@node Input processing
|
||
@section How @code{m4} copies input to output
|
||
|
||
As @code{m4} reads the input token by token, it will copy each token
|
||
directly to the output immediately.
|
||
|
||
The exception is when it finds a word with a macro definition. In that
|
||
case @code{m4} will calculate the macro's expansion, possibly reading
|
||
more input to get the arguments. It then inserts the expansion in front
|
||
of the remaining input. In other words, the resulting text from a macro
|
||
call will be read and parsed into tokens again.
|
||
|
||
@code{m4} expands a macro as soon as possible. If it finds a macro call
|
||
when collecting the arguments to another, it will expand the second call
|
||
first. This process continues until there are no more macro calls to
|
||
expand and all the input has been consumed.
|
||
|
||
For a running example, examine how @code{m4} handles this input:
|
||
|
||
@comment ignore
|
||
@example
|
||
format(`Result is %d', eval(`2**15'))
|
||
@end example
|
||
|
||
@noindent
|
||
First, @code{m4} sees that the token @samp{format} is a macro name, so
|
||
it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
|
||
and @samp{@w{ }}, before encountering another potential macro. Sure
|
||
enough, @samp{eval} is a macro name, so the nested argument collection
|
||
picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
|
||
with the lone argument of @samp{2**15}. The expansion of
|
||
@samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
|
||
tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
|
||
combined with the next @samp{)}, the format macro now has all its
|
||
arguments, as if the user had typed:
|
||
|
||
@comment ignore
|
||
@example
|
||
format(`Result is %d', 32768)
|
||
@end example
|
||
|
||
@noindent
|
||
The format macro expands to @samp{Result is 32768}, and we have another
|
||
round of scanning for the tokens @samp{Result}, @samp{@w{ }},
|
||
@samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
|
||
@samp{8}. None of these are macros, so the final output is
|
||
|
||
@comment ignore
|
||
@example
|
||
@result{}Result is 32768
|
||
@end example
|
||
|
||
As a more complicated example, we will contrast an actual code
|
||
example from the Gnulib project@footnote{Derived from a patch in
|
||
@uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
|
||
and a followup patch in
|
||
@uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
|
||
showing both a buggy approach and the desired results. The user desires
|
||
to output a shell assignment statement that takes its argument and turns
|
||
it into a shell variable by converting it to uppercase and prepending a
|
||
prefix. The original attempt looks like this:
|
||
|
||
@example
|
||
changequote([,])dnl
|
||
define([gl_STRING_MODULE_INDICATOR],
|
||
[
|
||
dnl comment
|
||
GNULIB_]translit([$1],[a-z],[A-Z])[=1
|
||
])dnl
|
||
gl_STRING_MODULE_INDICATOR([strcase])
|
||
@result{} @w{ }
|
||
@result{} GNULIB_strcase=1
|
||
@result{} @w{ }
|
||
@end example
|
||
|
||
Oops -- the argument did not get capitalized. And although the manual
|
||
is not able to easily show it, both lines that appear empty actually
|
||
contain two trailing spaces. By stepping through the parse, it is easy
|
||
to see what happened. First, @code{m4} sees the token
|
||
@samp{changequote}, which it recognizes as a macro, followed by
|
||
@samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
|
||
argument list. The macro expands to the empty string, but changes the
|
||
quoting characters to something more useful for generating shell code
|
||
(unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
|
||
but unbalanced @samp{[]} tend to be rare). Also in the first line,
|
||
@code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
|
||
macro that consumes the rest of the line, resulting in no output for
|
||
that line.
|
||
|
||
The second line starts a macro definition. @code{m4} sees the token
|
||
@samp{define}, which it recognizes as a macro, followed by a @samp{(},
|
||
@samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
|
||
comma was encountered, the first argument is known to be the expansion
|
||
of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
|
||
Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
|
||
whitespace is discarded as part of argument collection. Then comes a
|
||
rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
|
||
comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
|
||
@samp{translit}, which @code{m4} recognizes as a macro name, so a nested
|
||
macro expansion has started.
|
||
|
||
The arguments to the @code{translit} are found by the tokens @samp{(},
|
||
@samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
|
||
@samp{)}. All three string arguments are expanded (or in other words,
|
||
the quotes are stripped), and since neither @samp{$} nor @samp{1} need
|
||
capitalization, the result of the macro is @samp{$1}. This expansion is
|
||
rescanned, resulting in the two literal characters @samp{$} and
|
||
@samp{1}.
|
||
|
||
Scanning of the outer macro resumes, and picks up with
|
||
@samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
|
||
expanded text are concatenated, with the end result that the macro
|
||
@samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
|
||
@samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
|
||
Once again, @samp{dnl} is recognized and avoids a newline in the output.
|
||
|
||
The final line is then parsed, beginning with @samp{ } and @samp{ }
|
||
that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
|
||
recognized as a macro name, with an argument list of @samp{(},
|
||
@samp{[strcase]}, and @samp{)}. Since the definition of the macro
|
||
contains the sequence @samp{$1}, that sequence is replaced with the
|
||
argument @samp{strcase} prior to starting the rescan. The rescan sees
|
||
@samp{@key{NL}} and four spaces, which are output literally, then
|
||
@samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
|
||
comes four more spaces, also output literally, and the token
|
||
@samp{GNULIB_strcase}, which resulted from the earlier parameter
|
||
substitution. Since that is not a macro name, it is output literally,
|
||
followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
|
||
two more spaces. Finally, the original @samp{@key{NL}} seen after the
|
||
macro invocation is scanned and output literally.
|
||
|
||
Now for a corrected approach. This rearranges the use of newlines and
|
||
whitespace so that less whitespace is output (which, although harmless
|
||
to shell scripts, can be visually unappealing), and fixes the quoting
|
||
issues so that the capitalization occurs when the macro
|
||
@samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
|
||
defined.
|
||
|
||
@example
|
||
changequote([,])dnl
|
||
define([gl_STRING_MODULE_INDICATOR],
|
||
[dnl comment
|
||
GNULIB_[]translit([$1], [a-z], [A-Z])=1dnl
|
||
])dnl
|
||
gl_STRING_MODULE_INDICATOR([strcase])
|
||
@result{} GNULIB_STRCASE=1
|
||
@end example
|
||
|
||
The parsing of the first line is unchanged. The second line sees the
|
||
name of the macro to define, then sees the discarded @samp{@key{NL}}
|
||
and two spaces, as before. But this time, the next token is
|
||
@samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([$1], [a-z],
|
||
[A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
|
||
@samp{)} to end the macro definition and @samp{dnl} to skip the
|
||
newline. No early expansion of @code{translit} occurs, so the entire
|
||
string becomes the definition of the macro.
|
||
|
||
The final line is then parsed, beginning with two spaces that are
|
||
output literally, and an invocation of
|
||
@code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
|
||
Again, the @samp{$1} in the macro definition is substituted prior to
|
||
rescanning. Rescanning first encounters @samp{dnl}, and discards
|
||
@samp{ comment@key{NL}}. Then two spaces are output literally. Next
|
||
comes the token @samp{GNULIB_}, but that is not a macro, so it is
|
||
output literally. The token @samp{[]} is an empty string, so it does
|
||
not affect output. Then the token @samp{translit} is encountered.
|
||
|
||
This time, the arguments to @code{translit} are parsed as @samp{(},
|
||
@samp{[strcase]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
|
||
@samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
|
||
translit results in the desired result @samp{STRCASE}. This is
|
||
rescanned, but since it is not a macro name, it is output literally.
|
||
Then the scanner sees @samp{=} and @samp{1}, which are output
|
||
literally, followed by @samp{dnl} which discards the rest of the
|
||
definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
|
||
end of output is the literal @samp{@key{NL}} that appeared after the
|
||
invocation of the macro.
|
||
|
||
The order in which @code{m4} expands the macros can be further explored
|
||
using the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
|
||
|
||
@node Macros
|
||
@chapter How to invoke macros
|
||
|
||
This chapter covers macro invocation, macro arguments and how macro
|
||
expansion is treated.
|
||
|
||
@menu
|
||
* Invocation:: Macro invocation
|
||
* Inhibiting Invocation:: Preventing macro invocation
|
||
* Macro Arguments:: Macro arguments
|
||
* Quoting Arguments:: On Quoting Arguments to macros
|
||
* Macro expansion:: Expanding macros
|
||
@end menu
|
||
|
||
@node Invocation
|
||
@section Macro invocation
|
||
|
||
@cindex macro invocation
|
||
@cindex invoking macros
|
||
Macro invocations has one of the forms
|
||
|
||
@comment ignore
|
||
@example
|
||
name
|
||
@end example
|
||
|
||
@noindent
|
||
which is a macro invocation without any arguments, or
|
||
|
||
@comment ignore
|
||
@example
|
||
name(arg1, arg2, @dots{}, arg@var{n})
|
||
@end example
|
||
|
||
@noindent
|
||
which is a macro invocation with @var{n} arguments. Macros can have any
|
||
number of arguments. All arguments are strings, but different macros
|
||
might interpret the arguments in different ways.
|
||
|
||
The opening parenthesis @emph{must} follow the @var{name} directly, with
|
||
no spaces in between. If it does not, the macro is called with no
|
||
arguments at all.
|
||
|
||
For a macro call to have no arguments, the parentheses @emph{must} be
|
||
left out. The macro call
|
||
|
||
@comment ignore
|
||
@example
|
||
name()
|
||
@end example
|
||
|
||
@noindent
|
||
is a macro call with one argument, which is the empty string, not a call
|
||
with no arguments.
|
||
|
||
@node Inhibiting Invocation
|
||
@section Preventing macro invocation
|
||
|
||
An innovation of the @code{m4} language, compared to some of its
|
||
predecessors (like Strachey's @code{GPM}, for example), is the ability
|
||
to recognize macro calls without resorting to any special, prefixed
|
||
invocation character. While generally useful, this feature might
|
||
sometimes be the source of spurious, unwanted macro calls. So, @acronym{GNU}
|
||
@code{m4} offers several mechanisms or techniques for inhibiting the
|
||
recognition of names as macro calls.
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
@cindex blind macro
|
||
@cindex macro, blind
|
||
First of all, many builtin macros cannot meaningfully be called without
|
||
arguments. As a @acronym{GNU} extension, for any of these macros,
|
||
whenever an opening parenthesis does not immediately follow their name,
|
||
the builtin macro call is not triggered. This solves the most usual
|
||
cases, like for @samp{include} or @samp{eval}. Later in this document,
|
||
the sentence ``This macro is recognized only with parameters'' refers to
|
||
this specific provision of @acronym{GNU} M4, also known as a blind
|
||
builtin macro. For the builtins defined by @acronym{POSIX} that bear
|
||
this disclaimer, @acronym{POSIX} specifically states that invoking those
|
||
builtins without arguments is unspecified, because many other
|
||
implementations simply invoke the builtin as though it were given one
|
||
empty argument instead.
|
||
|
||
@example
|
||
$ @kbd{m4}
|
||
eval
|
||
@result{}eval
|
||
eval(`1')
|
||
@result{}1
|
||
@end example
|
||
|
||
There is also a command line option (@option{--prefix-builtins}, or
|
||
@option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
|
||
builtin macros with a prefix of @samp{m4_} at startup. The option has
|
||
no effect whatsoever on user defined macros. For example, with this option,
|
||
one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
|
||
no effect on whether a macro requires parameters.
|
||
|
||
@comment options: -P
|
||
@example
|
||
$ @kbd{m4 -P}
|
||
eval
|
||
@result{}eval
|
||
eval(`1')
|
||
@result{}eval(1)
|
||
m4_eval
|
||
@result{}m4_eval
|
||
m4_eval(`1')
|
||
@result{}1
|
||
@end example
|
||
|
||
Another alternative is to redefine problematic macros to a name less
|
||
likely to cause conflicts, @xref{Definitions}.
|
||
|
||
If your version of @acronym{GNU} @code{m4} has the @code{changeword} feature
|
||
compiled in, it offers far more flexibility in specifying the
|
||
syntax of macro names, both builtin or user-defined. @xref{Changeword},
|
||
for more information on this experimental feature.
|
||
|
||
Of course, the simplest way to prevent a name from being interpreted
|
||
as a call to an existing macro is to quote it. The remainder of
|
||
this section studies a little more deeply how quoting affects macro
|
||
invocation, and how quoting can be used to inhibit macro invocation.
|
||
|
||
Even if quoting is usually done over the whole macro name, it can also
|
||
be done over only a few characters of this name (provided, of course,
|
||
that the unquoted portions are not also a macro). It is also possible
|
||
to quote the empty string, but this works only @emph{inside} the name.
|
||
For example:
|
||
|
||
@example
|
||
`divert'
|
||
@result{}divert
|
||
`d'ivert
|
||
@result{}divert
|
||
di`ver't
|
||
@result{}divert
|
||
div`'ert
|
||
@result{}divert
|
||
@end example
|
||
|
||
@noindent
|
||
all yield the string @samp{divert}. While in both:
|
||
|
||
@example
|
||
`'divert
|
||
@result{}
|
||
divert`'
|
||
@result{}
|
||
@end example
|
||
|
||
@noindent
|
||
the @code{divert} builtin macro will be called, which expands to the
|
||
empty string.
|
||
|
||
@cindex rescanning
|
||
The output of macro evaluations is always rescanned. In the following
|
||
example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
|
||
if @code{m4}
|
||
has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
|
||
|
||
@example
|
||
define(`cde', `CDE')
|
||
@result{}
|
||
define(`x', `substr(ab')
|
||
@result{}
|
||
define(`y', `cde, `1', `3')')
|
||
@result{}
|
||
x`'y
|
||
@result{}bCD
|
||
@end example
|
||
|
||
@ignore
|
||
@comment Similar, but with argument references, to ensure good test
|
||
@comment coverage.
|
||
@example
|
||
define(`x1', `len(`$1'')
|
||
@result{}
|
||
define(`y1', ``$1')')
|
||
@result{}
|
||
x1(`01234567890123456789')y1(`98765432109876543210')
|
||
@result{}40
|
||
@end example
|
||
@end ignore
|
||
|
||
Unquoted strings on either side of a quoted string are subject to
|
||
being recognized as macro names. In the following example, quoting the
|
||
empty string allows for the second @code{macro} to be recognized as such:
|
||
|
||
@example
|
||
define(`macro', `m')
|
||
@result{}
|
||
macro(`m')macro
|
||
@result{}mmacro
|
||
macro(`m')`'macro
|
||
@result{}mm
|
||
@end example
|
||
|
||
Quoting may prevent recognizing as a macro name the concatenation of a
|
||
macro expansion with the surrounding characters. In this example:
|
||
|
||
@example
|
||
define(`macro', `di$1')
|
||
@result{}
|
||
macro(`v')`ert'
|
||
@result{}divert
|
||
macro(`v')ert
|
||
@result{}
|
||
@end example
|
||
|
||
@noindent
|
||
the input will produce the string @samp{divert}. When the quotes were
|
||
removed, the @code{divert} builtin was called instead.
|
||
|
||
@node Macro Arguments
|
||
@section Macro arguments
|
||
|
||
@cindex macros, arguments to
|
||
@cindex arguments to macros
|
||
When a name is seen, and it has a macro definition, it will be expanded
|
||
as a macro.
|
||
|
||
If the name is followed by an opening parenthesis, the arguments will be
|
||
collected before the macro is called. If too few arguments are
|
||
supplied, the missing arguments are taken to be the empty string.
|
||
However, some builtins are documented to behave differently for a
|
||
missing optional argument than for an explicit empty string. If there
|
||
are too many arguments, the excess arguments are ignored. Unquoted
|
||
leading whitespace is stripped off all arguments, but whitespace
|
||
generated by a macro expansion or occurring after a macro that expanded
|
||
to an empty string remains intact. Whitespace includes space, tab,
|
||
newline, carriage return, vertical tab, and formfeed.
|
||
|
||
@example
|
||
define(`macro', `$1')
|
||
@result{}
|
||
macro( unquoted leading space lost)
|
||
@result{}unquoted leading space lost
|
||
macro(` quoted leading space kept')
|
||
@result{} quoted leading space kept
|
||
macro(
|
||
divert `unquoted space kept after expansion')
|
||
@result{} unquoted space kept after expansion
|
||
macro(macro(`
|
||
')`whitespace from expansion kept')
|
||
@result{}
|
||
@result{}whitespace from expansion kept
|
||
macro(`unquoted trailing whitespace kept'
|
||
)
|
||
@result{}unquoted trailing whitespace kept
|
||
@result{}
|
||
@end example
|
||
|
||
@cindex warnings, suppressing
|
||
@cindex suppressing warnings
|
||
Normally @code{m4} will issue warnings if a builtin macro is called
|
||
with an inappropriate number of arguments, but it can be suppressed with
|
||
the @option{--quiet} command line option (or @option{--silent}, or
|
||
@option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
|
||
defined macros, there is no check of the number of arguments given.
|
||
|
||
@example
|
||
$ @kbd{m4}
|
||
index(`abc')
|
||
@error{}m4:stdin:1: Warning: too few arguments to builtin `index'
|
||
@result{}0
|
||
index(`abc',)
|
||
@result{}0
|
||
index(`abc', `b', `ignored')
|
||
@error{}m4:stdin:3: Warning: excess arguments to builtin `index' ignored
|
||
@result{}1
|
||
@end example
|
||
|
||
@comment options: -Q
|
||
@example
|
||
$ @kbd{m4 -Q}
|
||
index(`abc')
|
||
@result{}0
|
||
index(`abc',)
|
||
@result{}0
|
||
index(`abc', `b', `ignored')
|
||
@result{}1
|
||
@end example
|
||
|
||
Macros are expanded normally during argument collection, and whatever
|
||
commas, quotes and parentheses that might show up in the resulting
|
||
expanded text will serve to define the arguments as well. Thus, if
|
||
@var{foo} expands to @samp{, b, c}, the macro call
|
||
|
||
@comment ignore
|
||
@example
|
||
bar(a foo, d)
|
||
@end example
|
||
|
||
@noindent
|
||
is a macro call with four arguments, which are @samp{a }, @samp{b},
|
||
@samp{c} and @samp{d}. To understand why the first argument contains
|
||
whitespace, remember that unquoted leading whitespace is never part
|
||
of an argument, but trailing whitespace always is.
|
||
|
||
It is possible for a macro's definition to change during argument
|
||
collection, in which case the expansion uses the definition that was in
|
||
effect at the time the opening @samp{(} was seen.
|
||
|
||
@example
|
||
define(`f', `1')
|
||
@result{}
|
||
f(define(`f', `2'))
|
||
@result{}1
|
||
f
|
||
@result{}2
|
||
@end example
|
||
|
||
It is an error if the end of file occurs while collecting arguments.
|
||
|
||
@comment status: 1
|
||
@example
|
||
hello world
|
||
@result{}hello world
|
||
define(
|
||
^D
|
||
@error{}m4:stdin:2: ERROR: end of file in argument list
|
||
@end example
|
||
|
||
@node Quoting Arguments
|
||
@section On Quoting Arguments to macros
|
||
|
||
@cindex quoted macro arguments
|
||
@cindex macros, quoted arguments to
|
||
@cindex arguments, quoted macro
|
||
Each argument has unquoted leading whitespace removed. Within each
|
||
argument, all unquoted parentheses must match. For example, if
|
||
@var{foo} is a macro,
|
||
|
||
@comment ignore
|
||
@example
|
||
foo(() (`(') `(')
|
||
@end example
|
||
|
||
@noindent
|
||
is a macro call, with one argument, whose value is @samp{() (() (}.
|
||
Commas separate arguments, except when they occur inside quotes,
|
||
comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
|
||
examples.
|
||
|
||
It is common practice to quote all arguments to macros, unless you are
|
||
sure you want the arguments expanded. Thus, in the above
|
||
example with the parentheses, the `right' way to do it is like this:
|
||
|
||
@comment ignore
|
||
@example
|
||
foo(`() (() (')
|
||
@end example
|
||
|
||
@cindex quoting rule of thumb
|
||
@cindex rule of thumb, quoting
|
||
It is, however, in certain cases necessary (because nested expansion
|
||
must occur to create the arguments for the outer macro) or convenient
|
||
(because it uses fewer characters) to leave out quotes for some
|
||
arguments, and there is nothing wrong in doing it. It just makes life a
|
||
bit harder, if you are not careful to follow a consistent quoting style.
|
||
For consistency, this manual follows the rule of thumb that each layer
|
||
of parentheses introduces another layer of single quoting, except when
|
||
showing the consequences of quoting rules. This is done even when the
|
||
quoted string cannot be a macro, such as with integers when you have not
|
||
changed the syntax via @code{changeword} (@pxref{Changeword}).
|
||
|
||
The quoting rule of thumb of one level of quoting per parentheses has a
|
||
nice property: when a macro name appears inside parentheses, you can
|
||
determine when it will be expanded. If it is not quoted, it will be
|
||
expanded prior to the outer macro, so that its expansion becomes the
|
||
argument. If it is single-quoted, it will be expanded after the outer
|
||
macro. And if it is double-quoted, it will be used as literal text
|
||
instead of a macro name.
|
||
|
||
@example
|
||
define(`active', `ACT, IVE')
|
||
@result{}
|
||
define(`show', `$1 $1')
|
||
@result{}
|
||
show(active)
|
||
@result{}ACT ACT
|
||
show(`active')
|
||
@result{}ACT, IVE ACT, IVE
|
||
show(``active'')
|
||
@result{}active active
|
||
@end example
|
||
|
||
@node Macro expansion
|
||
@section Macro expansion
|
||
|
||
@cindex macros, expansion of
|
||
@cindex expansion of macros
|
||
When the arguments, if any, to a macro call have been collected, the
|
||
macro is expanded, and the expansion text is pushed back onto the input
|
||
(unquoted), and reread. The expansion text from one macro call might
|
||
therefore result in more macros being called, if the calls are included,
|
||
completely or partially, in the first macro calls' expansion.
|
||
|
||
Taking a very simple example, if @var{foo} expands to @samp{bar}, and
|
||
@var{bar} expands to @samp{Hello}, the input
|
||
|
||
@comment options: -Dbar=Hello -Dfoo=bar
|
||
@example
|
||
$ @kbd{m4 -Dbar=Hello -Dfoo=bar}
|
||
foo
|
||
@result{}Hello
|
||
@end example
|
||
|
||
@noindent
|
||
will expand first to @samp{bar}, and when this is reread and
|
||
expanded, into @samp{Hello}.
|
||
|
||
@ignore
|
||
@comment not worth documenting, but test that the command line can
|
||
@comment define macros that take parameters
|
||
|
||
@comment options: -Dfoo -Decho=$@
|
||
@example
|
||
$ @kbd{m4 -Dfoo -Decho='$@'}
|
||
foo
|
||
@result{}
|
||
foo(`silently ignored')
|
||
@result{}
|
||
echo(`1', `2')
|
||
@result{}1,2
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Definitions
|
||
@chapter How to define new macros
|
||
|
||
@cindex macros, how to define new
|
||
@cindex defining new macros
|
||
Macros can be defined, redefined and deleted in several different ways.
|
||
Also, it is possible to redefine a macro without losing a previous
|
||
value, and bring back the original value at a later time.
|
||
|
||
@menu
|
||
* Define:: Defining a new macro
|
||
* Arguments:: Arguments to macros
|
||
* Pseudo Arguments:: Special arguments to macros
|
||
* Undefine:: Deleting a macro
|
||
* Defn:: Renaming macros
|
||
* Pushdef:: Temporarily redefining macros
|
||
|
||
* Indir:: Indirect call of macros
|
||
* Builtin:: Indirect call of builtins
|
||
@end menu
|
||
|
||
@node Define
|
||
@section Defining a macro
|
||
|
||
The normal way to define or redefine macros is to use the builtin
|
||
@code{define}:
|
||
|
||
@deffn Builtin define (@var{name}, @ovar{expansion})
|
||
Defines @var{name} to expand to @var{expansion}. If
|
||
@var{expansion} is not given, it is taken to be empty.
|
||
|
||
The expansion of @code{define} is void.
|
||
The macro @code{define} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
The following example defines the macro @var{foo} to expand to the text
|
||
@samp{Hello World.}.
|
||
|
||
@example
|
||
define(`foo', `Hello world.')
|
||
@result{}
|
||
foo
|
||
@result{}Hello world.
|
||
@end example
|
||
|
||
The empty line in the output is there because the newline is not
|
||
a part of the macro definition, and it is consequently copied to
|
||
the output. This can be avoided by use of the macro @code{dnl}.
|
||
@xref{Dnl}, for details.
|
||
|
||
The first argument to @code{define} should be quoted; otherwise, if the
|
||
macro is already defined, you will be defining a different macro. This
|
||
example shows the problems with underquoting, since we did not want to
|
||
redefine @code{one}:
|
||
|
||
@example
|
||
define(foo, one)
|
||
@result{}
|
||
define(foo, two)
|
||
@result{}
|
||
one
|
||
@result{}two
|
||
@end example
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
@acronym{GNU} @code{m4} normally replaces only the @emph{topmost}
|
||
definition of a macro if it has several definitions from @code{pushdef}
|
||
(@pxref{Pushdef}). Some other implementations of @code{m4} replace all
|
||
definitions of a macro with @code{define}. @xref{Incompatibilities},
|
||
for more details.
|
||
|
||
As a @acronym{GNU} extension, the first argument to @code{define} does
|
||
not have to be a simple word.
|
||
It can be any text string, even the empty string. A macro with a
|
||
non-standard name cannot be invoked in the normal way, as the name is
|
||
not recognized. It can only be referenced by the builtins @code{indir}
|
||
(@pxref{Indir}) and @code{defn} (@pxref{Defn}).
|
||
|
||
@cindex arrays
|
||
Arrays and associative arrays can be simulated by using non-standard
|
||
macro names.
|
||
|
||
@deffn Composite array (@var{index})
|
||
@deffnx Composite array_set (@var{index}, @ovar{value})
|
||
Provide access to entries within an array. @code{array} reads the entry
|
||
at location @var{index}, and @code{array_set} assigns @var{value} to
|
||
location @var{index}.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`array', `defn(format(``array[%d]'', `$1'))')
|
||
@result{}
|
||
define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
|
||
@result{}
|
||
array_set(`4', `array element no. 4')
|
||
@result{}
|
||
array_set(`17', `array element no. 17')
|
||
@result{}
|
||
array(`4')
|
||
@result{}array element no. 4
|
||
array(eval(`10 + 7'))
|
||
@result{}array element no. 17
|
||
@end example
|
||
|
||
Change the @samp{%d} to @samp{%s} and it is an associative array.
|
||
|
||
@node Arguments
|
||
@section Arguments to macros
|
||
|
||
@cindex macros, arguments to
|
||
@cindex arguments to macros
|
||
Macros can have arguments. The @var{n}th argument is denoted by
|
||
@code{$n} in the expansion text, and is replaced by the @var{n}th actual
|
||
argument, when the macro is expanded. Replacement of arguments happens
|
||
before rescanning, regardless of how many nesting levels of quoting
|
||
appear in the expansion. Here is an example of a macro with
|
||
two arguments.
|
||
|
||
@deffn Composite exch (@var{arg1}, @var{arg2})
|
||
Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
|
||
their order.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`exch', `$2, $1')
|
||
@result{}
|
||
exch(`arg1', `arg2')
|
||
@result{}arg2, arg1
|
||
@end example
|
||
|
||
This can be used, for example, if you like the arguments to
|
||
@code{define} to be reversed.
|
||
|
||
@example
|
||
define(`exch', `$2, $1')
|
||
@result{}
|
||
define(exch(``expansion text'', ``macro''))
|
||
@result{}
|
||
macro
|
||
@result{}expansion text
|
||
@end example
|
||
|
||
@xref{Quoting Arguments}, for an explanation of the double quotes.
|
||
(You should try and improve this example so that clients of @code{exch}
|
||
do not have to double quote; or @pxref{Improved exch, , Answers}).
|
||
|
||
As a special case, the zeroth argument, @code{$0}, is always the name
|
||
of the macro being expanded.
|
||
|
||
@example
|
||
define(`test', ``Macro name: $0'')
|
||
@result{}
|
||
test
|
||
@result{}Macro name: test
|
||
@end example
|
||
|
||
If you want quoted text to appear as part of the expansion text,
|
||
remember that quotes can be nested in quoted strings. Thus, in
|
||
|
||
@example
|
||
define(`foo', `This is macro `foo'.')
|
||
@result{}
|
||
foo
|
||
@result{}This is macro foo.
|
||
@end example
|
||
|
||
@noindent
|
||
The @samp{foo} in the expansion text is @emph{not} expanded, since it is
|
||
a quoted string, and not a name.
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
@cindex nine arguments, more than
|
||
@cindex more than nine arguments
|
||
@cindex arguments, more than nine
|
||
@cindex positional parameters, more than nine
|
||
@acronym{GNU} @code{m4} allows the number following the @samp{$} to
|
||
consist of one or more digits, allowing macros to have any number of
|
||
arguments. The extension of accepting multiple digits is incompatible
|
||
with @acronym{POSIX}, and is different than traditional implementations
|
||
of @code{m4}, which only recognize one digit. Therefore, future
|
||
versions of @acronym{GNU} M4 will phase out this feature. To portably
|
||
access beyond the ninth argument, you can use the @code{argn} macro
|
||
documented later (@pxref{Shift}).
|
||
|
||
@acronym{POSIX} also states that @samp{$} followed immediately by
|
||
@samp{@{} in a macro definition is implementation-defined. This version
|
||
of M4 passes the literal characters @samp{$@{} through unchanged, but M4
|
||
2.0 will implement an optional feature similar to @command{sh}, where
|
||
@samp{$@{11@}} expands to the eleventh argument, to replace the current
|
||
recognition of @samp{$11}. Meanwhile, if you want to guarantee that you
|
||
will get a literal @samp{$@{} in output when expanding a macro, even
|
||
when you upgrade to M4 2.0, you can use nested quoting to your
|
||
advantage:
|
||
|
||
@example
|
||
define(`foo', `single quoted $`'@{1@} output')
|
||
@result{}
|
||
define(`bar', ``double quoted $'`@{2@} output'')
|
||
@result{}
|
||
foo(`a', `b')
|
||
@result{}single quoted $@{1@} output
|
||
bar(`a', `b')
|
||
@result{}double quoted $@{2@} output
|
||
@end example
|
||
|
||
To help you detect places in your M4 input files that might change in
|
||
behavior due to the changed behavior of M4 2.0, you can use the
|
||
@option{--warn-macro-sequence} command-line option (@pxref{Operation
|
||
modes, , Invoking m4}) with the default regular expression. This will
|
||
add a warning any time a macro definition includes @samp{$} followed by
|
||
multiple digits, or by @samp{@{}. The warning is not enabled by
|
||
default, because it triggers a number of warnings in Autoconf 2.61 (and
|
||
Autoconf uses @option{-E} to treat warnings as errors), and because it
|
||
will still be possible to restore older behavior in M4 2.0.
|
||
|
||
@comment options: --warn-macro-sequence
|
||
@example
|
||
$ @kbd{m4 --warn-macro-sequence}
|
||
define(`foo', `$001 $@{1@} $1')
|
||
@error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$001'
|
||
@error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$@{1@}'
|
||
@result{}
|
||
foo(`bar')
|
||
@result{}bar $@{1@} bar
|
||
@end example
|
||
|
||
@node Pseudo Arguments
|
||
@section Special arguments to macros
|
||
|
||
@cindex special arguments to macros
|
||
@cindex macros, special arguments to
|
||
@cindex arguments to macros, special
|
||
There is a special notation for the number of actual arguments supplied,
|
||
and for all the actual arguments.
|
||
|
||
The number of actual arguments in a macro call is denoted by @code{$#}
|
||
in the expansion text.
|
||
|
||
@deffn Composite nargs (@dots{})
|
||
Expands to a count of the number of arguments supplied.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`nargs', `$#')
|
||
@result{}
|
||
nargs
|
||
@result{}0
|
||
nargs()
|
||
@result{}1
|
||
nargs(`arg1', `arg2', `arg3')
|
||
@result{}3
|
||
nargs(`commas can be quoted, like this')
|
||
@result{}1
|
||
nargs(arg1#inside comments, commas do not separate arguments
|
||
still arg1)
|
||
@result{}1
|
||
nargs((unquoted parentheses, like this, group arguments))
|
||
@result{}1
|
||
@end example
|
||
|
||
Remember that @samp{#} defaults to the comment character; if you forget
|
||
quotes to inhibit the comment behavior, your macro definition may not
|
||
end where you expected.
|
||
|
||
@example
|
||
dnl Attempt to define a macro to just `$#'
|
||
define(underquoted, $#)
|
||
oops)
|
||
@result{}
|
||
underquoted
|
||
@result{}0)
|
||
@result{}oops
|
||
@end example
|
||
|
||
The notation @code{$*} can be used in the expansion text to denote all
|
||
the actual arguments, unquoted, with commas in between. For example
|
||
|
||
@example
|
||
define(`echo', `$*')
|
||
@result{}
|
||
echo(arg1, arg2, arg3 , arg4)
|
||
@result{}arg1,arg2,arg3 ,arg4
|
||
@end example
|
||
|
||
Often each argument should be quoted, and the notation @code{$@@} handles
|
||
that. It is just like @code{$*}, except that it quotes each argument.
|
||
A simple example of that is:
|
||
|
||
@example
|
||
define(`echo', `$@@')
|
||
@result{}
|
||
echo(arg1, arg2, arg3 , arg4)
|
||
@result{}arg1,arg2,arg3 ,arg4
|
||
@end example
|
||
|
||
Where did the quotes go? Of course, they were eaten, when the expanded
|
||
text were reread by @code{m4}. To show the difference, try
|
||
|
||
@example
|
||
define(`echo1', `$*')
|
||
@result{}
|
||
define(`echo2', `$@@')
|
||
@result{}
|
||
define(`foo', `This is macro `foo'.')
|
||
@result{}
|
||
echo1(foo)
|
||
@result{}This is macro This is macro foo..
|
||
echo1(`foo')
|
||
@result{}This is macro foo.
|
||
echo2(foo)
|
||
@result{}This is macro foo.
|
||
echo2(`foo')
|
||
@result{}foo
|
||
@end example
|
||
|
||
@noindent
|
||
@xref{Trace}, if you do not understand this. As another example of the
|
||
difference, remember that comments encountered in arguments are passed
|
||
untouched to the macro, and that quoting disables comments.
|
||
|
||
@example
|
||
define(`echo1', `$*')
|
||
@result{}
|
||
define(`echo2', `$@@')
|
||
@result{}
|
||
define(`foo', `bar')
|
||
@result{}
|
||
echo1(#foo'foo
|
||
foo)
|
||
@result{}#foo'foo
|
||
@result{}bar
|
||
echo2(#foo'foo
|
||
foo)
|
||
@result{}#foobar
|
||
@result{}bar'
|
||
@end example
|
||
|
||
@ignore
|
||
@comment Not worth putting in the manual, but this example is needed for
|
||
@comment good test coverage of copying large strings across recursion
|
||
@comment levels.
|
||
|
||
@example
|
||
define(`echo', `$@@')dnl
|
||
echo(echo(`01234567890123456789', `01234567890123456789')
|
||
echo(`98765432109876543210', `98765432109876543210'))
|
||
@result{}01234567890123456789,01234567890123456789
|
||
@result{}98765432109876543210,98765432109876543210
|
||
len((echo(`01234567890123456789',
|
||
`01234567890123456789')echo(`98765432109876543210',
|
||
`98765432109876543210')))
|
||
@result{}84
|
||
indir(`echo', indir(`echo', `01234567890123456789',
|
||
`01234567890123456789')
|
||
indir(`echo', `98765432109876543210', `98765432109876543210'))
|
||
@result{}01234567890123456789,01234567890123456789
|
||
@result{}98765432109876543210,98765432109876543210
|
||
define(`argn', `$#')dnl
|
||
define(`echo1', `-$@@-')define(`echo2', `,$@@,')dnl
|
||
echo1(`1', `2', `3') argn(echo1(`1', `2', `3'))
|
||
@result{}-1,2,3- 3
|
||
echo2(`1', `2', `3') argn(echo2(`1', `2', `3'))
|
||
@result{},1,2,3, 5
|
||
@end example
|
||
@end ignore
|
||
|
||
A @samp{$} sign in the expansion text, that is not followed by anything
|
||
@code{m4} understands, is simply copied to the macro expansion, as any
|
||
other text is.
|
||
|
||
@example
|
||
define(`foo', `$$$ hello $$$')
|
||
@result{}
|
||
foo
|
||
@result{}$$$ hello $$$
|
||
@end example
|
||
|
||
@cindex rescanning
|
||
@cindex literal output
|
||
@cindex output, literal
|
||
If you want a macro to expand to something like @samp{$12}, the
|
||
judicious use of nested quoting can put a safe character between the
|
||
@code{$} and the next character, relying on the rescanning to remove the
|
||
nested quote. This will prevent @code{m4} from interpreting the
|
||
@code{$} sign as a reference to an argument.
|
||
|
||
@example
|
||
define(`foo', `no nested quote: $1')
|
||
@result{}
|
||
foo(`arg')
|
||
@result{}no nested quote: arg
|
||
define(`foo', `nested quote around $: `$'1')
|
||
@result{}
|
||
foo(`arg')
|
||
@result{}nested quote around $: $1
|
||
define(`foo', `nested empty quote after $: $`'1')
|
||
@result{}
|
||
foo(`arg')
|
||
@result{}nested empty quote after $: $1
|
||
define(`foo', `nested quote around next character: $`1'')
|
||
@result{}
|
||
foo(`arg')
|
||
@result{}nested quote around next character: $1
|
||
define(`foo', `nested quote around both: `$1'')
|
||
@result{}
|
||
foo(`arg')
|
||
@result{}nested quote around both: arg
|
||
@end example
|
||
|
||
@node Undefine
|
||
@section Deleting a macro
|
||
|
||
@cindex macros, how to delete
|
||
@cindex deleting macros
|
||
@cindex undefining macros
|
||
A macro definition can be removed with @code{undefine}:
|
||
|
||
@deffn Builtin undefine (@var{name}@dots{})
|
||
For each argument, remove the macro @var{name}. The macro names must
|
||
necessarily be quoted, since they will be expanded otherwise.
|
||
|
||
The expansion of @code{undefine} is void.
|
||
The macro @code{undefine} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
foo bar blah
|
||
@result{}foo bar blah
|
||
define(`foo', `some')define(`bar', `other')define(`blah', `text')
|
||
@result{}
|
||
foo bar blah
|
||
@result{}some other text
|
||
undefine(`foo')
|
||
@result{}
|
||
foo bar blah
|
||
@result{}foo other text
|
||
undefine(`bar', `blah')
|
||
@result{}
|
||
foo bar blah
|
||
@result{}foo bar blah
|
||
@end example
|
||
|
||
Undefining a macro inside that macro's expansion is safe; the macro
|
||
still expands to the definition that was in effect at the @samp{(}.
|
||
|
||
@example
|
||
define(`f', ``$0':$1')
|
||
@result{}
|
||
f(f(f(undefine(`f')`hello world')))
|
||
@result{}f:f:f:hello world
|
||
f(`bye')
|
||
@result{}f(bye)
|
||
@end example
|
||
|
||
It is not an error for @var{name} to have no macro definition. In that
|
||
case, @code{undefine} does nothing.
|
||
|
||
@node Defn
|
||
@section Renaming macros
|
||
|
||
@cindex macros, how to rename
|
||
@cindex renaming macros
|
||
@cindex macros, displaying definitions
|
||
@cindex definitions, displaying macro
|
||
It is possible to rename an already defined macro. To do this, you need
|
||
the builtin @code{defn}:
|
||
|
||
@deffn Builtin defn (@var{name}@dots{})
|
||
Expands to the @emph{quoted definition} of each @var{name}. If an
|
||
argument is not a defined macro, the expansion for that argument is
|
||
empty.
|
||
|
||
If @var{name} is a user-defined macro, the quoted definition is simply
|
||
the quoted expansion text. If, instead, there is only one @var{name}
|
||
and it is a builtin, the
|
||
expansion is a special token, which points to the builtin's internal
|
||
definition. This token is only meaningful as the second argument to
|
||
@code{define} (and @code{pushdef}), and is silently converted to an
|
||
empty string in most other contexts. Combining a builtin with anything
|
||
else is not supported; a warning is issued and the builtin is omitted
|
||
from the final expansion.
|
||
|
||
The macro @code{defn} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
Its normal use is best understood through an example, which shows how to
|
||
rename @code{undefine} to @code{zap}:
|
||
|
||
@example
|
||
define(`zap', defn(`undefine'))
|
||
@result{}
|
||
zap(`undefine')
|
||
@result{}
|
||
undefine(`zap')
|
||
@result{}undefine(zap)
|
||
@end example
|
||
|
||
In this way, @code{defn} can be used to copy macro definitions, and also
|
||
definitions of builtin macros. Even if the original macro is removed,
|
||
the other name can still be used to access the definition.
|
||
|
||
The fact that macro definitions can be transferred also explains why you
|
||
should use @code{$0}, rather than retyping a macro's name in its
|
||
definition:
|
||
|
||
@example
|
||
define(`foo', `This is `$0'')
|
||
@result{}
|
||
define(`bar', defn(`foo'))
|
||
@result{}
|
||
bar
|
||
@result{}This is bar
|
||
@end example
|
||
|
||
Macros used as string variables should be referred through @code{defn},
|
||
to avoid unwanted expansion of the text:
|
||
|
||
@example
|
||
define(`string', `The macro dnl is very useful
|
||
')
|
||
@result{}
|
||
string
|
||
@result{}The macro@w{ }
|
||
defn(`string')
|
||
@result{}The macro dnl is very useful
|
||
@result{}
|
||
@end example
|
||
|
||
@cindex rescanning
|
||
However, it is important to remember that @code{m4} rescanning is purely
|
||
textual. If an unbalanced end-quote string occurs in a macro
|
||
definition, the rescan will see that embedded quote as the termination
|
||
of the quoted string, and the remainder of the macro's definition will
|
||
be rescanned unquoted. Thus it is a good idea to avoid unbalanced
|
||
end-quotes in macro definitions or arguments to macros.
|
||
|
||
@example
|
||
define(`foo', a'a)
|
||
@result{}
|
||
define(`a', `A')
|
||
@result{}
|
||
define(`echo', `$@@')
|
||
@result{}
|
||
foo
|
||
@result{}A'A
|
||
defn(`foo')
|
||
@result{}aA'
|
||
echo(foo)
|
||
@result{}AA'
|
||
@end example
|
||
|
||
On the other hand, it is possible to exploit the fact that @code{defn}
|
||
can concatenate multiple macros prior to the rescanning phase, in order
|
||
to join the definitions of macros that, in isolation, have unbalanced
|
||
quotes. This is particularly useful when one has used several macros to
|
||
accumulate text that M4 should rescan as a whole. In the example below,
|
||
note how the use of @code{defn} on @code{l} in isolation opens a string,
|
||
which is not closed until the next line; but used on @code{l} and
|
||
@code{r} together results in nested quoting.
|
||
|
||
@example
|
||
define(`l', `<[>')define(`r', `<]>')
|
||
@result{}
|
||
changequote(`[', `]')
|
||
@result{}
|
||
defn([l])defn([r])
|
||
])
|
||
@result{}<[>]defn([r])
|
||
@result{})
|
||
defn([l], [r])
|
||
@result{}<[>][<]>
|
||
@end example
|
||
|
||
@cindex builtins, special tokens
|
||
@cindex tokens, builtin macro
|
||
Using @code{defn} to generate special tokens for builtin macros outside
|
||
of expected contexts can sometimes trigger warnings. But most of the
|
||
time, such tokens are silently converted to the empty string.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
defn(`defn')
|
||
@result{}
|
||
define(defn(`divnum'), `cannot redefine a builtin token')
|
||
@error{}m4:stdin:2: Warning: define: invalid macro name ignored
|
||
@result{}
|
||
divnum
|
||
@result{}0
|
||
len(defn(`divnum'))
|
||
@result{}0
|
||
@end example
|
||
|
||
Also note that @code{defn} with multiple arguments can only join text
|
||
macros, not builtins, although a future version of @acronym{GNU} M4 may
|
||
lift this restriction.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
define(`a', `A')define(`AA', `b')
|
||
@result{}
|
||
traceon(`defn', `define')
|
||
@result{}
|
||
defn(`a', `divnum', `a')
|
||
@error{}m4:stdin:3: Warning: cannot concatenate builtin `divnum'
|
||
@error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'`A''
|
||
@result{}AA
|
||
define(`mydivnum', defn(`divnum', `divnum'))mydivnum
|
||
@error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
|
||
@error{}m4:stdin:4: Warning: cannot concatenate builtin `divnum'
|
||
@error{}m4trace: -2- defn(`divnum', `divnum')
|
||
@error{}m4trace: -1- define(`mydivnum', `')
|
||
@result{}
|
||
traceoff(`defn', `define')
|
||
@result{}
|
||
@end example
|
||
|
||
@node Pushdef
|
||
@section Temporarily redefining macros
|
||
|
||
@cindex macros, temporary redefinition of
|
||
@cindex temporary redefinition of macros
|
||
@cindex redefinition of macros, temporary
|
||
@cindex definition stack
|
||
@cindex pushdef stack
|
||
@cindex stack, macro definition
|
||
It is possible to redefine a macro temporarily, reverting to the
|
||
previous definition at a later time. This is done with the builtins
|
||
@code{pushdef} and @code{popdef}:
|
||
|
||
@deffn Builtin pushdef (@var{name}, @ovar{expansion})
|
||
@deffnx Builtin popdef (@var{name}@dots{})
|
||
Analogous to @code{define} and @code{undefine}.
|
||
|
||
These macros work in a stack-like fashion. A macro is temporarily
|
||
redefined with @code{pushdef}, which replaces an existing definition of
|
||
@var{name}, while saving the previous definition, before the new one is
|
||
installed. If there is no previous definition, @code{pushdef} behaves
|
||
exactly like @code{define}.
|
||
|
||
If a macro has several definitions (of which only one is accessible),
|
||
the topmost definition can be removed with @code{popdef}. If there is
|
||
no previous definition, @code{popdef} behaves like @code{undefine}.
|
||
|
||
The expansion of both @code{pushdef} and @code{popdef} is void.
|
||
The macros @code{pushdef} and @code{popdef} are recognized only with
|
||
parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`foo', `Expansion one.')
|
||
@result{}
|
||
foo
|
||
@result{}Expansion one.
|
||
pushdef(`foo', `Expansion two.')
|
||
@result{}
|
||
foo
|
||
@result{}Expansion two.
|
||
pushdef(`foo', `Expansion three.')
|
||
@result{}
|
||
pushdef(`foo', `Expansion four.')
|
||
@result{}
|
||
popdef(`foo')
|
||
@result{}
|
||
foo
|
||
@result{}Expansion three.
|
||
popdef(`foo', `foo')
|
||
@result{}
|
||
foo
|
||
@result{}Expansion one.
|
||
popdef(`foo')
|
||
@result{}
|
||
foo
|
||
@result{}foo
|
||
@end example
|
||
|
||
If a macro with several definitions is redefined with @code{define}, the
|
||
topmost definition is @emph{replaced} with the new definition. If it is
|
||
removed with @code{undefine}, @emph{all} the definitions are removed,
|
||
and not only the topmost one. However, @acronym{POSIX} allows other
|
||
implementations that treat @code{define} as replacing an entire stack
|
||
of definitions with a single new definition, so to be portable to other
|
||
implementations, it may be worth explicitly using @code{popdef} and
|
||
@code{pushdef} rather than relying on the @acronym{GNU} behavior of
|
||
@code{define}.
|
||
|
||
@example
|
||
define(`foo', `Expansion one.')
|
||
@result{}
|
||
foo
|
||
@result{}Expansion one.
|
||
pushdef(`foo', `Expansion two.')
|
||
@result{}
|
||
foo
|
||
@result{}Expansion two.
|
||
define(`foo', `Second expansion two.')
|
||
@result{}
|
||
foo
|
||
@result{}Second expansion two.
|
||
undefine(`foo')
|
||
@result{}
|
||
foo
|
||
@result{}foo
|
||
@end example
|
||
|
||
@cindex local variables
|
||
@cindex variables, local
|
||
Local variables within macros are made with @code{pushdef} and
|
||
@code{popdef}. At the start of the macro a new definition is pushed,
|
||
within the macro it is manipulated and at the end it is popped,
|
||
revealing the former definition.
|
||
|
||
It is possible to temporarily redefine a builtin with @code{pushdef}
|
||
and @code{defn}.
|
||
|
||
@node Indir
|
||
@section Indirect call of macros
|
||
|
||
@cindex indirect call of macros
|
||
@cindex call of macros, indirect
|
||
@cindex macros, indirect call of
|
||
@cindex @acronym{GNU} extensions
|
||
Any macro can be called indirectly with @code{indir}:
|
||
|
||
@deffn Builtin indir (@var{name}, @ovar{args@dots{}})
|
||
Results in a call to the macro @var{name}, which is passed the
|
||
rest of the arguments @var{args}. If @var{name} is not defined, an
|
||
error message is printed, and the expansion is void.
|
||
|
||
The macro @code{indir} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
This can be used to call macros with computed or ``invalid''
|
||
names (@code{define} allows such names to be defined):
|
||
|
||
@example
|
||
define(`$$internal$macro', `Internal macro (name `$0')')
|
||
@result{}
|
||
$$internal$macro
|
||
@result{}$$internal$macro
|
||
indir(`$$internal$macro')
|
||
@result{}Internal macro (name $$internal$macro)
|
||
@end example
|
||
|
||
The point is, here, that larger macro packages can have private macros
|
||
defined, that will not be called by accident. They can @emph{only} be
|
||
called through the builtin @code{indir}.
|
||
|
||
One other point to observe is that argument collection occurs before
|
||
@code{indir} invokes @var{name}, so if argument collection changes the
|
||
value of @var{name}, that will be reflected in the final expansion.
|
||
This is different than the behavior when invoking macros directly,
|
||
where the definition that was in effect before argument collection is
|
||
used.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
define(`f', `1')
|
||
@result{}
|
||
f(define(`f', `2'))
|
||
@result{}1
|
||
indir(`f', define(`f', `3'))
|
||
@result{}3
|
||
indir(`f', undefine(`f'))
|
||
@error{}m4:stdin:4: undefined macro `f'
|
||
@result{}
|
||
@end example
|
||
|
||
When handed the result of @code{defn} (@pxref{Defn}) as one of its
|
||
arguments, @code{indir} defers to the invoked @var{name} for whether a
|
||
token representing a builtin is recognized or flattened to the empty
|
||
string.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
indir(defn(`defn'), `divnum')
|
||
@error{}m4:stdin:1: Warning: indir: invalid macro name ignored
|
||
@result{}
|
||
indir(`define', defn(`defn'), `divnum')
|
||
@error{}m4:stdin:2: Warning: define: invalid macro name ignored
|
||
@result{}
|
||
indir(`define', `foo', defn(`divnum'))
|
||
@result{}
|
||
foo
|
||
@result{}0
|
||
indir(`divert', defn(`foo'))
|
||
@error{}m4:stdin:5: empty string treated as 0 in builtin `divert'
|
||
@result{}
|
||
@end example
|
||
|
||
@node Builtin
|
||
@section Indirect call of builtins
|
||
|
||
@cindex indirect call of builtins
|
||
@cindex call of builtins, indirect
|
||
@cindex builtins, indirect call of
|
||
@cindex @acronym{GNU} extensions
|
||
Builtin macros can be called indirectly with @code{builtin}:
|
||
|
||
@deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
|
||
Results in a call to the builtin @var{name}, which is passed the
|
||
rest of the arguments @var{args}. If @var{name} does not name a
|
||
builtin, an error message is printed, and the expansion is void.
|
||
|
||
The macro @code{builtin} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
This can be used even if @var{name} has been given another definition
|
||
that has covered the original, or been undefined so that no macro
|
||
maps to the builtin.
|
||
|
||
@example
|
||
pushdef(`define', `hidden')
|
||
@result{}
|
||
undefine(`undefine')
|
||
@result{}
|
||
define(`foo', `bar')
|
||
@result{}hidden
|
||
foo
|
||
@result{}foo
|
||
builtin(`define', `foo', defn(`divnum'))
|
||
@result{}
|
||
foo
|
||
@result{}0
|
||
builtin(`define', `foo', `BAR')
|
||
@result{}
|
||
foo
|
||
@result{}BAR
|
||
undefine(`foo')
|
||
@result{}undefine(foo)
|
||
foo
|
||
@result{}BAR
|
||
builtin(`undefine', `foo')
|
||
@result{}
|
||
foo
|
||
@result{}foo
|
||
@end example
|
||
|
||
The @var{name} argument only matches the original name of the builtin,
|
||
even when the @option{--prefix-builtins} option (or @option{-P},
|
||
@pxref{Operation modes, , Invoking m4}) is in effect. This is different
|
||
from @code{indir}, which only tracks current macro names.
|
||
|
||
@comment options: -P
|
||
@example
|
||
$ @kbd{m4 -P}
|
||
m4_builtin(`divnum')
|
||
@result{}0
|
||
m4_builtin(`m4_divnum')
|
||
@error{}m4:stdin:2: undefined builtin `m4_divnum'
|
||
@result{}
|
||
m4_indir(`divnum')
|
||
@error{}m4:stdin:3: undefined macro `divnum'
|
||
@result{}
|
||
m4_indir(`m4_divnum')
|
||
@result{}0
|
||
@end example
|
||
|
||
Note that @code{indir} and @code{builtin} can be used to invoke builtins
|
||
without arguments, even when they normally require parameters to be
|
||
recognized; but it will provoke a warning, and result in a void expansion.
|
||
|
||
@example
|
||
builtin
|
||
@result{}builtin
|
||
builtin()
|
||
@error{}m4:stdin:2: undefined builtin `'
|
||
@result{}
|
||
builtin(`builtin')
|
||
@error{}m4:stdin:3: Warning: too few arguments to builtin `builtin'
|
||
@result{}
|
||
builtin(`builtin',)
|
||
@error{}m4:stdin:4: undefined builtin `'
|
||
@result{}
|
||
builtin(`builtin', ``'
|
||
')
|
||
@error{}m4:stdin:5: undefined builtin ``'
|
||
@error{}'
|
||
@result{}
|
||
indir(`index')
|
||
@error{}m4:stdin:7: Warning: too few arguments to builtin `index'
|
||
@result{}
|
||
@end example
|
||
|
||
@ignore
|
||
@comment This example is not worth putting in the manual, but it is
|
||
@comment needed for full coverage. Autoconf's m4_include relies heavily
|
||
@comment on this feature.
|
||
|
||
@example
|
||
builtin(`include', `foo')dnl
|
||
@result{}bar
|
||
@end example
|
||
|
||
@comment And this example triggers a regression present in 1.4.10b.
|
||
|
||
@example
|
||
define(`s', `builtin(`shift', $@@)')dnl
|
||
define(`loop', `ifelse(`$2', `', `-', `$1$2: $0(`$1', s(s($@@)))')')dnl
|
||
loop(`1')
|
||
@result{}-
|
||
loop(`1', `2')
|
||
@result{}12: -
|
||
loop(`1', `2', `3')
|
||
@result{}12: 13: -
|
||
loop(`1', `2', `3', `4')
|
||
@result{}12: 13: 14: -
|
||
loop(`1', `2', `3', `4', `5')
|
||
@result{}12: 13: 14: 15: -
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Conditionals
|
||
@chapter Conditionals, loops, and recursion
|
||
|
||
Macros, expanding to plain text, perhaps with arguments, are not quite
|
||
enough. We would like to have macros expand to different things, based
|
||
on decisions taken at run-time. For that, we need some kind of conditionals.
|
||
Also, we would like to have some kind of loop construct, so we could do
|
||
something a number of times, or while some condition is true.
|
||
|
||
@menu
|
||
* Ifdef:: Testing if a macro is defined
|
||
* Ifelse:: If-else construct, or multibranch
|
||
* Shift:: Recursion in @code{m4}
|
||
* Forloop:: Iteration by counting
|
||
* Foreach:: Iteration by list contents
|
||
* Stacks:: Working with definition stacks
|
||
* Composition:: Building macros with macros
|
||
@end menu
|
||
|
||
@node Ifdef
|
||
@section Testing if a macro is defined
|
||
|
||
@cindex conditionals
|
||
There are two different builtin conditionals in @code{m4}. The first is
|
||
@code{ifdef}:
|
||
|
||
@deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
|
||
If @var{name} is defined as a macro, @code{ifdef} expands to
|
||
@var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
|
||
omitted, it is taken to be the empty string (according to the normal
|
||
rules).
|
||
|
||
The macro @code{ifdef} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
ifdef(`foo', ``foo' is defined', ``foo' is not defined')
|
||
@result{}foo is not defined
|
||
define(`foo', `')
|
||
@result{}
|
||
ifdef(`foo', ``foo' is defined', ``foo' is not defined')
|
||
@result{}foo is defined
|
||
ifdef(`no_such_macro', `yes', `no', `extra argument')
|
||
@error{}m4:stdin:4: Warning: excess arguments to builtin `ifdef' ignored
|
||
@result{}no
|
||
@end example
|
||
|
||
@node Ifelse
|
||
@section If-else construct, or multibranch
|
||
|
||
@cindex comparing strings
|
||
@cindex discarding input
|
||
@cindex input, discarding
|
||
The other conditional, @code{ifelse}, is much more powerful. It can be
|
||
used as a way to introduce a long comment, as an if-else construct, or
|
||
as a multibranch, depending on the number of arguments supplied:
|
||
|
||
@deffn Builtin ifelse (@var{comment})
|
||
@deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
|
||
@ovar{not-equal})
|
||
@deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
|
||
@var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
|
||
Used with only one argument, the @code{ifelse} simply discards it and
|
||
produces no output.
|
||
|
||
If called with three or four arguments, @code{ifelse} expands into
|
||
@var{equal}, if @var{string-1} and @var{string-2} are equal (character
|
||
for character), otherwise it expands to @var{not-equal}. A final fifth
|
||
argument is ignored, after triggering a warning.
|
||
|
||
If called with six or more arguments, and @var{string-1} and
|
||
@var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
|
||
otherwise the first three arguments are discarded and the processing
|
||
starts again.
|
||
|
||
The macro @code{ifelse} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
Using only one argument is a common @code{m4} idiom for introducing a
|
||
block comment, as an alternative to repeatedly using @code{dnl}. This
|
||
special usage is recognized by @acronym{GNU} @code{m4}, so that in this
|
||
case, the warning about missing arguments is never triggered.
|
||
|
||
@example
|
||
ifelse(`some comments')
|
||
@result{}
|
||
ifelse(`foo', `bar')
|
||
@error{}m4:stdin:2: Warning: too few arguments to builtin `ifelse'
|
||
@result{}
|
||
@end example
|
||
|
||
Using three or four arguments provides decision points.
|
||
|
||
@example
|
||
ifelse(`foo', `bar', `true')
|
||
@result{}
|
||
ifelse(`foo', `foo', `true')
|
||
@result{}true
|
||
define(`foo', `bar')
|
||
@result{}
|
||
ifelse(foo, `bar', `true', `false')
|
||
@result{}true
|
||
ifelse(foo, `foo', `true', `false')
|
||
@result{}false
|
||
@end example
|
||
|
||
@cindex macro, blind
|
||
@cindex blind macro
|
||
Notice how the first argument was used unquoted; it is common to compare
|
||
the expansion of a macro with a string. With this macro, you can now
|
||
reproduce the behavior of blind builtins, where the macro is recognized
|
||
only with arguments.
|
||
|
||
@example
|
||
define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
|
||
@result{}
|
||
foo
|
||
@result{}foo
|
||
foo()
|
||
@result{}arguments:1
|
||
foo(`a', `b', `c')
|
||
@result{}arguments:3
|
||
@end example
|
||
|
||
For an example of a way to make defining blind macros easier, see
|
||
@ref{Composition}.
|
||
|
||
@cindex multibranches
|
||
@cindex switch statement
|
||
@cindex case statement
|
||
The macro @code{ifelse} can take more than four arguments. If given more
|
||
than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
|
||
statement in traditional programming languages. If @var{string-1} and
|
||
@var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
|
||
the procedure is repeated with the first three arguments discarded. This
|
||
calls for an example:
|
||
|
||
@example
|
||
ifelse(`foo', `bar', `third', `gnu', `gnats')
|
||
@error{}m4:stdin:1: Warning: excess arguments to builtin `ifelse' ignored
|
||
@result{}gnu
|
||
ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
|
||
@result{}
|
||
ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
|
||
@result{}seventh
|
||
ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
|
||
@error{}m4:stdin:4: Warning: excess arguments to builtin `ifelse' ignored
|
||
@result{}7
|
||
@end example
|
||
|
||
@ignore
|
||
@comment Stress tests, not worth documenting.
|
||
|
||
@comment Ensure that references compared to strings work regardless of
|
||
@comment similar prefixes.
|
||
@example
|
||
define(`e', `$@@')define(`long', `01234567890123456789')
|
||
@result{}
|
||
ifelse(long, `01234567890123456789', `yes', `no')
|
||
@result{}yes
|
||
ifelse(`01234567890123456789', long, `yes', `no')
|
||
@result{}yes
|
||
ifelse(long, `01234567890123456789-', `yes', `no')
|
||
@result{}no
|
||
ifelse(`01234567890123456789-', long, `yes', `no')
|
||
@result{}no
|
||
ifelse(e(long), `01234567890123456789', `yes', `no')
|
||
@result{}yes
|
||
ifelse(`01234567890123456789', e(long), `yes', `no')
|
||
@result{}yes
|
||
ifelse(e(long), `01234567890123456789-', `yes', `no')
|
||
@result{}no
|
||
ifelse(`01234567890123456789-', e(long), `yes', `no')
|
||
@result{}no
|
||
ifelse(-e(long), `-01234567890123456789', `yes', `no')
|
||
@result{}yes
|
||
ifelse(-`01234567890123456789', -e(long), `yes', `no')
|
||
@result{}yes
|
||
ifelse(-e(long), `-01234567890123456789-', `yes', `no')
|
||
@result{}no
|
||
ifelse(`-01234567890123456789-', -e(long), `yes', `no')
|
||
@result{}no
|
||
ifelse(-e(long)-, `-01234567890123456789-', `yes', `no')
|
||
@result{}yes
|
||
ifelse(-`01234567890123456789-', -e(long)-, `yes', `no')
|
||
@result{}yes
|
||
ifelse(-e(long)-, `-01234567890123456789', `yes', `no')
|
||
@result{}no
|
||
ifelse(`-01234567890123456789', -e(long)-, `yes', `no')
|
||
@result{}no
|
||
ifelse(`-'e(long), `-01234567890123456789', `yes', `no')
|
||
@result{}yes
|
||
ifelse(-`01234567890123456789', `-'e(long), `yes', `no')
|
||
@result{}yes
|
||
ifelse(`-'e(long), `-01234567890123456789-', `yes', `no')
|
||
@result{}no
|
||
ifelse(`-01234567890123456789-', `-'e(long), `yes', `no')
|
||
@result{}no
|
||
ifelse(`-'e(long)`-', `-01234567890123456789-', `yes', `no')
|
||
@result{}yes
|
||
ifelse(-`01234567890123456789-', `-'e(long)`-', `yes', `no')
|
||
@result{}yes
|
||
ifelse(`-'e(long)`-', `-01234567890123456789', `yes', `no')
|
||
@result{}no
|
||
ifelse(`-01234567890123456789', `-'e(long)`-', `yes', `no')
|
||
@result{}no
|
||
@end example
|
||
@end ignore
|
||
|
||
Naturally, the normal case will be slightly more advanced than these
|
||
examples. A common use of @code{ifelse} is in macros implementing loops
|
||
of various kinds.
|
||
|
||
@node Shift
|
||
@section Recursion in @code{m4}
|
||
|
||
@cindex recursive macros
|
||
@cindex macros, recursive
|
||
There is no direct support for loops in @code{m4}, but macros can be
|
||
recursive. There is no limit on the number of recursion levels, other
|
||
than those enforced by your hardware and operating system.
|
||
|
||
@cindex loops
|
||
Loops can be programmed using recursion and the conditionals described
|
||
previously.
|
||
|
||
There is a builtin macro, @code{shift}, which can, among other things,
|
||
be used for iterating through the actual arguments to a macro:
|
||
|
||
@deffn Builtin shift (@var{arg1}, @dots{})
|
||
Takes any number of arguments, and expands to all its arguments except
|
||
@var{arg1}, separated by commas, with each argument quoted.
|
||
|
||
The macro @code{shift} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
shift
|
||
@result{}shift
|
||
shift(`bar')
|
||
@result{}
|
||
shift(`foo', `bar', `baz')
|
||
@result{}bar,baz
|
||
@end example
|
||
|
||
An example of the use of @code{shift} is this macro:
|
||
|
||
@cindex reversing arguments
|
||
@cindex arguments, reversing
|
||
@deffn Composite reverse (@dots{})
|
||
Takes any number of arguments, and reverses their order.
|
||
@end deffn
|
||
|
||
It is implemented as:
|
||
|
||
@example
|
||
define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
|
||
`reverse(shift($@@)), `$1'')')
|
||
@result{}
|
||
reverse
|
||
@result{}
|
||
reverse(`foo')
|
||
@result{}foo
|
||
reverse(`foo', `bar', `gnats', `and gnus')
|
||
@result{}and gnus, gnats, bar, foo
|
||
@end example
|
||
|
||
While not a very interesting macro, it does show how simple loops can be
|
||
made with @code{shift}, @code{ifelse} and recursion. It also shows
|
||
that @code{shift} is usually used with @samp{$@@}. Another example of
|
||
this is an implementation of a short-circuiting conditional operator.
|
||
|
||
@cindex short-circuiting conditional
|
||
@cindex conditional, short-circuiting
|
||
@deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
|
||
@ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
|
||
Similar to @code{ifelse}, where an equal comparison between the first
|
||
two strings results in the third, otherwise the first three arguments
|
||
are discarded and the process repeats. The difference is that each
|
||
@var{test-<n>} is expanded only when it is encountered. This means that
|
||
every third argument to @code{cond} is normally given one more level of
|
||
quoting than the corresponding argument to @code{ifelse}.
|
||
@end deffn
|
||
|
||
Here is the implementation of @code{cond}, along with a demonstration of
|
||
how it can short-circuit the side effects in @code{side}. Notice how
|
||
all the unquoted side effects happen regardless of how many comparisons
|
||
are made with @code{ifelse}, compared with only the relevant effects
|
||
with @code{cond}.
|
||
|
||
@example
|
||
define(`cond',
|
||
`ifelse(`$#', `1', `$1',
|
||
`ifelse($1, `$2', `$3',
|
||
`$0(shift(shift(shift($@@))))')')')dnl
|
||
define(`side', `define(`counter', incr(counter))$1')dnl
|
||
define(`example1',
|
||
`define(`counter', `0')dnl
|
||
ifelse(side(`$1'), `yes', `one comparison: ',
|
||
side(`$1'), `no', `two comparisons: ',
|
||
side(`$1'), `maybe', `three comparisons: ',
|
||
`side(`default answer: ')')counter')dnl
|
||
define(`example2',
|
||
`define(`counter', `0')dnl
|
||
cond(`side(`$1')', `yes', `one comparison: ',
|
||
`side(`$1')', `no', `two comparisons: ',
|
||
`side(`$1')', `maybe', `three comparisons: ',
|
||
`side(`default answer: ')')counter')dnl
|
||
example1(`yes')
|
||
@result{}one comparison: 3
|
||
example1(`no')
|
||
@result{}two comparisons: 3
|
||
example1(`maybe')
|
||
@result{}three comparisons: 3
|
||
example1(`feeling rather indecisive today')
|
||
@result{}default answer: 4
|
||
example2(`yes')
|
||
@result{}one comparison: 1
|
||
example2(`no')
|
||
@result{}two comparisons: 2
|
||
example2(`maybe')
|
||
@result{}three comparisons: 3
|
||
example2(`feeling rather indecisive today')
|
||
@result{}default answer: 4
|
||
@end example
|
||
|
||
@cindex joining arguments
|
||
@cindex arguments, joining
|
||
@cindex concatenating arguments
|
||
Another common task that requires iteration is joining a list of
|
||
arguments into a single string.
|
||
|
||
@deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
|
||
@deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
|
||
Generate a single-quoted string, consisting of each @var{arg} separated
|
||
by @var{separator}. While @code{joinall} always outputs a
|
||
@var{separator} between arguments, @code{join} avoids the
|
||
@var{separator} for an empty @var{arg}.
|
||
@end deffn
|
||
|
||
Here are some examples of its usage, based on the implementation
|
||
@file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
|
||
package:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`join.m4')
|
||
@result{}
|
||
join,join(`-'),join(`-', `'),join(`-', `', `')
|
||
@result{},,,
|
||
joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
|
||
@result{},,,-
|
||
join(`-', `1')
|
||
@result{}1
|
||
join(`-', `1', `2', `3')
|
||
@result{}1-2-3
|
||
join(`', `1', `2', `3')
|
||
@result{}123
|
||
join(`-', `', `1', `', `', `2', `')
|
||
@result{}1-2
|
||
joinall(`-', `', `1', `', `', `2', `')
|
||
@result{}-1---2-
|
||
join(`,', `1', `2', `3')
|
||
@result{}1,2,3
|
||
define(`nargs', `$#')dnl
|
||
nargs(join(`,', `1', `2', `3'))
|
||
@result{}1
|
||
@end example
|
||
|
||
Examining the implementation shows some interesting points about several
|
||
m4 programming idioms.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`join.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# join(sep, args) - join each non-empty ARG into a single
|
||
@result{}# string, with each element separated by SEP
|
||
@result{}define(`join',
|
||
@result{}`ifelse(`$#', `2', ``$2'',
|
||
@result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
|
||
@result{}define(`_join',
|
||
@result{}`ifelse(`$#$2', `2', `',
|
||
@result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
|
||
@result{}# joinall(sep, args) - join each ARG, including empty ones,
|
||
@result{}# into a single string, with each element separated by SEP
|
||
@result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
|
||
@result{}define(`_joinall',
|
||
@result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
First, notice that this implementation creates helper macros
|
||
@code{_join} and @code{_joinall}. This division of labor makes it
|
||
easier to output the correct number of @var{separator} instances:
|
||
@code{join} and @code{joinall} are responsible for the first argument,
|
||
without a separator, while @code{_join} and @code{_joinall} are
|
||
responsible for all remaining arguments, always outputting a separator
|
||
when outputting an argument.
|
||
|
||
Next, observe how @code{join} decides to iterate to itself, because the
|
||
first @var{arg} was empty, or to output the argument and swap over to
|
||
@code{_join}. If the argument is non-empty, then the nested
|
||
@code{ifelse} results in an unquoted @samp{_}, which is concatenated
|
||
with the @samp{$0} to form the next macro name to invoke. The
|
||
@code{joinall} implementation is simpler since it does not have to
|
||
suppress empty @var{arg}; it always executes once then defers to
|
||
@code{_joinall}.
|
||
|
||
Another important idiom is the idea that @var{separator} is reused for
|
||
each iteration. Each iteration has one less argument, but rather than
|
||
discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
|
||
discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
|
||
|
||
Next, notice that it is possible to compare more than one condition in a
|
||
single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
|
||
allows @code{_join} to iterate for two separate reasons---either there
|
||
are still more than two arguments, or there are exactly two arguments
|
||
but the last argument is not empty.
|
||
|
||
Finally, notice that these macros require exactly two arguments to
|
||
terminate recursion, but that they still correctly result in empty
|
||
output when given no @var{args} (i.e., zero or one macro argument). On
|
||
the first pass when there are too few arguments, the @code{shift}
|
||
results in no output, but leaves an empty string to serve as the
|
||
required second argument for the second pass. Put another way,
|
||
@samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
|
||
former guarantees at least two arguments.
|
||
|
||
@cindex quote manipulation
|
||
@cindex manipulating quotes
|
||
Sometimes, a recursive algorithm requires adding quotes to each element,
|
||
or treating multiple arguments as a single element:
|
||
|
||
@deffn Composite quote (@dots{})
|
||
@deffnx Composite dquote (@dots{})
|
||
@deffnx Composite dquote_elt (@dots{})
|
||
Takes any number of arguments, and adds quoting. With @code{quote},
|
||
only one level of quoting is added, effectively removing whitespace
|
||
after commas and turning multiple arguments into a single string. With
|
||
@code{dquote}, two levels of quoting are added, one around each element,
|
||
and one around the list. And with @code{dquote_elt}, two levels of
|
||
quoting are added around each element.
|
||
@end deffn
|
||
|
||
An actual implementation of these three macros is distributed as
|
||
@file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
|
||
let's examine their usage:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`quote.m4')
|
||
@result{}
|
||
-quote-dquote-dquote_elt-
|
||
@result{}----
|
||
-quote()-dquote()-dquote_elt()-
|
||
@result{}--`'-`'-
|
||
-quote(`1')-dquote(`1')-dquote_elt(`1')-
|
||
@result{}-1-`1'-`1'-
|
||
-quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
|
||
@result{}-1,2-`1',`2'-`1',`2'-
|
||
define(`n', `$#')dnl
|
||
-n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
|
||
@result{}-1-1-2-
|
||
dquote(dquote_elt(`1', `2'))
|
||
@result{}``1'',``2''
|
||
dquote_elt(dquote(`1', `2'))
|
||
@result{}``1',`2''
|
||
@end example
|
||
|
||
The last two lines show that when given two arguments, @code{dquote}
|
||
results in one string, while @code{dquote_elt} results in two. Now,
|
||
examine the implementation. Note that @code{quote} and
|
||
@code{dquote_elt} make decisions based on their number of arguments, so
|
||
that when called without arguments, they result in nothing instead of a
|
||
quoted empty string; this is so that it is possible to distinguish
|
||
between no arguments and an empty first argument. @code{dquote}, on the
|
||
other hand, results in a string no matter what, since it is still
|
||
possible to tell whether it was invoked without arguments based on the
|
||
resulting string.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`quote.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# quote(args) - convert args to single-quoted string
|
||
@result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
|
||
@result{}# dquote(args) - convert args to quoted list of quoted strings
|
||
@result{}define(`dquote', ``$@@'')
|
||
@result{}# dquote_elt(args) - convert args to list of double-quoted strings
|
||
@result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
|
||
@result{} ```$1'',$0(shift($@@))')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
It is worth pointing out that @samp{quote(@var{args})} is more efficient
|
||
than @samp{joinall(`,', @var{args})} for producing the same output.
|
||
|
||
@cindex nine arguments, more than
|
||
@cindex more than nine arguments
|
||
@cindex arguments, more than nine
|
||
One more useful macro based on @code{shift} allows portably selecting
|
||
an arbitrary argument (usually greater than the ninth argument), without
|
||
relying on the @acronym{GNU} extension of multi-digit arguments
|
||
(@pxref{Arguments}).
|
||
|
||
@deffn Composite argn (@var{n}, @dots{})
|
||
Expands to argument @var{n} out of the remaining arguments. @var{n}
|
||
must be a positive number. Usually invoked as
|
||
@samp{argn(`@var{n}',$@@)}.
|
||
@end deffn
|
||
|
||
It is implemented as:
|
||
|
||
@example
|
||
define(`argn', `ifelse(`$1', 1, ``$2'',
|
||
`argn(decr(`$1'), shift(shift($@@)))')')
|
||
@result{}
|
||
argn(`1', `a')
|
||
@result{}a
|
||
define(`foo', `argn(`11', $@@)')
|
||
@result{}
|
||
foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
|
||
@result{}k
|
||
@end example
|
||
|
||
@node Forloop
|
||
@section Iteration by counting
|
||
|
||
@cindex for loops
|
||
@cindex loops, counting
|
||
@cindex counting loops
|
||
Here is an example of a loop macro that implements a simple for loop.
|
||
|
||
@deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
|
||
Takes the name in @var{iterator}, which must be a valid macro name, and
|
||
successively assign it each integer value from @var{start} to @var{end},
|
||
inclusive. For each assignment to @var{iterator}, append @var{text} to
|
||
the expansion of the @code{forloop}. @var{text} may refer to
|
||
@var{iterator}. Any definition of @var{iterator} prior to this
|
||
invocation is restored.
|
||
@end deffn
|
||
|
||
It can, for example, be used for simple counting:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`forloop.m4')
|
||
@result{}
|
||
forloop(`i', `1', `8', `i ')
|
||
@result{}1 2 3 4 5 6 7 8@w{ }
|
||
@end example
|
||
|
||
For-loops can be nested, like:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`forloop.m4')
|
||
@result{}
|
||
forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
|
||
')
|
||
@result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
|
||
@result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
|
||
@result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
|
||
@result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
|
||
@result{}
|
||
@end example
|
||
|
||
The implementation of the @code{forloop} macro is fairly
|
||
straightforward. The @code{forloop} macro itself is simply a wrapper,
|
||
which saves the previous definition of the first argument, calls the
|
||
internal macro @code{@w{_forloop}}, and re-establishes the saved
|
||
definition of the first argument.
|
||
|
||
The macro @code{@w{_forloop}} expands the fourth argument once, and
|
||
tests to see if the iterator has reached the final value. If it has
|
||
not finished, it increments the iterator (using the predefined macro
|
||
@code{incr}, @pxref{Incr}), and recurses.
|
||
|
||
Here is an actual implementation of @code{forloop}, distributed as
|
||
@file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`forloop.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# forloop(var, from, to, stmt) - simple version
|
||
@result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
|
||
@result{}define(`_forloop',
|
||
@result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
Notice the careful use of quotes. Certain macro arguments are left
|
||
unquoted, each for its own reason. Try to find out @emph{why} these
|
||
arguments are left unquoted, and see what happens if they are quoted.
|
||
(As presented, these two macros are useful but not very robust for
|
||
general use. They lack even basic error handling for cases like
|
||
@var{start} less than @var{end}, @var{end} not numeric, or
|
||
@var{iterator} not being a macro name. See if you can improve these
|
||
macros; or @pxref{Improved forloop, , Answers}).
|
||
|
||
@node Foreach
|
||
@section Iteration by list contents
|
||
|
||
@cindex for each loops
|
||
@cindex loops, list iteration
|
||
@cindex iterating over lists
|
||
Here is an example of a loop macro that implements list iteration.
|
||
|
||
@deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
|
||
@deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
|
||
Takes the name in @var{iterator}, which must be a valid macro name, and
|
||
successively assign it each value from @var{paren-list} or
|
||
@var{quote-list}. In @code{foreach}, @var{paren-list} is a
|
||
comma-separated list of elements contained in parentheses. In
|
||
@code{foreachq}, @var{quote-list} is a comma-separated list of elements
|
||
contained in a quoted string. For each assignment to @var{iterator},
|
||
append @var{text} to the overall expansion. @var{text} may refer to
|
||
@var{iterator}. Any definition of @var{iterator} prior to this
|
||
invocation is restored.
|
||
@end deffn
|
||
|
||
As an example, this displays each word in a list inside of a sentence,
|
||
using an implementation of @code{foreach} distributed as
|
||
@file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
|
||
in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreach.m4')
|
||
@result{}
|
||
foreach(`x', (foo, bar, foobar), `Word was: x
|
||
')dnl
|
||
@result{}Word was: foo
|
||
@result{}Word was: bar
|
||
@result{}Word was: foobar
|
||
include(`foreachq.m4')
|
||
@result{}
|
||
foreachq(`x', `foo, bar, foobar', `Word was: x
|
||
')dnl
|
||
@result{}Word was: foo
|
||
@result{}Word was: bar
|
||
@result{}Word was: foobar
|
||
@end example
|
||
|
||
It is possible to be more complex; each element of the @var{paren-list}
|
||
or @var{quote-list} can itself be a list, to pass as further arguments
|
||
to a helper macro. This example generates a shell case statement:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreach.m4')
|
||
@result{}
|
||
define(`_case', ` $1)
|
||
$2=" $1";;
|
||
')dnl
|
||
define(`_cat', `$1$2')dnl
|
||
case $`'1 in
|
||
@result{}case $1 in
|
||
foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
|
||
`_cat(`_case', x)')dnl
|
||
@result{} a)
|
||
@result{} vara=" a";;
|
||
@result{} b)
|
||
@result{} varb=" b";;
|
||
@result{} c)
|
||
@result{} varc=" c";;
|
||
esac
|
||
@result{}esac
|
||
@end example
|
||
|
||
The implementation of the @code{foreach} macro is a bit more involved;
|
||
it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
|
||
needed to grab the first element of a list. Second,
|
||
@code{@w{_foreach}} implements the recursion, successively walking
|
||
through the original list. Here is a simple implementation of
|
||
@code{foreach}:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`foreach.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
|
||
@result{}# parenthesized list, simple version
|
||
@result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
|
||
@result{}define(`_arg1', `$1')
|
||
@result{}define(`_foreach', `ifelse(`$2', `()', `',
|
||
@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
Unfortunately, that implementation is not robust to macro names as list
|
||
elements. Each iteration of @code{@w{_foreach}} is stripping another
|
||
layer of quotes, leading to erratic results if list elements are not
|
||
already fully expanded. The first cut at implementing @code{foreachq}
|
||
takes this into account. Also, when using quoted elements in a
|
||
@var{paren-list}, the overall list must be quoted. A @var{quote-list}
|
||
has the nice property of requiring fewer characters to create a list
|
||
containing the same quoted elements. To see the difference between the
|
||
two macros, we attempt to pass double-quoted macro names in a list,
|
||
expecting the macro name on output after one layer of quotes is removed
|
||
during list iteration and the final layer removed during the final
|
||
rescan:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
define(`a', `1')define(`b', `2')define(`c', `3')
|
||
@result{}
|
||
include(`foreach.m4')
|
||
@result{}
|
||
include(`foreachq.m4')
|
||
@result{}
|
||
foreach(`x', `(``a'', ``(b'', ``c)'')', `x
|
||
')
|
||
@result{}1
|
||
@result{}(2)1
|
||
@result{}
|
||
@result{}, x
|
||
@result{})
|
||
foreachq(`x', ```a'', ``(b'', ``c)''', `x
|
||
')dnl
|
||
@result{}a
|
||
@result{}(b
|
||
@result{}c)
|
||
@end example
|
||
|
||
Obviously, @code{foreachq} did a better job; here is its implementation:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`foreachq.m4')dnl
|
||
@result{}include(`quote.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
||
@result{}# quoted list, simple version
|
||
@result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
|
||
@result{}define(`_arg1', `$1')
|
||
@result{}define(`_foreachq', `ifelse(quote($2), `', `',
|
||
@result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
Notice that @code{@w{_foreachq}} had to use the helper macro
|
||
@code{quote} defined earlier (@pxref{Shift}), to ensure that the
|
||
embedded @code{ifelse} call does not go haywire if a list element
|
||
contains a comma. Unfortunately, this implementation of @code{foreachq}
|
||
has its own severe flaw. Whereas the @code{foreach} implementation was
|
||
linear, this macro is quadratic in the number of list elements, and is
|
||
much more likely to trip up the limit set by the command line option
|
||
@option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
|
||
Invoking m4}). Additionally, this implementation does not expand
|
||
@samp{defn(`@var{iterator}')} very well, when compared with
|
||
@code{foreach}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreach.m4')include(`foreachq.m4')
|
||
@result{}
|
||
foreach(`name', `(`a', `b')', ` defn(`name')')
|
||
@result{} a b
|
||
foreachq(`name', ``a', `b'', ` defn(`name')')
|
||
@result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
|
||
@end example
|
||
|
||
It is possible to have robust iteration with linear behavior and sane
|
||
@var{iterator} contents for either list style. See if you can learn
|
||
from the best elements of both of these implementations to create robust
|
||
macros (or @pxref{Improved foreach, , Answers}).
|
||
|
||
@node Stacks
|
||
@section Working with definition stacks
|
||
|
||
@cindex definition stack
|
||
@cindex pushdef stack
|
||
@cindex stack, macro definition
|
||
Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
|
||
operation in @code{m4}. Normally, only the topmost definition in a
|
||
stack is important, but sometimes, it is desirable to manipulate the
|
||
entire definition stack.
|
||
|
||
@deffn Composite stack_foreach (@var{macro}, @var{action})
|
||
@deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
|
||
For each of the @code{pushdef} definitions associated with @var{macro},
|
||
invoke the macro @var{action} with a single argument of that definition.
|
||
@code{stack_foreach} visits the oldest definition first, while
|
||
@code{stack_foreach_lifo} visits the current definition first.
|
||
@var{action} should not modify or dereference @var{macro}. There are a
|
||
few special macros, such as @code{defn}, which cannot be used as the
|
||
@var{macro} parameter.
|
||
@end deffn
|
||
|
||
A sample implementation of these macros is distributed in the file
|
||
@file{m4-@value{VERSION}/@/examples/@/stack.m4}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`stack.m4')
|
||
@result{}
|
||
pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
|
||
@result{}
|
||
define(`show', ``$1'
|
||
')
|
||
@result{}
|
||
stack_foreach(`a', `show')dnl
|
||
@result{}1
|
||
@result{}2
|
||
@result{}3
|
||
stack_foreach_lifo(`a', `show')dnl
|
||
@result{}3
|
||
@result{}2
|
||
@result{}1
|
||
@end example
|
||
|
||
Now for the implementation. Note the definition of a helper macro,
|
||
@code{_stack_reverse}, which destructively swaps the contents of one
|
||
stack of definitions into the reverse order in the temporary macro
|
||
@samp{tmp-$1}. By calling the helper twice, the original order is
|
||
restored back into the macro @samp{$1}; since the operation is
|
||
destructive, this explains why @samp{$1} must not be modified or
|
||
dereferenced during the traversal. The caller can then inject
|
||
additional code to pass the definition currently being visited to
|
||
@samp{$2}. The choice of helper names is intentional; since @samp{-} is
|
||
not valid as part of a macro name, there is no risk of conflict with a
|
||
valid macro name, and the code is guaranteed to use @code{defn} where
|
||
necessary. Finally, note that any macro used in the traversal of a
|
||
@code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
|
||
handled by @code{stack_foreach}, since the macro would temporarily be
|
||
undefined during the algorithm.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`stack.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# stack_foreach(macro, action)
|
||
@result{}# Invoke ACTION with a single argument of each definition
|
||
@result{}# from the definition stack of MACRO, starting with the oldest.
|
||
@result{}define(`stack_foreach',
|
||
@result{}`_stack_reverse(`$1', `tmp-$1')'dnl
|
||
@result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
|
||
@result{}# stack_foreach_lifo(macro, action)
|
||
@result{}# Invoke ACTION with a single argument of each definition
|
||
@result{}# from the definition stack of MACRO, starting with the newest.
|
||
@result{}define(`stack_foreach_lifo',
|
||
@result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
|
||
@result{}`_stack_reverse(`tmp-$1', `$1')')
|
||
@result{}define(`_stack_reverse',
|
||
@result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
@node Composition
|
||
@section Building macros with macros
|
||
|
||
@cindex macro composition
|
||
@cindex composing macros
|
||
Since m4 is a macro language, it is possible to write macros that
|
||
can build other macros. First on the list is a way to automate the
|
||
creation of blind macros.
|
||
|
||
@cindex macro, blind
|
||
@cindex blind macro
|
||
@deffn Composite define_blind (@var{name}, @ovar{value})
|
||
Defines @var{name} as a blind macro, such that @var{name} will expand to
|
||
@var{value} only when given explicit arguments. @var{value} should not
|
||
be the result of @code{defn} (@pxref{Defn}). This macro is only
|
||
recognized with parameters, and results in an empty string.
|
||
@end deffn
|
||
|
||
Defining a macro to define another macro can be a bit tricky. We want
|
||
to use a literal @samp{$#} in the argument to the nested @code{define}.
|
||
However, if @samp{$} and @samp{#} are adjacent in the definition of
|
||
@code{define_blind}, then it would be expanded as the number of
|
||
arguments to @code{define_blind} rather than the intended number of
|
||
arguments to @var{name}. The solution is to pass the difficult
|
||
characters through extra arguments to a helper macro
|
||
@code{_define_blind}. When composing macros, it is a common idiom to
|
||
need a helper macro to concatenate text that forms parameters in the
|
||
composed macro, rather than interpreting the text as a parameter of the
|
||
composing macro.
|
||
|
||
As for the limitation against using @code{defn}, there are two reasons.
|
||
If a macro was previously defined with @code{define_blind}, then it can
|
||
safely be renamed to a new blind macro using plain @code{define}; using
|
||
@code{define_blind} to rename it just adds another layer of
|
||
@code{ifelse}, occupying memory and slowing down execution. And if a
|
||
macro is a builtin, then it would result in an attempt to define a macro
|
||
consisting of both text and a builtin token; this is not supported, and
|
||
the builtin token is flattened to an empty string.
|
||
|
||
With that explanation, here's the definition, and some sample usage.
|
||
Notice that @code{define_blind} is itself a blind macro.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
define(`define_blind', `ifelse(`$#', `0', ``$0'',
|
||
`_$0(`$1', `$2', `$'`#', `$'`0')')')
|
||
@result{}
|
||
define(`_define_blind', `define(`$1',
|
||
`ifelse(`$3', `0', ``$4'', `$2')')')
|
||
@result{}
|
||
define_blind
|
||
@result{}define_blind
|
||
define_blind(`foo', `arguments were $*')
|
||
@result{}
|
||
foo
|
||
@result{}foo
|
||
foo(`bar')
|
||
@result{}arguments were bar
|
||
define(`blah', defn(`foo'))
|
||
@result{}
|
||
blah
|
||
@result{}blah
|
||
blah(`a', `b')
|
||
@result{}arguments were a,b
|
||
defn(`blah')
|
||
@result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
|
||
@end example
|
||
|
||
@cindex currying arguments
|
||
@cindex argument currying
|
||
Another interesting composition tactic is argument @dfn{currying}, or
|
||
factoring a macro that takes multiple arguments for use in a context
|
||
that provides exactly one argument.
|
||
|
||
@deffn Composite curry (@var{macro}, @dots{})
|
||
Expand to a macro call that takes exactly one argument, then appends
|
||
that argument to the original arguments and invokes @var{macro} with the
|
||
resulting list of arguments.
|
||
@end deffn
|
||
|
||
A demonstration of currying makes the intent of this macro a little more
|
||
obvious. The macro @code{stack_foreach} mentioned earlier is an example
|
||
of a context that provides exactly one argument to a macro name. But
|
||
coupled with currying, we can invoke @code{reverse} with two arguments
|
||
for each definition of a macro stack. This example uses the file
|
||
@file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
|
||
distribution.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`curry.m4')include(`stack.m4')
|
||
@result{}
|
||
define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
|
||
`reverse(shift($@@)), `$1'')')
|
||
@result{}
|
||
pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
|
||
@result{}
|
||
stack_foreach(`a', `:curry(`reverse', `4')')
|
||
@result{}:1, 4:2, 4:3, 4
|
||
curry(`curry', `reverse', `1')(`2')(`3')
|
||
@result{}3, 2, 1
|
||
@end example
|
||
|
||
Now for the implementation. Notice how @code{curry} leaves off with a
|
||
macro name but no open parenthesis, while still in the middle of
|
||
collecting arguments for @samp{$1}. The macro @code{_curry} is the
|
||
helper macro that takes one argument, then adds it to the list and
|
||
finally supplies the closing parenthesis. The use of a comma inside the
|
||
@code{shift} call allows currying to also work for a macro that takes
|
||
one argument, although it often makes more sense to invoke that macro
|
||
directly rather than going through @code{curry}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`curry.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# curry(macro, args)
|
||
@result{}# Expand to a macro call that takes one argument, then invoke
|
||
@result{}# macro(args, extra).
|
||
@result{}define(`curry', `$1(shift($@@,)_$0')
|
||
@result{}define(`_curry', ``$1')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
|
||
tokens, which are silently flattened to the empty string when passed
|
||
through another text macro. This limitation will be lifted in a future
|
||
release of M4.
|
||
|
||
@cindex renaming macros
|
||
@cindex copying macros
|
||
@cindex macros, copying
|
||
Putting the last few concepts together, it is possible to copy or rename
|
||
an entire stack of macro definitions.
|
||
|
||
@deffn Composite copy (@var{source}, @var{dest})
|
||
@deffnx Composite rename (@var{source}, @var{dest})
|
||
Ensure that @var{dest} is undefined, then define it to the same stack of
|
||
definitions currently in @var{source}. @code{copy} leaves @var{source}
|
||
unchanged, while @code{rename} undefines @var{source}. There are only a
|
||
few macros, such as @code{copy} or @code{defn}, which cannot be copied
|
||
via this macro.
|
||
@end deffn
|
||
|
||
The implementation is relatively straightforward (although since it uses
|
||
@code{curry}, it is unable to copy builtin macros, such as the second
|
||
definition of @code{a} as a synonym for @code{divnum}. See if you can
|
||
design a version that works around this limitation, or @pxref{Improved
|
||
copy, , Answers}).
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`curry.m4')include(`stack.m4')
|
||
@result{}
|
||
define(`rename', `copy($@@)undefine(`$1')')dnl
|
||
define(`copy', `ifdef(`$2', `errprint(`$2 already defined
|
||
')m4exit(`1')',
|
||
`stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
|
||
pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
|
||
@result{}
|
||
copy(`a', `b')
|
||
@result{}
|
||
rename(`b', `c')
|
||
@result{}
|
||
a b c
|
||
@result{}2 b 2
|
||
popdef(`a', `c')c a
|
||
@result{} 0
|
||
popdef(`a', `c')a c
|
||
@result{}1 1
|
||
@end example
|
||
|
||
@node Debugging
|
||
@chapter How to debug macros and input
|
||
|
||
@cindex debugging macros
|
||
@cindex macros, debugging
|
||
When writing macros for @code{m4}, they often do not work as intended on
|
||
the first try (as is the case with most programming languages).
|
||
Fortunately, there is support for macro debugging in @code{m4}.
|
||
|
||
@menu
|
||
* Dumpdef:: Displaying macro definitions
|
||
* Trace:: Tracing macro calls
|
||
* Debug Levels:: Controlling debugging output
|
||
* Debug Output:: Saving debugging output
|
||
@end menu
|
||
|
||
@node Dumpdef
|
||
@section Displaying macro definitions
|
||
|
||
@cindex displaying macro definitions
|
||
@cindex macros, displaying definitions
|
||
@cindex definitions, displaying macro
|
||
@cindex standard error, output to
|
||
If you want to see what a name expands into, you can use the builtin
|
||
@code{dumpdef}:
|
||
|
||
@deffn Builtin dumpdef (@ovar{names@dots{}})
|
||
Accepts any number of arguments. If called without any arguments,
|
||
it displays the definitions of all known names, otherwise it displays
|
||
the definitions of the @var{names} given. The output is printed to the
|
||
current debug file (usually standard error), and is sorted by name. If
|
||
an unknown name is encountered, a warning is printed.
|
||
|
||
The expansion of @code{dumpdef} is void.
|
||
@end deffn
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
define(`foo', `Hello world.')
|
||
@result{}
|
||
dumpdef(`foo')
|
||
@error{}foo:@tabchar{}`Hello world.'
|
||
@result{}
|
||
dumpdef(`define')
|
||
@error{}define:@tabchar{}<define>
|
||
@result{}
|
||
@end example
|
||
|
||
The last example shows how builtin macros definitions are displayed.
|
||
The definition that is dumped corresponds to what would occur if the
|
||
macro were to be called at that point, even if other definitions are
|
||
still live due to redefining a macro during argument collection.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
|
||
@result{}
|
||
f(popdef(`f')dumpdef(`f'))
|
||
@error{}f:@tabchar{}``$0'1'
|
||
@result{}f2
|
||
f(popdef(`f')dumpdef(`f'))
|
||
@error{}m4:stdin:3: undefined macro `f'
|
||
@result{}f1
|
||
@end example
|
||
|
||
@xref{Debug Levels}, for information on controlling the details of the
|
||
display.
|
||
|
||
@node Trace
|
||
@section Tracing macro calls
|
||
|
||
@cindex tracing macro expansion
|
||
@cindex macro expansion, tracing
|
||
@cindex expansion, tracing macro
|
||
@cindex standard error, output to
|
||
It is possible to trace macro calls and expansions through the builtins
|
||
@code{traceon} and @code{traceoff}:
|
||
|
||
@deffn Builtin traceon (@ovar{names@dots{}})
|
||
@deffnx Builtin traceoff (@ovar{names@dots{}})
|
||
When called without any arguments, @code{traceon} and @code{traceoff}
|
||
will turn tracing on and off, respectively, for all currently defined
|
||
macros.
|
||
|
||
When called with arguments, only the macros listed in @var{names} are
|
||
affected, whether or not they are currently defined.
|
||
|
||
The expansion of @code{traceon} and @code{traceoff} is void.
|
||
@end deffn
|
||
|
||
Whenever a traced macro is called and the arguments have been collected,
|
||
the call is displayed. If the expansion of the macro call is not void,
|
||
the expansion can be displayed after the call. The output is printed
|
||
to the current debug file (defaulting to standard error, @pxref{Debug
|
||
Output}).
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
define(`foo', `Hello World.')
|
||
@result{}
|
||
define(`echo', `$@@')
|
||
@result{}
|
||
traceon(`foo', `echo')
|
||
@result{}
|
||
foo
|
||
@error{}m4trace: -1- foo -> `Hello World.'
|
||
@result{}Hello World.
|
||
echo(`gnus', `and gnats')
|
||
@error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
|
||
@result{}gnus,and gnats
|
||
@end example
|
||
|
||
The number between dashes is the depth of the expansion. It is one most
|
||
of the time, signifying an expansion at the outermost level, but it
|
||
increases when macro arguments contain unquoted macro calls. The
|
||
maximum number that will appear between dashes is controlled by the
|
||
option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
|
||
, Invoking m4}). Additionally, the option @option{--trace} (or
|
||
@option{-t}) can be used to invoke @code{traceon(@var{name})} before
|
||
parsing input.
|
||
|
||
@comment The explicit -dp neutralizes the testsuite default of -d.
|
||
@comment options: -dp -L3 -tifelse
|
||
@comment status: 1
|
||
@example
|
||
$ @kbd{m4 -L 3 -t ifelse}
|
||
ifelse(`one level')
|
||
@error{}m4trace: -1- ifelse
|
||
@result{}
|
||
ifelse(ifelse(ifelse(`three levels')))
|
||
@error{}m4trace: -3- ifelse
|
||
@error{}m4trace: -2- ifelse
|
||
@error{}m4trace: -1- ifelse
|
||
@result{}
|
||
ifelse(ifelse(ifelse(ifelse(`four levels'))))
|
||
@error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
|
||
@end example
|
||
|
||
Tracing by name is an attribute that is preserved whether the macro is
|
||
defined or not. This allows the selection of macros to trace before
|
||
those macros are defined.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
traceoff(`foo')
|
||
@result{}
|
||
traceon(`foo')
|
||
@result{}
|
||
foo
|
||
@result{}foo
|
||
defn(`foo')
|
||
@result{}
|
||
define(`foo', `bar')
|
||
@result{}
|
||
foo
|
||
@error{}m4trace: -1- foo -> `bar'
|
||
@result{}bar
|
||
undefine(`foo')
|
||
@result{}
|
||
ifdef(`foo', `yes', `no')
|
||
@result{}no
|
||
indir(`foo')
|
||
@error{}m4:stdin:9: undefined macro `foo'
|
||
@result{}
|
||
define(`foo', `blah')
|
||
@result{}
|
||
foo
|
||
@error{}m4trace: -1- foo -> `blah'
|
||
@result{}blah
|
||
traceoff
|
||
@result{}
|
||
foo
|
||
@result{}blah
|
||
@end example
|
||
|
||
Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
|
||
does not transfer tracing status.
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
traceon(`traceon')
|
||
@result{}
|
||
traceon(`traceoff')
|
||
@error{}m4trace: -1- traceon(`traceoff')
|
||
@result{}
|
||
traceoff(`traceoff')
|
||
@error{}m4trace: -1- traceoff(`traceoff')
|
||
@result{}
|
||
traceoff(`traceon')
|
||
@result{}
|
||
traceon(`eval', `m4_divnum')
|
||
@result{}
|
||
define(`m4_eval', defn(`eval'))
|
||
@result{}
|
||
define(`m4_divnum', defn(`divnum'))
|
||
@result{}
|
||
eval(divnum)
|
||
@error{}m4trace: -1- eval(`0') -> `0'
|
||
@result{}0
|
||
m4_eval(m4_divnum)
|
||
@error{}m4trace: -2- m4_divnum -> `0'
|
||
@result{}0
|
||
@end example
|
||
|
||
@xref{Debug Levels}, for information on controlling the details of the
|
||
display. The format of the trace output is not specified by
|
||
@acronym{POSIX}, and varies between implementations of @code{m4}.
|
||
|
||
@ignore
|
||
@comment not worth including in the manual, but this tests a trace code
|
||
@comment path that was temporarily broken
|
||
@comment options: -de --trace ifelse
|
||
@example
|
||
$ @kbd{m4 -de --trace ifelse}
|
||
define(`e', `ifelse(`$1', `$2', `ifelse(`$1', `$2', `e(shift($@@))')')')
|
||
@result{}
|
||
e(`1', `1')
|
||
@error{}m4trace: -1- ifelse -> ifelse(`1', `1', `e(shift(`1',`1'))')
|
||
@error{}m4trace: -1- ifelse -> e(shift(`1',`1'))
|
||
@error{}m4trace: -1- ifelse
|
||
@result{}
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Debug Levels
|
||
@section Controlling debugging output
|
||
|
||
@cindex controlling debugging output
|
||
@cindex debugging output, controlling
|
||
The @option{-d} option to @code{m4} (or @option{--debug},
|
||
@pxref{Debugging options, , Invoking m4}) controls the amount of details
|
||
presented in three
|
||
categories of output. Trace output is requested by @code{traceon}
|
||
(@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
|
||
relation to a macro invocation. Debug output tracks useful events not
|
||
associated with a macro invocation, and each line is prefixed by
|
||
@samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
|
||
affected, with no prefix added to the output lines.
|
||
|
||
The @var{flags} following the option can be one or more of the
|
||
following:
|
||
|
||
@table @code
|
||
@item a
|
||
In trace output, show the actual arguments that were collected before
|
||
invoking the macro. This applies to all macro calls if the @samp{t}
|
||
flag is used, otherwise only the macros covered by calls of
|
||
@code{traceon}. Arguments are subject to length truncation specified by
|
||
the command line option @option{--arglength} (or @option{-l}).
|
||
|
||
@item c
|
||
In trace output, show several trace lines for each macro call. A line
|
||
is shown when the macro is seen, but before the arguments are collected;
|
||
a second line when the arguments have been collected and a third line
|
||
after the call has completed.
|
||
|
||
@item e
|
||
In trace output, show the expansion of each macro call, if it is not
|
||
void. This applies to all macro calls if the @samp{t} flag is used,
|
||
otherwise only the macros covered by calls of @code{traceon}. The
|
||
expansion is subject to length truncation specified by the command line
|
||
option @option{--arglength} (or @option{-l}).
|
||
|
||
@item f
|
||
In debug and trace output, include the name of the current input file in
|
||
the output line.
|
||
|
||
@item i
|
||
In debug output, print a message each time the current input file is
|
||
changed.
|
||
|
||
@item l
|
||
In debug and trace output, include the current input line number in the
|
||
output line.
|
||
|
||
@item p
|
||
In debug output, print a message when a named file is found through the
|
||
path search mechanism (@pxref{Search Path}), giving the actual file name
|
||
used.
|
||
|
||
@item q
|
||
In trace and dumpdef output, quote actual arguments and macro expansions
|
||
in the display with the current quotes. This is useful in connection
|
||
with the @samp{a} and @samp{e} flags above.
|
||
|
||
@item t
|
||
In trace output, trace all macro calls made in this invocation of
|
||
@code{m4}, regardless of the settings of @code{traceon}.
|
||
|
||
@item x
|
||
In trace output, add a unique `macro call id' to each line of the trace
|
||
output. This is useful in connection with the @samp{c} flag above.
|
||
|
||
@item V
|
||
A shorthand for all of the above flags.
|
||
@end table
|
||
|
||
If no flags are specified with the @option{-d} option, the default is
|
||
@samp{aeq}. The examples throughout this manual assume the default
|
||
flags.
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
There is a builtin macro @code{debugmode}, which allows on-the-fly control of
|
||
the debugging output format:
|
||
|
||
@deffn Builtin debugmode (@ovar{flags})
|
||
The argument @var{flags} should be a subset of the letters listed above.
|
||
As special cases, if the argument starts with a @samp{+}, the flags are
|
||
added to the current debug flags, and if it starts with a @samp{-}, they
|
||
are removed. If no argument is present, all debugging flags are cleared
|
||
(as if no @option{-d} was given), and with an empty argument the flags
|
||
are reset to the default of @samp{aeq}.
|
||
|
||
The expansion of @code{debugmode} is void.
|
||
@end deffn
|
||
|
||
@comment The explicit -dp neutralizes the testsuite default of -d.
|
||
@comment options: -dp
|
||
@example
|
||
$ @kbd{m4}
|
||
define(`foo', `FOO')
|
||
@result{}
|
||
traceon(`foo')
|
||
@result{}
|
||
debugmode()
|
||
@result{}
|
||
foo
|
||
@error{}m4trace: -1- foo -> `FOO'
|
||
@result{}FOO
|
||
debugmode
|
||
@result{}
|
||
foo
|
||
@error{}m4trace: -1- foo
|
||
@result{}FOO
|
||
debugmode(`+l')
|
||
@result{}
|
||
foo
|
||
@error{}m4trace:8: -1- foo
|
||
@result{}FOO
|
||
@end example
|
||
|
||
The following example demonstrates the behavior of length truncation,
|
||
when specified on the command line. Note that each argument and the
|
||
final result are individually truncated. Also, the special tokens for
|
||
builtin functions are not truncated.
|
||
|
||
@comment options: -l6
|
||
@example
|
||
$ @kbd{m4 -d -l 6}
|
||
define(`echo', `$@@')debugmode(`+t')
|
||
@result{}
|
||
echo(`1', `long string')
|
||
@error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
|
||
@result{}1,long string
|
||
indir(`echo', defn(`changequote'))
|
||
@error{}m4trace: -2- defn(`change...')
|
||
@error{}m4trace: -1- indir(`echo', <changequote>) -> ``''
|
||
@result{}
|
||
@end example
|
||
|
||
This example shows the effects of the debug flags that are not related
|
||
to macro tracing.
|
||
|
||
@comment examples
|
||
@comment options: -dip
|
||
@example
|
||
$ @kbd{m4 -dip -I examples}
|
||
@error{}m4debug: input read from stdin
|
||
include(`foo')dnl
|
||
@error{}m4debug: path search for `foo' found `examples/foo'
|
||
@error{}m4debug: input read from examples/foo
|
||
@result{}bar
|
||
@error{}m4debug: input reverted to stdin, line 1
|
||
^D
|
||
@error{}m4debug: input exhausted
|
||
@end example
|
||
|
||
@node Debug Output
|
||
@section Saving debugging output
|
||
|
||
@cindex saving debugging output
|
||
@cindex debugging output, saving
|
||
@cindex output, saving debugging
|
||
@cindex @acronym{GNU} extensions
|
||
Debug and tracing output can be redirected to files using either the
|
||
@option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
|
||
Invoking m4}), or with the builtin macro @code{debugfile}:
|
||
|
||
@deffn Builtin debugfile (@ovar{file})
|
||
Sends all further debug and trace output to @var{file}, opened in append
|
||
mode. If @var{file} is the empty string, debug and trace output are
|
||
discarded. If @code{debugfile} is called without any arguments, debug
|
||
and trace output are sent to standard error. This does not affect
|
||
warnings, error messages, or @code{errprint} output, which are
|
||
always sent to standard error. If @var{file} cannot be opened, the
|
||
current debug file is unchanged, and an error is issued.
|
||
|
||
The expansion of @code{debugfile} is void.
|
||
@end deffn
|
||
|
||
@example
|
||
$ @kbd{m4 -d}
|
||
traceon(`divnum')
|
||
@result{}
|
||
divnum(`extra')
|
||
@error{}m4:stdin:2: Warning: excess arguments to builtin `divnum' ignored
|
||
@error{}m4trace: -1- divnum(`extra') -> `0'
|
||
@result{}0
|
||
debugfile()
|
||
@result{}
|
||
divnum(`extra')
|
||
@error{}m4:stdin:4: Warning: excess arguments to builtin `divnum' ignored
|
||
@result{}0
|
||
debugfile
|
||
@result{}
|
||
divnum
|
||
@error{}m4trace: -1- divnum -> `0'
|
||
@result{}0
|
||
@end example
|
||
|
||
@node Input Control
|
||
@chapter Input control
|
||
|
||
This chapter describes various builtin macros for controlling the input
|
||
to @code{m4}.
|
||
|
||
@menu
|
||
* Dnl:: Deleting whitespace in input
|
||
* Changequote:: Changing the quote characters
|
||
* Changecom:: Changing the comment delimiters
|
||
* Changeword:: Changing the lexical structure of words
|
||
* M4wrap:: Saving text until end of input
|
||
@end menu
|
||
|
||
@node Dnl
|
||
@section Deleting whitespace in input
|
||
|
||
@cindex deleting whitespace in input
|
||
@cindex discarding input
|
||
@cindex input, discarding
|
||
The builtin @code{dnl} stands for ``Discard to Next Line'':
|
||
|
||
@deffn Builtin dnl
|
||
All characters, up to and including the next newline, are discarded
|
||
without performing any macro expansion. A warning is issued if the end
|
||
of the file is encountered without a newline.
|
||
|
||
The expansion of @code{dnl} is void.
|
||
@end deffn
|
||
|
||
It is often used in connection with @code{define}, to remove the
|
||
newline that follows the call to @code{define}. Thus
|
||
|
||
@example
|
||
define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
|
||
foo
|
||
@result{}Macro foo.
|
||
@end example
|
||
|
||
The input up to and including the next newline is discarded, as opposed
|
||
to the way comments are treated (@pxref{Comments}).
|
||
|
||
Usually, @code{dnl} is immediately followed by an end of line or some
|
||
other whitespace. @acronym{GNU} @code{m4} will produce a warning diagnostic if
|
||
@code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
|
||
will collect and process all arguments, looking for a matching close
|
||
parenthesis. All predictable side effects resulting from this
|
||
collection will take place. @code{dnl} will return no output. The
|
||
input following the matching close parenthesis up to and including the
|
||
next newline, on whatever line containing it, will still be discarded.
|
||
|
||
@example
|
||
dnl(`args are ignored, but side effects occur',
|
||
define(`foo', `like this')) while this text is ignored: undefine(`foo')
|
||
@error{}m4:stdin:1: Warning: excess arguments to builtin `dnl' ignored
|
||
See how `foo' was defined, foo?
|
||
@result{}See how foo was defined, like this?
|
||
@end example
|
||
|
||
If the end of file is encountered without a newline character, a
|
||
warning is issued and dnl stops consuming input.
|
||
|
||
@example
|
||
m4wrap(`m4wrap(`2 hi
|
||
')0 hi dnl 1 hi')
|
||
@result{}
|
||
define(`hi', `HI')
|
||
@result{}
|
||
^D
|
||
@error{}m4:stdin:1: Warning: end of file treated as newline
|
||
@result{}0 HI 2 HI
|
||
@end example
|
||
|
||
@node Changequote
|
||
@section Changing the quote characters
|
||
|
||
@cindex changing quote delimiters
|
||
@cindex quote delimiters, changing
|
||
@cindex delimiters, changing
|
||
The default quote delimiters can be changed with the builtin
|
||
@code{changequote}:
|
||
|
||
@deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
|
||
This sets @var{start} as the new begin-quote delimiter and @var{end} as
|
||
the new end-quote delimiter. If both arguments are missing, the default
|
||
quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
|
||
quoting is disabled. Otherwise, if @var{end} is missing or void, the
|
||
default end-quote delimiter (@code{'}) is used. The quote delimiters
|
||
can be of any length.
|
||
|
||
The expansion of @code{changequote} is void.
|
||
@end deffn
|
||
|
||
@example
|
||
changequote(`[', `]')
|
||
@result{}
|
||
define([foo], [Macro [foo].])
|
||
@result{}
|
||
foo
|
||
@result{}Macro foo.
|
||
@end example
|
||
|
||
The quotation strings can safely contain eight-bit characters.
|
||
@ignore
|
||
@comment Yuck. I know of no clean way to render an 8-bit character in
|
||
@comment both info and dvi. This example uses the `open-guillemot' and
|
||
@comment `close-guillemot' characters of the Latin-1 character set.
|
||
|
||
@example
|
||
define(`a', `b')
|
||
@result{}
|
||
<EFBFBD>a<EFBFBD>
|
||
@result{}<7D>b<EFBFBD>
|
||
changequote(`<60>', `<60>')
|
||
@result{}
|
||
<EFBFBD>a<EFBFBD>
|
||
@result{}a
|
||
@end example
|
||
@end ignore
|
||
If no single character is appropriate, @var{start} and @var{end} can be
|
||
of any length. Other implementations cap the delimiter length to five
|
||
characters, but @acronym{GNU} has no inherent limit.
|
||
|
||
@example
|
||
changequote(`[[[', `]]]')
|
||
@result{}
|
||
define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
|
||
@result{}
|
||
foo
|
||
@result{}Macro [[foo]].
|
||
@end example
|
||
|
||
Calling @code{changequote} with @var{start} as the empty string will
|
||
effectively disable the quoting mechanism, leaving no way to quote text.
|
||
However, using an empty string is not portable, as some other
|
||
implementations of @code{m4} revert to the default quoting, while others
|
||
preserve the prior non-empty delimiter. If @var{start} is not empty,
|
||
then an empty @var{end} will use the default end-quote delimiter of
|
||
@samp{'}, as otherwise, it would be impossible to end a quoted string.
|
||
Again, this is not portable, as some other @code{m4} implementations
|
||
reuse @var{start} as the end-quote delimiter, while others preserve the
|
||
previous non-empty value. Omitting both arguments restores the default
|
||
begin-quote and end-quote delimiters; fortunately this behavior is
|
||
portable to all implementations of @code{m4}.
|
||
|
||
@example
|
||
define(`foo', `Macro `FOO'.')
|
||
@result{}
|
||
changequote(`', `')
|
||
@result{}
|
||
foo
|
||
@result{}Macro `FOO'.
|
||
`foo'
|
||
@result{}`Macro `FOO'.'
|
||
changequote(`,)
|
||
@result{}
|
||
foo
|
||
@result{}Macro FOO.
|
||
@end example
|
||
|
||
There is no way in @code{m4} to quote a string containing an unmatched
|
||
begin-quote, except using @code{changequote} to change the current
|
||
quotes.
|
||
|
||
If the quotes should be changed from, say, @samp{[} to @samp{[[},
|
||
temporary quote characters have to be defined. To achieve this, two
|
||
calls of @code{changequote} must be made, one for the temporary quotes
|
||
and one for the new quotes.
|
||
|
||
Macros are recognized in preference to the begin-quote string, so if a
|
||
prefix of @var{start} can be recognized as part of a potential macro
|
||
name, the quoting mechanism is effectively disabled. Unless you use
|
||
@code{changeword} (@pxref{Changeword}), this means that @var{start}
|
||
should not begin with a letter, digit, or @samp{_} (underscore).
|
||
However, even though quoted strings are not recognized, the quote
|
||
characters can still be discerned in macro expansion and in trace
|
||
output.
|
||
|
||
@example
|
||
define(`echo', `$@@')
|
||
@result{}
|
||
define(`hi', `HI')
|
||
@result{}
|
||
changequote(`q', `Q')
|
||
@result{}
|
||
q hi Q hi
|
||
@result{}q HI Q HI
|
||
echo(hi)
|
||
@result{}qHIQ
|
||
changequote
|
||
@result{}
|
||
changequote(`-', `EOF')
|
||
@result{}
|
||
- hi EOF hi
|
||
@result{} hi HI
|
||
changequote
|
||
@result{}
|
||
changequote(`1', `2')
|
||
@result{}
|
||
hi1hi2
|
||
@result{}hi1hi2
|
||
hi 1hi2
|
||
@result{}HI hi
|
||
@end example
|
||
|
||
Quotes are recognized in preference to argument collection. In
|
||
particular, if @var{start} is a single @samp{(}, then argument
|
||
collection is effectively disabled. For portability with other
|
||
implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
|
||
@samp{)} as the first character in @var{start}.
|
||
|
||
@example
|
||
define(`echo', `$#:$@@:')
|
||
@result{}
|
||
define(`hi', `HI')
|
||
@result{}
|
||
changequote(`(',`)')
|
||
@result{}
|
||
echo(hi)
|
||
@result{}0::hi
|
||
changequote
|
||
@result{}
|
||
changequote(`((', `))')
|
||
@result{}
|
||
echo(hi)
|
||
@result{}1:HI:
|
||
echo((hi))
|
||
@result{}0::hi
|
||
changequote
|
||
@result{}
|
||
changequote(`,', `)')
|
||
@result{}
|
||
echo(hi,hi)bye)
|
||
@result{}1:HIhibye:
|
||
@end example
|
||
|
||
However, if you are not worried about portability, using @samp{(} and
|
||
@samp{)} as quoting characters has an interesting property---you can use
|
||
it to compute a quoted string containing the expansion of any quoted
|
||
text, as long as the expansion results in both balanced quotes and
|
||
balanced parentheses. The trick is realizing @code{expand} uses
|
||
@samp{$1} unquoted, to trigger its expansion using the normal quoting
|
||
characters, but uses extra parentheses to group unquoted commas that
|
||
occur in the expansion without consuming whitespace following those
|
||
commas. Then @code{_expand} uses @code{changequote} to convert the
|
||
extra parentheses back into quoting characters. Note that it takes two
|
||
more @code{changequote} invocations to restore the original quotes.
|
||
Contrast the behavior on whitespace when using @samp{$*}, via
|
||
@code{quote}, to attempt the same task.
|
||
|
||
@example
|
||
changequote(`[', `]')dnl
|
||
define([a], [1, (b)])dnl
|
||
define([b], [2])dnl
|
||
define([quote], [[$*]])dnl
|
||
define([expand], [_$0(($1))])dnl
|
||
define([_expand],
|
||
[changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
|
||
expand([a, a, [a, a], [[a, a]]])
|
||
@result{}1, (2), 1, (2), a, a, [a, a]
|
||
quote(a, a, [a, a], [[a, a]])
|
||
@result{}1,(2),1,(2),a, a,[a, a]
|
||
@end example
|
||
|
||
If @var{end} is a prefix of @var{start}, the end-quote will be
|
||
recognized in preference to a nested begin-quote. In particular,
|
||
changing the quotes to have the same string for @var{start} and
|
||
@var{end} disables nesting of quotes. When quote nesting is disabled,
|
||
it is impossible to double-quote strings across macro expansions, so
|
||
using the same string is not done very often.
|
||
|
||
@example
|
||
define(`hi', `HI')
|
||
@result{}
|
||
changequote(`""', `"')
|
||
@result{}
|
||
""hi"""hi"
|
||
@result{}hihi
|
||
""hi" ""hi"
|
||
@result{}hi hi
|
||
""hi"" "hi"
|
||
@result{}hi" "HI"
|
||
changequote
|
||
@result{}
|
||
`hi`hi'hi'
|
||
@result{}hi`hi'hi
|
||
changequote(`"', `"')
|
||
@result{}
|
||
"hi"hi"hi"
|
||
@result{}hiHIhi
|
||
@end example
|
||
|
||
@ignore
|
||
@comment And another stress test, not worth documenting in the manual.
|
||
@example
|
||
define(`aaaaaaaaaaaaaaaaaaaa', `A')define(`q', `"$@@"')
|
||
@result{}
|
||
changequote(`"', `"')
|
||
@result{}
|
||
q(q("aaaaaaaaaaaaaaaaaaaa", "a"))
|
||
@result{}A,a
|
||
@end example
|
||
@end ignore
|
||
|
||
It is an error if the end of file occurs within a quoted string.
|
||
|
||
@comment status: 1
|
||
@example
|
||
`hello world'
|
||
@result{}hello world
|
||
`dangling quote
|
||
^D
|
||
@error{}m4:stdin:2: ERROR: end of file in string
|
||
@end example
|
||
|
||
@comment status: 1
|
||
@example
|
||
ifelse(`dangling quote
|
||
^D
|
||
@error{}m4:stdin:1: ERROR: end of file in string
|
||
@end example
|
||
|
||
@node Changecom
|
||
@section Changing the comment delimiters
|
||
|
||
@cindex changing comment delimiters
|
||
@cindex comment delimiters, changing
|
||
@cindex delimiters, changing
|
||
The default comment delimiters can be changed with the builtin
|
||
macro @code{changecom}:
|
||
|
||
@deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
|
||
This sets @var{start} as the new begin-comment delimiter and @var{end}
|
||
as the new end-comment delimiter. If both arguments are missing, or
|
||
@var{start} is void, then comments are disabled. Otherwise, if
|
||
@var{end} is missing or void, the default end-comment delimiter of
|
||
newline is used. The comment delimiters can be of any length.
|
||
|
||
The expansion of @code{changecom} is void.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`comment', `COMMENT')
|
||
@result{}
|
||
# A normal comment
|
||
@result{}# A normal comment
|
||
changecom(`/*', `*/')
|
||
@result{}
|
||
# Not a comment anymore
|
||
@result{}# Not a COMMENT anymore
|
||
But: /* this is a comment now */ while this is not a comment
|
||
@result{}But: /* this is a comment now */ while this is not a COMMENT
|
||
@end example
|
||
|
||
@cindex comments, copied to output
|
||
Note how comments are copied to the output, much as if they were quoted
|
||
strings. If you want the text inside a comment expanded, quote the
|
||
begin-comment delimiter.
|
||
|
||
Calling @code{changecom} without any arguments, or with @var{start} as
|
||
the empty string, will effectively disable the commenting mechanism. To
|
||
restore the original comment start of @samp{#}, you must explicitly ask
|
||
for it. If @var{start} is not empty, then an empty @var{end} will use
|
||
the default end-comment delimiter of newline, as otherwise, it would be
|
||
impossible to end a comment. However, this is not portable, as some
|
||
other @code{m4} implementations preserve the previous non-empty
|
||
delimiters instead.
|
||
|
||
@example
|
||
define(`comment', `COMMENT')
|
||
@result{}
|
||
changecom
|
||
@result{}
|
||
# Not a comment anymore
|
||
@result{}# Not a COMMENT anymore
|
||
changecom(`#', `')
|
||
@result{}
|
||
# comment again
|
||
@result{}# comment again
|
||
@end example
|
||
|
||
The comment strings can safely contain eight-bit characters.
|
||
@ignore
|
||
@comment Yuck. I know of no clean way to render an 8-bit character in
|
||
@comment both info and dvi. This example uses the `open-guillemot' and
|
||
@comment `close-guillemot' characters of the Latin-1 character set.
|
||
|
||
@example
|
||
define(`a', `b')
|
||
@result{}
|
||
<EFBFBD>a<EFBFBD>
|
||
@result{}<7D>b<EFBFBD>
|
||
changecom(`<60>', `<60>')
|
||
@result{}
|
||
<EFBFBD>a<EFBFBD>
|
||
@result{}<7D>a<EFBFBD>
|
||
@end example
|
||
@end ignore
|
||
If no single character is appropriate, @var{start} and @var{end} can be
|
||
of any length. Other implementations cap the delimiter length to five
|
||
characters, but @acronym{GNU} has no inherent limit.
|
||
|
||
Comments are recognized in preference to macros. However, this is not
|
||
compatible with other implementations, where macros and even quoting
|
||
takes precedence over comments, so it may change in a future release.
|
||
For portability, this means that @var{start} should not begin with a
|
||
letter, digit, or @samp{_} (underscore), and that neither the
|
||
start-quote nor the start-comment string should be a prefix of the
|
||
other.
|
||
|
||
@example
|
||
define(`hi', `HI')
|
||
@result{}
|
||
define(`hi1hi2', `hello')
|
||
@result{}
|
||
changecom(`q', `Q')
|
||
@result{}
|
||
q hi Q hi
|
||
@result{}q hi Q HI
|
||
changecom(`1', `2')
|
||
@result{}
|
||
hi1hi2
|
||
@result{}hello
|
||
hi 1hi2
|
||
@result{}HI 1hi2
|
||
@end example
|
||
|
||
Comments are recognized in preference to argument collection. In
|
||
particular, if @var{start} is a single @samp{(}, then argument
|
||
collection is effectively disabled. For portability with other
|
||
implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
|
||
@samp{)} as the first character in @var{start}.
|
||
|
||
@example
|
||
define(`echo', `$#:$*:$@@:')
|
||
@result{}
|
||
define(`hi', `HI')
|
||
@result{}
|
||
changecom(`(',`)')
|
||
@result{}
|
||
echo(hi)
|
||
@result{}0:::(hi)
|
||
changecom
|
||
@result{}
|
||
changecom(`((', `))')
|
||
@result{}
|
||
echo(hi)
|
||
@result{}1:HI:HI:
|
||
echo((hi))
|
||
@result{}0:::((hi))
|
||
changecom(`,', `)')
|
||
@result{}
|
||
echo(hi,hi)bye)
|
||
@result{}1:HI,hi)bye:HI,hi)bye:
|
||
changecom
|
||
@result{}
|
||
echo(hi,`,`'hi',hi)
|
||
@result{}3:HI,,HI,HI:HI,,`'hi,HI:
|
||
echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
|
||
@result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
|
||
@end example
|
||
|
||
It is an error if the end of file occurs within a comment.
|
||
|
||
@comment status: 1
|
||
@example
|
||
changecom(`/*', `*/')
|
||
@result{}
|
||
/*dangling comment
|
||
^D
|
||
@error{}m4:stdin:2: ERROR: end of file in comment
|
||
@end example
|
||
|
||
@node Changeword
|
||
@section Changing the lexical structure of words
|
||
|
||
@cindex lexical structure of words
|
||
@cindex words, lexical structure of
|
||
@cindex syntax, changing
|
||
@cindex changing syntax
|
||
@cindex regular expressions
|
||
@quotation
|
||
The macro @code{changeword} and all associated functionality is
|
||
experimental. It is only available if the @option{--enable-changeword}
|
||
option was given to @command{configure}, at @acronym{GNU} @code{m4}
|
||
installation
|
||
time. The functionality will go away in the future, to be replaced by
|
||
other new features that are more efficient at providing the same
|
||
capabilities. @emph{Do not rely on it}. Please direct your comments
|
||
about it the same way you would do for bugs.
|
||
@end quotation
|
||
|
||
A file being processed by @code{m4} is split into quoted strings, words
|
||
(potential macro names) and simple tokens (any other single character).
|
||
Initially a word is defined by the following regular expression:
|
||
|
||
@comment ignore
|
||
@example
|
||
[_a-zA-Z][_a-zA-Z0-9]*
|
||
@end example
|
||
|
||
Using @code{changeword}, you can change this regular expression:
|
||
|
||
@deffn {Optional builtin} changeword (@var{regex})
|
||
Changes the regular expression for recognizing macro names to be
|
||
@var{regex}. If @var{regex} is empty, use
|
||
@samp{[_a-zA-Z][_a-zA-Z0-9]*}. @var{regex} must obey the constraint
|
||
that every prefix of the desired final pattern is also accepted by the
|
||
regular expression. If @var{regex} contains grouping parentheses, the
|
||
macro invoked is the portion that matched the first group, rather than
|
||
the entire matching string.
|
||
|
||
The expansion of @code{changeword} is void.
|
||
The macro @code{changeword} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
Relaxing the lexical rules of @code{m4} might be useful (for example) if
|
||
you wanted to apply translations to a file of numbers:
|
||
|
||
@example
|
||
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
||
')m4exit(`77')')dnl
|
||
changeword(`[_a-zA-Z0-9]+')
|
||
@result{}
|
||
define(`1', `0')1
|
||
@result{}0
|
||
@end example
|
||
|
||
Tightening the lexical rules is less useful, because it will generally
|
||
make some of the builtins unavailable. You could use it to prevent
|
||
accidental call of builtins, for example:
|
||
|
||
@example
|
||
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
||
')m4exit(`77')')dnl
|
||
define(`_indir', defn(`indir'))
|
||
@result{}
|
||
changeword(`_[_a-zA-Z0-9]*')
|
||
@result{}
|
||
esyscmd(`foo')
|
||
@result{}esyscmd(foo)
|
||
_indir(`esyscmd', `echo hi')
|
||
@result{}hi
|
||
@result{}
|
||
@end example
|
||
|
||
Because @code{m4} constructs its words a character at a time, there
|
||
is a restriction on the regular expressions that may be passed to
|
||
@code{changeword}. This is that if your regular expression accepts
|
||
@samp{foo}, it must also accept @samp{f} and @samp{fo}.
|
||
|
||
@example
|
||
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
||
')m4exit(`77')')dnl
|
||
define(`foo
|
||
', `bar
|
||
')
|
||
@result{}
|
||
dnl This example wants to recognize changeword, dnl, and `foo\n'.
|
||
dnl First, we check that our regexp will match.
|
||
regexp(`changeword', `[cd][a-z]*\|foo[
|
||
]')
|
||
@result{}0
|
||
regexp(`foo
|
||
', `[cd][a-z]*\|foo[
|
||
]')
|
||
@result{}0
|
||
regexp(`f', `[cd][a-z]*\|foo[
|
||
]')
|
||
@result{}-1
|
||
foo
|
||
@result{}foo
|
||
changeword(`[cd][a-z]*\|foo[
|
||
]')
|
||
@result{}
|
||
dnl Even though `foo\n' matches, we forgot to allow `f'.
|
||
foo
|
||
@result{}foo
|
||
changeword(`[cd][a-z]*\|fo*[
|
||
]?')
|
||
@result{}
|
||
dnl Now we can call `foo\n'.
|
||
foo
|
||
@result{}bar
|
||
@end example
|
||
|
||
@ignore
|
||
@comment One more test of including newline in a macro name; but this
|
||
@comment does not need to be displayed in the manual. This ensures
|
||
@comment that line numbering is correct when dnl cuts across include
|
||
@comment file boundaries, and when __file__ or __line__ is the last
|
||
@comment token in an include file.
|
||
|
||
@example
|
||
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
||
')m4exit(`77')')dnl
|
||
define(`bar
|
||
', defn(`dnl'))dnl
|
||
define(`baz', `dnl
|
||
include(`foo') ignored
|
||
dnl')dnl
|
||
changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
|
||
\)')
|
||
@result{}
|
||
__file__:__line__
|
||
@result{}stdin:10
|
||
include(`foo') ignored
|
||
__file__:__line__
|
||
@result{}stdin:12
|
||
baz ignored
|
||
__file__:__line__
|
||
@result{}stdin:14
|
||
define(`bar
|
||
', defn(`__file__'))
|
||
@result{}
|
||
include(`foo')
|
||
@result{}examples/foo
|
||
define(`bar
|
||
', defn(`__line__'))
|
||
@result{}
|
||
include(`foo')
|
||
@result{}1
|
||
__file__:__line__
|
||
@result{}stdin:21
|
||
@end example
|
||
@end ignore
|
||
|
||
@code{changeword} has another function. If the regular expression
|
||
supplied contains any grouped subexpressions, then text outside
|
||
the first of these is discarded before symbol lookup. So:
|
||
|
||
@example
|
||
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
||
')m4exit(`77')')dnl
|
||
ifdef(`__unix__', ,
|
||
`errprint(` skipping: syscmd does not have unix semantics
|
||
')m4exit(`77')')dnl
|
||
changecom(`/*', `*/')dnl
|
||
define(`foo', `bar')dnl
|
||
changeword(`#\([_a-zA-Z0-9]*\)')
|
||
@result{}
|
||
#esyscmd(`echo foo \#foo')
|
||
@result{}foo bar
|
||
@result{}
|
||
@end example
|
||
|
||
@code{m4} now requires a @samp{#} mark at the beginning of every
|
||
macro invocation, so one can use @code{m4} to preprocess plain
|
||
text without losing various words like @samp{divert}.
|
||
|
||
In @code{m4}, macro substitution is based on text, while in @TeX{}, it
|
||
is based on tokens. @code{changeword} can throw this difference into
|
||
relief. For example, here is the same idea represented in @TeX{} and
|
||
@code{m4}. First, the @TeX{} version:
|
||
|
||
@comment ignore
|
||
@example
|
||
\def\a@{\message@{Hello@}@}
|
||
\catcode`\@@=0
|
||
\catcode`\\=12
|
||
@@a
|
||
@@bye
|
||
@result{}Hello
|
||
@end example
|
||
|
||
@noindent
|
||
Then, the @code{m4} version:
|
||
|
||
@example
|
||
ifdef(`changeword', `', `errprint(` skipping: no changeword support
|
||
')m4exit(`77')')dnl
|
||
define(`a', `errprint(`Hello')')dnl
|
||
changeword(`@@\([_a-zA-Z0-9]*\)')
|
||
@result{}
|
||
@@a
|
||
@result{}errprint(Hello)
|
||
@end example
|
||
|
||
In the @TeX{} example, the first line defines a macro @code{a} to
|
||
print the message @samp{Hello}. The second line defines @key{@@} to
|
||
be usable instead of @key{\} as an escape character. The third line
|
||
defines @key{\} to be a normal printing character, not an escape.
|
||
The fourth line invokes the macro @code{a}. So, when @TeX{} is run
|
||
on this file, it displays the message @samp{Hello}.
|
||
|
||
When the @code{m4} example is passed through @code{m4}, it outputs
|
||
@samp{errprint(Hello)}. The reason for this is that @TeX{} does
|
||
lexical analysis of macro definition when the macro is @emph{defined}.
|
||
@code{m4} just stores the text, postponing the lexical analysis until
|
||
the macro is @emph{used}.
|
||
|
||
You should note that using @code{changeword} will slow @code{m4} down
|
||
by a factor of about seven, once it is changed to something other
|
||
than the default regular expression. You can invoke @code{changeword}
|
||
with the empty string to restore the default word definition, and regain
|
||
the parsing speed.
|
||
|
||
@node M4wrap
|
||
@section Saving text until end of input
|
||
|
||
@cindex saving input
|
||
@cindex input, saving
|
||
@cindex deferring expansion
|
||
@cindex expansion, deferring
|
||
It is possible to `save' some text until the end of the normal input has
|
||
been seen. Text can be saved, to be read again by @code{m4} when the
|
||
normal input has been exhausted. This feature is normally used to
|
||
initiate cleanup actions before normal exit, e.g., deleting temporary
|
||
files.
|
||
|
||
To save input text, use the builtin @code{m4wrap}:
|
||
|
||
@deffn Builtin m4wrap (@var{string}, @dots{})
|
||
Stores @var{string} in a safe place, to be reread when end of input is
|
||
reached. As a @acronym{GNU} extension, additional arguments are
|
||
concatenated with a space to the @var{string}.
|
||
|
||
The expansion of @code{m4wrap} is void.
|
||
The macro @code{m4wrap} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`cleanup', `This is the `cleanup' action.
|
||
')
|
||
@result{}
|
||
m4wrap(`cleanup')
|
||
@result{}
|
||
This is the first and last normal input line.
|
||
@result{}This is the first and last normal input line.
|
||
^D
|
||
@result{}This is the cleanup action.
|
||
@end example
|
||
|
||
The saved input is only reread when the end of normal input is seen, and
|
||
not if @code{m4exit} is used to exit @code{m4}.
|
||
|
||
@comment FIXME: this contradicts POSIX, which requires that "If the
|
||
@comment m4wrap macro is used multiple times, the arguments specified
|
||
@comment shall be processed in the order in which the m4wrap macros were
|
||
@comment processed."
|
||
It is safe to call @code{m4wrap} from saved text, but then the order in
|
||
which the saved text is reread is undefined. If @code{m4wrap} is not used
|
||
recursively, the saved pieces of text are reread in the opposite order
|
||
in which they were saved (LIFO---last in, first out). However, this
|
||
behavior is likely to change in a future release, to match
|
||
@acronym{POSIX}, so you should not depend on this order.
|
||
|
||
It is possible to emulate @acronym{POSIX} behavior even
|
||
with older versions of @acronym{GNU} M4 by including the file
|
||
@file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
|
||
distribution:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`wrapfifo.m4')dnl
|
||
@result{}dnl Redefine m4wrap to have FIFO semantics.
|
||
@result{}define(`_m4wrap_level', `0')dnl
|
||
@result{}define(`m4wrap',
|
||
@result{}`ifdef(`m4wrap'_m4wrap_level,
|
||
@result{} `define(`m4wrap'_m4wrap_level,
|
||
@result{} defn(`m4wrap'_m4wrap_level)`$1')',
|
||
@result{} `builtin(`m4wrap', `define(`_m4wrap_level',
|
||
@result{} incr(_m4wrap_level))dnl
|
||
@result{}m4wrap'_m4wrap_level)dnl
|
||
@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
|
||
include(`wrapfifo.m4')
|
||
@result{}
|
||
m4wrap(`a`'m4wrap(`c
|
||
', `d')')m4wrap(`b')
|
||
@result{}
|
||
^D
|
||
@result{}abc
|
||
@end example
|
||
|
||
It is likewise possible to emulate LIFO behavior without resorting to
|
||
the @acronym{GNU} M4 extension of @code{builtin}, by including the file
|
||
@file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
|
||
distribution. (Unfortunately, both examples shown here share some
|
||
subtle bugs. See if you can find and correct them; or @pxref{Improved
|
||
m4wrap, , Answers}).
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`wraplifo.m4')dnl
|
||
@result{}dnl Redefine m4wrap to have LIFO semantics.
|
||
@result{}define(`_m4wrap_level', `0')dnl
|
||
@result{}define(`_m4wrap', defn(`m4wrap'))dnl
|
||
@result{}define(`m4wrap',
|
||
@result{}`ifdef(`m4wrap'_m4wrap_level,
|
||
@result{} `define(`m4wrap'_m4wrap_level,
|
||
@result{} `$1'defn(`m4wrap'_m4wrap_level))',
|
||
@result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
|
||
@result{}m4wrap'_m4wrap_level)dnl
|
||
@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
|
||
include(`wraplifo.m4')
|
||
@result{}
|
||
m4wrap(`a`'m4wrap(`c
|
||
', `d')')m4wrap(`b')
|
||
@result{}
|
||
^D
|
||
@result{}bac
|
||
@end example
|
||
|
||
Here is an example of implementing a factorial function using
|
||
@code{m4wrap}:
|
||
|
||
@example
|
||
define(`f', `ifelse(`$1', `0', `Answer: 0!=1
|
||
', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
|
||
', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
|
||
@result{}
|
||
f(`10')
|
||
@result{}
|
||
^D
|
||
@result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
|
||
@end example
|
||
|
||
Invocations of @code{m4wrap} at the same recursion level are
|
||
concatenated and rescanned as usual:
|
||
|
||
@example
|
||
define(`aa', `AA
|
||
')
|
||
@result{}
|
||
m4wrap(`a')m4wrap(`a')
|
||
@result{}
|
||
^D
|
||
@result{}AA
|
||
@end example
|
||
|
||
@noindent
|
||
however, the transition between recursion levels behaves like an end of
|
||
file condition between two input files.
|
||
|
||
@comment status: 1
|
||
@example
|
||
m4wrap(`m4wrap(`)')len(abc')
|
||
@result{}
|
||
^D
|
||
@error{}m4:stdin:1: ERROR: end of file in argument list
|
||
@end example
|
||
|
||
@node File Inclusion
|
||
@chapter File inclusion
|
||
|
||
@cindex file inclusion
|
||
@cindex inclusion, of files
|
||
@code{m4} allows you to include named files at any point in the input.
|
||
|
||
@menu
|
||
* Include:: Including named files
|
||
* Search Path:: Searching for include files
|
||
@end menu
|
||
|
||
@node Include
|
||
@section Including named files
|
||
|
||
There are two builtin macros in @code{m4} for including files:
|
||
|
||
@deffn Builtin include (@var{file})
|
||
@deffnx Builtin sinclude (@var{file})
|
||
Both macros cause the file named @var{file} to be read by
|
||
@code{m4}. When the end of the file is reached, input is resumed from
|
||
the previous input file.
|
||
|
||
The expansion of @code{include} and @code{sinclude} is therefore the
|
||
contents of @var{file}.
|
||
|
||
If @var{file} does not exist, is a directory, or cannot otherwise be
|
||
read, the expansion is void,
|
||
and @code{include} will fail with an error while @code{sinclude} is
|
||
silent. The empty string counts as a file that does not exist.
|
||
|
||
The macros @code{include} and @code{sinclude} are recognized only with
|
||
parameters.
|
||
@end deffn
|
||
|
||
@comment status: 1
|
||
@example
|
||
include(`none')
|
||
@error{}m4:stdin:1: cannot open `none': No such file or directory
|
||
@result{}
|
||
include()
|
||
@error{}m4:stdin:2: cannot open `': No such file or directory
|
||
@result{}
|
||
sinclude(`none')
|
||
@result{}
|
||
sinclude()
|
||
@result{}
|
||
@end example
|
||
|
||
The rest of this section assumes that @code{m4} is invoked with the
|
||
@option{-I} option (@pxref{Preprocessor features, , Invoking m4})
|
||
pointing to the @file{m4-@value{VERSION}/@/examples}
|
||
directory shipped as part of the @acronym{GNU} @code{m4} package. The
|
||
file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
|
||
contains the lines:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{cat examples/incl.m4}
|
||
@result{}Include file start
|
||
@result{}foo
|
||
@result{}Include file end
|
||
@end example
|
||
|
||
Normally file inclusion is used to insert the contents of a file
|
||
into the input stream. The contents of the file will be read by
|
||
@code{m4} and macro calls in the file will be expanded:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
define(`foo', `FOO')
|
||
@result{}
|
||
include(`incl.m4')
|
||
@result{}Include file start
|
||
@result{}FOO
|
||
@result{}Include file end
|
||
@result{}
|
||
@end example
|
||
|
||
The fact that @code{include} and @code{sinclude} expand to the contents
|
||
of the file can be used to define macros that operate on entire files.
|
||
Here is an example, which defines @samp{bar} to expand to the contents
|
||
of @file{incl.m4}:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
define(`bar', include(`incl.m4'))
|
||
@result{}
|
||
This is `bar': >>bar<<
|
||
@result{}This is bar: >>Include file start
|
||
@result{}foo
|
||
@result{}Include file end
|
||
@result{}<<
|
||
@end example
|
||
|
||
This use of @code{include} is not trivial, though, as files can contain
|
||
quotes, commas, and parentheses, which can interfere with the way the
|
||
@code{m4} parser works. @acronym{GNU} @code{m4} seamlessly concatenates
|
||
the file contents with the next character, even if the included file
|
||
ended in the middle of a comment, string, or macro call. These
|
||
conditions are only treated as end of file errors if specified as input
|
||
files on the command line.
|
||
|
||
In @acronym{GNU} @code{m4}, an alternative method of reading files is
|
||
using @code{undivert} (@pxref{Undivert}) on a named file.
|
||
|
||
@ignore
|
||
@comment Test that include(`file/') detects that file is not a
|
||
@comment directory; we can assume that the current directory contains a
|
||
@comment Makefile. mingw fails with EINVAL rather than ENOTDIR.
|
||
|
||
@comment status: 1
|
||
@comment xerr: ignore
|
||
@example
|
||
include(`Makefile/')
|
||
@error{}m4:stdin:1: cannot open `Makefile/': Not a directory
|
||
@result{}
|
||
@end example
|
||
|
||
@comment POSIX allows, but doesn't require, failure on reading
|
||
@comment directories. But since they aren't text files, it never makes
|
||
@comment sense, so we globally forbid it even if fopen doesn't. mingw
|
||
@comment fails with EACCES rather than EISDIR.
|
||
|
||
@comment status: 1
|
||
@comment xerr: ignore
|
||
@example
|
||
include(`.')
|
||
@error{}m4:stdin:1: cannot open `.': Is a directory
|
||
@result{}
|
||
@end example
|
||
|
||
@comment Meanwhile, ignore errors with sinclude.
|
||
|
||
@example
|
||
sinclude(`Makefile/')
|
||
@result{}
|
||
sinclude(`.')
|
||
@result{}
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Search Path
|
||
@section Searching for include files
|
||
|
||
@cindex search path for included files
|
||
@cindex included files, search path for
|
||
@cindex @acronym{GNU} extensions
|
||
@acronym{GNU} @code{m4} allows included files to be found in other directories
|
||
than the current working directory.
|
||
|
||
@cindex @env{M4PATH}
|
||
If the @option{--prepend-include} or @option{-B} command-line option was
|
||
provided (@pxref{Preprocessor features, , Invoking m4}), those
|
||
directories are searched first, in reverse order that those options were
|
||
listed on the command line. Then @code{m4} looks in the current working
|
||
directory. Next comes the directories specified with the
|
||
@option{--include} or @option{-I} option, in the order found on the
|
||
command line. Finally, if the @env{M4PATH} environment variable is set,
|
||
it is expected to contain a colon-separated list of directories, which
|
||
will be searched in order.
|
||
|
||
If the automatic search for include-files causes trouble, the @samp{p}
|
||
debug flag (@pxref{Debug Levels}) can help isolate the problem.
|
||
|
||
@node Diversions
|
||
@chapter Diverting and undiverting output
|
||
|
||
@cindex deferring output
|
||
Diversions are a way of temporarily saving output. The output of
|
||
@code{m4} can at any time be diverted to a temporary file, and be
|
||
reinserted into the output stream, @dfn{undiverted}, again at a later
|
||
time.
|
||
|
||
@cindex @env{TMPDIR}
|
||
Numbered diversions are counted from 0 upwards, diversion number 0
|
||
being the normal output stream. The number of simultaneous diversions
|
||
is limited mainly by the memory used to describe them, because @acronym{GNU}
|
||
@code{m4} tries to keep diversions in memory. However, there is a
|
||
limit to the overall memory usable by all diversions taken altogether
|
||
(512K, currently). When this maximum is about to be exceeded,
|
||
a temporary file is opened to receive the contents of the biggest
|
||
diversion still in memory, freeing this memory for other diversions.
|
||
When creating the temporary file, @code{m4} honors the value of the
|
||
environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
|
||
So, it is theoretically possible that the number and aggregate size of
|
||
diversions is limited only by available disk space.
|
||
|
||
@ignore
|
||
@comment We need to test spilled diversions, but don't need to expose
|
||
@comment this highly repetitive test in the manual.
|
||
|
||
@example
|
||
divert(`-1')define(`f', `.')
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
divert`'dnl
|
||
len(f)
|
||
@result{}1048576
|
||
divert(`1')
|
||
f
|
||
divert(`2')
|
||
f
|
||
divert(`-1')undivert
|
||
divert(`1')bye
|
||
^D
|
||
@result{}bye
|
||
@end example
|
||
|
||
@comment Another test of spilled diversions.
|
||
|
||
@example
|
||
divert(`-1')define(`f', `.')
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
define(`f', defn(`f')defn(`f'))
|
||
divert`'dnl
|
||
len(f)
|
||
@result{}1048576
|
||
divert(`1')
|
||
f
|
||
m4exit
|
||
@end example
|
||
|
||
@comment Catch regression in 1.4.10 with spilled diversions.
|
||
|
||
@example
|
||
ifdef(`__unix__', ,
|
||
`errprint(` skipping: syscmd does not have unix semantics
|
||
')m4exit(`77')')dnl
|
||
changequote(`[', `]')dnl
|
||
syscmd([echo 'divert(1)hi
|
||
format(%1000000d, 1)' | ']__program__[' | sed -n 1p])dnl
|
||
@result{}hi
|
||
sysval
|
||
@result{}0
|
||
@end example
|
||
|
||
@comment Avoid quadratic copying time when transferring diversions;
|
||
@comment test both in-memory and spilled to file.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`forloop2.m4')dnl
|
||
divert(`1')format(`%10000s', `')dnl
|
||
forloop(`i', `1', `10000',
|
||
`divert(incr(i))undivert(i)')dnl
|
||
divert(`9001')format(`%1000000s', `')dnl
|
||
forloop(`i', `9001', `10000',
|
||
`divert(incr(i))undivert(i)')dnl
|
||
divert(`-1')undivert
|
||
@end example
|
||
@end ignore
|
||
|
||
Diversions make it possible to generate output in a different order than
|
||
the input was read. It is possible to implement topological sorting
|
||
dependencies. For example, @acronym{GNU} Autoconf makes use of
|
||
diversions under the hood to ensure that the expansion of a prerequisite
|
||
macro appears in the output prior to the expansion of a dependent macro,
|
||
regardless of which order the two macros were invoked in the user's
|
||
input file.
|
||
|
||
@menu
|
||
* Divert:: Diverting output
|
||
* Undivert:: Undiverting output
|
||
* Divnum:: Diversion numbers
|
||
* Cleardivert:: Discarding diverted text
|
||
@end menu
|
||
|
||
@node Divert
|
||
@section Diverting output
|
||
|
||
@cindex diverting output to files
|
||
@cindex output, diverting to files
|
||
@cindex files, diverting output to
|
||
Output is diverted using @code{divert}:
|
||
|
||
@deffn Builtin divert (@dvar{number, 0})
|
||
The current diversion is changed to @var{number}. If @var{number} is left
|
||
out or empty, it is assumed to be zero. If @var{number} cannot be
|
||
parsed, the diversion is unchanged.
|
||
|
||
The expansion of @code{divert} is void.
|
||
@end deffn
|
||
|
||
When all the @code{m4} input will have been processed, all existing
|
||
diversions are automatically undiverted, in numerical order.
|
||
|
||
@example
|
||
divert(`1')
|
||
This text is diverted.
|
||
divert
|
||
@result{}
|
||
This text is not diverted.
|
||
@result{}This text is not diverted.
|
||
^D
|
||
@result{}
|
||
@result{}This text is diverted.
|
||
@end example
|
||
|
||
Several calls of @code{divert} with the same argument do not overwrite
|
||
the previous diverted text, but append to it. Diversions are printed
|
||
after any wrapped text is expanded.
|
||
|
||
@example
|
||
define(`text', `TEXT')
|
||
@result{}
|
||
divert(`1')`diverted text.'
|
||
divert
|
||
@result{}
|
||
m4wrap(`Wrapped text precedes ')
|
||
@result{}
|
||
^D
|
||
@result{}Wrapped TEXT precedes diverted text.
|
||
@end example
|
||
|
||
@cindex discarding input
|
||
@cindex input, discarding
|
||
If output is diverted to a negative diversion, it is simply discarded.
|
||
This can be used to suppress unwanted output. A common example of
|
||
unwanted output is the trailing newlines after macro definitions. Here
|
||
is a common programming idiom in @code{m4} for avoiding them.
|
||
|
||
@example
|
||
divert(`-1')
|
||
define(`foo', `Macro `foo'.')
|
||
define(`bar', `Macro `bar'.')
|
||
divert
|
||
@result{}
|
||
@end example
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
Traditional implementations only supported ten diversions. But as a
|
||
@acronym{GNU} extension, diversion numbers can be as large as positive
|
||
integers will allow, rather than treating a multi-digit diversion number
|
||
as a request to discard text.
|
||
|
||
@example
|
||
divert(eval(`1<<28'))world
|
||
divert(`2')hello
|
||
^D
|
||
@result{}hello
|
||
@result{}world
|
||
@end example
|
||
|
||
Note that @code{divert} is an English word, but also an active macro
|
||
without arguments. When processing plain text, the word might appear in
|
||
normal text and be unintentionally swallowed as a macro invocation. One
|
||
way to avoid this is to use the @option{-P} option to rename all
|
||
builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
|
||
a wrapper that requires a parameter to be recognized.
|
||
|
||
@example
|
||
We decided to divert the stream for irrigation.
|
||
@result{}We decided to the stream for irrigation.
|
||
define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
|
||
@result{}
|
||
divert(`-1')
|
||
Ignored text.
|
||
divert(`0')
|
||
@result{}
|
||
We decided to divert the stream for irrigation.
|
||
@result{}We decided to divert the stream for irrigation.
|
||
@end example
|
||
|
||
@node Undivert
|
||
@section Undiverting output
|
||
|
||
Diverted text can be undiverted explicitly using the builtin
|
||
@code{undivert}:
|
||
|
||
@deffn Builtin undivert (@ovar{diversions@dots{}})
|
||
Undiverts the numeric @var{diversions} given by the arguments, in the
|
||
order given. If no arguments are supplied, all diversions are
|
||
undiverted, in numerical order.
|
||
|
||
@cindex file inclusion
|
||
@cindex inclusion, of files
|
||
@cindex @acronym{GNU} extensions
|
||
As a @acronym{GNU} extension, @var{diversions} may contain non-numeric
|
||
strings, which are treated as the names of files to copy into the output
|
||
without expansion. A warning is issued if a file could not be opened.
|
||
|
||
The expansion of @code{undivert} is void.
|
||
@end deffn
|
||
|
||
@example
|
||
divert(`1')
|
||
This text is diverted.
|
||
divert
|
||
@result{}
|
||
This text is not diverted.
|
||
@result{}This text is not diverted.
|
||
undivert(`1')
|
||
@result{}
|
||
@result{}This text is diverted.
|
||
@result{}
|
||
@end example
|
||
|
||
Notice the last two blank lines. One of them comes from the newline
|
||
following @code{undivert}, the other from the newline that followed the
|
||
@code{divert}! A diversion often starts with a blank line like this.
|
||
|
||
When diverted text is undiverted, it is @emph{not} reread by @code{m4},
|
||
but rather copied directly to the current output, and it is therefore
|
||
not an error to undivert into a diversion. Undiverting the empty string
|
||
is the same as specifying diversion 0; in either case nothing happens
|
||
since the output has already been flushed.
|
||
|
||
@example
|
||
divert(`1')diverted text
|
||
divert
|
||
@result{}
|
||
undivert()
|
||
@result{}
|
||
undivert(`0')
|
||
@result{}
|
||
undivert
|
||
@result{}diverted text
|
||
@result{}
|
||
divert(`1')more
|
||
divert(`2')undivert(`1')diverted text`'divert
|
||
@result{}
|
||
undivert(`1')
|
||
@result{}
|
||
undivert(`2')
|
||
@result{}more
|
||
@result{}diverted text
|
||
@end example
|
||
|
||
When a diversion has been undiverted, the diverted text is discarded,
|
||
and it is not possible to bring back diverted text more than once.
|
||
|
||
@example
|
||
divert(`1')
|
||
This text is diverted first.
|
||
divert(`0')undivert(`1')dnl
|
||
@result{}
|
||
@result{}This text is diverted first.
|
||
undivert(`1')
|
||
@result{}
|
||
divert(`1')
|
||
This text is also diverted but not appended.
|
||
divert(`0')undivert(`1')dnl
|
||
@result{}
|
||
@result{}This text is also diverted but not appended.
|
||
@end example
|
||
|
||
Attempts to undivert the current diversion are silently ignored. Thus,
|
||
when the current diversion is not 0, the current diversion does not get
|
||
rearranged among the other diversions.
|
||
|
||
@example
|
||
divert(`1')one
|
||
divert(`2')two
|
||
divert(`3')three
|
||
divert(`2')undivert`'dnl
|
||
divert`'undivert`'dnl
|
||
@result{}two
|
||
@result{}one
|
||
@result{}three
|
||
@end example
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
@cindex file inclusion
|
||
@cindex inclusion, of files
|
||
@acronym{GNU} @code{m4} allows named files to be undiverted. Given a
|
||
non-numeric argument, the contents of the file named will be copied,
|
||
uninterpreted, to the current output. This complements the builtin
|
||
@code{include} (@pxref{Include}). To illustrate the difference, assume
|
||
the file @file{foo} contains:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{cat foo}
|
||
bar
|
||
@end example
|
||
|
||
@noindent
|
||
then
|
||
|
||
@example
|
||
define(`bar', `BAR')
|
||
@result{}
|
||
undivert(`foo')
|
||
@result{}bar
|
||
@result{}
|
||
include(`foo')
|
||
@result{}BAR
|
||
@result{}
|
||
@end example
|
||
|
||
If the file is not found (or cannot be read), an error message is
|
||
issued, and the expansion is void. It is possible to intermix files
|
||
and diversion numbers.
|
||
|
||
@example
|
||
divert(`1')diversion one
|
||
divert(`2')undivert(`foo')dnl
|
||
divert(`3')diversion three
|
||
divert`'dnl
|
||
undivert(`1', `2', `foo', `3')dnl
|
||
@result{}diversion one
|
||
@result{}bar
|
||
@result{}bar
|
||
@result{}diversion three
|
||
@end example
|
||
|
||
@node Divnum
|
||
@section Diversion numbers
|
||
|
||
@cindex diversion numbers
|
||
The current diversion is tracked by the builtin @code{divnum}:
|
||
|
||
@deffn Builtin divnum
|
||
Expands to the number of the current diversion.
|
||
@end deffn
|
||
|
||
@example
|
||
Initial divnum
|
||
@result{}Initial 0
|
||
divert(`1')
|
||
Diversion one: divnum
|
||
divert(`2')
|
||
Diversion two: divnum
|
||
^D
|
||
@result{}
|
||
@result{}Diversion one: 1
|
||
@result{}
|
||
@result{}Diversion two: 2
|
||
@end example
|
||
|
||
@node Cleardivert
|
||
@section Discarding diverted text
|
||
|
||
@cindex discarding diverted text
|
||
@cindex diverted text, discarding
|
||
Often it is not known, when output is diverted, whether the diverted
|
||
text is actually needed. Since all non-empty diversion are brought back
|
||
on the main output stream when the end of input is seen, a method of
|
||
discarding a diversion is needed. If all diversions should be
|
||
discarded, the easiest is to end the input to @code{m4} with
|
||
@samp{divert(`-1')} followed by an explicit @samp{undivert}:
|
||
|
||
@example
|
||
divert(`1')
|
||
Diversion one: divnum
|
||
divert(`2')
|
||
Diversion two: divnum
|
||
divert(`-1')
|
||
undivert
|
||
^D
|
||
@end example
|
||
|
||
@noindent
|
||
No output is produced at all.
|
||
|
||
Clearing selected diversions can be done with the following macro:
|
||
|
||
@deffn Composite cleardivert (@ovar{diversions@dots{}})
|
||
Discard the contents of each of the listed numeric @var{diversions}.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`cleardivert',
|
||
`pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
|
||
@result{}
|
||
@end example
|
||
|
||
It is called just like @code{undivert}, but the effect is to clear the
|
||
diversions, given by the arguments. (This macro has a nasty bug! You
|
||
should try to see if you can find it and correct it; or @pxref{Improved
|
||
cleardivert, , Answers}).
|
||
|
||
@node Text handling
|
||
@chapter Macros for text handling
|
||
|
||
There are a number of builtins in @code{m4} for manipulating text in
|
||
various ways, extracting substrings, searching, substituting, and so on.
|
||
|
||
@menu
|
||
* Len:: Calculating length of strings
|
||
* Index macro:: Searching for substrings
|
||
* Regexp:: Searching for regular expressions
|
||
* Substr:: Extracting substrings
|
||
* Translit:: Translating characters
|
||
* Patsubst:: Substituting text by regular expression
|
||
* Format:: Formatting strings (printf-like)
|
||
@end menu
|
||
|
||
@node Len
|
||
@section Calculating length of strings
|
||
|
||
@cindex length of strings
|
||
@cindex strings, length of
|
||
The length of a string can be calculated by @code{len}:
|
||
|
||
@deffn Builtin len (@var{string})
|
||
Expands to the length of @var{string}, as a decimal number.
|
||
|
||
The macro @code{len} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
len()
|
||
@result{}0
|
||
len(`abcdef')
|
||
@result{}6
|
||
@end example
|
||
|
||
@node Index macro
|
||
@section Searching for substrings
|
||
|
||
@cindex substrings, locating
|
||
Searching for substrings is done with @code{index}:
|
||
|
||
@deffn Builtin index (@var{string}, @var{substring})
|
||
Expands to the index of the first occurrence of @var{substring} in
|
||
@var{string}. The first character in @var{string} has index 0. If
|
||
@var{substring} does not occur in @var{string}, @code{index} expands to
|
||
@samp{-1}.
|
||
|
||
The macro @code{index} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
index(`gnus, gnats, and armadillos', `nat')
|
||
@result{}7
|
||
index(`gnus, gnats, and armadillos', `dag')
|
||
@result{}-1
|
||
@end example
|
||
|
||
Omitting @var{substring} evokes a warning, but still produces output;
|
||
contrast this with an empty @var{substring}.
|
||
|
||
@example
|
||
index(`abc')
|
||
@error{}m4:stdin:1: Warning: too few arguments to builtin `index'
|
||
@result{}0
|
||
index(`abc', `')
|
||
@result{}0
|
||
index(`abc', `b')
|
||
@result{}1
|
||
@end example
|
||
|
||
@node Regexp
|
||
@section Searching for regular expressions
|
||
|
||
@cindex basic regular expressions
|
||
@cindex regular expressions
|
||
@cindex expressions, regular
|
||
@cindex @acronym{GNU} extensions
|
||
Searching for regular expressions is done with the builtin
|
||
@code{regexp}:
|
||
|
||
@deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
|
||
Searches for @var{regexp} in @var{string}. The syntax for regular
|
||
expressions is the same as in @acronym{GNU} Emacs, which is similar to
|
||
@acronym{BRE, Basic Regular Expressions} in @acronym{POSIX}.
|
||
@ifnothtml
|
||
@xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
|
||
Manual}.
|
||
@end ifnothtml
|
||
@ifhtml
|
||
See
|
||
@uref{http://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
|
||
Syntax of Regular Expressions} in the @acronym{GNU} Emacs Manual.
|
||
@end ifhtml
|
||
Support for @acronym{ERE, Extended Regular Expressions} is not
|
||
available, but will be added in @acronym{GNU} M4 2.0.
|
||
|
||
If @var{replacement} is omitted, @code{regexp} expands to the index of
|
||
the first match of @var{regexp} in @var{string}. If @var{regexp} does
|
||
not match anywhere in @var{string}, it expands to -1.
|
||
|
||
If @var{replacement} is supplied, and there was a match, @code{regexp}
|
||
changes the expansion to this argument, with @samp{\@var{n}} substituted
|
||
by the text matched by the @var{n}th parenthesized sub-expression of
|
||
@var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
|
||
replaced by the text of the entire regular expression matched. For
|
||
all other characters, @samp{\} treats the next character literally. A
|
||
warning is issued if there were fewer sub-expressions than the
|
||
@samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
|
||
was no match, @code{regexp} expands to the empty string.
|
||
|
||
The macro @code{regexp} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
regexp(`GNUs not Unix', `\<[a-z]\w+')
|
||
@result{}5
|
||
regexp(`GNUs not Unix', `\<Q\w*')
|
||
@result{}-1
|
||
regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
|
||
@result{}*** Unix *** nix ***
|
||
regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
|
||
@result{}
|
||
@end example
|
||
|
||
Here are some more examples on the handling of backslash:
|
||
|
||
@example
|
||
regexp(`abc', `\(b\)', `\\\10\a')
|
||
@result{}\b0a
|
||
regexp(`abc', `b', `\1\')
|
||
@error{}m4:stdin:2: Warning: sub-expression 1 not present
|
||
@error{}m4:stdin:2: Warning: trailing \ ignored in replacement
|
||
@result{}
|
||
regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
|
||
@error{}m4:stdin:3: Warning: sub-expression 4 not present
|
||
@error{}m4:stdin:3: Warning: sub-expression 5 not present
|
||
@error{}m4:stdin:3: Warning: sub-expression 6 not present
|
||
@result{}c
|
||
@end example
|
||
|
||
Omitting @var{regexp} evokes a warning, but still produces output;
|
||
contrast this with an empty @var{regexp} argument.
|
||
|
||
@example
|
||
regexp(`abc')
|
||
@error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
|
||
@result{}0
|
||
regexp(`abc', `')
|
||
@result{}0
|
||
regexp(`abc', `', `\\def')
|
||
@result{}\def
|
||
@end example
|
||
|
||
@node Substr
|
||
@section Extracting substrings
|
||
|
||
@cindex extracting substrings
|
||
@cindex substrings, extracting
|
||
Substrings are extracted with @code{substr}:
|
||
|
||
@deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
|
||
Expands to the substring of @var{string}, which starts at index
|
||
@var{from}, and extends for @var{length} characters, or to the end of
|
||
@var{string}, if @var{length} is omitted. The starting index of a string
|
||
is always 0. The expansion is empty if there is an error parsing
|
||
@var{from} or @var{length}, if @var{from} is beyond the end of
|
||
@var{string}, or if @var{length} is negative.
|
||
|
||
The macro @code{substr} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
substr(`gnus, gnats, and armadillos', `6')
|
||
@result{}gnats, and armadillos
|
||
substr(`gnus, gnats, and armadillos', `6', `5')
|
||
@result{}gnats
|
||
@end example
|
||
|
||
Omitting @var{from} evokes a warning, but still produces output.
|
||
|
||
@example
|
||
substr(`abc')
|
||
@error{}m4:stdin:1: Warning: too few arguments to builtin `substr'
|
||
@result{}abc
|
||
substr(`abc',)
|
||
@error{}m4:stdin:2: empty string treated as 0 in builtin `substr'
|
||
@result{}abc
|
||
@end example
|
||
|
||
@node Translit
|
||
@section Translating characters
|
||
|
||
@cindex translating characters
|
||
@cindex characters, translating
|
||
Character translation is done with @code{translit}:
|
||
|
||
@deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
|
||
Expands to @var{string}, with each character that occurs in
|
||
@var{chars} translated into the character from @var{replacement} with
|
||
the same index.
|
||
|
||
If @var{replacement} is shorter than @var{chars}, the excess characters
|
||
of @var{chars} are deleted from the expansion; if @var{chars} is
|
||
shorter, the excess characters in @var{replacement} are silently
|
||
ignored. If @var{replacement} is omitted, all characters in
|
||
@var{string} that are present in @var{chars} are deleted from the
|
||
expansion. If a character appears more than once in @var{chars}, only
|
||
the first instance is used in making the translation. Only a single
|
||
translation pass is made, even if characters in @var{replacement} also
|
||
appear in @var{chars}.
|
||
|
||
As a @acronym{GNU} extension, both @var{chars} and @var{replacement} can
|
||
contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
|
||
letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
|
||
in @var{chars} or @var{replacement}, place it first or last in the
|
||
entire string, or as the last character of a range. Back-to-back ranges
|
||
can share a common endpoint. It is not an error for the last character
|
||
in the range to be `larger' than the first. In that case, the range
|
||
runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
|
||
The expansion of a range is dependent on the underlying encoding of
|
||
characters, so using ranges is not always portable between machines.
|
||
|
||
The macro @code{translit} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
translit(`GNUs not Unix', `A-Z')
|
||
@result{}s not nix
|
||
translit(`GNUs not Unix', `a-z', `A-Z')
|
||
@result{}GNUS NOT UNIX
|
||
translit(`GNUs not Unix', `A-Z', `z-a')
|
||
@result{}tmfs not fnix
|
||
translit(`+,-12345', `+--1-5', `<;>a-c-a')
|
||
@result{}<;>abcba
|
||
translit(`abcdef', `aabdef', `bcged')
|
||
@result{}bgced
|
||
@end example
|
||
|
||
In the @sc{ascii} encoding, the first example deletes all uppercase
|
||
letters, the second converts lowercase to uppercase, and the third
|
||
`mirrors' all uppercase letters, while converting them to lowercase.
|
||
The two first cases are by far the most common, even though they are not
|
||
portable to @sc{ebcdic} or other encodings. The fourth example shows a
|
||
range ending in @samp{-}, as well as back-to-back ranges. The final
|
||
example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
|
||
resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
|
||
@samp{e} are swapped, and the @samp{f} is discarded.
|
||
|
||
@ignore
|
||
@comment No need to fight 8-bit characters, as it is difficult to get
|
||
@comment rendering right in both info and dvi.
|
||
|
||
@example
|
||
translit(`<60>abc~', `~-<2D>')
|
||
@result{}abc
|
||
@end example
|
||
|
||
@comment Stress test short arguments, since they use a different code
|
||
@comment path.
|
||
@example
|
||
translit(`abcdeabcde', `a')
|
||
@result{}bcdebcde
|
||
translit(`abcdeabcde', `ab')
|
||
@result{}cdecde
|
||
translit(`abcdeabcde', `a', `f')
|
||
@result{}fbcdefbcde
|
||
translit(`abcdeabcde', `a', `f')
|
||
@result{}fbcdefbcde
|
||
translit(`abcdeabcde', `a', `fg')
|
||
@result{}fbcdefbcde
|
||
translit(`abcdeabcde', `ab', `f')
|
||
@result{}fcdefcde
|
||
translit(`abcdeabcde', `ab', `fg')
|
||
@result{}fgcdefgcde
|
||
translit(`abcdeabcde', `ab', `ba')
|
||
@result{}bacdebacde
|
||
translit(`abcdeabcde', `e', `f')
|
||
@result{}abcdfabcdf
|
||
translit(`abc', `', `cde')
|
||
@result{}abc
|
||
translit(`', `a', `bc')
|
||
@result{}
|
||
@end example
|
||
@end ignore
|
||
|
||
Omitting @var{chars} evokes a warning, but still produces output.
|
||
|
||
@example
|
||
translit(`abc')
|
||
@error{}m4:stdin:1: Warning: too few arguments to builtin `translit'
|
||
@result{}abc
|
||
@end example
|
||
|
||
@node Patsubst
|
||
@section Substituting text by regular expression
|
||
|
||
@cindex basic regular expressions
|
||
@cindex regular expressions
|
||
@cindex expressions, regular
|
||
@cindex pattern substitution
|
||
@cindex substitution by regular expression
|
||
@cindex @acronym{GNU} extensions
|
||
Global substitution in a string is done by @code{patsubst}:
|
||
|
||
@deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
|
||
Searches @var{string} for matches of @var{regexp}, and substitutes
|
||
@var{replacement} for each match. The syntax for regular expressions
|
||
is the same as in @acronym{GNU} Emacs (@pxref{Regexp}).
|
||
|
||
The parts of @var{string} that are not covered by any match of
|
||
@var{regexp} are copied to the expansion. Whenever a match is found, the
|
||
search proceeds from the end of the match, so a character from
|
||
@var{string} will never be substituted twice. If @var{regexp} matches a
|
||
string of zero length, the start position for the search is incremented,
|
||
to avoid infinite loops.
|
||
|
||
When a replacement is to be made, @var{replacement} is inserted into
|
||
the expansion, with @samp{\@var{n}} substituted by the text matched by
|
||
the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
|
||
nine sub-expressions. The escape @samp{\&} is replaced by the text of
|
||
the entire regular expression matched. For all other characters,
|
||
@samp{\} treats the next character literally. A warning is issued if
|
||
there were fewer sub-expressions than the @samp{\@var{n}} requested, or
|
||
if there is a trailing @samp{\}.
|
||
|
||
The @var{replacement} argument can be omitted, in which case the text
|
||
matched by @var{regexp} is deleted.
|
||
|
||
The macro @code{patsubst} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
patsubst(`GNUs not Unix', `^', `OBS: ')
|
||
@result{}OBS: GNUs not Unix
|
||
patsubst(`GNUs not Unix', `\<', `OBS: ')
|
||
@result{}OBS: GNUs OBS: not OBS: Unix
|
||
patsubst(`GNUs not Unix', `\w*', `(\&)')
|
||
@result{}(GNUs)() (not)() (Unix)()
|
||
patsubst(`GNUs not Unix', `\w+', `(\&)')
|
||
@result{}(GNUs) (not) (Unix)
|
||
patsubst(`GNUs not Unix', `[A-Z][a-z]+')
|
||
@result{}GN not@w{ }
|
||
patsubst(`GNUs not Unix', `not', `NOT\')
|
||
@error{}m4:stdin:6: Warning: trailing \ ignored in replacement
|
||
@result{}GNUs NOT Unix
|
||
@end example
|
||
|
||
Here is a slightly more realistic example, which capitalizes individual
|
||
words or whole sentences, by substituting calls of the macros
|
||
@code{upcase} and @code{downcase} into the strings.
|
||
|
||
@deffn Composite upcase (@var{text})
|
||
@deffnx Composite downcase (@var{text})
|
||
@deffnx Composite capitalize (@var{text})
|
||
Expand to @var{text}, but with capitalization changed: @code{upcase}
|
||
changes all letters to upper case, @code{downcase} changes all letters
|
||
to lower case, and @code{capitalize} changes the first character of each
|
||
word to upper case and the remaining characters to lower case.
|
||
@end deffn
|
||
|
||
First, an example of their usage, using implementations distributed in
|
||
@file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`capitalize.m4')
|
||
@result{}
|
||
upcase(`GNUs not Unix')
|
||
@result{}GNUS NOT UNIX
|
||
downcase(`GNUs not Unix')
|
||
@result{}gnus not unix
|
||
capitalize(`GNUs not Unix')
|
||
@result{}Gnus Not Unix
|
||
@end example
|
||
|
||
Now for the implementation. There is a helper macro @code{_capitalize}
|
||
which puts only its first word in mixed case. Then @code{capitalize}
|
||
merely parses out the words, and replaces them with an invocation of
|
||
@code{_capitalize}. (As presented here, the @code{capitalize} macro has
|
||
some subtle flaws. You should try to see if you can find and correct
|
||
them; or @pxref{Improved capitalize, , Answers}).
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
undivert(`capitalize.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# upcase(text)
|
||
@result{}# downcase(text)
|
||
@result{}# capitalize(text)
|
||
@result{}# change case of text, simple version
|
||
@result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
|
||
@result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
|
||
@result{}define(`_capitalize',
|
||
@result{} `regexp(`$1', `^\(\w\)\(\w*\)',
|
||
@result{} `upcase(`\1')`'downcase(`\2')')')
|
||
@result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
While @code{regexp} replaces the whole input with the replacement as
|
||
soon as there is a match, @code{patsubst} replaces each
|
||
@emph{occurrence} of a match and preserves non-matching pieces:
|
||
|
||
@example
|
||
define(`patreg',
|
||
`patsubst($@@)
|
||
regexp($@@)')dnl
|
||
patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
|
||
@result{}bar FOO baz FOO
|
||
@result{}FOO
|
||
patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
|
||
@result{}bab abb 212
|
||
@result{}bab
|
||
@end example
|
||
|
||
Omitting @var{regexp} evokes a warning, but still produces output;
|
||
contrast this with an empty @var{regexp} argument.
|
||
|
||
@example
|
||
patsubst(`abc')
|
||
@error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
|
||
@result{}abc
|
||
patsubst(`abc', `')
|
||
@result{}abc
|
||
patsubst(`abc', `', `\\-')
|
||
@result{}\-a\-b\-c\-
|
||
@end example
|
||
|
||
@node Format
|
||
@section Formatting strings (printf-like)
|
||
|
||
@cindex formatted output
|
||
@cindex output, formatted
|
||
@cindex @acronym{GNU} extensions
|
||
Formatted output can be made with @code{format}:
|
||
|
||
@deffn Builtin format (@var{format-string}, @dots{})
|
||
Works much like the C function @code{printf}. The first argument
|
||
@var{format-string} can contain @samp{%} specifications which are
|
||
satisfied by additional arguments, and the expansion of @code{format} is
|
||
the formatted string.
|
||
|
||
The macro @code{format} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
Its use is best described by a few examples:
|
||
|
||
@comment This test is a bit fragile, if someone tries to port to a
|
||
@comment platform without infinity.
|
||
@example
|
||
define(`foo', `The brown fox jumped over the lazy dog')
|
||
@result{}
|
||
format(`The string "%s" uses %d characters', foo, len(foo))
|
||
@result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
|
||
format(`%*.*d', `-1', `-1', `1')
|
||
@result{}1
|
||
format(`%.0f', `56789.9876')
|
||
@result{}56790
|
||
len(format(`%-*X', `5000', `1'))
|
||
@result{}5000
|
||
ifelse(format(`%010F', `infinity'), ` INF', `success',
|
||
format(`%010F', `infinity'), ` INFINITY', `success',
|
||
format(`%010F', `infinity'))
|
||
@result{}success
|
||
ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
|
||
format(`%.1A', `1.999'), `0X2.0P+0', `success',
|
||
format(`%.1A', `1.999'))
|
||
@result{}success
|
||
format(`%g', `0xa.P+1')
|
||
@result{}20
|
||
@end example
|
||
|
||
Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
|
||
example shows how @code{format} can be used to produce tabular output.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`forloop.m4')
|
||
@result{}
|
||
forloop(`i', `1', `10', `format(`%6d squared is %10d
|
||
', i, eval(i**2))')
|
||
@result{} 1 squared is 1
|
||
@result{} 2 squared is 4
|
||
@result{} 3 squared is 9
|
||
@result{} 4 squared is 16
|
||
@result{} 5 squared is 25
|
||
@result{} 6 squared is 36
|
||
@result{} 7 squared is 49
|
||
@result{} 8 squared is 64
|
||
@result{} 9 squared is 81
|
||
@result{} 10 squared is 100
|
||
@result{}
|
||
@end example
|
||
|
||
The builtin @code{format} is modeled after the ANSI C @samp{printf}
|
||
function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
|
||
@samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
|
||
@samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
|
||
@samp{%}; it supports field widths and precisions, and the flags
|
||
@samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
|
||
integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
|
||
@samp{l} are recognized, and for floating point specifiers, the width
|
||
modifier @samp{l} is recognized. Items not yet supported include
|
||
positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
|
||
specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
|
||
modifiers, and any platform extensions available in the native
|
||
@code{printf}. For more details on the functioning of @code{printf},
|
||
see the C Library Manual, or the @acronym{POSIX} specification (for
|
||
example, @samp{%a} is supported even on platforms that haven't yet
|
||
implemented C99 hexadecimal floating point output natively).
|
||
|
||
Unrecognized specifiers result in a warning. It is anticipated that a
|
||
future release of @acronym{GNU} @code{m4} will support more specifiers,
|
||
and give better warnings when various problems such as overflow are
|
||
encountered. Likewise, escape sequences are not yet recognized.
|
||
|
||
@example
|
||
format(`%p', `0')
|
||
@error{}m4:stdin:1: Warning: unrecognized specifier in `%p'
|
||
@result{}
|
||
@end example
|
||
|
||
@node Arithmetic
|
||
@chapter Macros for doing arithmetic
|
||
|
||
@cindex arithmetic
|
||
@cindex integer arithmetic
|
||
Integer arithmetic is included in @code{m4}, with a C-like syntax. As
|
||
convenient shorthands, there are builtins for simple increment and
|
||
decrement operations.
|
||
|
||
@menu
|
||
* Incr:: Decrement and increment operators
|
||
* Eval:: Evaluating integer expressions
|
||
@end menu
|
||
|
||
@node Incr
|
||
@section Decrement and increment operators
|
||
|
||
@cindex decrement operator
|
||
@cindex increment operator
|
||
Increment and decrement of integers are supported using the builtins
|
||
@code{incr} and @code{decr}:
|
||
|
||
@deffn Builtin incr (@var{number})
|
||
@deffnx Builtin decr (@var{number})
|
||
Expand to the numerical value of @var{number}, incremented
|
||
or decremented, respectively, by one. Except for the empty string, the
|
||
expansion is empty if @var{number} could not be parsed.
|
||
|
||
The macros @code{incr} and @code{decr} are recognized only with
|
||
parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
incr(`4')
|
||
@result{}5
|
||
decr(`7')
|
||
@result{}6
|
||
incr()
|
||
@error{}m4:stdin:3: empty string treated as 0 in builtin `incr'
|
||
@result{}1
|
||
decr()
|
||
@error{}m4:stdin:4: empty string treated as 0 in builtin `decr'
|
||
@result{}-1
|
||
@end example
|
||
|
||
@node Eval
|
||
@section Evaluating integer expressions
|
||
|
||
@cindex integer expression evaluation
|
||
@cindex evaluation, of integer expressions
|
||
@cindex expressions, evaluation of integer
|
||
Integer expressions are evaluated with @code{eval}:
|
||
|
||
@deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
|
||
Expands to the value of @var{expression}. The expansion is empty
|
||
if a problem is encountered while parsing the arguments. If specified,
|
||
@var{radix} and @var{width} control the format of the output.
|
||
|
||
Calculations are done with 32-bit signed numbers. Overflow silently
|
||
results in wraparound. A warning is issued if division by zero is
|
||
attempted, or if @var{expression} could not be parsed.
|
||
|
||
Expressions can contain the following operators, listed in order of
|
||
decreasing precedence.
|
||
|
||
@table @samp
|
||
@item ()
|
||
Parentheses
|
||
@item + - ~ !
|
||
Unary plus and minus, and bitwise and logical negation
|
||
@item **
|
||
Exponentiation
|
||
@item * / %
|
||
Multiplication, division, and modulo
|
||
@item + -
|
||
Addition and subtraction
|
||
@item << >>
|
||
Shift left or right
|
||
@item > >= < <=
|
||
Relational operators
|
||
@item == !=
|
||
Equality operators
|
||
@item &
|
||
Bitwise and
|
||
@item ^
|
||
Bitwise exclusive-or
|
||
@item |
|
||
Bitwise or
|
||
@item &&
|
||
Logical and
|
||
@item ||
|
||
Logical or
|
||
@end table
|
||
|
||
The macro @code{eval} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
All binary operators, except exponentiation, are left associative. C
|
||
operators that perform variable assignment, such as @samp{+=} or
|
||
@samp{--}, are not implemented, since @code{eval} only operates on
|
||
constants, not variables. Attempting to use them results in an error.
|
||
However, since traditional implementations treated @samp{=} as an
|
||
undocumented alias for @samp{==} as opposed to an assignment operator,
|
||
this usage is supported as a special case. Be aware that a future
|
||
version of @acronym{GNU} M4 may support assignment semantics as an
|
||
extension when @acronym{POSIX} mode is not requested, and that using
|
||
@samp{=} to check equality is not portable.
|
||
|
||
@comment status: 1
|
||
@example
|
||
eval(`2 = 2')
|
||
@error{}m4:stdin:1: Warning: recommend ==, not =, for equality operator
|
||
@result{}1
|
||
eval(`++0')
|
||
@error{}m4:stdin:2: invalid operator in eval: ++0
|
||
@result{}
|
||
eval(`0 |= 1')
|
||
@error{}m4:stdin:3: invalid operator in eval: 0 |= 1
|
||
@result{}
|
||
@end example
|
||
|
||
Note that some older @code{m4} implementations use @samp{^} as an
|
||
alternate operator for the exponentiation, although @acronym{POSIX}
|
||
requires the C behavior of bitwise exclusive-or. The precedence of the
|
||
negation operators, @samp{~} and @samp{!}, was traditionally lower than
|
||
equality. The unary operators could not be used reliably more than once
|
||
on the same term without intervening parentheses. The traditional
|
||
precedence of the equality operators @samp{==} and @samp{!=} was
|
||
identical instead of lower than the relational operators such as
|
||
@samp{<}, even through @acronym{GNU} M4 1.4.8. Starting with version
|
||
1.4.9, @acronym{GNU} M4 correctly follows @acronym{POSIX} precedence
|
||
rules. M4 scripts designed to be portable between releases must be
|
||
aware that parentheses may be required to enforce C precedence rules.
|
||
Likewise, division by zero, even in the unused branch of a
|
||
short-circuiting operator, is not always well-defined in other
|
||
implementations.
|
||
|
||
Following are some examples where the current version of M4 follows C
|
||
precedence rules, but where older versions and some other
|
||
implementations of @code{m4} require explicit parentheses to get the
|
||
correct result:
|
||
|
||
@example
|
||
eval(`1 == 2 > 0')
|
||
@result{}1
|
||
eval(`(1 == 2) > 0')
|
||
@result{}0
|
||
eval(`! 0 * 2')
|
||
@result{}2
|
||
eval(`! (0 * 2)')
|
||
@result{}1
|
||
eval(`1 | 1 ^ 1')
|
||
@result{}1
|
||
eval(`(1 | 1) ^ 1')
|
||
@result{}0
|
||
eval(`+ + - ~ ! ~ 0')
|
||
@result{}1
|
||
eval(`2 || 1 / 0')
|
||
@result{}1
|
||
eval(`0 || 1 / 0')
|
||
@error{}m4:stdin:9: divide by zero in eval: 0 || 1 / 0
|
||
@result{}
|
||
eval(`0 && 1 % 0')
|
||
@result{}0
|
||
eval(`2 && 1 % 0')
|
||
@error{}m4:stdin:11: modulo by zero in eval: 2 && 1 % 0
|
||
@result{}
|
||
@end example
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
As a @acronym{GNU} extension, the operator @samp{**} performs integral
|
||
exponentiation. The operator is right-associative, and if evaluated,
|
||
the exponent must be non-negative, and at least one of the arguments
|
||
must be non-zero, or a warning is issued.
|
||
|
||
@example
|
||
eval(`2 ** 3 ** 2')
|
||
@result{}512
|
||
eval(`(2 ** 3) ** 2')
|
||
@result{}64
|
||
eval(`0 ** 1')
|
||
@result{}0
|
||
eval(`2 ** 0')
|
||
@result{}1
|
||
eval(`0 ** 0')
|
||
@result{}
|
||
@error{}m4:stdin:5: divide by zero in eval: 0 ** 0
|
||
eval(`4 ** -2')
|
||
@error{}m4:stdin:6: negative exponent in eval: 4 ** -2
|
||
@result{}
|
||
@end example
|
||
|
||
Within @var{expression}, (but not @var{radix} or @var{width}), numbers
|
||
without a special prefix are decimal. A simple @samp{0} prefix
|
||
introduces an octal number. @samp{0x} introduces a hexadecimal number.
|
||
As @acronym{GNU} extensions, @samp{0b} introduces a binary number.
|
||
@samp{0r} introduces a number expressed in any radix between 1 and 36:
|
||
the prefix should be immediately followed by the decimal expression of
|
||
the radix, a colon, then the digits making the number. For radix 1,
|
||
leading zeros are ignored, and all remaining digits must be @samp{1};
|
||
for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
|
||
@dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
|
||
to @samp{z}. Lower and upper case letters can be used interchangeably
|
||
in numbers prefixes and as number digits.
|
||
|
||
Parentheses may be used to group subexpressions whenever needed. For the
|
||
relational operators, a true relation returns @code{1}, and a false
|
||
relation return @code{0}.
|
||
|
||
Here are a few examples of use of @code{eval}.
|
||
|
||
@example
|
||
eval(`-3 * 5')
|
||
@result{}-15
|
||
eval(`-99 / 10')
|
||
@result{}-9
|
||
eval(`-99 % 10')
|
||
@result{}-9
|
||
eval(`99 % -10')
|
||
@result{}9
|
||
eval(index(`Hello world', `llo') >= 0)
|
||
@result{}1
|
||
eval(`0r1:0111 + 0b100 + 0r3:12')
|
||
@result{}12
|
||
define(`square', `eval(`($1) ** 2')')
|
||
@result{}
|
||
square(`9')
|
||
@result{}81
|
||
square(square(`5')` + 1')
|
||
@result{}676
|
||
define(`foo', `666')
|
||
@result{}
|
||
eval(`foo / 6')
|
||
@error{}m4:stdin:11: bad expression in eval: foo / 6
|
||
@result{}
|
||
eval(foo / 6)
|
||
@result{}111
|
||
@end example
|
||
|
||
As the last two lines show, @code{eval} does not handle macro
|
||
names, even if they expand to a valid expression (or part of a valid
|
||
expression). Therefore all macros must be expanded before they are
|
||
passed to @code{eval}.
|
||
|
||
Some calculations are not portable to other implementations, since they
|
||
have undefined semantics in C, but @acronym{GNU} @code{m4} has
|
||
well-defined behavior on overflow. When shifting, an out-of-range shift
|
||
amount is implicitly brought into the range of 32-bit signed integers
|
||
using an implicit bit-wise and with 0x1f).
|
||
|
||
@example
|
||
define(`max_int', eval(`0x7fffffff'))
|
||
@result{}
|
||
define(`min_int', incr(max_int))
|
||
@result{}
|
||
eval(min_int` < 0')
|
||
@result{}1
|
||
eval(max_int` > 0')
|
||
@result{}1
|
||
ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
|
||
@result{}overflow occurred
|
||
min_int
|
||
@result{}-2147483648
|
||
eval(`0x80000000 % -1')
|
||
@result{}0
|
||
eval(`-4 >> 1')
|
||
@result{}-2
|
||
eval(`-4 >> 33')
|
||
@result{}-2
|
||
@end example
|
||
|
||
If @var{radix} is specified, it specifies the radix to be used in the
|
||
expansion. The default radix is 10; this is also the case if
|
||
@var{radix} is the empty string. A warning results if the radix is
|
||
outside the range of 1 through 36, inclusive. The result of @code{eval}
|
||
is always taken to be signed. No radix prefix is output, and for
|
||
radices greater than 10, the digits are lower case. The @var{width}
|
||
argument specifies the minimum output width, excluding any negative
|
||
sign. The result is zero-padded to extend the expansion to the
|
||
requested width. A warning results if the width is negative. If
|
||
@var{radix} or @var{width} is out of bounds, the expansion of
|
||
@code{eval} is empty.
|
||
|
||
@example
|
||
eval(`666', `10')
|
||
@result{}666
|
||
eval(`666', `11')
|
||
@result{}556
|
||
eval(`666', `6')
|
||
@result{}3030
|
||
eval(`666', `6', `10')
|
||
@result{}0000003030
|
||
eval(`-666', `6', `10')
|
||
@result{}-0000003030
|
||
eval(`10', `', `0')
|
||
@result{}10
|
||
`0r1:'eval(`10', `1', `11')
|
||
@result{}0r1:01111111111
|
||
eval(`10', `16')
|
||
@result{}a
|
||
eval(`1', `37')
|
||
@error{}m4:stdin:9: radix 37 in builtin `eval' out of range
|
||
@result{}
|
||
eval(`1', , `-1')
|
||
@error{}m4:stdin:10: negative width to builtin `eval'
|
||
@result{}
|
||
eval()
|
||
@error{}m4:stdin:11: empty string treated as 0 in builtin `eval'
|
||
@result{}0
|
||
@end example
|
||
|
||
@node Shell commands
|
||
@chapter Macros for running shell commands
|
||
|
||
@cindex UNIX commands, running
|
||
@cindex executing shell commands
|
||
@cindex running shell commands
|
||
@cindex shell commands, running
|
||
@cindex commands, running shell
|
||
There are a few builtin macros in @code{m4} that allow you to run shell
|
||
commands from within @code{m4}.
|
||
|
||
Note that the definition of a valid shell command is system dependent.
|
||
On UNIX systems, this is the typical @command{/bin/sh}. But on other
|
||
systems, such as native Windows, the shell has a different syntax of
|
||
commands that it understands. Some examples in this chapter assume
|
||
@command{/bin/sh}, and also demonstrate how to quit early with a known
|
||
exit value if this is not the case.
|
||
|
||
@menu
|
||
* Platform macros:: Determining the platform
|
||
* Syscmd:: Executing simple commands
|
||
* Esyscmd:: Reading the output of commands
|
||
* Sysval:: Exit status
|
||
* Mkstemp:: Making temporary files
|
||
@end menu
|
||
|
||
@node Platform macros
|
||
@section Determining the platform
|
||
|
||
@cindex platform macros
|
||
Sometimes it is desirable for an input file to know which platform
|
||
@code{m4} is running on. @acronym{GNU} @code{m4} provides several
|
||
macros that are predefined to expand to the empty string; checking for
|
||
their existence will confirm platform details.
|
||
|
||
@deffn {Optional builtin} __gnu__
|
||
@deffnx {Optional builtin} __os2__
|
||
@deffnx {Optional builtin} os2
|
||
@deffnx {Optional builtin} __unix__
|
||
@deffnx {Optional builtin} unix
|
||
@deffnx {Optional builtin} __windows__
|
||
@deffnx {Optional builtin} windows
|
||
Each of these macros is conditionally defined as needed to describe the
|
||
environment of @code{m4}. If defined, each macro expands to the empty
|
||
string. For now, these macros silently ignore all arguments, but in a
|
||
future release of M4, they might warn if arguments are present.
|
||
@end deffn
|
||
|
||
When @acronym{GNU} extensions are in effect (that is, when you did not
|
||
use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
|
||
@acronym{GNU} @code{m4} will define the macro @code{@w{__gnu__}} to
|
||
expand to the empty string.
|
||
|
||
@example
|
||
$ @kbd{m4}
|
||
__gnu__
|
||
@result{}
|
||
__gnu__(`ignored')
|
||
@result{}
|
||
Extensions are ifdef(`__gnu__', `active', `inactive')
|
||
@result{}Extensions are active
|
||
@end example
|
||
|
||
@comment options: -G
|
||
@example
|
||
$ @kbd{m4 -G}
|
||
__gnu__
|
||
@result{}__gnu__
|
||
__gnu__(`ignored')
|
||
@result{}__gnu__(ignored)
|
||
Extensions are ifdef(`__gnu__', `active', `inactive')
|
||
@result{}Extensions are inactive
|
||
@end example
|
||
|
||
On UNIX systems, @acronym{GNU} @code{m4} will define @code{@w{__unix__}}
|
||
by default, or @code{unix} when the @option{-G} option is specified.
|
||
|
||
On native Windows systems, @acronym{GNU} @code{m4} will define
|
||
@code{@w{__windows__}} by default, or @code{windows} when the
|
||
@option{-G} option is specified.
|
||
|
||
On OS/2 systems, @acronym{GNU} @code{m4} will define @code{@w{__os2__}}
|
||
by default, or @code{os2} when the @option{-G} option is specified.
|
||
|
||
If @acronym{GNU} @code{m4} does not provide a platform macro for your system,
|
||
please report that as a bug.
|
||
|
||
@example
|
||
define(`provided', `0')
|
||
@result{}
|
||
ifdef(`__unix__', `define(`provided', incr(provided))')
|
||
@result{}
|
||
ifdef(`__windows__', `define(`provided', incr(provided))')
|
||
@result{}
|
||
ifdef(`__os2__', `define(`provided', incr(provided))')
|
||
@result{}
|
||
provided
|
||
@result{}1
|
||
@end example
|
||
|
||
@node Syscmd
|
||
@section Executing simple commands
|
||
|
||
Any shell command can be executed, using @code{syscmd}:
|
||
|
||
@deffn Builtin syscmd (@var{shell-command})
|
||
Executes @var{shell-command} as a shell command.
|
||
|
||
The expansion of @code{syscmd} is void, @emph{not} the output from
|
||
@var{shell-command}! Output or error messages from @var{shell-command}
|
||
are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
|
||
command output.
|
||
|
||
Prior to executing the command, @code{m4} flushes its buffers.
|
||
The default standard input, output and error of @var{shell-command} are
|
||
the same as those of @code{m4}.
|
||
|
||
By default, the @var{shell-command} will be used as the argument to the
|
||
@option{-c} option of the @command{/bin/sh} shell (or the version of
|
||
@command{sh} specified by @samp{command -p getconf PATH}, if your system
|
||
supports that). If you prefer a different shell, the
|
||
@command{configure} script can be given the option
|
||
@option{--with-syscmd-shell=@var{location}} to set the location of an
|
||
alternative shell at @acronym{GNU} @code{m4} installation; the
|
||
alternative shell must still support @option{-c}.
|
||
|
||
The macro @code{syscmd} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`foo', `FOO')
|
||
@result{}
|
||
syscmd(`echo foo')
|
||
@result{}foo
|
||
@result{}
|
||
@end example
|
||
|
||
Note how the expansion of @code{syscmd} keeps the trailing newline of
|
||
the command, as well as using the newline that appeared after the macro.
|
||
|
||
The following is an example of @var{shell-command} using the same
|
||
standard input as @code{m4}:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
|
||
@result{}
|
||
@end example
|
||
|
||
@ignore
|
||
@comment If the user types the example below with stdin being an
|
||
@comment interactive terminal, then cat will hang waiting for additional
|
||
@comment input after m4 has exited. But the testsuite is using a pipe
|
||
@comment for stdin. Hence, we have two versions - the one we feed the
|
||
@comment testsuite below, and the one we display to the user above that
|
||
@comment more accurately shows what the testsuite is really doing but
|
||
@comment which the testsuite cannot parse.
|
||
|
||
@example
|
||
m4wrap(`syscmd(`cat')')
|
||
@result{}
|
||
^D
|
||
@end example
|
||
@end ignore
|
||
|
||
It tells @code{m4} to read all of its input before executing the wrapped
|
||
text, then hand a valid (albeit emptied) pipe as standard input for the
|
||
@code{cat} subcommand. Therefore, you should be careful when using
|
||
standard input (either by specifying no files, or by passing @samp{-} as
|
||
a file name on the command line, @pxref{Command line files, , Invoking
|
||
m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
|
||
that consume data from standard input. When standard input is a
|
||
seekable file, the subprocess will pick up with the next character not
|
||
yet processed by @code{m4}; when it is a pipe or other non-seekable
|
||
file, there is no guarantee how much data will already be buffered by
|
||
@code{m4} and thus unavailable to the child.
|
||
|
||
@node Esyscmd
|
||
@section Reading the output of commands
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
If you want @code{m4} to read the output of a shell command, use
|
||
@code{esyscmd}:
|
||
|
||
@deffn Builtin esyscmd (@var{shell-command})
|
||
Expands to the standard output of the shell command
|
||
@var{shell-command}.
|
||
|
||
Prior to executing the command, @code{m4} flushes its buffers.
|
||
The default standard input and standard error of @var{shell-command} are
|
||
the same as those of @code{m4}. The error output of @var{shell-command}
|
||
is not a part of the expansion: it will appear along with the error
|
||
output of @code{m4}.
|
||
|
||
By default, the @var{shell-command} will be used as the argument to the
|
||
@option{-c} option of the @command{/bin/sh} shell (or the version of
|
||
@command{sh} specified by @samp{command -p getconf PATH}, if your system
|
||
supports that). If you prefer a different shell, the
|
||
@command{configure} script can be given the option
|
||
@option{--with-syscmd-shell=@var{location}} to set the location of an
|
||
alternative shell at @acronym{GNU} @code{m4} installation; the
|
||
alternative shell must still support @option{-c}.
|
||
|
||
The macro @code{esyscmd} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
define(`foo', `FOO')
|
||
@result{}
|
||
esyscmd(`echo foo')
|
||
@result{}FOO
|
||
@result{}
|
||
@end example
|
||
|
||
Note how the expansion of @code{esyscmd} keeps the trailing newline of
|
||
the command, as well as using the newline that appeared after the macro.
|
||
|
||
Just as with @code{syscmd}, care must be exercised when sharing standard
|
||
input between @code{m4} and the child process of @code{esyscmd}.
|
||
|
||
@node Sysval
|
||
@section Exit status
|
||
|
||
@cindex UNIX commands, exit status from
|
||
@cindex exit status from shell commands
|
||
@cindex shell commands, exit status from
|
||
@cindex commands, exit status from shell
|
||
@cindex status of shell commands
|
||
To see whether a shell command succeeded, use @code{sysval}:
|
||
|
||
@deffn Builtin sysval
|
||
Expands to the exit status of the last shell command run with
|
||
@code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
|
||
run yet.
|
||
@end deffn
|
||
|
||
@example
|
||
sysval
|
||
@result{}0
|
||
syscmd(`false')
|
||
@result{}
|
||
ifelse(sysval, `0', `zero', `non-zero')
|
||
@result{}non-zero
|
||
syscmd(`exit 2')
|
||
@result{}
|
||
sysval
|
||
@result{}2
|
||
syscmd(`true')
|
||
@result{}
|
||
sysval
|
||
@result{}0
|
||
esyscmd(`false')
|
||
@result{}
|
||
ifelse(sysval, `0', `zero', `non-zero')
|
||
@result{}non-zero
|
||
esyscmd(`exit 2')
|
||
@result{}
|
||
sysval
|
||
@result{}2
|
||
esyscmd(`true')
|
||
@result{}
|
||
sysval
|
||
@result{}0
|
||
@end example
|
||
|
||
@code{sysval} results in 127 if there was a problem executing the
|
||
command, for example, if the system-imposed argument length is exceeded,
|
||
or if there were not enough resources to fork. It is not possible to
|
||
distinguish between failed execution and successful execution that had
|
||
an exit status of 127.
|
||
|
||
On UNIX platforms, where it is possible to detect when command execution
|
||
is terminated by a signal, rather than a normal exit, the result is the
|
||
signal number shifted left by eight bits.
|
||
|
||
@comment This test has difficulties being portable, even on platforms
|
||
@comment where syscmd invokes /bin/sh. Kill is not portable with signal
|
||
@comment names. According to autoconf, the only portable signal numbers
|
||
@comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
|
||
@comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
|
||
@comment exits normally rather than letting the signal terminate it).
|
||
@comment Also, TERM is flaky, as it can also kill the running m4 on
|
||
@comment systems where /bin/sh does not create its own process group.
|
||
@comment And PIPE is unreliable, since people tend to run with it
|
||
@comment ignored, with m4 inheriting that choice. That leaves KILL as
|
||
@comment the only signal we can reliably test.
|
||
@example
|
||
dnl This test assumes kill is a shell builtin, and that signals are
|
||
dnl recognizable.
|
||
ifdef(`__unix__', ,
|
||
`errprint(` skipping: syscmd does not have unix semantics
|
||
')m4exit(`77')')dnl
|
||
syscmd(`kill -9 $$')
|
||
@result{}
|
||
sysval
|
||
@result{}2304
|
||
syscmd()
|
||
@result{}
|
||
sysval
|
||
@result{}0
|
||
esyscmd(`kill -9 $$')
|
||
@result{}
|
||
sysval
|
||
@result{}2304
|
||
@end example
|
||
|
||
@node Mkstemp
|
||
@section Making temporary files
|
||
|
||
@cindex temporary file names
|
||
@cindex files, names of temporary
|
||
Commands specified to @code{syscmd} or @code{esyscmd} might need a
|
||
temporary file, for output or for some other purpose. There is a
|
||
builtin macro, @code{mkstemp}, for making a temporary file:
|
||
|
||
@deffn Builtin mkstemp (@var{template})
|
||
@deffnx Builtin maketemp (@var{template})
|
||
Expands to the quoted name of a new, empty file, made from the string
|
||
@var{template}, which should end with the string @samp{XXXXXX}. The six
|
||
@samp{X} characters are then replaced with random characters matching
|
||
the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
|
||
name unique. If fewer than six @samp{X} characters are found at the end
|
||
of @code{template}, the result will be longer than the template. The
|
||
created file will have access permissions as if by @kbd{chmod =rw,go=},
|
||
meaning that the current umask of the @code{m4} process is taken into
|
||
account, and at most only the current user can read and write the file.
|
||
|
||
The traditional behavior, standardized by @acronym{POSIX}, is that
|
||
@code{maketemp} merely replaces the trailing @samp{X} with the process
|
||
id, without creating a file or quoting the expansion, and without
|
||
ensuring that the resulting
|
||
string is a unique file name. In part, this means that using the same
|
||
@var{template} twice in the same input file will result in the same
|
||
expansion. This behavior is a security hole, as it is very easy for
|
||
another process to guess the name that will be generated, and thus
|
||
interfere with a subsequent use of @code{syscmd} trying to manipulate
|
||
that file name. Hence, @acronym{POSIX} has recommended that all new
|
||
implementations of @code{m4} provide the secure @code{mkstemp} builtin,
|
||
and that users of @code{m4} check for its existence.
|
||
|
||
The expansion is void and an error issued if a temporary file could
|
||
not be created.
|
||
|
||
The macros @code{mkstemp} and @code{maketemp} are recognized only with
|
||
parameters.
|
||
@end deffn
|
||
|
||
If you try this next example, you will most likely get different output
|
||
for the two file names, since the replacement characters are randomly
|
||
chosen:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{m4}
|
||
define(`tmp', `oops')
|
||
@result{}
|
||
maketemp(`/tmp/fooXXXXXX')
|
||
@result{}/tmp/fooa07346
|
||
ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
|
||
`define(`mkstemp', defn(`maketemp'))dnl
|
||
errprint(`warning: potentially insecure maketemp implementation
|
||
')')
|
||
@result{}
|
||
mkstemp(`doc')
|
||
@result{}docQv83Uw
|
||
@end example
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
Unless you use the @option{--traditional} command line option (or
|
||
@option{-G}, @pxref{Limits control, , Invoking m4}), the @acronym{GNU}
|
||
version of @code{maketemp} is secure. This means that using the same
|
||
template to multiple calls will generate multiple files. However, we
|
||
recommend that you use the new @code{mkstemp} macro, introduced in
|
||
@acronym{GNU} M4 1.4.8, which is secure even in traditional mode. Also,
|
||
as of M4 1.4.11, the secure implementation quotes the resulting file
|
||
name, so that you are guaranteed to know what file was created even if
|
||
the random file name happens to match an existing macro. Notice that
|
||
this example is careful to use @code{defn} to avoid unintended expansion
|
||
of @samp{foo}.
|
||
|
||
@example
|
||
$ @kbd{m4}
|
||
define(`foo', `errprint(`oops')')
|
||
@result{}
|
||
syscmd(`rm -f foo-??????')sysval
|
||
@result{}0
|
||
define(`file1', maketemp(`foo-XXXXXX'))dnl
|
||
ifelse(esyscmd(`echo \` foo-?????? \''), ` foo-?????? ',
|
||
`no file', `created')
|
||
@result{}created
|
||
define(`file2', maketemp(`foo-XX'))dnl
|
||
define(`file3', mkstemp(`foo-XXXXXX'))dnl
|
||
ifelse(len(defn(`file1')), len(defn(`file2')),
|
||
`same length', `different')
|
||
@result{}same length
|
||
ifelse(defn(`file1'), defn(`file2'), `same', `different file')
|
||
@result{}different file
|
||
ifelse(defn(`file2'), defn(`file3'), `same', `different file')
|
||
@result{}different file
|
||
ifelse(defn(`file1'), defn(`file3'), `same', `different file')
|
||
@result{}different file
|
||
syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
|
||
@result{}
|
||
sysval
|
||
@result{}0
|
||
@end example
|
||
|
||
@ignore
|
||
@c Not worth documenting, but make sure we don't leave trailing NUL in
|
||
@c the expansion.
|
||
|
||
@example
|
||
syscmd(`rm -rf foodir')sysval
|
||
@result{}0
|
||
syscmd(`mkdir foodir')sysval
|
||
@result{}0
|
||
len(mkstemp(`foodir/fooXXXXX'))
|
||
@result{}16
|
||
syscmd(`rm -r foodir')sysval
|
||
@result{}0
|
||
@end example
|
||
|
||
@c Likewise, and ensure that traditional mode leaves the result unquoted
|
||
@c without creating a file.
|
||
|
||
@comment options: -G
|
||
@example
|
||
syscmd(`rm -f foo-*')sysval
|
||
@result{}0
|
||
len(maketemp(`foo-XXXXX'))
|
||
@error{}m4:stdin:2: recommend using mkstemp instead
|
||
@result{}9
|
||
define(`abc', `def')
|
||
@result{}
|
||
maketemp(`foo-abc')
|
||
@result{}foo-def
|
||
@error{}m4:stdin:4: recommend using mkstemp instead
|
||
syscmd(`test -f foo-*')ifelse(sysval, `0', `0', `1')
|
||
@result{}1
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Miscellaneous
|
||
@chapter Miscellaneous builtin macros
|
||
|
||
This chapter describes various builtins, that do not really belong in
|
||
any of the previous chapters.
|
||
|
||
@menu
|
||
* Errprint:: Printing error messages
|
||
* Location:: Printing current location
|
||
* M4exit:: Exiting from @code{m4}
|
||
@end menu
|
||
|
||
@node Errprint
|
||
@section Printing error messages
|
||
|
||
@cindex printing error messages
|
||
@cindex error messages, printing
|
||
@cindex messages, printing error
|
||
@cindex standard error, output to
|
||
You can print error messages using @code{errprint}:
|
||
|
||
@deffn Builtin errprint (@var{message}, @dots{})
|
||
Prints @var{message} and the rest of the arguments to standard error,
|
||
separated by spaces. Standard error is used, regardless of the
|
||
@option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
|
||
|
||
The expansion of @code{errprint} is void.
|
||
The macro @code{errprint} is recognized only with parameters.
|
||
@end deffn
|
||
|
||
@example
|
||
errprint(`Invalid arguments to forloop
|
||
')
|
||
@error{}Invalid arguments to forloop
|
||
@result{}
|
||
errprint(`1')errprint(`2',`3
|
||
')
|
||
@error{}12 3
|
||
@result{}
|
||
@end example
|
||
|
||
A trailing newline is @emph{not} printed automatically, so it should be
|
||
supplied as part of the argument, as in the example. Unfortunately, the
|
||
exact output of @code{errprint} is not very portable to other @code{m4}
|
||
implementations: @acronym{POSIX} requires that all arguments be printed,
|
||
but some implementations of @code{m4} only print the first.
|
||
Furthermore, some @acronym{BSD} implementations always append a newline
|
||
for each @code{errprint} call, regardless of whether the last argument
|
||
already had one, and @acronym{POSIX} is silent on whether this is
|
||
acceptable.
|
||
|
||
@node Location
|
||
@section Printing current location
|
||
|
||
@cindex location, input
|
||
@cindex input location
|
||
To make it possible to specify the location of an error, three
|
||
utility builtins exist:
|
||
|
||
@deffn Builtin __file__
|
||
@deffnx Builtin __line__
|
||
@deffnx Builtin __program__
|
||
Expand to the quoted name of the current input file, the
|
||
current input line number in that file, and the quoted name of the
|
||
current invocation of @code{m4}.
|
||
@end deffn
|
||
|
||
@example
|
||
errprint(__program__:__file__:__line__: `input error
|
||
')
|
||
@error{}m4:stdin:1: input error
|
||
@result{}
|
||
@end example
|
||
|
||
Line numbers start at 1 for each file. If the file was found due to the
|
||
@option{-I} option or @env{M4PATH} environment variable, that is
|
||
reflected in the file name. The syncline option (@option{-s},
|
||
@pxref{Preprocessor features, , Invoking m4}), and the
|
||
@samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
|
||
also use this notion of current file and line. Redefining the three
|
||
location macros has no effect on syncline, debug, warning, or error
|
||
message output.
|
||
|
||
This example reuses the file @file{incl.m4} mentioned earlier
|
||
(@pxref{Include}):
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
define(`foo', ``$0' called at __file__:__line__')
|
||
@result{}
|
||
foo
|
||
@result{}foo called at stdin:2
|
||
include(`incl.m4')
|
||
@result{}Include file start
|
||
@result{}foo called at examples/incl.m4:2
|
||
@result{}Include file end
|
||
@result{}
|
||
@end example
|
||
|
||
The location of macros invoked during the rescanning of macro expansion
|
||
text corresponds to the location in the file where the expansion was
|
||
triggered, regardless of how many newline characters the expansion text
|
||
contains. As of @acronym{GNU} M4 1.4.8, the location of text wrapped
|
||
with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
|
||
@code{m4wrap} was invoked. Previous versions, however, behaved as
|
||
though wrapped text came from line 0 of the file ``''.
|
||
|
||
@example
|
||
define(`echo', `$@@')
|
||
@result{}
|
||
define(`foo', `echo(__line__
|
||
__line__)')
|
||
@result{}
|
||
echo(__line__
|
||
__line__)
|
||
@result{}4
|
||
@result{}5
|
||
m4wrap(`foo
|
||
')
|
||
@result{}
|
||
foo(errprint(__line__
|
||
__line__
|
||
))
|
||
@error{}8
|
||
@error{}9
|
||
@result{}8
|
||
@result{}8
|
||
__line__
|
||
@result{}11
|
||
m4wrap(`__line__
|
||
')
|
||
@result{}
|
||
^D
|
||
@result{}12
|
||
@result{}6
|
||
@result{}6
|
||
@end example
|
||
|
||
The @code{@w{__program__}} macro behaves like @samp{$0} in shell
|
||
terminology. If you invoke @code{m4} through an absolute path or a link
|
||
with a different spelling, rather than by relying on a @env{PATH} search
|
||
for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
|
||
The intent is that you can use it to produce error messages with the
|
||
same formatting that @code{m4} produces internally. It can also be used
|
||
within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
|
||
@code{m4} that is currently running, rather than whatever version of
|
||
@code{m4} happens to be first in @env{PATH}. It was first introduced in
|
||
@acronym{GNU} M4 1.4.6.
|
||
|
||
@node M4exit
|
||
@section Exiting from @code{m4}
|
||
|
||
@cindex exiting from @code{m4}
|
||
@cindex status, setting @code{m4} exit
|
||
If you need to exit from @code{m4} before the entire input has been
|
||
read, you can use @code{m4exit}:
|
||
|
||
@deffn Builtin m4exit (@dvar{code, 0})
|
||
Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
|
||
left out, the exit status is zero. If @var{code} cannot be parsed, or
|
||
is outside the range of 0 to 255, the exit status is one. No further
|
||
input is read, and all wrapped and diverted text is discarded.
|
||
@end deffn
|
||
|
||
@example
|
||
m4wrap(`This text is lost due to `m4exit'.')
|
||
@result{}
|
||
divert(`1') So is this.
|
||
divert
|
||
@result{}
|
||
m4exit And this is never read.
|
||
@end example
|
||
|
||
A common use of this is to abort processing:
|
||
|
||
@deffn Composite fatal_error (@var{message})
|
||
Abort processing with an error message and non-zero status. Prefix
|
||
@var{message} with details about where the error occurred, and print the
|
||
resulting string to standard error.
|
||
@end deffn
|
||
|
||
@comment status: 1
|
||
@example
|
||
define(`fatal_error',
|
||
`errprint(__program__:__file__:__line__`: fatal error: $*
|
||
')m4exit(`1')')
|
||
@result{}
|
||
fatal_error(`this is a BAD one, buster')
|
||
@error{}m4:stdin:4: fatal error: this is a BAD one, buster
|
||
@end example
|
||
|
||
After this macro call, @code{m4} will exit with exit status 1. This macro
|
||
is only intended for error exits, since the normal exit procedures are
|
||
not followed, i.e., diverted text is not undiverted, and saved text
|
||
(@pxref{M4wrap}) is not reread. (This macro could be made more robust
|
||
to earlier versions of @code{m4}. You should try to see if you can find
|
||
weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
|
||
|
||
Note that it is still possible for the exit status to be different than
|
||
what was requested by @code{m4exit}. If @code{m4} detects some other
|
||
error, such as a write error on standard output, the exit status will be
|
||
non-zero even if @code{m4exit} requested zero.
|
||
|
||
If standard input is seekable, then the file will be positioned at the
|
||
next unread character. If it is a pipe or other non-seekable file,
|
||
then there are no guarantees how much data @code{m4} might have read
|
||
into buffers, and thus discarded.
|
||
|
||
@node Frozen files
|
||
@chapter Fast loading of frozen state
|
||
|
||
Some bigger @code{m4} applications may be built over a common base
|
||
containing hundreds of definitions and other costly initializations.
|
||
Usually, the common base is kept in one or more declarative files,
|
||
which files are listed on each @code{m4} invocation prior to the
|
||
user's input file, or else each input file uses @code{include}.
|
||
|
||
Reading the common base of a big application, over and over again, may
|
||
be time consuming. @acronym{GNU} @code{m4} offers some machinery to
|
||
speed up the start of an application using lengthy common bases.
|
||
|
||
@menu
|
||
* Using frozen files:: Using frozen files
|
||
* Frozen file format:: Frozen file format
|
||
@end menu
|
||
|
||
@node Using frozen files
|
||
@section Using frozen files
|
||
|
||
@cindex fast loading of frozen files
|
||
@cindex frozen files for fast loading
|
||
@cindex initialization, frozen state
|
||
@cindex dumping into frozen file
|
||
@cindex reloading a frozen file
|
||
@cindex @acronym{GNU} extensions
|
||
Suppose a user has a library of @code{m4} initializations in
|
||
@file{base.m4}, which is then used with multiple input files:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{m4 base.m4 input1.m4}
|
||
$ @kbd{m4 base.m4 input2.m4}
|
||
$ @kbd{m4 base.m4 input3.m4}
|
||
@end example
|
||
|
||
Rather than spending time parsing the fixed contents of @file{base.m4}
|
||
every time, the user might rather execute:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{m4 -F base.m4f base.m4}
|
||
@end example
|
||
|
||
@noindent
|
||
once, and further execute, as often as needed:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{m4 -R base.m4f input1.m4}
|
||
$ @kbd{m4 -R base.m4f input2.m4}
|
||
$ @kbd{m4 -R base.m4f input3.m4}
|
||
@end example
|
||
|
||
@noindent
|
||
with the varying input. The first call, containing the @option{-F}
|
||
option, only reads and executes file @file{base.m4}, defining
|
||
various application macros and computing other initializations.
|
||
Once the input file @file{base.m4} has been completely processed, @acronym{GNU}
|
||
@code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
|
||
file which contains a kind of snapshot of the @code{m4} internal state.
|
||
|
||
Later calls, containing the @option{-R} option, are able to reload
|
||
the internal state of @code{m4}, from @file{base.m4f},
|
||
@emph{prior} to reading any other input files. This means
|
||
instead of starting with a virgin copy of @code{m4}, input will be
|
||
read after having effectively recovered the effect of a prior run.
|
||
In our example, the effect is the same as if file @file{base.m4} has
|
||
been read anew. However, this effect is achieved a lot faster.
|
||
|
||
Only one frozen file may be created or read in any one @code{m4}
|
||
invocation. It is not possible to recover two frozen files at once.
|
||
However, frozen files may be updated incrementally, through using
|
||
@option{-R} and @option{-F} options simultaneously. For example, if
|
||
some care is taken, the command:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
|
||
@end example
|
||
|
||
@noindent
|
||
could be broken down in the following sequence, accumulating the same
|
||
output:
|
||
|
||
@comment ignore
|
||
@example
|
||
$ @kbd{m4 -F file1.m4f file1.m4}
|
||
$ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
|
||
$ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
|
||
$ @kbd{m4 -R file3.m4f file4.m4}
|
||
@end example
|
||
|
||
Some care is necessary because not every effort has been made for
|
||
this to work in all cases. In particular, the trace attribute of
|
||
macros is not handled, nor the current setting of @code{changeword}.
|
||
Currently, @code{m4wrap} and @code{sysval} also have problems.
|
||
Also, interactions for some options of @code{m4}, being used in one call
|
||
and not in the next, have not been fully analyzed yet. On the other
|
||
end, you may be confident that stacks of @code{pushdef} definitions
|
||
are handled correctly, as well as undefined or renamed builtins, and
|
||
changed strings for quotes or comments. And future releases of
|
||
@acronym{GNU} M4 will improve on the utility of frozen files.
|
||
|
||
@ignore
|
||
@c This example is not worth putting in the manual, but caused core
|
||
@c dumps in all versions prior to 1.4.11.
|
||
|
||
@comment options: -F /dev/null
|
||
@example
|
||
traceon(`undefined')dnl
|
||
@end example
|
||
|
||
@c Make sure freezing is successful.
|
||
|
||
@example
|
||
ifdef(`__unix__', ,
|
||
`errprint(` skipping: syscmd does not have unix semantics
|
||
')m4exit(`77')')dnl
|
||
changequote(`[', `]')dnl
|
||
syscmd([echo 'changequote([,])pushdef([divnum],[hi])dnl' \
|
||
| ']__program__[' -F in.m4f \
|
||
&& echo 'divnum popdef([divnum])divnum' \
|
||
| ']__program__[' -R in.m4f \
|
||
&& rm in.m4f])status sysval
|
||
@result{}hi 0
|
||
@result{}status 0
|
||
@end example
|
||
|
||
@c Detect inability to freeze.
|
||
@c Some systems harden /, and fail with EACCES rather than ENOENT.
|
||
|
||
@comment options: -F /none/such
|
||
@comment xerr: ignore
|
||
@comment status: 1
|
||
@example
|
||
$ @kbd{m4 -F /none/such}
|
||
^D
|
||
@error{}m4: cannot open `/none/such': No such file or directory
|
||
@end example
|
||
@end ignore
|
||
|
||
When an @code{m4} run is to be frozen, the automatic undiversion
|
||
which takes place at end of execution is inhibited. Instead, all
|
||
positively numbered diversions are saved into the frozen file.
|
||
The active diversion number is also transmitted.
|
||
|
||
A frozen file to be reloaded need not reside in the current directory.
|
||
It is looked up the same way as an @code{include} file (@pxref{Search
|
||
Path}).
|
||
|
||
If the frozen file was generated with a newer version of @code{m4}, and
|
||
contains directives that an older @code{m4} cannot parse, attempting to
|
||
load the frozen file with option @option{-R} will cause @code{m4} to
|
||
exit with status 63 to indicate version mismatch.
|
||
|
||
@node Frozen file format
|
||
@section Frozen file format
|
||
|
||
@cindex frozen file format
|
||
@cindex file format, frozen file
|
||
Frozen files are sharable across architectures. It is safe to write
|
||
a frozen file on one machine and read it on another, given that the
|
||
second machine uses the same or newer version of @acronym{GNU} @code{m4}.
|
||
It is conventional, but not required, to give a frozen file the suffix
|
||
of @code{.m4f}.
|
||
|
||
These are simple (editable) text files, made up of directives,
|
||
each starting with a capital letter and ending with a newline
|
||
(@key{NL}). Wherever a directive is expected, the character
|
||
@samp{#} introduces a comment line; empty lines are also ignored if they
|
||
are not part of an embedded string.
|
||
In the following descriptions, each @var{len} refers to the length of
|
||
the corresponding strings @var{str} in the next line of input. Numbers
|
||
are always expressed in decimal. There are no escape characters. The
|
||
directives are:
|
||
|
||
@table @code
|
||
@item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
||
Uses @var{str1} and @var{str2} as the begin-comment and
|
||
end-comment strings. If omitted, then @samp{#} and @key{NL} are the
|
||
comment delimiters.
|
||
|
||
@item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
|
||
Selects diversion @var{number}, making it current, then copy
|
||
@var{str} in the current diversion. @var{number} may be a negative
|
||
number for a non-existing diversion. To merely specify an active
|
||
selection, use this command with an empty @var{str}. With 0 as the
|
||
diversion @var{number}, @var{str} will be issued on standard output
|
||
at reload time. @acronym{GNU} @code{m4} will not produce the @samp{D}
|
||
directive with non-zero length for diversion 0, but this can be done
|
||
with manual edits. This directive may
|
||
appear more than once for the same diversion, in which case the
|
||
diversion is the concatenation of the various uses. If omitted, then
|
||
diversion 0 is current.
|
||
|
||
@item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
||
Defines, through @code{pushdef}, a definition for @var{str1}
|
||
expanding to the function whose builtin name is @var{str2}. If the
|
||
builtin does not exist (for example, if the frozen file was produced by
|
||
a copy of @code{m4} compiled with changeword support, but the version
|
||
of @code{m4} reloading was compiled without it), the reload is silent,
|
||
but any subsequent use of the definition of @var{str1} will result in
|
||
a warning. This directive may appear more than once for the same name,
|
||
and its order, along with @samp{T}, is important. If omitted, you will
|
||
have no access to any builtins.
|
||
|
||
@item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
||
Uses @var{str1} and @var{str2} as the begin-quote and end-quote
|
||
strings. If omitted, then @samp{`} and @samp{'} are the quote
|
||
delimiters.
|
||
|
||
@item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
|
||
Defines, though @code{pushdef}, a definition for @var{str1}
|
||
expanding to the text given by @var{str2}. This directive may appear
|
||
more than once for the same name, and its order, along with @samp{F}, is
|
||
important.
|
||
|
||
@item V @var{number} @key{NL}
|
||
Confirms the format of the file. @code{m4} @value{VERSION} only creates
|
||
and understands frozen files where @var{number} is 1. This directive
|
||
must be the first non-comment in the file, and may not appear more than
|
||
once.
|
||
@end table
|
||
|
||
@node Compatibility
|
||
@chapter Compatibility with other versions of @code{m4}
|
||
|
||
@cindex compatibility
|
||
This chapter describes the many of the differences between this
|
||
implementation of @code{m4}, and of other implementations found under
|
||
UNIX, such as System V Release 3, Solaris, and @acronym{BSD} flavors.
|
||
In particular, it lists the known differences and extensions to
|
||
@acronym{POSIX}. However, the list is not necessarily comprehensive.
|
||
|
||
At the time of this writing, @acronym{POSIX} 2001 (also known as IEEE
|
||
Std 1003.1-2001) is the latest standard, although a new version of
|
||
@acronym{POSIX} is under development and includes several proposals for
|
||
modifying what @code{m4} is required to do. The requirements for
|
||
@code{m4} are shared between @acronym{SUSv3} and @acronym{POSIX}, and
|
||
can be viewed at
|
||
@uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
|
||
|
||
@menu
|
||
* Extensions:: Extensions in @acronym{GNU} M4
|
||
* Incompatibilities:: Facilities in System V m4 not in GNU M4
|
||
* Other Incompatibilities:: Other incompatibilities
|
||
@end menu
|
||
|
||
@node Extensions
|
||
@section Extensions in @acronym{GNU} M4
|
||
|
||
@cindex @acronym{GNU} extensions
|
||
@cindex @acronym{POSIX}
|
||
This version of @code{m4} contains a few facilities that do not exist
|
||
in System V @code{m4}. These extra facilities are all suppressed by
|
||
using the @option{-G} command line option (@pxref{Limits control, ,
|
||
Invoking m4}), unless overridden by other command line options.
|
||
|
||
@itemize @bullet
|
||
@item
|
||
In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
|
||
several digits, while the System V @code{m4} only accepts one digit.
|
||
This allows macros in @acronym{GNU} @code{m4} to take any number of
|
||
arguments, and not only nine (@pxref{Arguments}).
|
||
|
||
This means that @code{define(`foo', `$11')} is ambiguous between
|
||
implementations. To portably choose between grabbing the first
|
||
parameter and appending 1 to the expansion, or grabbing the eleventh
|
||
parameter, you can do the following:
|
||
|
||
@example
|
||
define(`a1', `A1')
|
||
@result{}
|
||
dnl First argument, concatenated with 1
|
||
define(`_1', `$1')define(`first1', `_1($@@)1')
|
||
@result{}
|
||
dnl Eleventh argument, portable
|
||
define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
|
||
@result{}
|
||
dnl Eleventh argument, GNU style
|
||
define(`Eleventh', `$11')
|
||
@result{}
|
||
first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
|
||
@result{}A1
|
||
eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
|
||
@result{}k
|
||
Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
|
||
@result{}k
|
||
@end example
|
||
|
||
@noindent
|
||
Also see the @code{argn} macro (@pxref{Shift}).
|
||
|
||
@item
|
||
The @code{divert} (@pxref{Divert}) macro can manage more than 9
|
||
diversions. @acronym{GNU} @code{m4} treats all positive numbers as valid
|
||
diversions, rather than discarding diversions greater than 9.
|
||
|
||
@item
|
||
Files included with @code{include} and @code{sinclude} are sought in a
|
||
user specified search path, if they are not found in the working
|
||
directory. The search path is specified by the @option{-I} option and the
|
||
@env{M4PATH} environment variable (@pxref{Search Path}).
|
||
|
||
@item
|
||
Arguments to @code{undivert} can be non-numeric, in which case the named
|
||
file will be included uninterpreted in the output (@pxref{Undivert}).
|
||
|
||
@item
|
||
Formatted output is supported through the @code{format} builtin, which
|
||
is modeled after the C library function @code{printf} (@pxref{Format}).
|
||
|
||
@item
|
||
Searches and text substitution through basic regular expressions are
|
||
supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
|
||
(@pxref{Patsubst}) builtins. Some @acronym{BSD} implementations use
|
||
extended regular expressions instead.
|
||
|
||
@item
|
||
The output of shell commands can be read into @code{m4} with
|
||
@code{esyscmd} (@pxref{Esyscmd}).
|
||
|
||
@item
|
||
There is indirect access to any builtin macro with @code{builtin}
|
||
(@pxref{Builtin}).
|
||
|
||
@item
|
||
Macros can be called indirectly through @code{indir} (@pxref{Indir}).
|
||
|
||
@item
|
||
The name of the program, the current input file, and the current input
|
||
line number are accessible through the builtins @code{@w{__program__}},
|
||
@code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
|
||
|
||
@item
|
||
The format of the output from @code{dumpdef} and macro tracing can be
|
||
controlled with @code{debugmode} (@pxref{Debug Levels}).
|
||
|
||
@item
|
||
The destination of trace and debug output can be controlled with
|
||
@code{debugfile} (@pxref{Debug Output}).
|
||
|
||
@item
|
||
The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
|
||
creating a new file with a unique name on every invocation, rather than
|
||
following the insecure behavior of replacing the trailing @samp{X}
|
||
characters with the @code{m4} process id.
|
||
|
||
@item
|
||
@acronym{POSIX} only requires support for the command line options
|
||
@option{-s}, @option{-D}, and @option{-U}, so all other options accepted
|
||
by @acronym{GNU} M4 are extensions. @xref{Invoking m4}, for a
|
||
description of these options.
|
||
|
||
The debugging and tracing facilities in @acronym{GNU} @code{m4} are much
|
||
more extensive than in most other versions of @code{m4}.
|
||
@end itemize
|
||
|
||
@node Incompatibilities
|
||
@section Facilities in System V @code{m4} not in @acronym{GNU} @code{m4}
|
||
|
||
The version of @code{m4} from System V contains a few facilities that
|
||
have not been implemented in @acronym{GNU} @code{m4} yet. Additionally,
|
||
@acronym{POSIX} requires some behaviors that @acronym{GNU} @code{m4} has not
|
||
implemented yet. Relying on these behaviors is non-portable, as a
|
||
future release of @acronym{GNU} @code{m4} may change.
|
||
|
||
@itemize @bullet
|
||
@item
|
||
@acronym{POSIX} requires support for multiple arguments to @code{defn},
|
||
without any clarification on how @code{defn} behaves when one of the
|
||
multiple arguments names a builtin. System V @code{m4} and some other
|
||
implementations allow mixing builtins and text macros into a single
|
||
macro. @acronym{GNU} @code{m4} only supports joining multiple text
|
||
arguments, although a future implementation may lift this restriction to
|
||
behave more like System V@. The only portable way to join text macros
|
||
with builtins is via helper macros and implicit concatenation of macro
|
||
results.
|
||
|
||
@item
|
||
@acronym{POSIX} requires an application to exit with non-zero status if
|
||
it wrote an error message to stderr. This has not yet been consistently
|
||
implemented for the various builtins that are required to issue an error
|
||
(such as @code{eval} (@pxref{Eval}) when an argument cannot be parsed).
|
||
|
||
@item
|
||
Some traditional implementations only allow reading standard input
|
||
once, but @acronym{GNU} @code{m4} correctly handles multiple instances
|
||
of @samp{-} on the command line.
|
||
|
||
@item
|
||
@acronym{POSIX} requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
|
||
(first-in, first-out) order, but @acronym{GNU} @code{m4} currently uses
|
||
LIFO order. Furthermore, @acronym{POSIX} states that only the first
|
||
argument to @code{m4wrap} is saved for later evaluation, but
|
||
@acronym{GNU} @code{m4} saves and processes all arguments, with output
|
||
separated by spaces.
|
||
|
||
@item
|
||
@acronym{POSIX} states that builtins that require arguments, but are
|
||
called without arguments, have undefined behavior. Traditional
|
||
implementations simply behave as though empty strings had been passed.
|
||
For example, @code{a`'define`'b} would expand to @code{ab}. But
|
||
@acronym{GNU} @code{m4} ignores certain builtins if they have missing
|
||
arguments, giving @code{adefineb} for the above example.
|
||
|
||
@item
|
||
Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
|
||
by undefining the entire stack of previous definitions, and if doing
|
||
@code{undefine(`f')} first. @acronym{GNU} @code{m4} replaces just the top
|
||
definition on the stack, as if doing @code{popdef(`f')} followed by
|
||
@code{pushdef(`f',`1')}. @acronym{POSIX} allows either behavior.
|
||
|
||
@item
|
||
@acronym{POSIX} 2001 requires @code{syscmd} (@pxref{Syscmd}) to evaluate
|
||
command output for macro expansion, but this was a mistake that is
|
||
anticipated to be corrected in the next version of @acronym{POSIX}.
|
||
@acronym{GNU} @code{m4} follows traditional behavior in @code{syscmd}
|
||
where output is not rescanned, and provides the extension @code{esyscmd}
|
||
that does scan the output.
|
||
|
||
@item
|
||
At one point, @acronym{POSIX} required @code{changequote(@var{arg})}
|
||
(@pxref{Changequote}) to use newline as the close quote, but this was a
|
||
bug, and the next version of @acronym{POSIX} is anticipated to state
|
||
that using empty strings or just one argument is unspecified.
|
||
Meanwhile, the @acronym{GNU} @code{m4} behavior of treating an empty
|
||
end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
|
||
repeating the start-quote delimiter, and BSD treats it as leaving the
|
||
previous end-quote delimiter unchanged. For predictable results, never
|
||
call changequote with just one argument, or with empty strings for
|
||
arguments.
|
||
|
||
@item
|
||
At one point, @acronym{POSIX} required @code{changecom(@var{arg},)}
|
||
(@pxref{Changecom}) to make it impossible to end a comment, but this is
|
||
a bug, and the next version of @acronym{POSIX} is anticipated to state
|
||
that using empty strings is unspecified. Meanwhile, the @acronym{GNU}
|
||
@code{m4} behavior of treating an empty end-comment delimiter as newline
|
||
is not portable, as BSD treats it as leaving the previous end-comment
|
||
delimiter unchanged. It is also impossible in BSD implementations to
|
||
disable comments, even though that is required by @acronym{POSIX}. For
|
||
predictable results, never call changecom with empty strings for
|
||
arguments.
|
||
|
||
@item
|
||
Most implementations of @code{m4} give macros a higher precedence than
|
||
comments when parsing, meaning that if the start delimiter given to
|
||
@code{changecom} (@pxref{Changecom}) starts with a macro name, comments
|
||
are effectively disabled. @acronym{POSIX} does not specify what the
|
||
precedence is, so this version of @acronym{GNU} @code{m4} parser
|
||
recognizes comments, then macros, then quoted strings.
|
||
|
||
@item
|
||
Traditional implementations allow argument collection, but not string
|
||
and comment processing, to span file boundaries. Thus, if @file{a.m4}
|
||
contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
|
||
@kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
|
||
gives an error message that the end of file was encountered inside a
|
||
macro with @acronym{GNU} @code{m4}. On the other hand, traditional
|
||
implementations do end of file processing for files included with
|
||
@code{include} or @code{sinclude} (@pxref{Include}), while @acronym{GNU}
|
||
@code{m4} seamlessly integrates the content of those files. Thus
|
||
@code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
|
||
giving an error.
|
||
|
||
@item
|
||
Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
|
||
arguments as a global variable, independent of named macro tracing.
|
||
Also, once a macro is undefined, named tracing of that macro is lost.
|
||
On the other hand, when @acronym{GNU} @code{m4} encounters
|
||
@code{traceon} without
|
||
arguments, it turns tracing on for all existing definitions at the time,
|
||
but does not trace future definitions; @code{traceoff} without arguments
|
||
turns tracing off for all definitions regardless of whether they were
|
||
also traced by name; and tracing by name, such as with @option{-tfoo} at
|
||
the command line or @code{traceon(`foo')} in the input, is an attribute
|
||
that is preserved even if the macro is currently undefined.
|
||
|
||
Additionally, while @acronym{POSIX} requires trace output, it makes no
|
||
demands on the formatting of that output. Parsing trace output is not
|
||
guaranteed to be reliable, even between different releases of
|
||
@acronym{GNU} M4; however, the intent is that any future changes in
|
||
trace output will only occur under the direction of additional
|
||
@code{debugmode} flags (@pxref{Debug Levels}).
|
||
|
||
@item
|
||
@acronym{POSIX} requires @code{eval} (@pxref{Eval}) to treat all
|
||
operators with the same precedence as C@. However, earlier versions of
|
||
@acronym{GNU} @code{m4} followed the traditional behavior of other
|
||
@code{m4} implementations, where bitwise and logical negation (@samp{~}
|
||
and @samp{!}) have lower precedence than equality operators; and where
|
||
equality operators (@samp{==} and @samp{!=}) had the same precedence as
|
||
relational operators (such as @samp{<}). Use explicit parentheses to
|
||
ensure proper precedence. As extensions to @acronym{POSIX},
|
||
@acronym{GNU} @code{m4} gives well-defined semantics to operations that
|
||
C leaves undefined, such as when overflow occurs, when shifting negative
|
||
numbers, or when performing division by zero. @acronym{POSIX} also
|
||
requires @samp{=} to cause an error, but many traditional
|
||
implementations allowed it as an alias for @samp{==}.
|
||
|
||
@item
|
||
@acronym{POSIX} 2001 requires @code{translit} (@pxref{Translit}) to
|
||
treat each character of the second and third arguments literally.
|
||
However, it is anticipated that the next version of @acronym{POSIX} will
|
||
allow the @acronym{GNU} @code{m4} behavior of treating @samp{-} as a
|
||
range operator.
|
||
|
||
@item
|
||
@acronym{POSIX} requires @code{m4} to honor the locale environment
|
||
variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
|
||
@env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
|
||
implemented in @acronym{GNU} @code{m4}.
|
||
|
||
@item
|
||
@acronym{POSIX} states that only unquoted leading newlines and blanks
|
||
(that is, space and tab) are ignored when collecting macro arguments.
|
||
However, this appears to be a bug in @acronym{POSIX}, since most
|
||
traditional implementations also ignore all whitespace (formfeed,
|
||
carriage return, and vertical tab). @acronym{GNU} @code{m4} follows
|
||
tradition and ignores all leading unquoted whitespace.
|
||
|
||
@item
|
||
@cindex @env{POSIXLY_CORRECT}
|
||
A strictly-compliant @acronym{POSIX} client is not allowed to use
|
||
command-line arguments not specified by @acronym{POSIX}. However, since
|
||
this version of M4 ignores @env{POSIXLY_CORRECT} and enables the option
|
||
@code{--gnu} by default (@pxref{Limits control, , Invoking m4}), a
|
||
client desiring to be strictly compliant has no way to disable
|
||
@acronym{GNU} extensions that conflict with @acronym{POSIX} when
|
||
directly invoking the compiled @code{m4}. A future version of
|
||
@code{GNU} M4 will honor the environment variable @env{POSIXLY_CORRECT},
|
||
implicitly enabling @option{--traditional} if it is set, in order to
|
||
allow a strictly-compliant client. In the meantime, a client needing
|
||
strict @acronym{POSIX} compliance can use the workaround of invoking a
|
||
shell script wrapper, where the wrapper then adds @option{--traditional}
|
||
to the arguments passed to the compiled @code{m4}.
|
||
@end itemize
|
||
|
||
@node Other Incompatibilities
|
||
@section Other incompatibilities
|
||
|
||
There are a few other incompatibilities between this implementation of
|
||
@code{m4}, and the System V version.
|
||
|
||
@itemize @bullet
|
||
@item
|
||
@acronym{GNU} @code{m4} implements sync lines differently from System V
|
||
@code{m4}, when text is being diverted. @acronym{GNU} @code{m4} outputs
|
||
the sync lines when the text is being diverted, and System V @code{m4}
|
||
when the diverted text is being brought back.
|
||
|
||
The problem is which lines and file names should be attached to text
|
||
that is being, or has been, diverted. System V @code{m4} regards all
|
||
the diverted text as being generated by the source line containing the
|
||
@code{undivert} call, whereas @acronym{GNU} @code{m4} regards the
|
||
diverted text as being generated at the time it is diverted.
|
||
|
||
The sync line option is used mostly when using @code{m4} as
|
||
a front end to a compiler. If a diverted line causes a compiler error,
|
||
the error messages should most probably refer to the place where the
|
||
diversion was made, and not where it was inserted again.
|
||
|
||
@comment options: -s
|
||
@example
|
||
divert(2)2
|
||
divert(1)1
|
||
divert`'0
|
||
@result{}#line 3 "stdin"
|
||
@result{}0
|
||
^D
|
||
@result{}#line 2 "stdin"
|
||
@result{}1
|
||
@result{}#line 1 "stdin"
|
||
@result{}2
|
||
@end example
|
||
|
||
The current @code{m4} implementation has a limitation that the syncline
|
||
output at the start of each diversion occurs no matter what, even if the
|
||
previous diversion did not end with a newline. This goes contrary to
|
||
the claim that synclines appear on a line by themselves, so this
|
||
limitation may be corrected in a future version of @code{m4}. In the
|
||
meantime, when using @option{-s}, it is wisest to make sure all
|
||
diversions end with newline.
|
||
|
||
@item
|
||
@acronym{GNU} @code{m4} makes no attempt at prohibiting self-referential
|
||
definitions like:
|
||
|
||
@example
|
||
define(`x', `x')
|
||
@result{}
|
||
define(`x', `x ')
|
||
@result{}
|
||
@end example
|
||
|
||
@cindex rescanning
|
||
There is nothing inherently wrong with defining @samp{x} to
|
||
return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
|
||
because that would cause an infinite rescan loop.
|
||
In @code{m4}, one might use macros to hold strings, as we do for
|
||
variables in other programming languages, further checking them with:
|
||
|
||
@comment ignore
|
||
@example
|
||
ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
|
||
@end example
|
||
|
||
@noindent
|
||
In cases like this one, an interdiction for a macro to hold its own name
|
||
would be a useless limitation. Of course, this leaves more rope for the
|
||
@acronym{GNU} @code{m4} user to hang himself! Rescanning hangs may be
|
||
avoided through careful programming, a little like for endless loops in
|
||
traditional programming languages.
|
||
@end itemize
|
||
|
||
@node Answers
|
||
@chapter Correct version of some examples
|
||
|
||
Some of the examples in this manuals are buggy or not very robust, for
|
||
demonstration purposes. Improved versions of these composite macros are
|
||
presented here.
|
||
|
||
@menu
|
||
* Improved exch:: Solution for @code{exch}
|
||
* Improved forloop:: Solution for @code{forloop}
|
||
* Improved foreach:: Solution for @code{foreach}
|
||
* Improved copy:: Solution for @code{copy}
|
||
* Improved m4wrap:: Solution for @code{m4wrap}
|
||
* Improved cleardivert:: Solution for @code{cleardivert}
|
||
* Improved capitalize:: Solution for @code{capitalize}
|
||
* Improved fatal_error:: Solution for @code{fatal_error}
|
||
@end menu
|
||
|
||
@node Improved exch
|
||
@section Solution for @code{exch}
|
||
|
||
The @code{exch} macro (@pxref{Arguments}) as presented requires clients
|
||
to double quote their arguments. A nicer definition, which lets
|
||
clients follow the rule of thumb of one level of quoting per level of
|
||
parentheses, involves adding quotes in the definition of @code{exch}, as
|
||
follows:
|
||
|
||
@example
|
||
define(`exch', ``$2', `$1'')
|
||
@result{}
|
||
define(exch(`expansion text', `macro'))
|
||
@result{}
|
||
macro
|
||
@result{}expansion text
|
||
@end example
|
||
|
||
@node Improved forloop
|
||
@section Solution for @code{forloop}
|
||
|
||
The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
|
||
into an infinite loop if given an iterator that is not parsed as a macro
|
||
name. It does not do any sanity checking on its numeric bounds, and
|
||
only permits decimal numbers for bounds. Here is an improved version,
|
||
shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
|
||
version also optimizes overhead by calling four macros instead of six
|
||
per iteration (excluding those in @var{text}), by not dereferencing the
|
||
@var{iterator} in the helper @code{@w{_forloop}}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -d -I examples}
|
||
undivert(`forloop2.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# forloop(var, from, to, stmt) - improved version:
|
||
@result{}# works even if VAR is not a strict macro name
|
||
@result{}# performs sanity check that FROM is larger than TO
|
||
@result{}# allows complex numerical expressions in TO and FROM
|
||
@result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
|
||
@result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
|
||
@result{} eval(`$3'), `$4')popdef(`$1')')')
|
||
@result{}define(`_forloop',
|
||
@result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
|
||
@result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
|
||
@result{}divert`'dnl
|
||
include(`forloop2.m4')
|
||
@result{}
|
||
forloop(`i', `2', `1', `no iteration occurs')
|
||
@result{}
|
||
forloop(`', `1', `2', ` odd iterator name')
|
||
@result{} odd iterator name odd iterator name
|
||
forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
|
||
@result{} 0xa 0xb 0xc
|
||
forloop(`i', `a', `b', `non-numeric bounds')
|
||
@error{}m4:stdin:6: bad expression in eval (bad input): (a) <= (b)
|
||
@result{}
|
||
@end example
|
||
|
||
One other change to notice is that the improved version used @samp{_$0}
|
||
rather than @samp{_foreach} to invoke the helper routine. In general,
|
||
this is a good practice to follow, because then the set of macros can be
|
||
uniformly transformed. The following example shows a transformation
|
||
that doubles the current quoting and appends a suffix @samp{2} to each
|
||
transformed macro. If @code{foreach} refers to the literal
|
||
@samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
|
||
the intended @code{_foreach2}, and the mixing of quoting paradigms leads
|
||
to an infinite recursion loop in this example.
|
||
|
||
@comment options: -L9
|
||
@comment status: 1
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -d -L 9 -I examples}
|
||
define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
|
||
@result{}
|
||
define(`double', `define(`$1'`2',
|
||
arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
|
||
@result{}
|
||
double(`forloop')double(`_forloop')defn(`forloop2')
|
||
@result{}ifelse(eval(``($2) <= ($3)''), ``1'',
|
||
@result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
|
||
@result{} eval(``$3''), ``$4'')popdef(``$1'')'')
|
||
forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
|
||
@result{}
|
||
changequote(`[', `]')changequote([``], [''])
|
||
@result{}
|
||
forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
|
||
@result{}
|
||
changequote`'include(`forloop.m4')
|
||
@result{}
|
||
double(`forloop')double(`_forloop')defn(`forloop2')
|
||
@result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
|
||
forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
|
||
@result{}
|
||
changequote(`[', `]')changequote([``], [''])
|
||
@result{}
|
||
forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
|
||
@error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
|
||
@end example
|
||
|
||
One more optimization is still possible. Instead of repeatedly
|
||
assigning a variable then invoking or dereferencing it, it is possible
|
||
to pass the current iterator value as a single argument. Coupled with
|
||
@code{curry} if other arguments are needed (@pxref{Composition}), or
|
||
with helper macros if the argument is needed in more than one place in
|
||
the expansion, the output can be generated with three, rather than four,
|
||
macros of overhead per iteration. Notice how the file
|
||
@file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
|
||
arguments of the helper @code{_forloop} to take two arguments that are
|
||
placed around the current value. By splitting a balanced set of
|
||
parantheses across multiple arguments, the helper macro can now be
|
||
shared by @code{forloop} and the new @code{forloop_arg}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`forloop3.m4')
|
||
@result{}
|
||
undivert(`forloop3.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
|
||
@result{}# each value between FROM and TO, without define overhead
|
||
@result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
|
||
@result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
|
||
@result{}# forloop(var, from, to, stmt) - refactored to share code
|
||
@result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
|
||
@result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
|
||
@result{} `define(`$1',', `)$4')popdef(`$1')')')
|
||
@result{}define(`_forloop',
|
||
@result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
|
||
@result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
|
||
@result{}divert`'dnl
|
||
forloop(`i', `1', `3', ` i')
|
||
@result{} 1 2 3
|
||
define(`echo', `$@@')
|
||
@result{}
|
||
forloop_arg(`1', `3', ` echo')
|
||
@result{} 1 2 3
|
||
include(`curry.m4')
|
||
@result{}
|
||
forloop_arg(`1', `3', `curry(`pushdef', `a')')
|
||
@result{}
|
||
a
|
||
@result{}3
|
||
popdef(`a')a
|
||
@result{}2
|
||
popdef(`a')a
|
||
@result{}1
|
||
popdef(`a')a
|
||
@result{}a
|
||
@end example
|
||
|
||
Of course, it is possible to make even more improvements, such as
|
||
adding an optional step argument, or allowing iteration through
|
||
descending sequences. @acronym{GNU} Autoconf provides some of these
|
||
additional bells and whistles in its @code{m4_for} macro.
|
||
|
||
@node Improved foreach
|
||
@section Solution for @code{foreach}
|
||
|
||
The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
|
||
presented earlier each have flaws. First, we will examine and fix the
|
||
quadratic behavior of @code{foreachq}:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreachq.m4')
|
||
@result{}
|
||
traceon(`shift')debugmode(`aq')
|
||
@result{}
|
||
foreachq(`x', ``1', `2', `3', `4'', `x
|
||
')dnl
|
||
@result{}1
|
||
@error{}m4trace: -3- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -2- shift(`1', `2', `3', `4')
|
||
@result{}2
|
||
@error{}m4trace: -4- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -3- shift(`2', `3', `4')
|
||
@error{}m4trace: -3- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -2- shift(`2', `3', `4')
|
||
@result{}3
|
||
@error{}m4trace: -5- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -4- shift(`2', `3', `4')
|
||
@error{}m4trace: -3- shift(`3', `4')
|
||
@error{}m4trace: -4- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -3- shift(`2', `3', `4')
|
||
@error{}m4trace: -2- shift(`3', `4')
|
||
@result{}4
|
||
@error{}m4trace: -6- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -5- shift(`2', `3', `4')
|
||
@error{}m4trace: -4- shift(`3', `4')
|
||
@error{}m4trace: -3- shift(`4')
|
||
@end example
|
||
|
||
@cindex quadratic behavior, avoiding
|
||
@cindex avoiding quadratic behavior
|
||
Each successive iteration was adding more quoted @code{shift}
|
||
invocations, and the entire list contents were passing through every
|
||
iteration. In general, when recursing, it is a good idea to make the
|
||
recursion use fewer arguments, rather than adding additional quoted
|
||
uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
|
||
fewer macros, is less likely to run into machine limits, and most
|
||
importantly, performs faster. The fixed version of @code{foreachq} can
|
||
be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreachq2.m4')
|
||
@result{}
|
||
undivert(`foreachq2.m4')dnl
|
||
@result{}include(`quote.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
||
@result{}# quoted list, improved version
|
||
@result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
|
||
@result{}define(`_arg1q', ``$1'')
|
||
@result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
|
||
@result{}define(`_foreachq', `ifelse(`$2', `', `',
|
||
@result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
|
||
@result{}divert`'dnl
|
||
traceon(`shift')debugmode(`aq')
|
||
@result{}
|
||
foreachq(`x', ``1', `2', `3', `4'', `x
|
||
')dnl
|
||
@result{}1
|
||
@error{}m4trace: -3- shift(`1', `2', `3', `4')
|
||
@result{}2
|
||
@error{}m4trace: -3- shift(`2', `3', `4')
|
||
@result{}3
|
||
@error{}m4trace: -3- shift(`3', `4')
|
||
@result{}4
|
||
@end example
|
||
|
||
Note that the fixed version calls unquoted helper macros in
|
||
@code{@w{_foreachq}} to trim elements immediately; those helper macros
|
||
in turn must re-supply the layer of quotes lost in the macro invocation.
|
||
Contrast the use of @code{@w{_arg1q}}, which quotes the first list
|
||
element, with @code{@w{_arg1}} of the earlier implementation that
|
||
returned the first list element directly. Additionally, by calling the
|
||
helper method immediately, the @samp{defn(`@var{iterator}')} no longer
|
||
contains unexpanded macros.
|
||
|
||
The astute m4 programmer might notice that the solution above still uses
|
||
more memory and macro invocations, and thus more time, than strictly
|
||
necessary. Note that @samp{$2}, which contains an arbitrarily long
|
||
quoted list, is expanded and rescanned three times per iteration of
|
||
@code{_foreachq}. Furthermore, every iteration of the algorithm
|
||
effectively unboxes then reboxes the list, which costs a couple of macro
|
||
invocations. It is possible to rewrite the algorithm for a bit more
|
||
speed by swapping the order of the arguments to @code{_foreachq} in
|
||
order to operate on an unboxed list in the first place, and by using the
|
||
fixed-length @samp{$#} instead of an arbitrary length list as the key to
|
||
end recursion. The result is an overhead of six macro invocations per
|
||
loop (excluding any macros in @var{text}), instead of eight. This
|
||
alternative approach is available as
|
||
@file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreachq3.m4')
|
||
@result{}
|
||
undivert(`foreachq3.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
||
@result{}# quoted list, alternate improved version
|
||
@result{}define(`foreachq', `ifelse(`$2', `', `',
|
||
@result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
|
||
@result{}define(`_foreachq', `ifelse(`$#', `3', `',
|
||
@result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
|
||
@result{} shift(shift(shift($@@))))')')
|
||
@result{}divert`'dnl
|
||
traceon(`shift')debugmode(`aq')
|
||
@result{}
|
||
foreachq(`x', ``1', `2', `3', `4'', `x
|
||
')dnl
|
||
@result{}1
|
||
@error{}m4trace: -4- shift(`x', `x
|
||
@error{}', `', `1', `2', `3', `4')
|
||
@error{}m4trace: -3- shift(`x
|
||
@error{}', `', `1', `2', `3', `4')
|
||
@error{}m4trace: -2- shift(`', `1', `2', `3', `4')
|
||
@result{}2
|
||
@error{}m4trace: -4- shift(`x', `x
|
||
@error{}', `1', `2', `3', `4')
|
||
@error{}m4trace: -3- shift(`x
|
||
@error{}', `1', `2', `3', `4')
|
||
@error{}m4trace: -2- shift(`1', `2', `3', `4')
|
||
@result{}3
|
||
@error{}m4trace: -4- shift(`x', `x
|
||
@error{}', `2', `3', `4')
|
||
@error{}m4trace: -3- shift(`x
|
||
@error{}', `2', `3', `4')
|
||
@error{}m4trace: -2- shift(`2', `3', `4')
|
||
@result{}4
|
||
@error{}m4trace: -4- shift(`x', `x
|
||
@error{}', `3', `4')
|
||
@error{}m4trace: -3- shift(`x
|
||
@error{}', `3', `4')
|
||
@error{}m4trace: -2- shift(`3', `4')
|
||
@end example
|
||
|
||
In the current version of M4, every instance of @samp{$@@} is rescanned
|
||
as it is encountered. Thus, the @file{foreachq3.m4} alternative uses
|
||
much less memory than @file{foreachq2.m4}, and executes as much as 10%
|
||
faster, since each iteration encounters fewer @samp{$@@}. However, the
|
||
implementation of rescanning every byte in @samp{$@@} is quadratic in
|
||
the number of bytes scanned (for example, making the broken version in
|
||
@file{foreachq.m4} cubic, rather than quadratic, in behavior). A future
|
||
release of M4 will improve the underlying implementation by reusing
|
||
results of previous scans, so that both styles of @code{foreachq} can
|
||
become linear in the number of bytes scanned. Notice how the
|
||
implementation injects an empty argument prior to expanding @samp{$2}
|
||
within @code{foreachq}; the helper macro @code{_foreachq} then ignores
|
||
the third argument altogether, and ends recursion when there are three
|
||
arguments left because there was nothing left to pass through
|
||
@code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
|
||
than the two conditionals used in the version from @file{foreachq2.m4}.
|
||
|
||
@cindex nine arguments, more than
|
||
@cindex more than nine arguments
|
||
@cindex arguments, more than nine
|
||
So far, all of the implementations of @code{foreachq} presented have
|
||
been quadratic with M4 1.4.x. But @code{forloop} is linear, because
|
||
each iteration parses a constant amount of arguments. So, it is
|
||
possible to design a variant that uses @code{forloop} to do the
|
||
iteration, then uses @samp{$@@} only once at the end, giving a linear
|
||
result even with older M4 implementations. This implementation relies
|
||
on the @acronym{GNU} extension that @samp{$10} expands to the tenth
|
||
argument rather than the first argument concatenated with @samp{0}. The
|
||
trick is to define an intermediate macro that repeats the text
|
||
@code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
|
||
integers corresponding to each argument. The helper macro
|
||
@code{_foreachq_} is needed in order to generate the literal sequences
|
||
such as @samp{$1} into the intermediate macro, rather than expanding
|
||
them as the arguments of @code{_foreachq}. With this approach, no
|
||
@code{shift} calls are even needed! Even though there are seven macros
|
||
of overhead per iteration instead of six in @file{foreachq3.m4}, the
|
||
linear scaling is apparent at relatively small list sizes. However,
|
||
this approach will need adjustment when a future version of M4 follows
|
||
@acronym{POSIX} by no longer treating @samp{$10} as the tenth argument;
|
||
the anticipation is that @samp{$@{10@}} can be used instead, although
|
||
that alternative syntax is not yet supported.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreachq4.m4')
|
||
@result{}
|
||
undivert(`foreachq4.m4')dnl
|
||
@result{}include(`forloop2.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
|
||
@result{}# quoted list, version based on forloop
|
||
@result{}define(`foreachq',
|
||
@result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
|
||
@result{}define(`_foreachq',
|
||
@result{}`pushdef(`$1', forloop(`$1', `3', `$#',
|
||
@result{} `$0_(`1', `2', indir(`$1'))')`popdef(
|
||
@result{} `$1')')indir(`$1', $@@)')
|
||
@result{}define(`_foreachq_',
|
||
@result{}``define(`$$1', `$$3')$$2`''')
|
||
@result{}divert`'dnl
|
||
traceon(`shift')debugmode(`aq')
|
||
@result{}
|
||
foreachq(`x', ``1', `2', `3', `4'', `x
|
||
')dnl
|
||
@result{}1
|
||
@result{}2
|
||
@result{}3
|
||
@result{}4
|
||
@end example
|
||
|
||
For yet another approach, the improved version of @code{foreach},
|
||
available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
|
||
overquotes the arguments to @code{@w{_foreach}} to begin with, using
|
||
@code{dquote_elt}. Then @code{@w{_foreach}} can just use
|
||
@code{@w{_arg1}} to remove the extra layer of quoting that was added up
|
||
front:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreach2.m4')
|
||
@result{}
|
||
undivert(`foreach2.m4')dnl
|
||
@result{}include(`quote.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
|
||
@result{}# parenthesized list, improved version
|
||
@result{}define(`foreach', `pushdef(`$1')_$0(`$1',
|
||
@result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
|
||
@result{}define(`_arg1', `$1')
|
||
@result{}define(`_foreach', `ifelse(`$2', `(`')', `',
|
||
@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
|
||
@result{}divert`'dnl
|
||
traceon(`shift')debugmode(`aq')
|
||
@result{}
|
||
foreach(`x', `(`1', `2', `3', `4')', `x
|
||
')dnl
|
||
@error{}m4trace: -4- shift(`1', `2', `3', `4')
|
||
@error{}m4trace: -4- shift(`2', `3', `4')
|
||
@error{}m4trace: -4- shift(`3', `4')
|
||
@result{}1
|
||
@error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
|
||
@result{}2
|
||
@error{}m4trace: -3- shift(``2'', ``3'', ``4'')
|
||
@result{}3
|
||
@error{}m4trace: -3- shift(``3'', ``4'')
|
||
@result{}4
|
||
@error{}m4trace: -3- shift(``4'')
|
||
@end example
|
||
|
||
It is likewise possible to write a variant of @code{foreach} that
|
||
performs in linear time on M4 1.4.x; the easiest method is probably
|
||
writing a version of @code{foreach} that unboxes its list, then invokes
|
||
@code{_foreachq} as previously defined in @file{foreachq4.m4}.
|
||
|
||
In summary, recursion over list elements is trickier than it appeared at
|
||
first glance, but provides a powerful idiom within @code{m4} processing.
|
||
As a final demonstration, both list styles are now able to handle
|
||
several scenarios that would wreak havoc on one or both of the original
|
||
implementations. This points out one other difference between the
|
||
list styles. @code{foreach} evaluates unquoted list elements only once,
|
||
in preparation for calling @code{@w{_foreach}}, similary for
|
||
@code{foreachq} as provided by @file{foreachq3.m4} or
|
||
@file{foreachq4.m4}. But
|
||
@code{foreachq}, as provided by @file{foreachq2.m4},
|
||
evaluates unquoted list elements twice while visiting the first list
|
||
element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
|
||
deciding which list style to use, one must take into account whether
|
||
repeating the side effects of unquoted list elements will have any
|
||
detrimental effects.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`foreach2.m4')
|
||
@result{}
|
||
include(`foreachq2.m4')
|
||
@result{}
|
||
dnl 0-element list:
|
||
foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
|
||
@result{} /@w{ }
|
||
dnl 1-element list of empty element
|
||
foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
|
||
@result{}<> / <>
|
||
dnl 2-element list of empty elements
|
||
foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
|
||
@result{}<><> / <><>
|
||
dnl 1-element list of a comma
|
||
foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
|
||
@result{}<,> / <,>
|
||
dnl 2-element list of unbalanced parentheses
|
||
foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
|
||
@result{}<(><)> / <(><)>
|
||
define(`ab', `oops')dnl using defn(`iterator')
|
||
foreach(`x', `(`a', `b')', `defn(`x')') /dnl
|
||
foreachq(`x', ``a', `b'', `defn(`x')')
|
||
@result{}ab / ab
|
||
define(`active', `ACT, IVE')
|
||
@result{}
|
||
traceon(`active')
|
||
@result{}
|
||
dnl list of unquoted macros; expansion occurs before recursion
|
||
foreach(`x', `(active, active)', `<x>
|
||
')dnl
|
||
@error{}m4trace: -4- active -> `ACT, IVE'
|
||
@error{}m4trace: -4- active -> `ACT, IVE'
|
||
@result{}<ACT>
|
||
@result{}<IVE>
|
||
@result{}<ACT>
|
||
@result{}<IVE>
|
||
foreachq(`x', `active, active', `<x>
|
||
')dnl
|
||
@error{}m4trace: -3- active -> `ACT, IVE'
|
||
@error{}m4trace: -3- active -> `ACT, IVE'
|
||
@result{}<ACT>
|
||
@error{}m4trace: -3- active -> `ACT, IVE'
|
||
@error{}m4trace: -3- active -> `ACT, IVE'
|
||
@result{}<IVE>
|
||
@result{}<ACT>
|
||
@result{}<IVE>
|
||
dnl list of quoted macros; expansion occurs during recursion
|
||
foreach(`x', `(`active', `active')', `<x>
|
||
')dnl
|
||
@error{}m4trace: -1- active -> `ACT, IVE'
|
||
@result{}<ACT, IVE>
|
||
@error{}m4trace: -1- active -> `ACT, IVE'
|
||
@result{}<ACT, IVE>
|
||
foreachq(`x', ``active', `active'', `<x>
|
||
')dnl
|
||
@error{}m4trace: -1- active -> `ACT, IVE'
|
||
@result{}<ACT, IVE>
|
||
@error{}m4trace: -1- active -> `ACT, IVE'
|
||
@result{}<ACT, IVE>
|
||
dnl list of double-quoted macro names; no expansion
|
||
foreach(`x', `(``active'', ``active'')', `<x>
|
||
')dnl
|
||
@result{}<active>
|
||
@result{}<active>
|
||
foreachq(`x', ```active'', ``active''', `<x>
|
||
')dnl
|
||
@result{}<active>
|
||
@result{}<active>
|
||
@end example
|
||
|
||
@ignore
|
||
@comment Not worth putting in the manual, but make sure that foreach
|
||
@comment implementations behave, and that final implementation is
|
||
@comment linear.
|
||
|
||
@comment boxed recursion
|
||
|
||
@comment examples
|
||
@comment options: -Dlimit=10 -Dverbose
|
||
@example
|
||
$ @kbd {m4 -I examples -Dlimit=10 -Dverbose}
|
||
include(`loop.m4')dnl
|
||
@result{} 1 2 3 4 5 6 7 8 9 10
|
||
@end example
|
||
|
||
@comment unboxed recursion
|
||
|
||
@comment examples
|
||
@comment options: -Dlimit=10 -Dverbose -Dalt
|
||
@example
|
||
$ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt}
|
||
include(`loop.m4')dnl
|
||
@result{} 1 2 3 4 5 6 7 8 9 10
|
||
@end example
|
||
|
||
@comment foreach via forloop recursion
|
||
|
||
@comment examples
|
||
@comment options: -Dlimit=10 -Dverbose -Dalt=4
|
||
@example
|
||
$ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt=4}
|
||
include(`loop.m4')dnl
|
||
@result{} 1 2 3 4 5 6 7 8 9 10
|
||
@end example
|
||
|
||
@comment examples
|
||
@comment options: -Dlimit=2500 -Dalt=4
|
||
@example
|
||
$ @kbd {m4 -I examples -Dlimit=2500 -Dalt=4}
|
||
include(`loop.m4')dnl
|
||
@end example
|
||
|
||
@comment examples
|
||
@comment options: -Dlimit=10000 -Dalt=4
|
||
@example
|
||
$ @kbd {m4 -I examples -Dlimit=10000 -Dalt=4}
|
||
define(`foo', `divert`'len(popdef(`_foreachq')_foreachq($@@))')dnl
|
||
define(`debug', `pushdef(`_foreachq', defn(`foo'))')
|
||
@result{}
|
||
include(`loop.m4')dnl
|
||
@result{}48894
|
||
@end example
|
||
|
||
@end ignore
|
||
|
||
@node Improved copy
|
||
@section Solution for @code{copy}
|
||
|
||
The macro @code{copy} presented above
|
||
is unable to handle builtin tokens with M4 1.4.x, because it tries to
|
||
pass the builtin token through the macro @code{curry}, where it is
|
||
silently flattened to an empty string (@pxref{Composition}). Rather
|
||
than using the problematic @code{curry} to work around the limitation
|
||
that @code{stack_foreach} expects to invoke a macro that takes exactly
|
||
one argument, we can write a new macro that lets us form the exact
|
||
two-argument @code{pushdef} call sequence needed, so that we are no
|
||
longer passing a builtin token through a text macro.
|
||
|
||
@deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
|
||
@var{sep})
|
||
@deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
|
||
@var{post}, @var{sep})
|
||
For each of the @code{pushdef} definitions associated with @var{macro},
|
||
expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
|
||
Additionally, expand @var{sep} between definitions.
|
||
@code{stack_foreach_sep} visits the oldest definition first, while
|
||
@code{stack_foreach_sep_lifo} visits the current definition first. The
|
||
expansion may dereference @var{macro}, but should not modify it. There
|
||
are a few special macros, such as @code{defn}, which cannot be used as
|
||
the @var{macro} parameter.
|
||
@end deffn
|
||
|
||
Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
|
||
equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
|
||
`)')}. By supplying explicit parentheses, split among the @var{pre} and
|
||
@var{post} arguments to @code{stack_foreach_sep}, it is now possible to
|
||
construct macro calls with more than one argument, without passing
|
||
builtin tokens through a macro call. It is likewise possible to
|
||
directly reference the stack definitions without a macro call, by
|
||
leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
|
||
@code{copy} on builtin tokens, it also executes with fewer macro
|
||
invocations.
|
||
|
||
The new macro also adds a separator that is only output after the first
|
||
iteration of the helper @code{_stack_reverse_sep}, implemented by
|
||
prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
|
||
argument in subsequent iterations. Note that the empty string that
|
||
separates @var{sep} from @var{pre} is provided as part of the fourth
|
||
argument when originally calling @code{_stack_reverse_sep}, and not by
|
||
writing @code{$4`'$3} as the third argument in the recursive call; while
|
||
the other approach would give the same output, it does so at the expense
|
||
of increasing the argument size on each iteration of
|
||
@code{_stack_reverse_sep}, which results in quadratic instead of linear
|
||
execution time. The improved stack walking macros are available in
|
||
@file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`stack_sep.m4')
|
||
@result{}
|
||
define(`copy', `ifdef(`$2', `errprint(`$2 already defined
|
||
')m4exit(`1')',
|
||
`stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
|
||
pushdef(`a', `1')pushdef(`a', defn(`divnum'))
|
||
@result{}
|
||
copy(`a', `b')
|
||
@result{}
|
||
b
|
||
@result{}0
|
||
popdef(`b')
|
||
@result{}
|
||
b
|
||
@result{}1
|
||
pushdef(`c', `1')pushdef(`c', `2')
|
||
@result{}
|
||
stack_foreach_sep_lifo(`c', `', `', `, ')
|
||
@result{}2, 1
|
||
undivert(`stack_sep.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# stack_foreach_sep(macro, pre, post, sep)
|
||
@result{}# Invoke PRE`'defn`'POST with a single argument of each definition
|
||
@result{}# from the definition stack of MACRO, starting with the oldest, and
|
||
@result{}# separated by SEP between definitions.
|
||
@result{}define(`stack_foreach_sep',
|
||
@result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
|
||
@result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
|
||
@result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
|
||
@result{}# Like stack_foreach_sep, but starting with the newest definition.
|
||
@result{}define(`stack_foreach_sep_lifo',
|
||
@result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
|
||
@result{}`_stack_reverse_sep(`tmp-$1', `$1')')
|
||
@result{}define(`_stack_reverse_sep',
|
||
@result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
|
||
@result{} `$1', `$2', `$4$3')')')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
@ignore
|
||
@comment Not worth putting in the manual, but make sure that
|
||
@comment stack_foreach_sep has linear performance.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd {m4 -I examples}
|
||
include(`forloop3.m4')include(`stack_sep.m4')dnl
|
||
forloop(`i', `1', `10000', `pushdef(`s', i)')
|
||
@result{}
|
||
define(`colon', `:')define(`dash', `-')
|
||
@result{}
|
||
len(stack_foreach_sep(`s', `dash', `', `colon'))
|
||
@result{}58893
|
||
@end example
|
||
@end ignore
|
||
|
||
@node Improved m4wrap
|
||
@section Solution for @code{m4wrap}
|
||
|
||
The replacement @code{m4wrap} versions presented above, designed to
|
||
guarantee FIFO or LIFO order regardless of the underlying M4
|
||
implementation, share a bug when dealing with wrapped text that looks
|
||
like parameter expansion. Note how the invocation of
|
||
@code{m4wrap@var{n}} interprets these parameters, while using the
|
||
builtin preserves them for their intended use.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`wraplifo.m4')
|
||
@result{}
|
||
m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
|
||
')
|
||
@result{}
|
||
builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
|
||
')
|
||
@result{}
|
||
^D
|
||
@result{}bar:-a-a,b-2-
|
||
@result{}m4wrap0:---0-
|
||
@end example
|
||
|
||
Additionally, the computation of @code{_m4wrap_level} and creation of
|
||
multiple @code{m4wrap@var{n}} placeholders in the original examples is
|
||
more expensive in time and memory than strictly necessary. Notice how
|
||
the improved version grabs the wrapped text via @code{defn} to avoid
|
||
parameter expansion, then undefines @code{_m4wrap_text}, before
|
||
stripping a level of quotes with @code{_arg1} to expand the text. That
|
||
way, each level of wrapping reuses the single placeholder, which starts
|
||
each nesting level in an undefined state.
|
||
|
||
Finally, it is worth emulating the @acronym{GNU} M4 extension of saving
|
||
all arguments to @code{m4wrap}, separated by a space, rather than saving
|
||
just the first argument. This is done with the @code{join} macro
|
||
documented previously (@pxref{Shift}). The improved LIFO example is
|
||
shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
|
||
easily be converted to a FIFO solution by swapping the adjacent
|
||
invocations of @code{joinall} and @code{defn}.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`wraplifo2.m4')
|
||
@result{}
|
||
undivert(`wraplifo2.m4')dnl
|
||
@result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
|
||
@result{}include(`join.m4')dnl
|
||
@result{}define(`_m4wrap', defn(`m4wrap'))dnl
|
||
@result{}define(`_arg1', `$1')dnl
|
||
@result{}define(`m4wrap',
|
||
@result{}`ifdef(`_$0_text',
|
||
@result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
|
||
@result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
|
||
@result{}define(`_$0_text', joinall(` ', $@@))')')dnl
|
||
m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
|
||
')
|
||
@result{}
|
||
m4wrap(`lifo text
|
||
m4wrap(`nested', `', `$@@
|
||
')')
|
||
@result{}
|
||
^D
|
||
@result{}lifo text
|
||
@result{}foo:-a-a,b-2-
|
||
@result{}nested $@@
|
||
@end example
|
||
|
||
@node Improved cleardivert
|
||
@section Solution for @code{cleardivert}
|
||
|
||
The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
|
||
called without arguments to clear all pending diversions. That is
|
||
because using undivert with an empty string for an argument is different
|
||
than using it with no arguments at all. Compare the earlier definition
|
||
with one that takes the number of arguments into account:
|
||
|
||
@example
|
||
define(`cleardivert',
|
||
`pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
|
||
@result{}
|
||
divert(`1')one
|
||
divert
|
||
@result{}
|
||
cleardivert
|
||
@result{}
|
||
undivert
|
||
@result{}one
|
||
@result{}
|
||
define(`cleardivert',
|
||
`pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
|
||
`undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
|
||
@result{}
|
||
divert(`2')two
|
||
divert
|
||
@result{}
|
||
cleardivert
|
||
@result{}
|
||
undivert
|
||
@result{}
|
||
@end example
|
||
|
||
@node Improved capitalize
|
||
@section Solution for @code{capitalize}
|
||
|
||
The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
|
||
not allow clients to follow the quoting rule of thumb. Consider the
|
||
three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
|
||
difference between calling @code{capitalize} with the expansion of a
|
||
macro, expanding the result of a case change, and changing the case of a
|
||
double-quoted string:
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`capitalize.m4')dnl
|
||
define(`active', `act1, ive')dnl
|
||
define(`Active', `Act2, Ive')dnl
|
||
define(`ACTIVE', `ACT3, IVE')dnl
|
||
upcase(active)
|
||
@result{}ACT1,IVE
|
||
upcase(`active')
|
||
@result{}ACT3, IVE
|
||
upcase(``active'')
|
||
@result{}ACTIVE
|
||
downcase(ACTIVE)
|
||
@result{}act3,ive
|
||
downcase(`ACTIVE')
|
||
@result{}act1, ive
|
||
downcase(``ACTIVE'')
|
||
@result{}active
|
||
capitalize(active)
|
||
@result{}Act1
|
||
capitalize(`active')
|
||
@result{}Active
|
||
capitalize(``active'')
|
||
@result{}_capitalize(`active')
|
||
define(`A', `OOPS')
|
||
@result{}
|
||
capitalize(active)
|
||
@result{}OOPSct1
|
||
capitalize(`active')
|
||
@result{}OOPSctive
|
||
@end example
|
||
|
||
First, when @code{capitalize} is called with more than one argument, it
|
||
was throwing away later arguments, whereas @code{upcase} and
|
||
@code{downcase} used @samp{$*} to collect them all. The fix is simple:
|
||
use @samp{$*} consistently.
|
||
|
||
Next, with single-quoting, @code{capitalize} outputs a single character,
|
||
a set of quotes, then the rest of the characters, making it impossible
|
||
to invoke @code{Active} after the fact, and allowing the alternate macro
|
||
@code{A} to interfere. Here, the solution is to use additional quoting
|
||
in the helper macros, then pass the final over-quoted output string
|
||
through @code{_arg1} to remove the extra quoting and finally invoke the
|
||
concatenated portions as a single string.
|
||
|
||
Finally, when passed a double-quoted string, the nested macro
|
||
@code{_capitalize} is never invoked because it ended up nested inside
|
||
quotes. This one is the toughest to fix. In short, we have no idea how
|
||
many levels of quotes are in effect on the substring being altered by
|
||
@code{patsubst}. If the replacement string cannot be expressed entirely
|
||
in terms of literal text and backslash substitutions, then we need a
|
||
mechanism to guarantee that the helper macros are invoked outside of
|
||
quotes. In other words, this sounds like a job for @code{changequote}
|
||
(@pxref{Changequote}). By changing the active quoting characters, we
|
||
can guarantee that replacement text injected by @code{patsubst} always
|
||
occurs in the middle of a string that has exactly one level of
|
||
over-quoting using alternate quotes; so the replacement text closes the
|
||
quoted string, invokes the helper macros, then reopens the quoted
|
||
string. In turn, that means the replacement text has unbalanced quotes,
|
||
necessitating another round of @code{changequote}.
|
||
|
||
In the fixed version below, (also shipped as
|
||
@file{m4-@value{VERSION}/@/examples/@/capitalize.m4}), @code{capitalize}
|
||
uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
|
||
strings are chosen so as to be less likely to appear in the text being
|
||
converted). The helpers @code{_to_alt} and @code{_from_alt} merely
|
||
reduce the number of characters required to perform a
|
||
@code{changequote}, since the definition changes twice. The outermost
|
||
pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
|
||
with alternate quoting; the innermost pair is used so that the third
|
||
argument to @code{patsubst} can contain an unbalanced
|
||
@samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
|
||
must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
|
||
they contain nested quotes but are invoked with the alternate quoting
|
||
scheme in effect.
|
||
|
||
@comment examples
|
||
@example
|
||
$ @kbd{m4 -I examples}
|
||
include(`capitalize2.m4')dnl
|
||
define(`active', `act1, ive')dnl
|
||
define(`Active', `Act2, Ive')dnl
|
||
define(`ACTIVE', `ACT3, IVE')dnl
|
||
define(`A', `OOPS')dnl
|
||
capitalize(active; `active'; ``active''; ```actIVE''')
|
||
@result{}Act1,Ive; Act2, Ive; Active; `Active'
|
||
undivert(`capitalize2.m4')dnl
|
||
@result{}divert(`-1')
|
||
@result{}# upcase(text)
|
||
@result{}# downcase(text)
|
||
@result{}# capitalize(text)
|
||
@result{}# change case of text, improved version
|
||
@result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
|
||
@result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
|
||
@result{}define(`_arg1', `$1')
|
||
@result{}define(`_to_alt', `changequote(`<<[', `]>>')')
|
||
@result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
|
||
@result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
|
||
@result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
|
||
@result{}define(`_capitalize_alt',
|
||
@result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
|
||
@result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
|
||
@result{}define(`capitalize',
|
||
@result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
|
||
@result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
|
||
@result{}divert`'dnl
|
||
@end example
|
||
|
||
@node Improved fatal_error
|
||
@section Solution for @code{fatal_error}
|
||
|
||
The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
|
||
of @acronym{GNU} M4 earlier than 1.4.8, where invoking
|
||
@code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
|
||
in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
|
||
though all files start at line 1. Furthermore, versions earlier than
|
||
1.4.6 did not support the @code{@w{__program__}} macro. If you want
|
||
@code{fatal_error} to work across the entire 1.4.x release series, a
|
||
better implementation would be:
|
||
|
||
@comment status: 1
|
||
@example
|
||
define(`fatal_error',
|
||
`errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
|
||
`:ifelse(__line__, `0', `',
|
||
`__file__:__line__:')` fatal error: $*
|
||
')m4exit(`1')')
|
||
@result{}
|
||
m4wrap(`divnum(`demo of internal message')
|
||
fatal_error(`inside wrapped text')')
|
||
@result{}
|
||
^D
|
||
@error{}m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
|
||
@result{}0
|
||
@error{}m4:stdin:6: fatal error: inside wrapped text
|
||
@end example
|
||
|
||
@c ========================================================== Appendices
|
||
|
||
@node Copying This Package
|
||
@appendix How to make copies of the overall M4 package
|
||
@cindex License, code
|
||
|
||
This appendix covers the license for copying the source code of the
|
||
overall M4 package. This manual is under a different set of
|
||
restrictions, covered later (@pxref{Copying This Manual}).
|
||
|
||
@menu
|
||
* GNU General Public License:: License for copying the M4 package
|
||
@end menu
|
||
|
||
@node GNU General Public License
|
||
@appendixsec License for copying the M4 package
|
||
@cindex GPL, GNU General Public License
|
||
@cindex GNU General Public License
|
||
@cindex General Public License (GPL), GNU
|
||
@include gpl-3.0.texi
|
||
|
||
@node Copying This Manual
|
||
@appendix How to make copies of this manual
|
||
@cindex License, manual
|
||
|
||
This appendix covers the license for copying this manual. Note that
|
||
some of the longer examples in this manual are also distributed in the
|
||
directory @file{m4-@value{VERSION}/@/examples/}, where a more
|
||
permissive license is in effect when copying just the examples.
|
||
|
||
@menu
|
||
* GNU Free Documentation License:: License for copying this manual
|
||
@end menu
|
||
|
||
@node GNU Free Documentation License
|
||
@appendixsec License for copying this manual
|
||
@cindex FDL, GNU Free Documentation License
|
||
@cindex GNU Free Documentation License
|
||
@cindex Free Documentation License (FDL), GNU
|
||
@include fdl-1.3.texi
|
||
|
||
@node Indices
|
||
@appendix Indices of concepts and macros
|
||
|
||
@menu
|
||
* Macro index:: Index for all @code{m4} macros
|
||
* Concept index:: Index for many concepts
|
||
@end menu
|
||
|
||
@node Macro index
|
||
@appendixsec Index for all @code{m4} macros
|
||
|
||
This index covers all @code{m4} builtins, as well as several useful
|
||
composite macros. References are exclusively to the places where a
|
||
macro is introduced the first time.
|
||
|
||
@printindex fn
|
||
|
||
@node Concept index
|
||
@appendixsec Index for many concepts
|
||
|
||
@printindex cp
|
||
|
||
@bye
|
||
|
||
@c Local Variables:
|
||
@c coding: iso-8859-1
|
||
@c fill-column: 72
|
||
@c ispell-local-dictionary: "american"
|
||
@c indent-tabs-mode: nil
|
||
@c whitespace-check-buffer-indent: nil
|
||
@c End:
|