marytts.util.string
Class PrintfFormat

java.lang.Object
  extended by marytts.util.string.PrintfFormat

public class PrintfFormat
extends java.lang.Object

PrintfFormat allows the formatting of an array of objects embedded within a string. Primitive types must be passed using wrapper types. The formatting is controlled by a control string.

A control string is a Java string that contains a control specification. The control specification starts at the first percent sign (%) in the string, provided that this percent sign

  1. is not escaped protected by a matching % or is not an escape % character,
  2. is not at the end of the format string, and
  3. precedes a sequence of characters that parses as a valid control specification.

A control specification usually takes the form:

 % ['-+ #0]* [0..9]* { . [0..9]* }+
                { [hlL] }+ [idfgGoxXeEcs]
There are variants of this basic form that are discussed below.

The format is composed of zero or more directives defined as follows:

The results are undefined if there are insufficient arguments for the format. Usually an unchecked exception will be thrown. If the format is exhausted while arguments remain, the excess arguments are evaluated but are otherwise ignored. In format strings containing the % form of conversion specifications, each argument in the argument list is used exactly once.

Conversions can be applied to the nth argument after the format in the argument list, rather than to the next unused argument. In this case, the conversion characer % is replaced by the sequence %n$, where n is a decimal integer giving the position of the argument in the argument list.

In format strings containing the %n$ form of conversion specifications, each argument in the argument list is used exactly once.

Escape Sequences

The following table lists escape sequences and associated actions on display devices capable of the action.

Sequence Name Description
\\backlashNone.
\aalertAttempts to alert the user through audible or visible notification.
\bbackspaceMoves the printing position to one column before the current position, unless the current position is the start of a line.
\fform-feedMoves the printing position to the initial printing position of the next logical page.
\nnewlineMoves the printing position to the start of the next line.
\rcarriage-returnMoves the printing position to the start of the current line.
\ttabMoves the printing position to the next implementation- defined horizontal tab position.
\vvertical-tabMoves the printing position to the start of the next implementation-defined vertical tab position.

Conversion Specifications

Each conversion specification is introduced by the percent sign character (%). After the character %, the following appear in sequence:

Zero or more flags (in any order), which modify the meaning of the conversion specification.

An optional minimum field width. If the converted value has fewer characters than the field width, it will be padded with spaces by default on the left; t will be padded on the right, if the left- adjustment flag (-), described below, is given to the field width. The field width takes the form of a decimal integer. If the conversion character is s, the field width is the the minimum number of characters to be printed.

An optional precision that gives the minumum number of digits to appear for the d, i, o, x or X conversions (the field is padded with leading zeros); the number of digits to appear after the radix character for the e, E, and f conversions, the maximum number of significant digits for the g and G conversions; or the maximum number of characters to be written from a string is s and S conversions. The precision takes the form of an optional decimal digit string, where a null digit string is treated as 0. If a precision appears with a c conversion character the precision is ignored.

An optional h specifies that a following d, i, o, x, or X conversion character applies to a type short argument (the argument will be promoted according to the integral promotions and its value converted to type short before printing).

An optional l (ell) specifies that a following d, i, o, x, or X conversion character applies to a type long argument.

A field width or precision may be indicated by an asterisk (*) instead of a digit string. In this case, an integer argument supplised the field width precision. The argument that is actually converted is not fetched until the conversion letter is seen, so the the arguments specifying field width or precision must appear before the argument (if any) to be converted. If the precision argument is negative, it will be changed to zero. A negative field width argument is taken as a - flag, followed by a positive field width.

In format strings containing the %n$ form of a conversion specification, a field width or precision may be indicated by the sequence *m$, where m is a decimal integer giving the position in the argument list (after the format argument) of an integer argument containing the field width or precision.

The format can contain either numbered argument specifications (that is, %n$ and *m$), or unnumbered argument specifications (that is % and *), but normally not both. The only exception to this is that %% can be mixed with the %n$ form. The results of mixing numbered and unnumbered argument specifications in a format string are undefined.

Flag Characters

The flags and their meanings are:

'
integer portion of the result of a decimal conversion (%i, %d, %f, %g, or %G) will be formatted with thousands' grouping characters. For other conversions the flag is ignored. The non-monetary grouping character is used.
-
result of the conversion is left-justified within the field. (It will be right-justified if this flag is not specified).
+
result of a signed conversion always begins with a sign (+ or -). (It will begin with a sign only when a negative value is converted if this flag is not specified.)
<space>
If the first character of a signed conversion is not a sign, a space character will be placed before the result. This means that if the space character and + flags both appear, the space flag will be ignored.
#
value is to be converted to an alternative form. For c, d, i, and s conversions, the flag has no effect. For o conversion, it increases the precision to force the first digit of the result to be a zero. For x or X conversion, a non-zero result has 0x or 0X prefixed to it, respectively. For e, E, f, g, and G conversions, the result always contains a radix character, even if no digits follow the radix character (normally, a decimal point appears in the result of these conversions only if a digit follows it). For g and G conversions, trailing zeros will not be removed from the result as they normally are.
0
d, i, o, x, X, e, E, f, g, and G conversions, leading zeros (following any indication of sign or base) are used to pad to the field width; no space padding is performed. If the 0 and - flags both appear, the 0 flag is ignored. For d, i, o, x, and X conversions, if a precision is specified, the 0 flag will be ignored. For c conversions, the flag is ignored.

Conversion Characters

Each conversion character results in fetching zero or more arguments. The results are undefined if there are insufficient arguments for the format. Usually, an unchecked exception will be thrown. If the format is exhausted while arguments remain, the excess arguments are ignored.

The conversion characters and their meanings are:

d,i
The int argument is converted to a signed decimal in the style [-]dddd. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it will be expanded with leading zeros. The default precision is 1. The result of converting 0 with an explicit precision of 0 is no characters.
o
The int argument is converted to unsigned octal format in the style ddddd. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it will be expanded with leading zeros. The default precision is 1. The result of converting 0 with an explicit precision of 0 is no characters.
x
The int argument is converted to unsigned hexadecimal format in the style dddd; the letters abcdef are used. The precision specifies the minimum numberof digits to appear; if the value being converted can be represented in fewer digits, it will be expanded with leading zeros. The default precision is 1. The result of converting 0 with an explicit precision of 0 is no characters.
X
Behaves the same as the x conversion character except that letters ABCDEF are used instead of abcdef.
f
The floating point number argument is written in decimal notation in the style [-]ddd.ddd, where the number of digits after the radix character (shown here as a decimal point) is equal to the precision specification. A Locale is used to determine the radix character to use in this format. If the precision is omitted from the argument, six digits are written after the radix character; if the precision is explicitly 0 and the # flag is not specified, no radix character appears. If a radix character appears, at least 1 digit appears before it. The value is rounded to the appropriate number of digits.
e,E
The floating point number argument is written in the style [-]d.ddde{+-}dd (the symbols {+-} indicate either a plus or minus sign), where there is one digit before the radix character (shown here as a decimal point) and the number of digits after it is equal to the precision. A Locale is used to determine the radix character to use in this format. When the precision is missing, six digits are written after the radix character; if the precision is 0 and the # flag is not specified, no radix character appears. The E conversion will produce a number with E instead of e introducing the exponent. The exponent always contains at least two digits. However, if the value to be written requires an exponent greater than two digits, additional exponent digits are written as necessary. The value is rounded to the appropriate number of digits.
g,G
The floating point number argument is written in style f or e (or in sytle E in the case of a G conversion character), with the precision specifying the number of significant digits. If the precision is zero, it is taken as one. The style used depends on the value converted: style e (or E) will be used only if the exponent resulting from the conversion is less than -4 or greater than or equal to the precision. Trailing zeros are removed from the result. A radix character appears only if it is followed by a digit.
c,C
The integer argument is converted to a char and the result is written.
s,S
The argument is taken to be a string and bytes from the string are written until the end of the string or the number of bytes indicated by the precision specification of the argument is reached. If the precision is omitted from the argument, it is taken to be infinite, so all characters up to the end of the string are written.
%
Write a % character; no argument is converted.

If a conversion specification does not match one of the above forms, an IllegalArgumentException is thrown and the instance of PrintfFormat is not created.

If a floating point value is the internal representation for infinity, the output is [+]Infinity, where Infinity is either Infinity or Inf, depending on the desired output string length. Printing of the sign follows the rules described above.

If a floating point value is the internal representation for "not-a-number," the output is [+]NaN. Printing of the sign follows the rules described above.

In no case does a non-existent or small field width cause truncation of a field; if the result of a conversion is wider than the field width, the field is simply expanded to contain the conversion result.

The behavior is like printf. One exception is that the minimum number of exponent digits is 3 instead of 2 for e and E formats when the optional L is used before the e, E, g, or G conversion character. The optional L does not imply conversion to a long long double.

The biggest divergence from the C printf specification is in the use of 16 bit characters. This allows the handling of characters beyond the small ASCII character set and allows the utility to interoperate correctly with the rest of the Java runtime environment.

Omissions from the C printf specification are numerous. All the known omissions are present because Java never uses bytes to represent characters and does not have pointers:

Most of this specification is quoted from the Unix man page for the sprintf utility.

Version:
1 Release 1: Initial release. Release 2: Asterisk field widths and precisions %n$ and *m$ Bug fixes g format fix (2 digits in e form corrupt) rounding in f format implemented round up when digit not printed is 5 formatting of -0.0f round up/down when last digits are 50000...
Author:
Allan Jacobs

Constructor Summary
PrintfFormat(java.util.Locale locale, java.lang.String fmtArg)
          Constructs an array of control specifications possibly preceded, separated, or followed by ordinary strings.
PrintfFormat(java.lang.String fmtArg)
          Constructs an array of control specifications possibly preceded, separated, or followed by ordinary strings.
 
Method Summary
 java.lang.String sprintf()
          Format nothing.
 java.lang.String sprintf(double x)
          Format a double.
 java.lang.String sprintf(int x)
          Format an int.
 java.lang.String sprintf(long x)
          Format an long.
 java.lang.String sprintf(java.lang.Object x)
          Format an Object.
 java.lang.String sprintf(java.lang.Object[] o)
          Format an array of objects.
 java.lang.String sprintf(java.lang.String x)
          Format a String.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PrintfFormat

public PrintfFormat(java.lang.String fmtArg)
             throws java.lang.IllegalArgumentException
Constructs an array of control specifications possibly preceded, separated, or followed by ordinary strings. Control strings begin with unpaired percent signs. A pair of successive percent signs designates a single percent sign in the format.

Parameters:
fmtArg - Control string.
Throws:
java.lang.IllegalArgumentException - if the control string is null, zero length, or otherwise malformed.

PrintfFormat

public PrintfFormat(java.util.Locale locale,
                    java.lang.String fmtArg)
             throws java.lang.IllegalArgumentException
Constructs an array of control specifications possibly preceded, separated, or followed by ordinary strings. Control strings begin with unpaired percent signs. A pair of successive percent signs designates a single percent sign in the format.

Parameters:
fmtArg - Control string.
Throws:
java.lang.IllegalArgumentException - if the control string is null, zero length, or otherwise malformed.
Method Detail

sprintf

public java.lang.String sprintf(java.lang.Object[] o)
Format an array of objects. Byte, Short, Integer, Long, Float, Double, and Character arguments are treated as wrappers for primitive types.

Parameters:
o - The array of objects to format.
Returns:
The formatted String.

sprintf

public java.lang.String sprintf()
Format nothing. Just use the control string.

Returns:
the formatted String.

sprintf

public java.lang.String sprintf(int x)
                         throws java.lang.IllegalArgumentException
Format an int.

Parameters:
x - The int to format.
Returns:
The formatted String.
Throws:
java.lang.IllegalArgumentException - if the conversion character is f, e, E, g, G, s, or S.

sprintf

public java.lang.String sprintf(long x)
                         throws java.lang.IllegalArgumentException
Format an long.

Parameters:
x - The long to format.
Returns:
The formatted String.
Throws:
java.lang.IllegalArgumentException - if the conversion character is f, e, E, g, G, s, or S.

sprintf

public java.lang.String sprintf(double x)
                         throws java.lang.IllegalArgumentException
Format a double.

Parameters:
x - The double to format.
Returns:
The formatted String.
Throws:
java.lang.IllegalArgumentException - if the conversion character is c, C, s, S, d, d, x, X, or o.

sprintf

public java.lang.String sprintf(java.lang.String x)
                         throws java.lang.IllegalArgumentException
Format a String.

Parameters:
x - The String to format.
Returns:
The formatted String.
Throws:
java.lang.IllegalArgumentException - if the conversion character is neither s nor S.

sprintf

public java.lang.String sprintf(java.lang.Object x)
                         throws java.lang.IllegalArgumentException
Format an Object. Convert wrapper types to their primitive equivalents and call the appropriate internal formatting method. Convert Strings using an internal formatting method for Strings. Otherwise use the default formatter (use toString).

Parameters:
x - the Object to format.
Returns:
the formatted String.
Throws:
java.lang.IllegalArgumentException - if the conversion character is inappropriate for formatting an unwrapped value.