Chapter 11

Formatting Your Output


CONTENTS

The Perl programs you've seen so far produce output using the print function, which writes raw, unformatted text to a file.

Perl also enables you to produce formatted output, using print formats and the built-in function write. Today's lesson describes how to produce formatted output. You'll learn the following:

Defining a Print Format

The following is an example of a simple print format:

format MYFORMAT =

===================================

Here is the text I want to display.

===================================

.

This defines the print format MYFORMAT.

The syntax for print formats is

format formatname =

lines_of_output

.

The special keyword format tells the Perl interpreter that the following lines are a print-format definition. The formatname is a placeholder for the name of the print format being defined (for example, MYFORMAT). This name must start with an alphabetic character and can consist of any sequence of letters, digits, or underscores.

The lines_of_output consists of one or more lines of text that are to be printed when the print format is utilized; these lines are sometimes called picture lines. In the MYFORMAT example, there are three lines of text printed: two lines containing = characters, and the line

Here is the text I want to display.

A print-format definition is terminated with a line containing a period character. This line can contain nothing else; there can be no white space, and the period must be the first character on the line.

Like subroutines, print-format definitions can appear anywhere in program code (even, for example, in the middle of a conditional statement). However, it usually is best to cluster them either at the beginning or the end of the program.

Displaying a Print Format

To display output using a print format, you need to do two things:

Listing 11.1 is an example of a simple program that displays output using a print format.


Listing 11.1. A program that uses a print format.
1:  #!/usr/local/bin/perl

2:  

3:  $~ = "MYFORMAT";

4:  write;

5:  

6:  format MYFORMAT =

7:  ===================================

8:  Here is the text I want to display.

9:  ===================================

10: .


$ program11_1

===================================

Here is the text I want to display.

===================================

$

Line 3 of this program assigns the character string MYFORMAT to the system variable $~. This tells the Perl interpreter that MYFORMAT is the print format to use when calling write.

Line 4 calls write, which sends the text defined in MYFORMAT to the standard output file.

Lines 6-10 contain the definition of the print format MYFORMAT.

NOTE
If you don't specify a print format by assigning to $~, the Perl interpreter assumes that the print format to use has the same name as the file variable being written to. In this example program, if line 3 had not specified MYFORMAT as the print format to use, the Perl interpreter would have tried to use a print format named STDOUT when executing the call to write in line 4, because the call to write is writing to the standard output file

Displaying Values in a Print Format

Of course, the main reason to use print formats is to format values stored in scalar variables or array variables to produce readable output. Perl enables you to do this by specifying value fields as part of a format definition.

Each value field specifies a value: the name of a scalar variable, for example, or an expression. When the write statement is invoked, the value is displayed in the format specified by the value field.

Listing 11.2 shows how value fields work. This program keeps track of the number of occurrences of the letters a, e, i, o, and u in a text file.


Listing 11.2. A program that uses value fields to print output.
1:  #!/usr/local/bin/perl

2:  

3:  while ($line = <STDIN>) {

4:          $line =~ s/[^aeiou]//g;

5:          @vowels = split(//, $line);

6:          foreach $vowel (@vowels) {

7:                  $vowelcount{$vowel} += 1;

8:          }

9:  }

10: $~ = "VOWELFORMAT";

11: write;

12: 

13: format VOWELFORMAT =

14: ==========================================================

15: Number of vowels found in text file:

16:           a: @<<<<<   e: @<<<<<

17:           $vowelcount{"a"}, $vowelcount{"e"}

18:           i: @<<<<<   o: @<<<<<

19:           $vowelcount{"i"}, $vowelcount{"o"}

20:           u: @<<<<<

21:           $vowelcount{"u"}

22: ==========================================================

23: .


$ program11_2

This is a test file.

This test file contains some vowels.

The quick brown fox jumped over the lazy dog.

^D

==========================================================

Number of vowels found in text file:

          a: 3        e: 10

          i: 7        o: 7

          u: 2

==========================================================

$

This program reads one line of input at a time. Line 4 removes everything that is not a, e, i, o, or u from the input line, and line 5 splits the remaining characters into the array @vowels. Each element of @vowels is one character of the input line.

Lines 6-8 count the vowels in the input line by examining the elements of @vowels and adding to the associative array %vowelcount.

Line 10 sets the current print format to VOWELFORMAT; line 11 prints using VOWELFORMAT.

The print format VOWELFORMAT is defined in lines 13-23. Line 16 is an example of a print format line that contains value fields; in this case, two value fields are defined. Each value field has the format @<<<<<, which indicates six left-justified characters. (For a complete description of the possible value fields, see the section called "Choosing a Value-Field Format," later today.)

When one or more value fields appear in a print-format line, the next line must define the value or values to be printed in this value field. Because line 16 defines two value fields, line 17 defines the two values to be printed. These values are $vowelcount{"a"} and $vowelcount{"e"}, which are the number of occurrences of a and e, respectively.

Similarly, line 18 defines two more value fields to be printed, and line 19 indicates that the values to be printed in these fields are $vowelcount{"i"} and $vowelcount{"o"}. Finally, line 20 defines a fifth value field, and line 21 specifies that $vowelcount{"u"} is to be printed in this field.

NOTE
Three things to note about the values that are specified for value-field formats:
  • The lines containing values to be printed are not themselves printed. For example, in Listing 11.2, lines 16, 18, and 20 are printed, but lines 17, 19, and 21 are not.
  • The Perl interpreter ignores spacing when it looks for values corresponding to value fields. Many people prefer to line up their values with the corresponding value fields on the previous line, but there is no need to do so.
  • The number of values specified must match the number of value fields defined on the previous line

Creating a General-Purpose Print Format

One disadvantage of print formats as defined in Perl is that scalar-variable names are included as part of the definition. For example, in the following definition, the scalar variable $winnum is built into the print format definition MYFORMAT:

format MYFORMAT =

==========================================================

The winning number is @<<<<<<!

$winnum

==========================================================

.

When write is called with this print format, as in the following, you have to remember that $winnum is being used by MYFORMAT.

$~ = "MYFORMAT";

write;

If, later on, you accidentally delete all references to $winnum in the program, the call to write will stop working properly.

One way to get around this problem is to call write from within a subroutine, and to use variables local to the subroutine in the print format that write uses. Listing 11.3 is a program that does this. It reads a file from the standard input file and prints out the number of occurrences of the five most frequently occurring letters.


Listing 11.3. A program that calls write from within a subroutine.
1:  #!/usr/local/bin/perl

2:  

3:  while ($line = <STDIN>) {

4:          $line =~ tr/A-Z/a-z/;

5:          $line =~ s/[^a-z]//g;

6:          @letters = split(//, $line);

7:          foreach $letter (@letters) {

8:                  $lettercount{$letter} += 1;

9:          }

10: }

11: 

12: $~ = "WRITEHEADER";

13: write;

14: $count = 0;

15: foreach $letter (reverse sort occurrences

16:                 (keys(%lettercount))) {

17:         &write_letter($letter, $lettercount{$letter});

18:         last if (++$count == 5);

19: }

20: 

21: sub occurrences {

22:         $lettercount{$a} <=> $lettercount{$b};

23: }

24: sub write_letter {

25:         local($letter, $value) = @_;

26: 

27:         $~ = "WRITELETTER";

28:         write;

29: }

30: format WRITEHEADER =

31: The five most frequently occurring letters are:

32: .

33: format WRITELETTER =

34:         @:  @<<<<<<

35:         $letter, $value

36: .


$ program11_3

This is a test file.

This test file contains some input.

The quick brown fox jumped over the lazy dog.

^D

The five most frequently occurring letters are:

        t: 10

        e: 9

        i: 8

        s: 7

        o: 6

$

Like the vowel-counting program in Listing 11.2, this program processes one line of input at a time. Line 4 translates all uppercase alphabetic characters into lowercase, so that they can be included in the letter count. Line 5 gets rid of all characters that are not letters, including any white space.

Line 6 splits the line into its individual letters; lines 7-9 examine each letter and increment the appropriate letter counters, which are stored in the associative array %lettercount.

Lines 12 and 13 print the following line by setting the current print format to WRITEHEADER and calling write:

The five most frequently occurring letters are:

Except in very special cases, never mix calls to write with calls to print. Your program should use one printing function or the other, not both

Lines 15-19 sort the array %lettercount in order of occurrence. The first letter to appear in the foreach loop is the letter that appears most often in the file. To sort the array in order of occurrence, lines 15 and 16 specify that sorting is to be performed according to the rules defined in the subroutine occurrences. This subroutine tells the Perl interpreter to use the values of the associative array elements as the sort criterion.

Line 17 passes the letter and its occurrence count to the subroutine write_letter. This subroutine sets the current print format to WRITELETTER; this print format refers to the local scalar variables $letter and $value, which contain the values passed to write_letter by line 17. This means that each call to write_letter prints the letter and value currently being examined by the foreach loop.

Note that the first value field in the print format WRITELETTER contains only a single character, @. This indicates that the write field is only one character long (which makes sense, because this is a single letter).

Line 18 ensures that the foreach loop quits after the five most frequently used letters have been examined and printed.

TIP
Some programs, such as the one in Listing 11.3, use more than one print-format definition. To make it easier to see which print format is being used by a particular call to write, always keep the print format specification statement and the write call together. For example:
$~ = "WRITEFORMAT";
write;
Here, it is obvious that the call to write is using the print format WRITEFORMAT

Formats and Local Variables

In Listing 11.3, you might have noticed that the subroutine write_letter calls a subroutine to write out a letter and its value:

sub write_letter {

        local($letter, $value) = @_;



        $~ = "WRITELETTER";

        write;

}

This subroutine works properly even though the WRITELETTER print format is defined outside the subroutine.

Note, however, that local variables defined using my cannot be written out using a print format unless the format is defined inside the subroutine. (To see this for yourself, change line 25 of Listing 11.3 to the following and run the program again:

my($letter,$value) = @_;

You will notice that the letter counts do not appear.) This limitation is a result of the way local variables defined using my are stored by the Perl interpreter. To avoid this difficulty, use local instead of my when you define local variables that are to be written out using write. (For a discussion of local and my, see Day 9, "Using Subroutines.")

Perl 4 users will not run into this problem, because my is not defined for that version of the language.

NOTE
In versions of Perl 5 earlier than version 5.001, local variables defined using my cannot be written out at all. Even in version 5.001, variables defined using my might not behave in the way you expect them to. As a consequence, it is best to avoid using my with print formats

Choosing a Value-Field Format

Now that you know how print formats and write work, it's time to look at the value-field formats that are available. Table 11.1 lists these formats.

Table 11.1. Valid value-field formats.

FieldValue-field format
@<<<Left-justified output
@>>>Right-justified output
@|||Centered output
@##.##Fixed-precision numeric
@*Multiline text

NOTE
In left-justified output, the value being displayed appears at the left end of the value field. In right-justified output, the value being displayed appears at the right end of the value field

In each of the field formats, the first character is a line-fill character. It indicates whether text formatting is required. If the @ character is specified as the line fill character, text formatting is not performed. (For a discussion of text formatting, see the section titled "Formatting Long Character Strings," later today.)

In all cases, except for the multiline value field @*, the width of the field is equal to the number of characters specified. The @ character is included when counting the number of characters in the value field. For example, the following field is five characters wide-one @ character and four > characters:

@>>>>

Similarly, the following field is seven characters wide-four before the decimal point, two after the decimal point, and the decimal point itself:

@###.##

Listing 11.4 illustrates how you can use the value field formats to produce a neatly printed report. The report is redirected to a file for later printing.


Listing 11.4. A program that uses the various value-field formats.
1:  #!/usr/local/bin/perl

2:  

3:  $company = <STDIN>;

4:  $~ = "COMPANY";

5:  write;

6:  

7:  $grandtotal = 0;

8:  $custline = <STDIN>;

9:  while ($custline ne "") {

10:         $total = 0;

11:         ($customer, $date) = split(/#/, $custline);

12:         $~ = "CUSTOMER";

13:         write;

14:         while (1) {

15:                 $orderline = <STDIN>;

16:                 if ($orderline eq "" || $orderline =~ /#/) {

17:                         $custline = $orderline;

18:                         last;

19:                 }

20:                 ($item, $cost) = split(/:/, $orderline);

21:                 $~ = "ORDERLINE";

22:                 write;

23:                 $total += $cost;

24:         }

25:         &write_total ("Total:", $total);

26:         $grandtotal += $total;

27: }

28: &write_total ("Grand total:", $grandtotal);

29: 

30: sub write_total {

31:         local ($totalstring, $total) = @_;

32:         $~ = "TOTAL";

33:         write;

34: }

35: 

36: format COMPANY =

37: ************* @|||||||||||||||||||||||||||||| *************

38: $company

39: .

40: format CUSTOMER =

41: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<                @>>>>>>>>>>>>

42: $customer, $date

43: .

44: format ORDERLINE =

45:           @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<           @####.##

46: $item, $cost

47: .

48: format TOTAL =

49: @<<<<<<<<<<<<<<                                   @#####.##

50: $totalstring, $total

51: 

52: .


$ program11_4 >report

Consolidated Widgets, Inc.

John Doe#Feb 11, 1994

1 flying widget:171.42

1 crawling widget:89.99

Mary Smith#May 4, 1994

2 swimming widgets:203.43

^D

$

The following report is written to the report file:

*************   Consolidated Widgets, Inc.    *************

John Doe                                       Feb 11, 1994

          1 flying widget                            171.42

          1 crawling widget                           89.99

Total:                                               261.41



Mary Smith                                      May 4, 1994

          2 swimming widgets                         203.43

Total:                                               203.43



Grand total:                                         464.84

This program starts off by reading the company name from the standard input file and then writing it out. Line 5 writes the company name using the print format COMPANY, which uses a centered output field to display the company name in the center of the line.

After the company name has been printed, the program starts processing data for one customer at a time. Each customer record is assumed to consist of a customer name and date followed by lines of orders. The customer name record uses a # character as the field separator, and the order records use : characters as the separator; this enables the program to distinguish one type of record from the other.

Line 13 prints the customer information using the CUSTOMER print format. This format contains two fields: a left-justified output field for the customer name, and a right-justified output field for the date of the transaction.

Line 22 prints an order line using the ORDERLINE print format. This print format also contains two fields: a left-justified output field indicating the item ordered, and a numeric field to display the cost of the item.

The value field format @####.## indicates that the cost is to be displayed as a floating-point number. This number is defined as containing at most five digits before the decimal point, and two digits after.

Finally, the print format TOTAL prints the customer total and the grand total. Because this print format is used inside a subroutine, the same print format can be used to print both totals.

Normally, any floating-point number you print is rounded up when necessary. For example, when you print 43.999 in the value field @#.##, it appears as 44.00.
However, a floating-point number whose last decimal place is 5 might or might not round correctly. For example, if you are writing using the value field @#.##, some numbers whose third and last decimal place is 5 will round and others will not. This happens because some floating-point numbers cannot be stored exactly, and the nearest equivalent number that can be stored is a slightly smaller number (which rounds down, not up)

Printing Value-Field Characters

As you have seen, certain characters such as @, <, and > are treated as value fields when they are encountered in print formats. Listing 11.5 shows how to actually print one of these special characters using write.


Listing 11.5. A program that prints a value-field character.
1:  #!/usr/local/bin/perl

2:

3:  format SPECIAL =

4:  This line contains the special character @.

5:  "@"

6:  .

7:

8:  $~ = "SPECIAL";

9:  write;


$ program11_5

This line contains the special character @.

$

The print format line in line 4 contains the special character @, which is a one-character value field. Line 5 specifies that the string @ is to be displayed in this value field when the line is printed.

Using the Multiline Field Format

Listing 11.6 uses the multiline field format @* to write a character string over several lines.


Listing 11.6. A program that writes a string using the multiline field format.
1:  #!/usr/local/bin/perl

2:  

3:  @input = <STDIN>;

4:  $string = join("", @input);

5:  $~ = "MULTILINE";

6:  write;

7:  

8:  format MULTILINE =

9:  ****** contents of the input file: ******

10: @*

11: $string

12: *****************************************

13: .


$ program11_6

Here is a line of input.

Here is another line.

Here is the last line.

^D

****** contents of the input file: ******

Here is a line of input.

Here is another line.

Here is the last line.

*****************************************

$

Line 3 reads the entire input file into the array variable @input. Each element of the list stored in @input is one line of the input file.

Line 4 joins the input lines into a single character string, stored in $string. This character string still contains the newline characters that end each line.

Line 6 calls write using the print format MULTILINE. The @* value field in this print-format definition indicates that the value stored in $string is to be written out using as many lines as necessary. This ensures that the entire string stored in $string is written out.

If a character string contains a newline character, the only way to display the entire string using write is to use the @* multiline value field. If you use any other value field, only the part of the string preceding the first newline character is displayed

Writing to Other Output Files

So far, all of the examples that have used the function write have written to the standard output file. However, you can use write also to send output to other files.

The simplest way to do this is to pass the file to write to as an argument to write. For example, to write to the file represented by the file variable MYFILE using the print format MYFILE, you can use the following statement:

write (MYFILE);

Here, write writes to the file named MYFILE using the default print format, which is also MYFILE. This is tidy and efficient, but somewhat restricting because, in this case, you can't use $~ to choose the print format to use.

The $~ system variable only works with the default file variable, which is the file variable to which write sends output. To change the default file variable, and therefore change the file that $~ affects, call the built-in function select, as follows:

select (MYFILE);

select sets the default file variable to use when writing. For example, to write to the file represented by the file variable MYFILE using the print format MYFORMAT, you can use the following statements:

select(MYFILE);

$~ = "MYFORMAT";

write;

Here, the built-in function select indicates that the file to be written to is the file represented by the file variable MYFILE. The statement

$~ = "MYFORMAT";

selects the print format to be associated with this particular file handle; in this case, the print format MYFORMAT is now associated with the file variable MYFILE.

NOTE
This is worth repeating: Each file variable has its own current print format. An assignment to $~ only changes the print format for the current file variable (the last one passed to select)

Because select has changed the file to be written to, the call to write no longer writes to the standard output file. Instead, it writes to MYFILE. Calls to write continue to write to MYFILE until the following statement is seen:

select(STDOUT);

This statement resets the write file to be the standard output file.

Changing the write file using select not only affects write; it also affects print. For example, consider the following:
select (MYFILE);
print ("Here is a line of text.\n");
This call to print writes to MYFILE, not to the standard output file. As with write, calls to print continue to write to MYFILE until another call to select is seen

The select function is useful if you want to be able to use the same subroutine to write to more than one file at a time. Listing 11.7 is an example of a simple program that does this.


Listing 11.7. A program that uses the select function.
1:  #!/usr/local/bin/perl

2:  

3:  open (FILE1, ">file1");

4:  $string = "junk";

5:  select (FILE1);

6:  &writeline;

7:  select (STDOUT);

8:  &writeline;

9:  close (FILE1);

10: 

11: sub writeline {

12:         $~ = "WRITELINE";

13:         write;

14: }

15: 

16: format WRITELINE =

17:         I am writing @<<<<< to my output files.

18:                      $string

19: .


$ program11_7

       I am writing junk   to my output files.

$

Line 5 of this program calls select, which sets the default file variable to FILE1. Now, all calls to write or print write to FILE, not the standard output file.

Line 6 calls writeline to write a line. This subroutine sets the current print format for the default file variable to WRITELINE. This means that the file FILE1 now is using the print format WRITELINE, and, therefore, the subroutine writes the following line to the file FILE1 (which is file1):

I am writing junk   to my output files.

Line 7 sets the default file variable back to the standard output file variable, STDOUT. This means that write and print now send output to the standard output file. Note that the current print format for the standard output file is STDOUT (the default), not WRITELINE; the assignment to $~ in the subroutine WRITELINE affects only FILE1, not STDOUT.

Line 8 calls writeline again; this time, the subroutine writes a line to the standard output file. The assignment

$~ = "WRITELINE";

in line 12 associates the print format WRITELINE with the standard output file. This means that WRITELINE is now associated with both STDOUT and FILE1.

At this point, the call to write in line 13 writes the line of output that you see on the standard output file.

DO, whenever possible, call select and assign to $~ immediately before calling write, as follows:
select (MYFILE);
$~ = "MYFORMAT";
write;
Keeping these statements together makes it clear which file is being written to and which print format is being used.
DON'T use select and $~ indiscriminately, because you might lose track of which print format goes with which file variable, and you might forget which file variable is the default for printing

Saving the Default File Variable

When select changes the default file variable, it returns an internal representation of the file variable that was last selected. For example:

$oldfile = select(NEWFILE);

This call to select is setting the current file variable to NEWFILE. The old file variable is now stored in $oldfile. To restore the previous default file variable, you can call select as follows:

select ($oldfile);

At this point, the default file variable reverts back to its original value (what it was before NEWFILE was selected).

The internal representation of the file variable returned by select is not necessarily the name of the file variable

You can use the return value from select to create subroutines that write to the file you want to write with, using the print format you want to use, without affecting the rest of the program. For example:

sub write_to_stdout {

        local ($savefile, $saveformat);

        $savefile = select(STDOUT);

        $saveformat = $~;

        $~ = "MYFORMAT";

        write;

        $~ = $saveformat;

        select($savefile);

}

This subroutine calls select to set the default output file to STDOUT, the standard output file. The return value from select, the previous default file, is saved in $savefile.

Now that the default output file is STDOUT, the next step is to save the current print format being used to write to STDOUT. The subroutine does this by saving the present value of $~ in another local variable, $saveformat. After this is saved, the subroutine can set the current print format to MYFORMAT. The call to write now writes to the standard output file using MYFORMAT.

After the call to write is complete, the subroutine puts things back the way they were. The first step is to reset $~ to the value stored in $saveformat. The final step is to set the default output file back to the file variable whose representation is saved in $savefile.

Note that the call to select must appear after the assignment to $~. If the call to select had been first, the assignment to $~ would change the print format associated with the original default file variable, not STDOUT.

As you can see, this subroutine doesn't need to know what the default values outside the subroutine are. Also, it does not affect the default values outside the subroutine.

Specifying a Page Header

If you are sending your output to a printer, you can make your output look smarter by supplying text to appear at the top of every page in your output. This special text is called a page header.

If a page header is defined for a particular output file, write automatically paginates the output to that file. When the number of lines printed is greater than the length of a page, write starts a new page.

To define a page header for a file, create a print format definition with the name of filename_TOP, where filename is a placeholder for the name of the file variable corresponding to the file to which you are writing. For example, to define a header for writing to standard output, define a print format named STDOUT_TOP, as follows:

format STDOUT_TOP =

Consolidated Widgets Inc. 1994 Annual Report

.

In this case, when the Perl interpreter starts a new page of standard output, the contents of the print format STDOUT_TOP are printed automatically.

Print formats that generate headers can contain value fields which are replaced by scalar values, just like any other print format. One particular value that is often used in page headers is the current page number, which is stored in the system variable $%. For example:

format STDOUT_TOP =

Page @<<.

$%

.

In this case, when the first page is printed, the program prints the following header at the top of the page:

Page 1.

NOTE
By default, $% is initially set to zero and is incremented every time a new page begins.
To change the pagination, change the value of $% before (or during) printing

Changing the Header Print Format

To change the name of the print format that prints a page header for a particular file, change the value stored in the special system variable $^.

As with $~, only the value for the current default file can be changed. For example, to use the print format MYHEADER as the header file for the file MYFILE, add the following statements:

$oldfile = select(MYFILE);

$^ = "MYHEADER";

select($oldfile);

.

These statements set MYFILE to be the current default file, change the header for MYFILE to be the print format MYHEADER, and then reset the current default file to its original value.

Setting the Page Length

By default, the page length is 60 lines. To specify a different page length, change the value stored in the system variable $=:

$= = 66;     # set the page length to 66 lines

This assignment must appear before the first write statement.

If the page length is changed in the middle of the program, the new page length will not be used until a new page is started

Listing 11.8 shows how you can set the page length and define a page-header print format for your output file.


Listing 11.8. A program that sets the length and print format for a page.
1:  #!/usr/local/bin/perl

2:  

3:  open (OUTFILE, ">file1");

4:  select (OUTFILE);

5:  $~ = "WRITELINE";

6:  $^ = "TOP_OF_PAGE";

7:  $= = 60;

8:  while ($line = <STDIN>) {

9:          write;

10: }

11: close (OUTFILE);

12: 

13: format TOP_OF_PAGE =

14:                                     - page @<

15:                                              $%

16: .

17: format WRITELINE =

18: @>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

19: $line

20: .


Suppose that you supply the following input:

$ program11_8

Here is a line of input.

Here is another line.

Here is the last line.

^D

$

The following output is written to the file file1:

                                    - page 1

                 Here is a line of input.

               Here is another line.

              Here is the last line.

Line 3 opens the file file1 for output and associates it with the file variable OUTFILE.

Line 4 sets the current default file to OUTFILE. Now, when write or print is called with no file variable supplied, the output is sent to OUTFILE.

Line 5 indicates that WRITELINE is the print format to be used when writing to the file OUTFILE. To do this, it assigns WRITELINE to the system variable $~. This assignment does not affect the page header.

Line 6 indicates that TOP_OF_PAGE is the print format to be used when printing the page headers for the file OUTFILE. This assignment does not affect the print format used to write to the body of the page.

Line 7 sets the page length to 60 lines. This page length takes effect immediately, because no output has been written to OUTFILE.

Using print with Pagination

Normally, you won't want to use print if you are using pagination, because the Perl interpreter keeps track of the current line number on the page by monitoring the calls to write. If you must use a call to print in your program and you want to ensure that the page counter includes the call in its line count, adjust the system variable $-. This system variable indicates the number of lines between the current line and the bottom of the page. When $- reaches 0, a top-of-form character is generated, which starts a new page.

The following is a code fragment that calls print and then adjusts the $- variable:

print ("Here is a line of output\n");

$- -= 1;

When $- has 1 subtracted from its value, the page counter becomes correct.

Formatting Long Character Strings

As you've seen, the @* value field prints multiple lines of text. However, this field prints the output exactly as it is stored in the character string. For example, consider Listing 11.9, which uses @* to write a multiline character string.


Listing 11.9. A program that illustrates the limitations of the @* value field.
1:  #!/usr/local/bin/perl

2:  

3:  $string = "Here\nis an unbalanced line of\ntext.\n";

4:  $~ = "OUTLINE";

5:  write;

6:  

7:  format OUTLINE =

8:  @*

9:  $string

10: .


$ program11_9

Here

is an unbalanced line of

text.

$

This call to write displays the character string stored in $string exactly as is. Perl enables you to define value fields in print-format definitions that format text. To do this, replace the initial @ character in the value field with a ^ character. When text formatting is specified, the Perl interpreter tries to fit as many words as possible into the output line.

Listing 11.10 is an example of a simple program that does this.


Listing 11.10. A program that uses a value field that does formatting.
1:  #!/usr/local/bin/perl

2:  

3:  $string = "Here\nis an unbalanced line of\ntext.\n";

4:  $~ = "OUTLINE";

5:  write;

6:  

7:  format OUTLINE =

8:  ^<<<<<<<<<<<<<<<<<<<<<<<<<<<

9:  $string

10: .


$ program11_10

Here is an unbalanced line

$

Line 5 calls write using the print format OUTLINE. This print format contains a value field that specifies that formatting is to take place; this means that the Perl interpreter tries to fit as many words as possible into the line of output. In this case, the first line Here and the four-word string is an unbalanced line fit into the output line.

Note that there are two characters left over in the output line after the four words have been filled in. These characters are not filled, because the next word is not short enough to fit into the space remaining. Only entire words are filled.

One other feature of the line-filling operation is that the substring printed out is actually deleted from the scalar variable $string. This means that the value of $string is now of\ntext.\n. This happens because subsequent lines of output in the same print-format definition can be used to print the rest of the string.

NOTE
Because the line-filling write operation updates the value used, the value must be contained in a scalar variable and cannot be the result of an expression

To see how multiple lines of formatted output work, look at Listing 11.11. This program reads a quotation from the standard input file and writes it out on three formatted lines of output.


Listing 11.11. A program that writes out multiple formatted lines of output.
1:  #!/usr/local/bin/perl

2:  

3:  @quotation = <STDIN>;

4:  $quotation = join("", @quotation);

5:  $~ = "QUOTATION";

6:  write;

7:  

8:  format QUOTATION =

9:  Quotation for the day:

10: -----------------------------

11:    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

12:    $quotation

13:    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

14:    $quotation

15:    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

16:    $quotation

17: -----------------------------

18: .


$ program11_11

Any sufficiently advanced programming

language is indistinguishable from magic.

^D

Quotation for the day:

-----------------------------

   Any sufficiently advanced programming language is   

   indistinguishable from magic.                       



-----------------------------

$

The print format QUOTATION defines three value fields on which formatting is to be employed. Each of the three value fields uses the value of the scalar variable $quotation.

Before write is called, $quotation contains the entire quotation with newline characters appearing at the end of each input line. When write is called, the first value field in the print format uses as much of the quotation as possible. This means that the following substring is written to the standard output file:

Any sufficiently advanced programming language is

After the substring is written, it is removed from $quotation, which now contains the following:

indistinguishable from magic.

Because the written substring has been removed from $quotation, the remainder of the string can be used in subsequent output lines. Because the next value field in the print format also wants to use $quotation, the remainder of the string appears on the second output line and is deleted. $quotation is now the empty string.

This means that the third value field, which also refers to $quotation, is replaced by the empty string, and a blank line is written out.

The scalar variable containing the output to be printed is changed by a write operation. If you need to preserve the information, copy it to another scalar variable before calling write

Eliminating Blank Lines When Formatting

You can eliminate blank lines such as the one generated by Listing 11.11. To do this, put a ~ character at the beginning of any output line that is to be printed only when needed.

Listing 11.12 modifies the quotation-printing program to print lines only when they are not blank.


Listing 11.12. A program that writes out multiple formatted lines of output and suppresses blank lines.
1:  #!/usr/local/bin/perl

2:  

3:  @quotation = <STDIN>;

4:  $quotation = join("", @quotation);

5:  $~ = "QUOTATION";

6:  write;

7:  

8:  format QUOTATION =

9:  Quotation for the day:

10: -----------------------------

11: ~  ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

12:    $quotation

13: ~  ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

14:    $quotation

15: ~  ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

16:    $quotation

17: -----------------------------

18: .


$ program11_12

Any sufficiently advanced programming

language is indistinguishable from magic.

^D

Quotation for the day:

-----------------------------

   Any sufficiently advanced programming language is   

   indistinguishable from magic.                       

-----------------------------

$

If the quotation is too short to require all the lines, remaining lines are left blank. In this case, the quotation requires only two lines of output, so the third isn't printed.

The program is identical to the one in Listing 11.11 in all other respects. In particular, the value of $quotation after the call to write is still the empty string.

Supplying an Indefinite Number of Lines

While Listing 11.12 suppresses blank lines, it imposes an upper limit of three lines. Quotations longer than three lines are not printed in their entirety. To indicate that the formatted output is to use as many lines as necessary, specify two ~ characters at the beginning of the output line containing the value field. Listing 11.13 modifies the quotation program to allow quotations of any length.


Listing 11.13. A program that writes out as many formatted lines of output as necessary.
1:  #!/usr/local/bin/perl

2:  

3:  @quotation = <STDIN>;

4:  $quotation = join("", @quotation);

5:  $~ = "QUOTATION";

6:  write;

7:  

8:  format QUOTATION =

9:  Quotation for the day:

10: -----------------------------

11: ~~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

12:    $quotation

13: -----------------------------

14: .


$ program11_13

Any sufficiently advanced programming

language is indistinguishable from magic.

^D

Quotation for the day:

----------------------------

   Any sufficiently advanced programming language is   

   indistinguishable from magic.                       

-----------------------------

$

The ~~ characters at the beginning of the output field indicate that multiple copies of the output line are to be supplied. The output line is to be printed until there is nothing more to print.

In Listing 11.13, two copies of the line are needed.

Formatting Output Using printf

If you want to write output that looks reasonable without going to all the trouble of using write and print formats, Perl provides a built-in function, printf, that prints formatted output.

NOTE
If you are familiar with the C programming language, the behavior of printf in Perl will be familiar; the Perl printf and the C printf are basically the same

The arguments passed to the printf function are as follows:

When printf sees a field specifier, it substitutes the corresponding value in the printf argument list. The representation of the substituted value in the string depends on the field specifier that is supplied.

Field specifiers consist of the % character followed by a single character that represents the format to use when printing. Table 11.2 lists the field-specifier formats and the field-specifier character that represents each.

Table 11.2. Field specifiers for printf.

Specifier
Description
%c
Single character
%d
Integer in decimal (base-10) format
%e
Floating-point number in scientific notation
%f
Floating-point number in "normal" (fixed-point) notation
%g
Floating-point number in compact format
%o
Integer in octal (base-8) format
%s
Character string
%u
Unsigned integer
%x
Integer in hexadecimal (base-16) format

Here is a simple example of a call to printf:

printf("The number I want to print is %d.\n", $number);

The string to be printed contains one field specifier, %d, which represents an integer. The value stored in $number is substituted for the field specifier and printed.

Field specifiers also support a variety of options, as follows:

If a floating-point number contains more digits than the field specifier wants, the number is rounded to the number of decimal places needed. For example, if 43.499 is being printed using the field %5.2f, the number actually printed is 43.50.
As with the write value field @##.##, printf might not always round up when it is handling numbers whose last decimal place is 5. This happens because some floating-point numbers cannot be stored exactly, and the nearest equivalent number that can be stored is a slightly smaller number (which rounds down, not up). For example, 43.495 when printed by %5.2f might print 43.49, depending on how 43.495 is stored

NOTE
You can use printf to print to other files. To do this, specify the file variable corresponding to the file to which you want to print, just as you would with print or write
printf MYFILE ("I am printing %d.\n", $value);
This means that changing the current default file using select affects printf.

Summary

Perl enables you to format your output using print-format definitions and the built-in function write. In print-format definitions, you can specify value fields that are to be replaced by either the contents of scalar variables or the values of expressions.

Value fields indicate how to print the contents of a scalar variable or the value of an expression. With a value field, you can specify that the value is to be left justified (blanks added on the right), right justified (blanks added on the left), centered, or displayed as a floating-point number.

You also can define value fields that format a multiline character string. Blank lines can be suppressed, and the field can be defined to use as many output lines as necessary.

The built-in function select enables you to change the default file to which write and print send output.

You can break your output into pages by defining a special header print format that prints header information at the top of each page.

The following system variables enable you to control how write sends output to a file:

The built-in function printf enables you to format an individual line of text using format specifiers.

Q&A

Q:Which is better, write or printf?
A:It depends on what you want to do. If you want to print reports or control pagination, you'll need to use write. If you just want individual lines of output to look neat, printf might be what you need.
Q:How do I generate a page break?
A:To do this, set $- to zero. This generates a top-of-form character.
Q:Why do value fields that format text modify the contents of the scalar variable containing the text?
A:When formatted text is printed, the printed text is removed from the scalar variable, and the part of the string that is not printed is retained. This enables you to use other calls to write to print the remainder of the text. In fact, you can print the rest of the text in the scalar variable using a completely different print format.
Q:How many print formats can I define?
A:Basically, as many as you like, provided the resulting Perl program can still fit in your machine.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to give you experience in using what you've learned. Try to understand the quiz and exercise answers before you go on to tomorrow's lesson.

Quiz

  1. Define value fields that print the following:
    a.    Ten left-justified characters
    b.    Five right-justified characters
    c.    Two centered characters
    d.    A floating-point number with five digits before the decimal point and three after it
    e.    A field that prints as many formatted lines of 30 left-justified characters as necessary
  2. What do these fields print?
    a.    @<<<<
    b
    .    @||||||
    c
    .    @
    d
    .    @*
    e
    .    ~ ^>>>>>>>>>
  3. What do these printf field specifiers print?
    a.    %5d
    b
    .    %11.4f
    c
    .    %010d
    d
    .    %-12s
    e
    .    %x
  4. Why do certain floating-point numbers have round-off problems?
  5. How do you create a page header for an output file?

Exercises

  1. Write a program that prints the powers of 2 from 2**1 to 2**10. Use write and a print format to print them three to a line. Align the lines so that the right end of each number is lined up with the right end of the corresponding number on the previous line.
  2. Repeat Exercise 1 using printf.
  3. Write a program that reads text and formats it into 40-character lines, left-justified. Put lines of asterisks above and below the text.
  4. Write a program that reads a set of dollar values such as 71.43 (one per line). Write out two values per line (the first and second on the first line, and so on). Total each of the resulting columns, and produce a grand total.
  5. BUG BUSTER: What is wrong with the following program?
    #!/usr/local/bin/perl

    format STDOUT =
    @*
    .
    while ($line = <STDIN>) {
    chop ($line);
    if ($line eq "") {
    print ("<blank line>\n");
    next;
    }
    write;
    }