A function is a self-contained computation that accepts a number of arguments as input and returns some value. Awk has a number of built-in functions in two groups: arithmetic and string functions. Awk also provides user-defined functions, which allow you to expand upon the built-in functions by writing your own.

Nine of the built-in functions can be classified as arithmetic functions. Most of them take a numeric argument and return a numeric value. Table 9.1 summarizes these arithmetic functions.

Awk Function | Description |
---|---|

cos(x) |
Returns cosine of x (x is in radians). |

exp(x) |
Returns e to the power x. |

int(x) |
Returns truncated value of x. |

log(x) |
Returns natural logarithm (base-e) of x. |

sin(x) |
Returns sine of x (x is in radians). |

sqrt(x) |
Returns square root of x. |

atan2(y,x) |
Returns arctangent of y/x in the range - to . |

rand() |
Returns pseudo-random number r, where 0 <= r < 1. |

srand(x) |
Establishes new seed for |

The trigonometric functions **cos()**
and **sin()** work the same way, taking a single argument
that is the size of an angle in radians and returning
the cosine or sine for that angle. (To convert from degrees
to radians, multiply the number by /180.)
The trigonometric function **atan2()**
takes two arguments and returns the arctangent of their quotient.
The expression

atan2(0, -1)

produces .

The function **exp()** uses the natural
exponential, which is also known as base-*e*
exponentiation. The expression

exp(1)

returns the natural number 2.71828, the base of the natural
logarithms, referred to as *e*.
Thus, **exp**(*x*) is *e* to the *x*-th power.

The **log()** function gives the inverse of the **exp()**
function, the natural logarithm of *x*.
The **sqrt()** function takes a single argument and returns
the (positive) square root of that argument.

The **int()** function truncates a numeric value by removing
digits to the right of the decimal point.
Look at the following two statements:

print 100/3 print int(100/3)

The output from these statements is shown below:

33.3333 33

The **int()** function simply truncates;
it does not round up or down.
(Use the **printf** format "%.0f" to perform
rounding.)[57]

[57]The way

printfdoes rounding is discussed in Appendix B, "Quick Reference for awk".

The **rand()** function generates a pseudo-random floating-point
number between 0 and 1. The **srand()** function sets the seed
or starting point for random number generation. If **srand()** is
called without an argument, it uses the time of day to generate
the seed. With an argument *x*, **srand()** uses *x* as
the seed.

If you don't call **srand()** at all, awk acts as if **srand()**
had been called with a constant argument before your program
started, causing you to get the same starting point every time
you run your program.
This is useful if you want reproducible behavior for testing, but
inappropriate if you really do want your program to behave
differently every time.
Look at the following script:

# rand.awk -- test random number generation BEGIN { print rand() print rand() srand() print rand() print rand() }

We print the result of the **rand()** function twice,
and then call the **srand()** function before printing
the result of the **rand()** function two more times.
Let's run the script.

$0.513871 0.175726 0.760277 0.263863awk -f rand.awk

Four random numbers are generated. Now look what happens when we run the program again:

$0.513871 0.175726 0.787988 0.305033awk -f rand.awk

The first two "random" numbers are identical to the numbers
generated in the previous run of the program while the last
two numbers are different.
The last two numbers are different because we provided
the **rand()** function with a new seed.

The return value of the **srand()** function is the seed it was using.
This can be used to keep track of sequences of random numbers,
and re-run them if needed.

To show how to use **rand()**,
we'll look at a script that implements a "quick-pick" for a lottery game.
This script, named **lotto**, picks *x*
numbers from a series of numbers 1 to *y*.
Two arguments can be supplied on the command line: how many numbers to pick
(the default is 6) and the highest number in the series (the default
is 30).
Using the default values for *x* and *y*, the script generates
six unique random numbers between 1 and 30.
The numbers are sorted for readability from lowest to highest
and output.
Before looking at the script itself, let's run the program:

$Pick 6 of 30 9 13 25 28 29 30 $lottoPick 7 of 35 1 6 9 16 20 22 27lotto 7 35

The first example uses the default values to print six random numbers from 1 to 30. The second example prints seven random numbers out of 35.

The full **lotto** script is fairly complicated, so before looking
at the entire script, let's look at a smaller script that
generates a single random number in a series:

awk -v TOPNUM=$1 ' # pick1 - pick one random number out of y # main routine BEGIN { # seed random number using time of day srand() # get a random number select = 1 + int(rand() * TOPNUM) # print pick print select }'

The shell script expects a single argument from the command line
and this is passed into the program as "TOPNUM=$1,"
using the -v option.
All the action happens in the **BEGIN** procedure. Since there are
no other statements in the program, awk exits when the **BEGIN**
procedure is done.

The main routine first calls the **srand()** function
to seed the random number generator. Then we get a
random number by calling the **rand()** function:

select = 1 + int(rand() * TOPNUM)

It might be helpful to see this expression broken up so each part of it is obvious.

Statement | Result |
---|---|

print r = rand() |
0.467315 |

print r * TOPNUM |
14.0195 |

print int(r * TOPNUM) |
14 |

print 1 + int(r * TOPNUM) |
15 |

Because the **rand()** function returns a number between 0 and 1,
we multiply it by **TOPNUM** to get a number between 0 and
**TOPNUM**. We then truncate the number to remove the fractional
values and then add 1 to the number. The latter is necessary
because **rand()** could return 0. In this example, the
random number that is generated is 15.
You could use this program to print any single number, such
as picking a number between 1 and 100.

$83pick1 100

The **lotto** script must "pick one" multiple times. Basically,
we need to set up a **for** loop to execute the **rand()**
function as many times as needed. One
of the reasons this is difficult is that we have to
worry about duplicates. In other words, it is possible
for a number to be picked again; therefore we have to
keep track of the numbers already picked.

Here's the **lotto** script:

awk -v NUM=$1 -v TOPNUM=$2 ' # lotto - pick x random numbers out of y # main routine BEGIN { # test command line args; NUM = $1, how many numbers to pick # TOPNUM = $2, last number in series if (NUM <= 0) NUM = 6 if (TOPNUM <= 0) TOPNUM = 30 # print "Pick x of y" printf("Pick %d of %d\n", NUM, TOPNUM) # seed random number using time and date; do this once srand() # loop until we have NUM selections for (j = 1; j <= NUM; ++j) { # loop to find a not-yet-seen selection do { select = 1 + int(rand() * TOPNUM) } while (select in pick) pick[select] = select } # loop through array and print picks. for (j in pick) printf("%s ", pick[j]) printf("\n") }'

Unlike the previous program, this one looks for two command-line
arguments, indicating *x* numbers out of
*y*. The main routine looks to see if these
numbers were supplied and if not, assigns default values.

There is only one array, **pick**, for holding the random numbers that
are selected. Each number is guaranteed to be in the desired range,
because the result of **rand()** (a value between 0 and 1) is
multiplied by **TOPNUM** and then truncated.
The heart of the script is a loop that occurs **NUM** times
to assign **NUM** elements to the **pick** array.

To get a new non-duplicate random number, we
use an inner loop that generates selections
and tests to see if they are in the **pick** array.
(Using the **in** operator is much faster than looping through the
array comparing subscripts.)
While **(select in pick)**, the corresponding element
has been found already, so the
selection is a duplicate and we reject the selection.
If it is not true that **select in pick**, then
we assign **select** to an element of the **pick** array.
This will make
future **in** tests
return true, causing the **do** loop to continue.

Finally, the program loops through the **pick** array and
prints the elements.
This version of the **lotto** script leaves one thing out.
See if you can tell what it is if we run it again:

$Pick 7 of 35 5 21 9 30 29 20 2lotto 7 35

That's right, the numbers are not sorted. We'll defer showing the code for the sort routine until we discuss user-defined functions. While it's not necessary to have written the sorting code as a function, it makes a lot of sense. One reason is that you can tackle a more generalized problem and retain the solution for use in other programs. Later on, we will write a function that sorts the elements of an array.

Note that the **pick** array isn't ready for sorting, since its
indices are the same as its values, not numbers in order.
We would have to set up a separate array for sorting by our sort
function:

# create a numerically indexed array for sorting i = 1 for (j in pick) sortedpick[i++] = pick[j]

The **lotto** program is set up to do everything in the **BEGIN** block.
No input is processed.
You could, however,
revise this script to read a list of names from a file and
for each name generate a "quick-pick."

Copyright © 2003 O'Reilly & Associates. All rights reserved.