Four major issues occur when using a timer:
time_1 = elapsedtime() (Stuff to be timed goes here) time_for_Stuff = elapsedtime() - time_1while a delta timer is used as
time_1 = elapsedtime() (Stuff to be timed goes here) time_for_Stuff = elapsedtime()Usually you want to find elapsed CPU time instead of wall clock time . This is particularly true for large scale scientific computing, where your job may spend a large amount of wall clock time swapped out. Also, if available an elapsed time clock is preferred to a delta time clock. You can always implement one, given the other (and it is a useful exercise to figure out how!), but the common approach to timing parts of a program is to time a few sections that you suspect account for most of the time, and then to subtract the sum of those from the overall time to find time spent in "everything else".
The terms "timing block" or "timing interval" refer to the chunk of code between two calls to a timer function, the part of the code that is specified as "(Stuff to be timed goes here)" above. That chunk may be a loop or a invocation of a function. Clock resolutions typically range from around a few nanoseconds, which is considered ``high resolution'', up to 0.01 seconds, which is considered pretty sloppy. The overhead (function call penalty) for calling a timer on modern (≥ 2012 CE) machines is now almost negligble, but that changes from year to year, so measuring it is the only reliable method that is guaranteed to work in the future.
To avoid problems with resolution and overhead, follow the general rules whenever possible:
Why not use standard statistical techniques for judging the quality of a timing? The brief answer is that you should, but it does require rather sophisticated statistical methods and cannot be done blindly. Timings rarely follow a normal (Gaussian) curve - why? Furthermore, they frequently cluster around discrete quanta corresponding to system events happening or not happening (like swaps). The real question is why not look at all of the timing data? Plotting 100k data points in Matlab takes < 0.25 seconds on a five year old workstation from 2006, and plotting 1M data points takes about 0.29 seconds. It's dumb to not look at the data for outliers and strange values when it can be done faster than you can read this sentence. You can right-click and download the Matlab script used to find those timings, and try some timings yourself to see how much data your system can realistically handle.
C/C++ have several timers, but the resolution is sometimes only claimed (by header
file or man pages) to be
0.01 seconds, the Posix standard. In practice, many C/C++ systems have much
better resolution than that; you have to measure it to find out the actual value.
The Fortran language standard requires vendors to provide a subroutine
(function that returns void, for C/C++ folks) that return the clock's
resolution and other information. This is OS-independent, making it
practical for using across platforms as well. The routine that provides
wall clock time is
subroutine system_clock(count, count_rate, count_max)
integer:: count, count_rate, count_max
where
integer(kind=selected_int_kind(18)) :: count, count_rate, count_max
(which makes them 8-byte integers), count_rate = 1000000 = 1M.
The gfortran compiler on the same platform returns count_rate = 1000.
count_max is the maximum value of count; this is typically
2147483647 or
9223372036854775807, numbers that should be immediately recognizable.
As a hint add one to each and then take its log base 2.
The count_max is important because of ...
Negative times. Scientific codes often run for a long
time, possibly days or even weeks. Even on a smaller scale it often occurs
that the clock rolls over, and the difference between end
time and start time is a negative number. When that happens, if the
rollover occured only once, the correct time is given by
If the code fragment takes a really long amount of time, then the clock may have rolled
over multiple times. In that case insert timer calls
into the timed section, and try to keep track of how many roll-overs
occurred. My recommendation is in C to use the Unix epoch time (which
won't roll over until 2038), and in Fortran to use date_and_time().
And arrange for 2038 to be a vacation year for yourself.
Another option is to use 64-bit integers for counting seconds. Those won't roll over
until 4 December 292277026596 CE, and it's a safe bet none of us will be around then.
call system_clock(count_start, count_rate, count_max)
...
call system_clock(count_end, count_rate, count_max)
time_used = count_end - count_start
if (time_used ≤ 0) time_used = time_used + count_max