sub 0~+~tau ) |~+~{0.8~<<~epsilon sub e sup 2 ( tau )~>> } over sqrt { <<~epsilon sub i sup 2 ( tau )~>> }> , where <$Eepsilon sub i ( tau )> and <$Eepsilon sub e ( tau )> are the accumulated error of the ith clock and entire clock ensemble, respectively. The accumulated error of the entire ensemble is <$E<<~epsilon sub e sup 2 ( tau )~>>~=~left [ sum from i=1 to n~1 over { <<~epsilon sub i sup 2 ( tau )~>> } right ] sup {~-1}>. Finally, the weight factor for the ith clock is calculated as <$Ew sub i ( tau )~=~ { <<~epsilon sub e sup 2 ( tau )~>> } over { <<~epsilon sub i sup 2 ( tau )~>> }> . When all estimators and weight factors have been updated, the origin of the estimation interval is shifted and the new value of t0 becomes the old value of <$Et sub 0 ~+~ tau>. While not entering into the above calculations, it is useful to estimate the frequency error, since the ensemble clocks can be located some distance from each other and become isolated for some time due to network failures. The frequency-offset error in Ri is equivalent to the fractional frequency yi, <$Ey sub i~=~{ nu sub i~-~nu sub I } over nu sub I> measured between the ith timescale and the standard timescale I. Temporarily dropping the subscript i for clarity, consider a sequence of N independent frequency-offset samples <$Ey(j)~ (j~=~1,~2,~... ,~N)> where the interval between samples is uniform and equal to T. Let <$Etau> be the nominal interval over which these samples are averaged. The Allan variance <$Esigma sub y sup 2 ( N,~T,~tau )> [ALL74a] is defined as <$E<< sigma sub y sup 2 ( N,~T,~tau )~>>~=~<< ~ 1 over { N~-~1 }~ left [ sum from j=1 to N~y (j) sup 2~-~1 over N~left ( sum from j=1 to N~y(j) right ) sup 2 right ]~>>> , A particularly useful formulation is <$EN~=~2> and <$ET~=~tau>: <$E<< sigma sub y sup 2 (N~=~2,~T~=~tau ,~tau )>>~==~sigma sub y sup 2 ( tau )~=~<< {[y(j~+~1)~-~y(j)] sup 2 } over 2 >>> , so that <$Esigma sub y sup 2 ( tau )~=~1 over {2(N~-~1)}sum from { j = 1 } to {n-1 }~[y(j~+~1)~-~y(j)] sup 2> . While the Allan variance has found application when estimating errors in ensembles of cesium clocks, its application to NTP is limited due to the
computation and storage burden. As described in the next section, it is possible to estimate errors with some degree of confidence using normal byproducts of NTP processing algorithms. Application to NTP The NTP clock model is somewhat less complex than the general model described above. For instance, at the present level of development it is not necessary to separately estimate the time and frequency of all peer clocks, only the time and frequency of the local clock. If the timekeeping reference is the local clock itself, then the offsets available in the peer.offset peer variables can be used directly for the <$ET sub ij> quantities above. In addition, the NTP local-clock model incorporates a type-II phase-locked loop, which itself reliably estimates frequency errors and corrects accordingly. Thus, the requirement for estimating frequency is entirely eliminated. There remains the problem of how to determine a robust and easily computable error estimate <$Eepsilon sub i>. The method described above, although analytically justified, is most difficult to implement. Happily, as a byproduct of the NTP clock-filter algorithm, a useful error estimate is available in the form of the dispersion. As described in the NTP specification, the dispersion includes the absolute value of the weighted average of the offsets between the chosen offset sample and the <$En~-~1> other samples retained for selection. The effectiveness of this estimator was compared with the above estimator by simulation using observed timekeeping data and found to give quite acceptable results. The NTP clock-combining algorithm can be implemented with only minor modifications to the algorithms as described in the NTP specification. Although elsewhere in the NTP specification the use of general-purpose multiply/divide routines has been successfully avoided, there seems to be no way to avoid them in the clock-combining algorithm. However, for best performance the local-clock algorithm described elsewhere in this document should be implemented as well, since the combining algorithms result in a modest increase in phase noise which the revised local-clock algorithm is designed to suppress. Clock-Combining Procedure The result of the NTP clock-selection procedure is a set of survivors (there must be at least one) that represent truechimers, or correct clocks. When clock combining is not implemented, one of these peers, chosen as the most likely candidate, becomes the synchronization source and its computed offset becomes the final clock correction. Subsequently, the system variables are adjusted as described in the NTP clock-update procedure. When clock combining is implemented, these actions are unchanged, except that the final clock correction is computed by the clock-combining procedure.
The clock-combining procedure is called from the clock-select procedure. It constructs from the variables of all surviving peers the final clock correction <$ETHETA>. The estimated error required by the algorithms previously described is based on the synchronization distance <$ELAMBDA> computed by the distance procedure, as defined in the NTP specification. The reciprocal of <$ELAMBDA> is the weight of each clock-offset contribution to the final clock correction. The following pseudo-code describes the procedure. begin clock-combining procedure <$Etemp1~<<-~0>; <$Etemp2~<<-~0>; for (each peer remaining on the candidate list) /* scan all survivors */ <$ELAMBDA~<<-~roman distance (peer)>; <$Etemp~<<-~1 over roman {peer.stratum~times~NTP.MAXDISPERSE~+~LAMBDA }>; <$Etemp1~<<-~temp1~+~temp>; /* update weight and offset */ <$Etemp2~<<-~temp2~+~temp~times~roman peer.offset>; endif; <$ETHETA~<<-~temp2 over temp1>; /* compute final correction */ end clock-combining procedure; The value <$ETHETA> is the final clock correction used by the local- clock procedure to adjust the clock. Appendix G. Computer Clock Modelling and Analysis A computer clock includes some kind of reference oscillator, which is stabilized by a quartz crystal or some other means, such as the power grid. Usually, the clock includes a prescaler, which divides the oscillator frequency to a standard value, such as 1 MHz or 100 Hz, and a counter, implemented in hardware, software or some combination of the two, which can be read by the processor. For systems intended to be synchronized to an external source of standard time, there must be some means to correct the phase and frequency by occasional vernier adjustments produced by the timekeeping protocol. Special care is necessary in all timekeeping system designs to insure that the clock indications are always monotonically increasing; that is, system time never <169>runs backwards.<170> Computer Clock Models The simplest computer clock consists of a hardware latch which is set by overflow of a hardware counter or prescaler, and causes a processor interrupt or tick. The latch is reset when acknowledged by the processor, which then increments the value of a software clock counter. The phase of the clock is adjusted by adding periodic corrections to the
counter as necessary. The frequency of the clock can be adjusted by changing the value of the increment itself, in order to make the clock run faster or slower. The precision of this simple clock model is limited to the tick interval, usually in the order of 10 ms; although in some systems the tick interval can be changed using a kernel variable. This software clock model requires a processor interrupt on every tick, which can cause significant overhead if the tick interval is small, say in the order less 1 ms with the newer RISC processors. Thus, in order to achieve timekeeping precisions less than 1 ms, some kind of hardware assist is required. A straightforward design consists of a voltage- controlled oscillator (VCO), in which the frequency is controlled by a buffered, digital/analog converter (DAC). Under the assumption that the VCO tolerance is 10-4 or 100 parts-per-million (ppm) (a reasonable value for inexpensive crystals) and the precision required is 100 <$Emu roman s> (a reasonable goal for a RISC processor), the DAC must include at least ten bits. A design sketch of a computer clock constructed entirely of hardware logic components is shown in Figure 10a<$&fig10>. The clock is read by first pulsing the read signal, which latches the current value of the clock counter, then adding the contents of the clock-counter latch and a 64-bit clock-offset variable, which is maintained in processor memory. The clock phase is adjusted by adding a correction to the clock-offset variable, while the clock frequency is adjusted by loading a correction to the DAC latch. In principle, this clock model can be adapted to any precision by changing the number of bits of the prescaler or clock counter or changing the VCO frequency. However, it does not seem useful to reduce precision much below the minimum interrupt latency, which is in the low microseconds for a modern RISC processor. If it is not possible to vary the oscillator frequency, which might be the case if the oscillator is an external frequency standard, a design such as shown in Figure 10b may be used. It includes a fixed-frequency oscillator and prescaler which includes a dual-modulus swallow counter that can be operated in either divide-by-10 or divide-by-11 modes as controlled by a pulse produced by a programmable divider (PD). The PD is loaded with a value representing the frequency offset. Each time the divider overflows a pulse is produced which switches the swallow counter from the divide-by-10 mode to the divide-by-11 mode and then back again, which in effect <169>swallows<170> or deletes a single pulse of the prescaler pulse train. The pulse train produced by the prescaler is controlled precisely over a small range by the contents of the PD. If programmed to emit pulses at a low rate, relatively few pulses are swallowed per second and the frequency counted is near the upper limit of its range; while, if programmed to emit pulses at a high rate, relatively many pulses are swallowed and the frequency counted is near the lower limit. Assuming some degree of freedom in the choice of oscillator frequency and
prescaler ratios, this design can compensate for a wide range of oscillator frequency tolerances. In all of the above designs it is necessary to limit the amount of adjustment incorporated in any step to insure that the system clock indications are always monotonically increasing. With the software clock model this is assured as long as the increment is never negative. When the magnitude of a phase adjustment exceeds the tick interval (as corrected for the frequency adjustment), it is necessary to spread the adjustments over mulitple tick intervals. This strategy amounts to a deliberate frequency offset sustained for an interval equal to the total number of ticks required and, in fact, is a feature of the Unix clock model discussed below. In the hardware clock models the same considerations apply; however, in these designs the tick interval amounts to a single pulse at the prescaler output, which may be in the order of 1 ms. In order to avoid decreasing the indicated time when a negative phase correction occurs, it is necessary to avoid modifying the clock-offset variable in processor memory and to confine all adjustments to the VCO or prescaler. Thus, all phase adjustments must be performed by means of programmed frequency adjustments in much the same way as with the software clock model described previously. It is interesting to conjecture on the design of a processor assist that could provide all of the above functions in a compact, general-purpose hardware interface. The interface might consist of a multifunction timer chip such as the AMD 9513A, which includes five 16-bit counters, each with programmable load and hold registers, plus an onboard crystal oscillator, prescaler and control circuitry. A 48-bit hardware clock counter would utilize three of the 16-bit counters, while the fourth would be used as the swallow counter and the fifth as the programmable divider. With the addition of a programmable-array logic device and architecture-specific host interface, this compact design could provide all the functions necessary for a comprehensive timekeeping system. The Fuzzball Clock Model The Fuzzball clock model uses a combination of hardware and software to provide precision timing with a minimum of software and processor overhead. The model includes an oscillator, prescaler and hardware counter; however, the oscillator frequency remains constant and the hardware counter produces only a fraction of the total number of bits required by the clock counter. A typical design uses a 64-bit software clock counter and a 16-bit hardware counter which counts the prescaler output. A hardware-counter overflow causes the processor to increment the software counter at the bit corresponding to the frequency <$E2 sup N f sub p>, where N is the number of bits of the hardware counter and fp is the counted frequency at the prescaler output. The processor reads the clock counter by first generating a read pulse, which latches the
hardware counter, and then adding its contents, suitably aligned, to the software counter. The Fuzzball clock can be corrected in phase by adding a (signed) adjustment to the software clock counter. In practice, this is done only when the local time is substantially different from the time indicated by the clock and may violate the monotonicity requirement. Vernier phase adjustments determined in normal system operation must be limited to no more than the period of the counted frequency, which is 1 kHz for LSI-11 Fuzzballs. In the Fuzzball model these adjustments are performed at intervals of 4 s, called the adjustment interval, which provides a maximum frequency adjustment range of 250 ppm. The adjustment opportunities are created using the interval-timer facility, which is a feature of most operating systems and independent of the time-of-day clock. However, if the counted frequency is increased from 1 kHz to 1 MHz for enhanced precision, the adjustment frequency must be increased to 250 Hz, which substantially increases processor overhead. A modified design suitable for high precision clocks is presented in the next section. In some applications involving the Fuzzball model, an external pulse- per-second (pps) signal is available from a reference source such as a cesium clock or GPS receiver. Such a signal generally provides much higher accuracy than the serial character string produced by a radio timecode receiver, typically in the low nanoseconds. In the Fuzzball model this signal is processed by an interface which produces a hardware interrupt coincident with the arrival of the pps pulse. The processor then reads the clock counter and computes the residual modulo 1 s of the clock counter. This represents the local-clock error relative to the pps signal. Assuming the seconds numbering of the clock counter has been determined by a reliable source, such as a timecode receiver, the offset within the second is determined by the residual computed above. In the NTP local- clock model the timecode receiver or NTP establishes the time to within <F128M>?<F255D>128 ms, called the aperture, which guarantees the seconds numbering to within the second. Then, the pps residual can be used directly to correct the oscillator, since the offset must be less than the aperture for a correctly operating timecode receiver and pps signal. The above technique has an inherent error equal to the latency of the interrupt system, which in modern RISC processors is in the low tens of microseconds. It is possible to improve accuracy by latching the hardware time-of-day counter directly by the pps pulse and then reading the counter in the same way as usual. This requires additional circuitry to prioritize the pps signal relative to the pulse generated by the program to latch the counter. The Unix Clock Model The Unix 4.3bsd clock model is based on two system calls, settimeofday
and adjtime, together with two kernel variables tick and tickadj. The settimeofday call unceremoniously resets the kernel clock to the value given, while the adjtime call slews the kernel clock to a new value numerically equal to the sum of the present time of day and the (signed) argument given in the adjtime call. In order to understand the behavior of the Unix clock as controlled by the Fuzzball clock model described above, it is helpful to explore the operations of adjtime in more detail. The Unix clock model assumes an interrupt produced by an onboard frequency source, such as the clock counter and prescaler described previously, to deliver a pulse train in the 100-Hz range. In priniciple, the power grid frequency can be used, although it is much less stable than a crystal oscillator. Each interrupt causes an increment called tick to be added to the clock counter. The value of the increment is chosen so that the clock counter, plus an initial offset established by the settimeofday call, is equal to the time of day in microseconds. The Unix clock can actually run at three different rates, one corresponding to tick, which is related to the intrinsic frequency of the particular oscillator used as the clock source, one to <$Etick~+~tickadj> and the third to <$Etick~-~tickadj>. Normally the rate corresponding to tick is used; but, if adjtime is called, the argument <$Edelta> given is used to calculate an interval <$EDELTA t~=~delta~tick over tickadj> during which one or the other of the two rates are used, depending on the sign of <$Edelta>. The effect is to slew the clock to a new value at a small, constant rate, rather than incorporate the adjustment all at once, which could cause the clock to be set backward. With common values of <$Etick~=~10> ms and <$Etickadj~=~5~mu roman s>, the maximum frequency adjustment range is <$E+- tickadj over tick~=~+- {5~roman x~10 sup -6} over {10 sup -2}> or <F128M>æ<F255D>500 ppm. Even larger ranges may be required in the case of some workstations (e.g., SPARCstations) with extremely poor component tolerances. When precisions not less than about 1 ms are required, the Fuzzball clock model can be adapted to the Unix model by software simulation, as described in Section 5 of the NTP specification, and calling adjtime at each adjustment interval. When precisions substantially better than this are required, the hardware microsecond clock provided in some workstations can be used together with certain refinements of the Fuzzball and Unix clock models. The particular design described below is appropriate for a maximum oscillator frequency tolerance of 100 ppm (.01%), which can be obtained using a relatively inexpensive quartz crystal oscillator, but is readily scalable for other assumed tolerances. The clock model requires the capability to slew the clock frequency over the range <F128M>æ<F255D>100 ppm with an intrinsic oscillator frequency error as great as <F128M>æ<F255D>100 ppm. Figure 11<$&fig11> shows the
timing relationships at the extremes of the requirements envelope. Starting from an assumed offset of nominal zero and an assumed error of +100 ppm at time 0 s, the line AC shows how the uncorrected offset grows with time. Let <$Esigma> represent the adjustment interval and a the interval AB, in seconds, and let r be the slew, or rate at which corrections are introduced, in ppm. For an accuracy specification of 100 <$Emu roman s>, then <$Esigma~<<=~{100~mu roman s} over {100~roman ppm}~+~{100~mu roman s} over {(r~-~100)~roman ppm}~=~r over {r~-~100}> . The line AE represents the extreme case where the clock is to be steered <F128M>-<F255D>100 ppm. Since the slew must be complete at the end of the adjustment interval, <$Ea~<<=~{(r~-~200)~sigma} over r>. These relationships are satisfied only if <$Er~>>~200~roman ppm> and <$Esigma~<<~2~roman s>. Using <$Er~=~300~roman ppm> for convenience, <$Esigma~=~1.5~roman s> and <$Ea~<<=~0.5~roman s>. For the Unix clock model with <$Etick~=~10~roman ms>, this results in the value of <$Etickadj~=~3~mu roman s>. One of the assumptions made in the Unix clock model is that the period of adjustment computed in the adjtime call must be completed before the next call is made. If not, this results in an error message to the system log. However, in order to correct for the intrinsic frequency offset of the clock oscillator, the NTP clock model requires adjtime to be called at regular adjustment intervals of <$Esigma> s. Using the algorithms described here and the architecture constants in the NTP specification, these adjustments will always complete. Mathematical Model of the NTP Logical Clock The NTP logical clock can be represented by the feedback-control model shown in Figure 12<$&fig12>. The model consists of an adaptive- parameter, phase-lock loop (PLL), which continuously adjusts the phase and frequency of an oscillator to compensate for its intrinsic jitter, wander and drift. A mathematical analysis of this model developed along the lines of [SMI86] is presented in following sections, along with a design example useful for implementation guidance in operating-systems environments such as Unix and Fuzzball. Table 9<$&tab9> summarizes the quantities ordinarily treated as variables in the model. By convention, <$Ev> is used for internal loop variables, <$Etheta> for phase, <$Eomega> for frequency and <$Etau> for time. Table 10<$&tab10> summarizes those quantities ordinarily fixed as constants in the model. Note that these are all expressed as a power of two in order to simplify the implementation. In Figure 12 the variable <$Etheta sub r> represents the phase of the reference signal and <$Etheta sub o> the phase of the voltage-controlled
oscillator (VCO). The phase detector (PD) produces a voltage <$Ev sub d> representing the phase difference <$Etheta sub r~-~theta sub o> . The clock filter functions as a tapped delay line, with the output <$Ev sub s> taken at the tap selected by the clock-filter algorithm described in the NTP specification. The loop filter, represented by the equations given below, produces a VCO correction voltage <$Ev sub c>, which controls the oscillator frequency and thus the phase <$Etheta sub o>. The PLL behavior is completely determined by its open-loop, Laplace transfer function <$EG(s)> in the s domain. Since both frequency and phase corrections are required, an appropriate design consists of a type-II PLL, which is defined by the function <$EG(s)~=~{omega sub c sup 2} over {tau sup 2 s sup 2}~( 1 ~+~{tau s} over omega sub z )> , where <$Eomega sub c> is the crossover frequency (also called loop gain), <$Eomega sub z> is the corner frequency (required for loop stability) and <$Etau> determines the PLL time constant and thus the bandwidth. While this is a first-order function and some improvement in phase noise might be gained from a higher-order function, in practice the improvement is lost due to the effects of the clock-filter delay, as described below. The open-loop transfer function <$EG(s)> is constructed by breaking the loop at point a on Figure 12 and computing the ratio of the output phase <$Etheta sub o (s)> to the reference phase <$Etheta sub r (s)>. This function is the product of the individual transfer functions for the phase detector, clock filter, loop filter and VCO. The phase detector delivers a voltage <$Ev sub d (t)~=~ theta sub r (t)>, so its transfer function is simply <$EF sub d (s)~=~1>, expressed in V/rad. The VCO delivers a frequency change <$EDELTA omega ~=~{roman d~theta sub o (t)} over {roman dt}~=~alpha {v sub c (t)}>, where <$Ealpha> is the VCO gain in rad/V-sec and <$Etheta sub o (t)~=~alpha~int v sub c (t)~dt>. Its transfer function is the Laplace transform of the integral, <$EF sub o (s)~=~alpha over s>, expressed in rad/V. The clock filter contributes a stochastic delay due to the clock-filter algorithm; but, for present purposes, this delay will be assumed a constant T, so its transfer function is the Laplace transform of the delay, <$EF sub s (s)~=~e sup {- Ts}>. Let <$EF(s)> be the transfer function of the loop filter, which has yet to be determined. The open-loop transfer function <$EG(s)> is the product of these four individual transfer functions: <$EG(s)~=~{omega sub c sup 2} over {tau sup 2 s sup 2}~( 1 ~+~{tau s} over omega sub z )~=~F sub d (s) F sub s (s) F(s) F sub o (s)~=~1e sup {-Ts}~F(s)~alpha over s> . For the moment, assume that the product <$ETs> is small, so that <$Ee sup {-Ts}~approx ~1>. Making the following substitutions,
<$Eomega sub c sup 2~=~alpha over { K sub f}~~~~> and <$E~~~~omega sub z~=~K sub g over {K sub f}> and rearranging yields <$EF(s)~=~1 over {K sub g~tau}~+~1 over {K sub f~tau sup 2 s }> , which corresponds to a constant term plus an integrating term scaled by the PLL time constant <$Etau>. This form is convenient for implementation as a sampled-data system, as described later. With the parameter values given in Table 10, the Bode plot of the open- loop transfer function <$EG(s)> consists of a <196>12 dB/octave line which intersects the 0-dB baseline at <$Eomega sub c~=~2 sup -12> rad/s, together with a +6 dB/octave line at the corner frequency <$Eomega sub z~=~2 sup -14> rad/s. The damping factor <$Ezeta~=~omega sub c over {2 omega sub z}~=~2> suggests the PLL will be stable and have a large phase margin together with a low overshoot. However, if the clock-filter delay T is not small compared to the loop delay, which is approximately equal to <$E1 over omega sub c>, the above analysis becomes unreliable and the loop can become unstable. With the values determined as above, T is ordinarily small enough to be neglected. Assuming the output is taken at <$Ev sub s>, the closed-loop transfer function <$EH(s)> is <$EH(s)~==~{v sub s (s)} over {theta sub r (s)}~=~{F sub d (s) e sup {- Ts}} over {1~+~G(s)}> . If only the relative response is needed and the clock-filter delay can be neglected, <$EH(s)> can be written <$EH(s)~=~1 over {1~+~G(s)}~=~s sup 2 over {s sup 2~+~omega sub c sup 2 over {omega sub z~tau} s~+~omega sub c sup 2 over tau sup 2}> . For some input function <$EI(s)> the output function <$EI(s)H(s)> can be inverted to find the time response. Using a unit-step input <$EI(s)~=~1 over s> and the values determined as above, This yields a PLL risetime of about 52 minutes, a maximum overshoot of about 4.8 percent in about 1.7 hours and a settling time to within one percent of the initial offset in about 8.7 hours. Parameter Management A very important feature of the NTP PLL design is the ability to adapt its behavior to match the prevailing stability of the local oscillator and transmission conditions in the network. This is done using the <$Ealpha> and <$Etau> parameters shown in Table 10. Mechanisms for doing this are described in following sections. Adjusting VCO Gain (<$Ebold alpha>)
The <$Ealpha> parameter is determined by the maximum frequency tolerance of the local oscillator and the maximum jitter requirements of the timekeeping system. This parameter is usually an architecture constant and fixed during system operation. In the implementation model described below, the reciprocal of <$Ealpha>, called the adjustment interval <$Esigma>, determines the time between corrections of the local clock, and thus the value of <$Ealpha>. The value of <$Esigma> can be determined by the following procedure. The maximum frequency tolerance for board-mounted, uncompensated quartz- crystal oscillators is probably in the range of 10-4 (100 ppm). Many if not most Internet timekeeping systems can tolerate jitter to at least the order of the intrinsic local-clock resolution, called precision in the NTP specification, which is commonly in the range from one to 20 ms. Assuming 10-3 s peak-to-peak as the most demanding case, the interval between clock corrections must be no more than <$Esigma~=~10 sup -3 over {2 roman~x~10 sup -4}~=~5> sec. For the NTP reference model <$Esigma~=~4> sec in order to allow for known features of the Unix operating-system kernel. However, in order to support future anticipated improvements in accuracy possible with faster workstations, it may be useful to decrease <$Esigma> to as little as one-tenth the present value. Note that if <$Esigma> is changed, it is necessary to adjust the parameters <$EK sub f> and <$EK sub g> in order to retain the same loop bandwidth; in particular, the same <$Eomega sub c> and <$Eomega sub z>. Since <$Ealpha> varies as the reciprocal of <$Esigma>, if <$Esigma> is changed to something other than 22, as in Table 10, it is necessary to divide both <$EK sub f> and <$EK sub g> by <$Esigma over 4> to obtain the new values. Adjusting PLL Bandwidth (<$Ebold tau>) A key feature of the type-II PLL design is its capability to compensate for the intrinsic frequency errors of the local oscillator. This requires a initial period of adaptation in order to refine the frequency estimate (see later sections of this appendix). The <$Etau> parameter determines the PLL time constant and thus the loop bandwidth, which is approximately equal to <$E{omega sub c} over tau>. When operated with a relatively large bandwidth (small <$Etau>), as in the analysis above, the PLL adapts quickly to changes in the input reference signal, but has poor long term stability. Thus, it is possible to accumulate substantial errors if the system is deprived of the reference signal for an extended period. When operated with a relatively small bandwidth (large <$Etau>), the PLL adapts slowly to changes in the input reference signal, and may even fail to lock onto it. Assuming the frequency estimate has stabilized, it is possible for the PLL to coast for an extended period without external corrections and without accumulating significant error.
In order to achieve the best performance without requiring individual tailoring of the loop bandwidth, it is necessary to compute each value of <$Etau> based on the measured values of offset, delay and dispersion, as produced by the NTP protocol itself. The traditional way of doing this in precision timekeeping systems based on cesium clocks, is to relate <$Etau> to the Allan variance, which is defined as the mean of the first-order differences of sequential samples measured during a specified interval <$Etau>, <$Esigma sub y sup 2 ( tau )~=~1 over {2(N~-~1)}sum from { i = 1 } to {N-1 }~[y(i~+~1)~-~y(i)] sup 2> , where y is the fractional frequency measured with respect to the local timescale and N is the number of samples. In the NTP local-clock model the Allan variance (called the compliance, h in Table 11) is approximated on a continuous basis by exponentially averaging the first-order differences of the offset samples using an empirically determined averaging constant. Using somewhat ad-hoc mapping functions determined from simulation and experience, the compliance is manipulated to produce the loop time constant and update interval. The NTP Clock Model The PLL behavior can also be described by a set of recurrence equations, which depend upon several variables and constants. The variables and parameters used in these equations are shown in Tables 9, 10 and 11<$&tab11>. Note the use of powers of two, which facilitates implementation using arithmetic shifts and avoids the requirement for a multiply/divide capability. A capsule overview of the design may be helpful in understanding how it operates. The logical clock is continuously adjusted in small increments at fixed intervals of <$Esigma>. The increments are determined while updating the variables shown in Tables 9 and 11, which are computed from received NTP messages as described in the NTP specification. Updates computed from these messages occur at discrete times as each is received. The intervals <$Emu> between updates are variable and can range up to about 17 minutes. As part of update processing the compliance h is computed and used to adjust the PLL time constant <$Etau>. Finally, the update interval <$Erho> for transmitted NTP messages is determined as a fixed multiple of <$Etau>. Updates are numbered from zero, with those in the neighborhood of the ith update shown in Figure 13<$&fig13>. All variables are initialized at <$Ei~=~0> to zero, except the time constant <$Etau (0)~=~tau>, poll interval <$Emu (0)~=~tau> (from Table 10) and compliance <$Eh (0)~=~K sub s>. After an interval <$Emu (i)> (<$Ei~>>~0>) from the previous update the ith update arrives at time <$Et(i)> including the time offset <$Ev sub s (i)>. Then, after an interval <$Emu (i~+~1)> the
<$Ei+1 roman th> update arrives at time <$Et(i~+~1)> including the time offset <$Ev sub s (i~+~1)>. When the update <$Ev sub s (i)> is received, the frequency error <$Ef(i~+~1)> and phase error <$Eg(i~+~1)> are computed: <$Ef(i~+~1)~=~f(i)~+~{mu (i) v sub s (i)} over {tau (i) sup 2 }> ,<$E~~~~~g(i~+~1)~=~{v sub s (i)} over {tau (i)}> . Note that these computations depend on the value of the time constant <$Etau (i)> and poll interval <$Emu (i)> previously computed from the <$Ei-1 roman th> update. Then, the time constant for the next interval is computed from the current value of the compliance <$Eh(i)> <$Etau (i~+~1)~=~roman max [K sub s~-~|~h(i)|,~1]> . Next, using the new value of <$Etau>, called <$Etau prime> to avoid confusion, the poll interval is computed <$Erho (i~+~1)~=~K sub u~tau prime> . Finally, the compliance <$Eh(i~+~1)> is recomputed for use in the <$Ei+1 roman th> update: <$Eh(i~+~1)~=~h(i)~+~{K sub t~tau prime v sub s (i)~-~h(i) }over K sub h> . The factor <$Etau prime> in the above has the effect of adjusting the bandwidth of the PLL as a function of compliance. When the compliance has been low over some relatively long period, <$Etau prime> is increased and the bandwidth is decreased. In this mode small timing fluctuations due to jitter in the network are suppressed and the PLL attains the most accurate frequency estimate. On the other hand, if the compliance becomes high due to greatly increased jitter or a systematic frequency offset, <$Etau prime> is decreased and the bandwidth is increased. In this mode the PLL is most adaptive to transients which can occur due to reboot of the system or a major timing error. In order to maintain optimum stability, the poll interval <$Erho> is varied directly with <$Etau>. A model suitable for simulation and parameter refinement can be constructed from the above recurrence relations. It is convenient to set the temporary variable <$Ea~=~g(i~+~1)>. At each adjustment interval <$Esigma> the quantity <$Ea over K sub g~+~{f(i~+~1)} over K sub f> is added to the local-clock phase and the quantity <$Ea over K sub g> is subtracted from a. For convenience, let n be the greatest integer in <$E{mu (i)} over sigma>; that is, the number of adjustments that occur in the ith interval. Thus, at the end of the ith interval just before the <$Ei+1 roman th> update, the VCO control voltage is: <$Ev sub c (i~+~1)~=~v sub c (i)~+~{[1~-~(1~-~1 over K sub g ) sup n ]}~{g(i~+~1)} ~+~n over {K sub f }~{ f(i~+~1)}~.>
Detailed simulation of the NTP PLL with the values specified in Tables 9, 10 and 11 and the clock filter described in the NTP specification results in the following characteristics: For a 100-ms phase change the loop reaches zero error in 39 minutes, overshoots 7 ms at 54 minutes and settles to less than 1 ms in about six hours. For a 50-ppm frequency change the loop reaches 1 ppm in about 16 hours and 0.1 ppm in about 26 hours. When the magnitude of correction exceeds a few milliseconds or a few ppm for more than a few updates, the compliance begins to increase, which causes the loop time constant and update interval to decrease. When the magnitude of correction falls below about 0.1 ppm for a few hours, the compliance begins to decrease, which causes the loop time constant and update interval to increase. The effect is to provide a broad capture range exceeding 4 s per day, yet the capability to resolve oscillator skew well below 1 ms per day. These characteristics are appropriate for typical crystal-controlled oscillators with or without temperature compensation or oven control. Appendix H. Analysis of Errors and Correctness Principles Introduction This appendix contains an analysis of errors arising in the generation and processing of NTP timestamps and the determination of delays and offsets. It establishes error bounds as a function of measured roundtrip delay and dispersion to the root (primary reference source) of the synchronization subnet. It also discusses correctness assertions about these error bounds and the time-transfer, filtering and selection algorithms used in NTP. The notation <$Ew~=~[u,~v]> in the following describes the interval in which u is the lower limit and v the upper limit, inclusive. Thus, <$Eu~=~min (w)~<<=~v~=~max (w)>, and for scalar a, <$Ew~+~a~=~[u~+~a,~v~+~a]>. Table 12<$&tab12> shows a summary of other notation used in the analysis. The notation <$E<<~x~>>> designates the (infinite) average of x, which is usually approximated by an exponential average, while the notation <$Ex hat> designates an estimator for x. The lower-case Greek letters <$Etheta>, <$Edelta> and <$Eepsilon> are used to designate measurement data for the local clock to a peer clock, while the upper-case Greek letters <$ETHETA>, <$EDELTA> and <$EEPSILON> are used to designate measurement data for the local clock relative to the primary reference source at the root of the synchronization subnet. Exceptions will be noted as they arise. Timestamp Errors The standard second (1 s) is defined as <169>9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the cesium-133 atom<170> [ALL74b], which implies a granularity of about 1.1x10-10 s. Other intervals can be
determined as rational multiples of 1 s. While NTP time has an inherent resolution of about 2.3x10-10 s, local clocks ordinarily have resolutions much worse than this, so the inherent error in resolving NTP time relative to the 1 s can be neglected. In this analysis the local clock is represented by a counter/divider which increments at intervals of s seconds and is driven by an oscillator which operates at frequency <$Ef sub c~=~n over s> for some integer n. A timestamp <$ET(t)> is determined by reading the clock at an arbitrary time t (the argument t will be usually omitted for conciseness). Strictly speaking, s is not known exactly, but can be assumed bounded from above by the maximum reading error <$Erho>. The reading error itself is represented by the random variable r bounded by the interval <$E[-~rho ,~0]>, where <$Erho> depends on the particular clock implementation. Since the intervals between reading the same clock are almost always independent of and much larger than s, successive readings can be considered independent and identically distributed. The frequency error of the clock oscillator is represented by the random variable f bounded by the interval <$E[-~phi ,~phi ]>, where <$Ephi> represents the maximum frequency tolerance of the oscillator throughout its service life. While f for a particular clock is a random variable with respect to the population of all clocks, for any one clock it ordinarily changes only slowly with time and can usually be assumed a constant for that clock. Thus, an NTP timestamp can be represented by the random variable T: <$ET~=~t~+~r~+~f tau> , where t represents a clock reading, <$Etau> represents the time interval since this reading and minor approximations inherent in the measurement of <$Etau> are neglected. In order to assess the nature and expected magnitude of timestamp errors and the calculations based on them, it is useful to examine the characteristics of the probability density functions (pdf) <$Ep sub r (x)> and <$Ep sub f (x)> for r and f respectively. Assuming the clock reading and counting processes are independent, the pdf for r is uniform over the interval <$E[-~rho ,~0]>. With conventional manufacturing processes and temperature variations the pdf for f can be approximated by a truncated, zero-mean Gaussian distribution with standard deviation <$Esigma>. In conventional manufacturing processes <$Esigma> is maneuvered so that the fraction of samples rejected outside the interval <$E[-~phi ,~phi ]> is acceptable. The pdf for the total timestamp error <$Eepsilon (x)> is thus the sum of the r and f contributions, computed as <$Eepsilon (x)~ =~ int~from {- inf } to inf p sub r (t) p sub f (x~-~t) d t> , which appears as a bell-shaped curve, symmetric about <$E-~rho over 2>
and bounded by the interval <$E[ min (r)~+~min (f tau ),~max (r)~+~max (f tau )]~=~[-~rho ~-~phi tau ,~phi tau ]> . Since f changes only slowly over time for any single clock, <$Eepsilon~==~[ min (r)~+~f tau ,~max (r)~+~f tau ]~=~ [-~ rho ,~0]~+~f tau> , where <$Eepsilon> without argument designates the interval and <$Eepsilon (x)> designates the pdf. In the following development subscripts will be used on various quantities to indicate to which entity or timestamp the quantity applies. Occasionally, <$Eepsilon> will be used to designate an absolute maximum error, rather than the interval, but the distinction will be clear from context. Measurement Errors In NTP the roundtrip delay and clock offset between two peers A and B are determined by a procedure in which timestamps are exchanged via the network paths between them. The procedure involves the four most recent timestamps numbered as shown in Figure 14<$&fig14>, where the <$Etheta sub 0> represents the true clock offset of peer B relative to peer A. The <$ET sub 1> and <$ET sub 4> timestamps are determined relative to the A clock, while the <$ET sub 2> and <$ET sub 3> timestamps are determined relative to the B clock. The measured roundtrip delay <$Edelta> and clock offset <$Etheta> of B relative to A are given by <$Edelta~=~(T sub 4~-~T sub 1 )~-~(T sub 3~-~T sub 2 )~~~~and~~~~theta~=~{(T sub 2~-~T sub 1 )~+~(T sub 3~-~T sub 4 )} over 2> . The errors inherent in determining the timestamps T1, T2, T3 and T4 are, respectively, <$Eepsilon sub 1~=~[-~rho sub A ,~0]>, <$E~epsilon sub 2~=~[-~rho sub B ,~0]>, <$E~epsilon sub 3~=~[-~rho sub B ,~0]~+~f sub B (T sub 3 ~-~T sub 2 )>, <$E~epsilon sub 4~=~[-~rho sub A ,~0]~+~f sub A (T sub 4 ~-~T sub 1 )> . For specific peers A and B, where <$Ef sub A> and <$Ef sub B> can be considered constants, the interval containing the maximum error inherent in determining <$Edelta> is given by <$E[ min ( epsilon sub 4 )~-~max ( epsilon sub 1 )~-~max ( epsilon sub 3 )~+~min ( epsilon sub 2 ),~ max ( epsilon sub 4 )~-~min ( epsilon sub 1 )~-~min ( epsilon sub 3 )~+~max ( epsilon sub 2 )]> <$E=~[-~rho sub A~-~rho sub B ,~rho sub A ~+~rho sub B ]~+~f sub A (T sub 4~-~T sub 1 )~-~f sub B (T sub 3~-~T sub 2 )> .
In the NTP local clock model the residual frequency errors <$Ef sub A> and <$Ef sub B> are minimized through the use of a type-II phase-lock loop (PLL). Under most conditions these errors will be small and can be ignored. The pdf for the remaining errors is symmetric, so that <$Edelta hat~=~<< delta >>> is an unbiased maximum-likelihood estimator for the true roundtrip delay, independent of the particular values of <$Erho sub A> and <$Erho sub B>. However, in order to reliably bound the errors under all conditions of component variation and operational regimes, the design of the PLL and the tolerance of its intrinsic oscillator must be controlled so that it is not possible under any circumstances for <$Ef sub A> or <$Ef sub B> to exceed the bounds <$E[-~phi sub A ,~phi sub A ]> or <$E[-~phi sub B ,~phi sub B ]>, respectively. Setting <$Erho~=~max ( rho sub A ,~rho sub B )> for convenience, the absolute maximum error <$Eepsilon sub delta> inherent in determining roundtrip delay <$Edelta> is given by <$Eepsilon sub delta~==~rho~+~phi sub A (T sub 4~-~T sub 1 )~+~phi sub B (T sub 3~-~T sub 2 )> , neglecting residuals. As in the case for <$Edelta>, where <$Ef sub A> and <$Ef sub B> can be considered constants, the interval containing the maximum error inherent in determining <$Etheta> is given by <$E{[ min ( epsilon sub 2 )~-~max ( epsilon sub 1 )~+~min ( epsilon sub 3 )~-~max ( epsilon sub 4 ),~ max ( epsilon sub 2 )~-~min ( epsilon sub 1 )~+~max ( epsilon sub 3 )~-~min ( epsilon sub 4 )]} over 2> <$E=~[ -~rho sub B ,~rho sub A ]~+~{f sub B (T sub 3~-~T sub 2 )~-~f sub A (T sub 4 ~-~T sub 1 )} over 2> . Under most conditions the errors due to <$Ef sub A> and <$Ef sub B> will be small and can be ignored. If <$Erho sub A~=~rho sub B~=~rho>; that is, if both the A and B clocks have the same resolution, the pdf for the remaining errors is symmetric, so that <$Etheta hat~=~<< theta >>> is an unbiased maximum-likelihood estimator for the true clock offset <$Etheta sub 0>, independent of the particular value of <$Erho>. If <$Erho sub A~!=~rho sub B>, <$E<< theta >>> is not an unbiased estimator; however, the bias error is in the order of <$E{rho sub A~-~rho sub B } over 2> . and can usually be neglected. Again setting <$Erho~=~max ( rho sub A ,~rho sub B )> for convenience, the absolute maximum error <$Eepsilon sub theta> inherent in determining clock offset <$Etheta> is given by <$Eepsilon sub theta~==~{rho~+~phi sub A (T sub 4~-~T sub 1 )~+~phi sub B (T sub 3~-~T sub 2 )} over 2 > .
Network Errors In practice, errors due to stochastic network delays usually dominate. In general, it is not possible to characterize network delays as a stationary random process, since network queues can grow and shrink in chaotic fashion and arriving customer traffic is frequently bursty. However, it is a simple exercise to calculate bounds on clock offset errors as a function of measured delay. Let <$ET sub 2~-~T sub 1~=~a> and <$ET sub 3~-~T sub 4~=~b>. Then, <$Edelta~=~a~-~b~~~~ and ~~~~theta~=~{a~+~b} over 2> . The true offset of B relative to A is called <$Etheta sub 0> in Figure 14. Let x denote the actual delay between the departure of a message from A and its arrival at B. Therefore, <$Ex~+~theta sub 0~=~T sub 2~-~T sub 1~==~a>. Since x must be positive in our universe, <$Ex~=~a~-~theta sub 0~>>=~0>, which requires <$Etheta sub 0~<<=~a>. A similar argument requires that <$Eb~<<=~theta sub 0>, so surely <$Eb~<<=~theta sub 0~<<=~a>. This inequality can also be expressed <$Eb~=~{a~+~b} over 2~-~{a~-~b} over 2~<<=~theta sub 0~<<=~{a~+~b} over 2~+~{a~-~b} over 2~=~a> , which is equivalent to <$Etheta~-~delta over 2~<<=~theta sub 0~<<=~theta~+~delta over 2> . In the previous section bounds on delay and offset errors were determined. Thus, the inequality can be written <$Etheta~-~epsilon sub theta~-~{delta~+~epsilon sub delta} over 2~<<=~theta sub 0~<<=~theta~+~epsilon sub theta~+~{delta~+~ epsilon sub delta } over 2> , where <$Eepsilon sub theta> is the maximum offset error and <$Eepsilon sub delta> is the maximum delay error derived previously. The quantity <$Eepsilon~=~epsilon sub theta~+~epsilon sub delta over 2~=~rho~+~phi sub A (T sub 4~-~T sub 1 )~+~phi sub B (T sub 3~-~T sub 2 )> , called the peer dispersion, defines the maximum error in the inequality. Thus, the correctness interval I can be defined as the interval <$EI~=~[ theta~-~delta over 2~-~epsilon ,~theta~+~delta over 2~+~epsilon ]> , in which the clock offset <$EC~=~theta> is the midpoint. By construction, the true offset <$Etheta sub 0> must lie somewhere in this interval.
Inherited Errors As described in the NTP specification, the NTP time server maintains the local clock <$ETHETA>, together with the root roundtrip delay <$EDELTA> and root dispersion <$EEPSILON> relative to the primary reference source at the root of the synchronization subnet. The values of these variables are either included in each update message or can be derived as described in the NTP specification. In addition, the protocol exchange and clock-filter algorithm provide the clock offset <$Etheta> and roundtrip delay <$Edelta> of the local clock relative to the peer clock, as well as various error accumulations as described below. The following discussion establishes how errors inherent in the time-transfer process accumulate within the subnet and contribute to the overall error budget at each server. An NTP measurement update includes three parts: clock offset <$Etheta>, roundtrip delay <$Edelta> and maximum error or dispersion <$Eepsilon> of the local clock relative to a peer clock. In case of a primary clock update, these values are usually all zero, although <$Eepsilon> can be tailored to reflect the specified maximum error of the primary reference source itself. In other cases <$Etheta> and <$Edelta> are calculated directly from the four most recent timestamps, as described in the NTP specification. The dispersion <$Eepsilon> includes the following contributions: 1. Each time the local clock is read a reading error is incurred due to the finite granularity or precision of the implementation. This is called the measurement dispersion <$Erho>. 2. Once an offset is determined, an error due to frequency offset or skew accumulates with time. This is called the skew dispersion <$Ephi tau>, where <$Ephi> represents the skew-rate constant (<$Eroman NTP.MAXSKEW over NTP.MAXAGE> in the NTP specification) and <$Etau> is the interval since the dispersion was last updated. 3 When a series of offsets are determined at regular intervals and accumulated in a window of samples, as in the NTP clock-filter algorithm, the (estimated) additional error due to offset sample variance is called the filter dispersion <$Eepsilon sub sigma>. 4. When a number of peers are considered for synchronization and two or more are determined to be correctly synchronized to a primary reference
source, as in the NTP clock-selection algorithm, the (estimated) additional error due to offset sample variance is called the selection dispersion <$Eepsilon sub xi>. Figure 15<$&fig15> shows how these errors accumulate in the ordinary course of NTP processing. Received messages from a single peer are represented by the packet variables. From the four most recent timestamps T1, T2, T3 and T4 the clock offset and roundtrip delay sample for the local clock relative to the peer clock are calculated directly. Included in the message are the root roundtrip delay <$EDELTA prime> and root dispersion <$EEPSILON prime> of the peer itself; however, before sending, the peer adds the measurement dispersion <$Erho> and skew dispersion <$Ephi tau>, where these quantities are determined by the peer and <$Etau> is the interval according to the peer clock since its clock was last updated. The NTP clock-filter procedure saves the most recent samples <$Etheta sub i> and <$Edelta sub i> in the clock filter as described in the NTP specification. The quantities <$Erho> and <$Ephi> characterize the local clock maximum reading error and frequency error, respectively. Each sample includes the dispersion <$Eepsilon sub i~=~rho~+~phi (T sub 4~-~T sub 1 )>, which is set upon arrival. Each time a new sample arrives all samples in the filter are updated with the skew dispersion <$Ephi tau sub i>, where <$Etau sub i> is the interval since the last sample arrived, as recorded in the variable peer.update. The clock-filter algorithm determines the selected clock offset <$Etheta> (peer.offset), together with the associated roundtrip delay <$Edelta> (peer.delay) and filter dispersion <$Eepsilon sub sigma>, which is added to the associated sample dispersion <$Eepsilon sub i> to form the peer dispersion <$Eepsilon> (peer.dispersion). The NTP clock-selection procedure selects a single peer to become the synchronization source as described in the NTP specification. The operation of the algorithm determines the final clock offset <$ETHETA> (local clock), roundtrip delay <$EDELTA> (sys.rootdelay) and dispersion <$EEPSILON> (sys.rootdispersion) relative to the root of the synchronization subnet, as shown in Figure 15. Note the inclusion of the selected peer dispersion and skew accumulation since the dispersion was last updated, as well as the select dispersion <$Eepsilon sub xi> computed by the clock-select algorithm itself. Also, note that, in order to preserve overall synchronization subnet stability, the final clock offset <$ETHETA> is in fact determined from the offset of the local clock relative to the peer clock, rather than the root of the subnet. Finally, note that the packet variables <$EDELTA prime> and <$EEPSILON prime> are in fact determined from the latest message received, not at the precise time the offset selected by the clock-filter algorithm was determined. Minor errors arising due to these simplifications will be ignored. Thus, the total dispersion accumulation relative to the root of the synchronization subnet is
<$EEPSILON~=~epsilon~+~phi tau~+~epsilon sub xi~+~| THETA |~+~EPSILON prime > , where <$Etau> is the time since the peer variables were last updated and <$E| THETA |> is the initial absolute error in setting the local clock. The three values of clock offset, roundtrip delay and dispersion are all additive; that is, if <$ETHETA sub i>, <$EDELTA sub i> and <$EEPSILON sub i> represent the values at peer i relative to the root of the synchronization subnet, the values <$ETHETA sub j (t)~==~THETA sub i~+~theta sub j (t)> , <$EDELTA sub j (t)~==~DELTA sub i~+~delta sub j> , <$EEPSILON sub j (t)~==~EPSILON sub i~+~epsilon sub i~+~epsilon sub j (t)> , represent the clock offset, roundtrip delay and dispersion of peer j at time t. The time dependence of <$Etheta sub j (t)> and <$Eepsilon sub j (t)> represents the local-clock correction and dispersion accumulated since the last update was received from peer i, while the term <$Eepsilon sub i> represents the dispersion accumulated by peer i from the time its clock was last set until the latest update was sent to peer j. Note that, while the offset of the local clock relative to the peer clock can be determined directly, the offset relative to the root of the synchronization subnet is not directly determinable, except on a probabilistic basis and within the bounds established in this and the previous section. The NTP synchronization subnet topology is that of a tree rooted at the primary server(s). Thus, there is an unbroken path from every time server to the primary reference source. Accuracy and stability are proportional to synchronization distance <$ELAMBDA>, defined as <$ELAMBDA~==~EPSILON~+~DELTA over 2> . The selection algorithm favors the minimum-distance paths and thus maximizes accuracy and stability. Since <$ETHETA sub 0>, <$EDELTA sub 0> and <$EEPSILON sub 0> are all zero, the sum of the clock offsets, roundtrip delays and dispersions of each server along the minimum- distance path from the root of the synchronization subnet to a given server i are the clock offset <$ETHETA sub i>, roundtrip delay <$EDELTA sub i> and dispersion <$EEPSILON sub i> inherited by and characteristic of that server. Correctness Principles In order to minimize the occurrence of errors due to incorrect clocks and maximize the reliability of the service, NTP relies on multiple peers and disjoint peer paths whenever possible. In the previous development it was shown that, if the primary reference source at the root of the synchronization subnet is in fact a correct clock, then the true offset <$Etheta sub 0> relative to that clock must be contained in
the interval <$E[ THETA~-~LAMBDA ,~THETA~+~LAMBDA ]~==~[ THETA~-~EPSILON~-~DELTA over 2 ,~THETA~+~EPSILON~+~DELTA over 2 ]> . When a number of clocks are involved, it is not clear beforehand which are correct and which are not; however, as cited previously, there are a number of techniques based on clustering and filtering principles which yield a high probability of detecting and discarding incorrect clocks. Marzullo and Owicki [MAR85] devised an algorithm designed to find an appropriate interval containing the correct time given the confidence intervals of m clocks, of which no more than f are considered incorrect. The algorithm finds the smallest single intersection containing all points in at least <$Em~-~f> of the given confidence intervals. Figure 16<$&fig16> illustrates the operation of this algorithm with a scenario involving four clocks A, B, C and D, with the calculated time (shown by the <F128>ì<F255> symbol) and confidence interval shown for each. These intervals are computed as described in previous sections of this appendix. For instance, any point in the A interval may possibly represent the actual time associated with that clock. If all clocks are correct, there must exist a nonempty intersection including all four intervals; but, clearly this is not the case in this scenario. However, if it is assumed that one of the clocks is incorrect (e.g., D), it might be possible to find a nonempty intersection including all but one of the intervals. If not, it might be possible to find a nonempty intersection including all but two of the intervals and so on. The algorithm proposed by DEC for use in the Digital Time Service [DEC89] is based on these principles. For the scenario illustrated in Figure 16, it computes the interval for <$Em~=~4> clocks, three of which turn out to be correct and one not. The low endpoint of the intersection is found as follows. A variable f is initialized with the number of presumed incorrect clocks, in this case zero, and a counter i is initialized at zero. Starting from the lowest endpoint, the algorithm increments i at each low endpoint, decrements i at each high endpoint, and stops when <$Ei~>>=~m~-~f>. The counter records the number of intersections and thus the number of presumed correct clocks. In the example the counter never reaches four, so f is increased by one and the procedure is repeated. This time the counter reaches three and stops at the low endpoint of the intersection marked DTS. The upper endpoint of this intersection is found using a similar procedure. This algorithm will always find the smallest single intersection containing points in at least one of the original <$Em~-~f> confidence intervals as long as the number of incorrect clocks is less than half the total <$Ef~<<~m over 2>. However, some points in the intersection may not be contained in all <$Em~-~f> of the original intervals; moreover, some or all of the calculated times (such as for C in Figure 16) may lie outside the intersection. In the NTP clock-selection
procedure the above algorithm is modified so as to include at least <$Em~-~f> of the calculated times. In the modified algorithm a counter c is initialized at zero. When starting from either endpoint, c is incremented at each calculated time; however, neither f nor c are reset between finding the low and high endpoints of the intersection. If after both endpoints have been found <$Ec~>>~f>, f is increased by one and the entire procedure is repeated. The revised algorithm finds the smallest intersection of <$Em~-~f> intervals containing at least <$Em~-~f> calculated times. As shown in Figure 16, the modified algorithm produces the intersection marked NTP and including the calculated time for C. In the NTP clock-selection procedure the peers represented by the clocks in the final intersection, called the survivors, are placed on a candidate list. In the remaining steps of the procedure one or more survivors may be discarded from the list as outlyers. Finally, the clock-combining algorithm described in Appendix F provides a weighted average of the remaining survivors based on synchronization distance. The resulting estimates represent a synthetic peer with offset between the maximum and minimum offsets of the remaining survivors. This defines the clock offset <$ETHETA>, total roundtrip total delay <$EDELTA> and total dispersion <$EEPSILON> which the local clock inherits. In principle, these values could be included in the time interface provided by the operating system to the user, so that the user could evaluate the quality of indications directly. Appendix I. Selected C-Language Program Listings Following are C-language program listings of selected algorithms described in the NTP specification. While these have been tested as part of a software simulator using data collected in regular operation, they do not necessarily represent a standard implementation, since many other implementations could in principle conform to the NTP specification. Common Definitions and Variables The following definitions are common to all procedures and peers. #define NMAX 40 /* max clocks */ #define FMAX 8 /* max filter size */ #define HZ 1000 /* clock rate */ #define MAXSTRAT 15 /* max stratum */ #define MAXSKEW 1 /* max skew error per MAXAGE */ #define MAXAGE 86400 /* max clock age */ #define MAXDISP 16 /* max dispersion */ #define MINCLOCK 3 /* min survivor clocks */ #define MAXCLOCK 10 /* min candidate clocks */ #define FILTER .5 /* filter weight
*/ #define SELECT .75 /* select weight */ The folowing are peer state variables (one set for each peer). double filtp[NMAX][FMAX]; /* offset samples */ double fildp[NMAX][FMAX]; /* delay samples */ double filep[NMAX][FMAX]; /* dispersion samples */ double tp[NMAX]; /* offset */ double dp[NMAX]; /* delay */ double ep[NMAX]; /* dispersion */ double rp[NMAX]; /* last offset */ double utc[NMAX]; /* update tstamp */ int st[NMAX]; /* stratum */ The following are system state variables and constants. double rho = 1./HZ; /* max reading error */ double phi = MAXSKEW/MAXAGE; /* max skew rate */ double bot, top; /* confidence interval limits */ double theta; /* clock offset */ double delta; /* roundtrip delay */ double epsil; /* dispersion */ double tstamp; /* current time */ int source; /* clock source */ int n1, n2; /* min/max clock ids */ The folowing are temporary lists shared by all peers and procedures. double list[3*NMAX]; /* temporary list*/ int index[3*NMAX]; /* index list */ Clock<196>Filter Algorithm /* clock filter algorithm n = peer id, offset = sample offset, delay = sample delay, disp = sample dispersion; computes tp[n] = peer offset, dp[n] = peer delay, ep[n] = peer dispersion
*/ void filter(int n, double offset, double delay, double disp) { int i, j, k, m; /* int temps */ double x; /* double temps */ for (i = FMAX<196>1; i >> 0; i<196> <196>) { /* update/shift filter */ filtp[n][i] = filtp[n][i<196>1]; fildp[n][i] = fildp[n][i<196>1]; filep[n][i] = filep[n][i<196>1]+phi*(tstamp<196>utc[n]); } utc[n] = tstamp; filtp[n][0] = offset<196>tp[0]; fildp[n][0] = delay; filep[n][0] = disp; m = 0; /* construct/sort temp list */ for (i = 0; i << FMAX; i++) { if (filep[n][i] >>= MAXDISP) continue; list[m] = filep[n][i]+fildp[n][i]/2.; index[m] = i; for (j = 0; j << m; j++) { if (list[j] >> list[m]) { x = list[j]; k = index[j]; list[j] = list[m]; index[j] = index[m]; list[m] = x; index[m] = k; } } m = m+1; } if (m <<= 0) ep[n] = MAXDISP; /* compute filter dispersion */ else { ep[n] = 0; for (i = FMAX<196>1; i >>= 0; i<196> <196>) { if (i << m) x = fabs(filtp[n][index[0]]<196>filtp[n][index[i]]); else x = MAXDISP; ep[n] = FILTER*(ep[n]+x); } i = index[0]; ep[n] = ep[n]+filep[n][i]; tp[n] = filtp[n][i]; dp[n] = fildp[n][i]; } return; } Interval Intersection Algorithm /*
compute interval intersection computes bot = lowpoint, top = highpoint (bot >> top if no intersection) */ void dts() { int f; /* intersection ceiling */ int end; /* endpoint counter */ int clk; /* falseticker counter */ int i, j, k, m, n; /* int temps */ double x, y; /* double temps */ m = 0; i = 0; for (n = n1; n <<= n2; n++) { /* construct endpoint list */ if (ep[n] >>= MAXDISP) continue; m = m+1; list[i] = tp[n]<196>dist(n); index[i] = <196>1; /* lowpoint */ for (j = 0; j << i; j++) { if ((list[j] >> list[i]) || ((list[j] == list[i]) && (index[j] >> index[i]))) { x = list[j]; k = index[j]; list[j] = list[i]; index[j] = index[i]; list[i] = x; index[i] = k; } } i = i+1; list[i] = tp[n]; index[i] = 0; /* midpoint */ for (j = 0; j << i; j++) { if ((list[j] >> list[i]) || ((list[j] == list[i]) && (index[j] >> index[i]))) { x = list[j]; k = index[j]; list[j] = list[i]; index[j] = index[i]; list[i] = x; index[i] = k; } } i = i+1; list[i] = tp[n]+dist(n); index[i] = 1; /* highpoint */ for (j = 0; j << i; j++) { if ((list[j] >> list[i]) || ((list[j] == list[i]) && (index[j] >> index[i]))) { x = list[j]; k = index[j]; list[j] =
list[i]; index[j] = index[i]; list[i] = x; index[i] = k; } } i = i+1; } if (m <<= 0) return; for (f = 0; f << m/2; f++) { /* find intersection */ clk = 0; end = 0; /* lowpoint */ for (j = 0; j << i; j++) { end = end<196>index[j]; bot = list[j]; if (end >>= (m<196>f)) break; if (index[j] == 0) clk = clk+1; } end = 0; /* highpoint */ for (j = i<196>1; j >>= 0; j<196> <196>) { end = end+index[j]; top = list[j]; if (end >>= (m<196>f)) break; if (index[j] == 0) clk = clk+1; } if (clk <<= f) break; } return; } Clock<196>Selection Algorithm /* select best subset of clocks in candidate list bot = lowpoint, top = highpoint; constructs index = candidate index list, m = number of candidates, source = clock source, theta = clock offset, delta = roundtrip delay, epsil = dispersion */ void select() { double xi; /* max select dispersion */ double eps; /* min peer dispersion */ int i, j, k, n; /* int temps */ double x, y, z; /* double temps */ m = 0; for (n = n1; n <<= n2; n++) { /* make/sort candidate list */ if ((st[n] >> 0) && (st[n] << MAXSTRAT) && (tp[n] >>=
bot) && (tp[n] <<= top)) { list[m] = MAXDISP*st[n]+dist(n); index[m] = n; for (j = 0; j << m; j++) { if (list[j] >> list[m]) { x = list[j]; k = index[j]; list[j] = list[m]; index[j] = index[m]; list[m] = x; index[m] = k; } } m = m+1; } } if (m <<= 0) { source = 0; return; } if (m >> MAXCLOCK) m = MAXCLOCK; while (1) { /* cast out falsetickers */ xi = 0.; eps = MAXDISP; for (j = 0; j << m; j++) { x = 0.; for (k = m<196>1; k >>= 0; k<196> <196>) x = SELECT*(x+fabs(tp[index[j]]<196>tp[index[k]])); if (x >> xi) { xi = x; i = j; /* max(xi) */ } x = ep[index[j]]+phi*(tstamp<196>utc[index[j]]); if (x << eps) eps = x; /* min(eps) */ } if ((xi <<= eps) || (m <<= MINCLOCK)) break; if (index[i] == source) source = 0; for (j = i; j << m<196>1; j++) index[j] = index[j+1]; m = m<196>1; } i = index[0]; /* declare winner */ if (source != i) if (source == 0) source = i; else if (st[i] << st[source]) source = i; theta = combine(); delta = dp[i]; epsil = ep[i]+phi*(tstamp<196>utc[i])+xi; return; } Clock<196>Combining Procedure /*
compute weighted ensemble average index = candidate index list, m = number of candidates; returns combined clock offset */ double combine() { int i; /* int temps */ double x, y, z; /* double temps */ z = 0. ; y = 0.; for (i = 0; i << m; i++) { /* compute weighted offset */ j = index[i]; x = dist(j)); z = z+tp[j]/x; y = y+1./x; } return z/y; /* normalize */ } Subroutine to Compute Synchronization Distance /* compute synchronization distance n = peer id; returns synchronization distance */ double dist(int n) { return ep[n]+phi*(tstamp<196>utc[n])+fabs(dp[n])/2.; } Security considerations see Section 3.6 and Appendix C Author's address David L. Mills Electrical Engineering Department University of Delaware Newark, DE 19716 Phone (302) 451<196>8247 EMail mills@udel.edu