Re: time() glitch on 2.4.18: solved

Jim Paris (jim@jtan.com)
Sun, 3 Nov 2002 14:32:16 -0500


> The problem is with time(). Every second, for approximately 1.1ms,
> time() reports a value that is about 2^32 microseconds (4295 seconds,
> or about an hour and a quarter) in the future. The glitches always
> occur between a change of seconds.

I finally found the problem.

My i8253 was out of phase. The 16-bit timer value is read in two
8-bit reads from the 8253 in arch/i386/kernel/time.c, and this value
should be between 0 and LATCH-1. My kernel was getting the MSB and
LSB reversed, and so the read values were usually too high, and
delay_at_last_interrupt ended up negative. This caused some small
random negative amount to be subtracted from usecs during
do_gettimeofday, and so my clock was always making small jumps
backwards, and occasionally jumping forward 2^32 when usecs was small.

To fix it, I just read a single byte from port 0x40. If I do it
again, the problem returns (and I've tested that this is the case on
multiple systems, so it's not just a problem with my 8253).

After 180 days of uptime, it's not surprising that there would have
been one read of the port that failed, triggering the problem, so I
think the kernel should detect and fix this. We could just check for
it: if the returned count > LATCH, read an extra byte from port 0x40,
as I did. Or, use the method in do_slow_gettimeoffset, which
basically resets the 8253's counter if count > LATCH.

Any comments?

-jim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/