Re: (reiserfs hang at boot) where is the kernel debugger?

Rik van Riel (riel@conectiva.com.br)
Sun, 1 Oct 2000 23:50:17 -0300 (BRST)


This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
Send mail to mime@docserver.cac.washington.edu for more info.

--------------26F0174022558C817C6C0863
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-ID: <Pine.LNX.4.21.0010012346342.30717@duckman.distro.conectiva>

On Sun, 1 Oct 2000, David Ford wrote:
> Rik van Riel wrote:
>
> > > How broken is it? I have a test9-pre7 system that's exhibits an
> > > elusive bug, reiserfs hangs at boot time, and all I need is a
> > > backtrace on the D state processes.
> >
> > Could be a VM bug. ;)
>
> It could, but I strongly doubt it. We've seen this bug [very]
> infrequently for the last year.

OK, then it's almost certainly not VM related...
(which also means I can't fix it in little time)

> I'm rather hesitant to trust my findings, kdb says all the
> processes are in schedule+nn. Yes, it can be related but I'm a
> wee bit dubious.

Schedule() is the last function in the kernel they
went into before they got scheduled away ;)

The second last function is the one you're interested
in ...

> I'd post all my trace findings but I really don't want to cross
> type all that right now -- I left my serial cable at the office.
> I'll post the traces tomorrow. The simple version of them is as
> follows:
>
> PC: schedule()
> -1: down()
> -2: down_fail()
> ...
>
> Some processes have devfs calls following this, some have
> typical kernel init calls etc., but the common factor in all of
> them is they all sit in schedule and they all have the same PC
> location.

Then I guess something was trying to take the same
semaphore twice and deadlocked, taking the rest of
the system with it...

> During normal operation of the machine, -T shows processes
> having PCs of 0x00000000 and 0x7f000000 which strikes me as a
> bit odd.
>
> For e.g. the following:
>
> sshd S 7FFFFFFF 0 247 88 248 (NOTLB)
> 121
> sig: 0 0000000000000000 0000000000000000 : X
> bash S 00000000 0 248 247 263 (NOTLB)
> sig: 0 0000000000000000 0000000000010000 : X

Sysrq-T is broken on x86 ;((((((((

(very much to my dismay ... this is one of the best
debugging helps we have^Whad and I could have used
it quite well)

regards,

Rik

--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/ http://www.surriel.com/

--------------26F0174022558C817C6C0863 Content-Type: TEXT/X-VCARD; CHARSET=us-ascii; NAME="david.vcf" Content-ID: <Pine.LNX.4.21.0010012346343.30717@duckman.distro.conectiva> Content-Description: Card for David Ford Content-Disposition: ATTACHMENT; FILENAME="david.vcf"

begin:vcard n:Ford;David x-mozilla-html:TRUE org:<img src="http://www.kalifornia.com/images/paradise.jpg"> adr:;;;;;; version:2.1 email;internet:david@kalifornia.com title:Blue Labs Developer x-mozilla-cpt:;28256 fn:David Ford end:vcard

--------------26F0174022558C817C6C0863-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/