Re: [PATCH] fix loop with disabled tasklets

Andrea Arcangeli (andrea@suse.de)
Mon, 12 Nov 2001 14:57:38 +0100


On Mon, Nov 12, 2001 at 08:42:27AM +0100, Mathijs Mohlmann wrote:
>
>
> On Mon, 12 Nov 2001, Andrea Arcangeli wrote:
> > Mathijs, can you verify that? If my theory is right need_resched isn't
> > set even if ksoftirqd loops forever. It could be one of those two
> > possibilities:
> >
> > 1) the timer irq isn't running yet
> > 2) "current" isn't functional
>
> well, i'm at work now, no sparc32's here. But during my debugging i made
> a piece of code that counted to times a process was selected by schedule.
> A custom SysRq key read the values. What i saw during the deadlock was
> that there where three processes running allready and scheduling seemed to
> work find. keventd got selected everytime i pressed SysRq and ksoftirqd

so "current" was _always_ "keventd" every time you press SYSRQ+P? sounds
like you deadlocked in keventd then.

> got selected during idle time.
>
> This (to me) proves that "current" is working just fine. Not quite sure
> about the timer irq. I will look at that tonight. But i'm pretty sure
> need_resched is set again and again.
>
> Even if the timer irq is working fine, the sun should not enable the
> keyboard irq without the tasklet being enabled. Initializing the keyboard

scheduling the tasklet without it being enabled is perfect, there's no
problem at all if the scheduler is functional by the time
spwan_ksoftirqd is executed.

All issues mentioned so far about the softirq should only to deal with
performance, not with deadlocking. If it deadlocks that's a sparc32 bug,
messing the softirq code can only hide the real bug.

> tasklet enabled got the sun to boot just fine for me.
>
> But i feel that this is fixing the symptoms. If this turns out to be the
> problem, it should be fixed, but i feel the softirq code needs some good
> looking over with respect to disabled tasklets as well. (tasklet_enable

As we said several times so far, tasklets should remain disabled only
for a little time during production (aka during the benchmarks), and
it's a matter of the user to tasklet_kill only enabled tasklets (it's
like quitting a function without a local_irq_restore, no surprise if irq
remains stuck if you forget to reenable irq/tasklets).

As Linus said tasklet_disable is the same thing of __cli() (or if you
prefer as Alexey said, it's like local_bh_disable()), you shouldn't __cli()
or local_bh_disable() for a long time, it's worthless to optimize for
very long critical sections, tasklet_disable is just to serialize within
a critical section. The boot case doesn't matter either, it doesn't
matter if at boot we waste a couple of cpu cycles, boot time won't
change anyways because of that (boot time is all cpu idle time waiting
those scsi things).

But again, those softirq issues are only performance issues, they cannot
explain a deadlock in spwan_ksoftirqd, that is a sparc32 bug, period.

So please fix the bug, then we can discuss the rest, but whatever we
change it won't be for the boot process, the only thing that matters is
production, even if ksoftirqd loops for seconds during the boot process
it doesn't matter at all.

> without a tasket_schedule and then a tasklet_kill will loop as well,
> wont it?)
>
> me (everything IMVHO)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/