linux-kernel - Re: floppy.c soft lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20070613161746.GB275@tv-sign.ru>
Date:	Wed, 13 Jun 2007 20:17:46 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Mark Hounschell <dmarkh@....rr.com>
Cc:	Matt Mackall <mpm@...enic.com>,
	Andrew Morton <akpm@...ux-foundation.org>, markh@...pro.net,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>
Subject: Re: floppy.c soft lockup

Sorry for delay,

On 06/07, Mark Hounschell wrote:
>
> >From an earlier thread member:
> 
> >> Mark writes:
> >> Again I don't understand why flush_scheduled_work() running on behalf
> >> of a process affinitized to processor-1 requires cooperation from
> >> events/2 (affinitized to processor-2)
> >> when there is an events/1 already affinitized to processor 1?
> 
> >Oleg write:
> >flush_workqueue() blocks until any scheduled work on any CPU has run to
> >completion. If we have some work_struct pending on CPU 2, it can be
> >completed only when events/2 executes it.
> 
> Could not flush_scheduled_work() just follow the affinity mask of the
> task that caused the call to begin with. If calling task had a cpu-mask
> of 3 then flush_scheduled_work() would do the events/0 and events/1
> thing and if the calling task had an affinity mask of 1 then only
> events/0 would be done?
> 
> In other words changing what Oleg says above just slightly:
> 
> flush_workqueue() blocks until any scheduled work on any CPU in the
> calling tasks affinity mask has run to completion?

No, we can't do this, this makes flush_workqueue() meaningless.

Even if we could, this can't help. Suppose that a kernel thread takes some
global lock (for example, in our case cache_reap() takes cache_chain_mutex)
and then it is preempted by RT task which doesn't relinquish CPU.

So this problem is "wider", flush_workqueue() was just a random victim.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/