linux-kernel - Re: 2.6.22-rc6 spurious hangs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d120d5000706290959w2feed4eehec7f5a115fc5de04@mail.gmail.com>
Date:	Fri, 29 Jun 2007 12:59:46 -0400
From:	"Dmitry Torokhov" <dmitry.torokhov@...il.com>
To:	"Ingo Molnar" <mingo@...e.hu>
Cc:	"Oleg Nesterov" <oleg@...sign.ru>,
	"Thomas Sattler" <tsattler@....de>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	"Alan Cox" <alan@...rguk.ukuu.org.uk>,
	"Daniel Mack" <daniel@...u.de>,
	"Holger Waechtler" <holger@...u.de>,
	"Mauro Carvalho Chehab" <mchehab@...radead.org>,
	"Mariusz Kozlowski" <m.kozlowski@...land.pl>,
	v4l-dvb-maintainer@...uxtv.org
Subject: Re: 2.6.22-rc6 spurious hangs

On 6/29/07, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Oleg Nesterov <oleg@...sign.ru> wrote:
>
> > > > ->disconnect_pending is used without any locks/barriers, perhaps
> > > > this is the reason.
> >
> > I misread cinergyt2_release, it checks !->disconnect_pending, so it is
> > very clear why cinergyt2_query_rc() tries to take the mutex.
> >
> > > > I'll try to look further tomorrow. In any case, cinergyT2 should not
> > > > use flush_scheduled_work() at all.
> > >
> > > would the hack below be worth trying, to see whether there are any
> > > further problems?
> [...]
> > I don't think we can just kill flush_scheduled_work(). We can use
> > cancel_rearming_delayed_work() instead of
> > cancel_delayed_work()+flush_scheduled_work()
> >
> > Still we can't do this under cinergyt2->sem, because cinergyt2_query()
> > takes it too. This all looks very wrong to me, I hope maintaners can
> > explain.
>
> i've Cc:-ed the maintainers.
>

Well, not really maintainer but I think the short term soluton (at
least for the RC part) is to alter cinergyt2_query_rc to take
cinergyt2->sem only around cinergyt2_command(). Ther rest of the
polling function need not be protected as it does nto tun concurently
with itself.

The longer trem solution is to convert cinergyt2 to use polled input
device framework (as in attached patch - untested). Unfortunately it
depends on adding suspend/resume support to polled devices now that
workqueues are not being freezed again.

Overall the driver makes me cringe...

- Why does it take cinergyt2->sem in its cinergyt2_poll() function
before calling poll_wait()? Most of the places that try to wake up
polling process (such as FE_SET_FRONTEND) attempt to take the same
mutex and will promptly deadlock unless I am missing something.

- cinergyt2_query() wakes up pollers before altering poll condition.

- cinergyt2->iunuse is racy...

-- 
Dmitry

View attachment "cinergyt-to-polldev.patch" of type "text/plain" (11978 bytes)