lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG-2HqVBnXUKSRBrJE=gEKA3St5KMfdgAbx2vRfpF3qw_teLOg@mail.gmail.com>
Date:	Thu, 2 Oct 2014 08:12:37 +0200
From:	Tom Gundersen <teg@...m.no>
To:	"Luis R. Rodriguez" <mcgrof@...e.com>
Cc:	"Luis R. Rodriguez" <mcgrof@...not-panic.com>,
	Michal Hocko <mhocko@...e.cz>,
	Greg KH <gregkh@...uxfoundation.org>,
	Dmitry Torokhov <dmitry.torokhov@...il.com>,
	Takashi Iwai <tiwai@...e.de>, Tejun Heo <tj@...nel.org>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Robert Milasan <rmilasan@...e.com>, werner@...e.com,
	Oleg Nesterov <oleg@...hat.com>, hare <hare@...e.com>,
	Benjamin Poirier <bpoirier@...e.de>,
	Santosh Rastapur <santosh@...lsio.com>, pmladek@...e.cz,
	dbueso@...e.com, LKML <linux-kernel@...r.kernel.org>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	Joseph Salisbury <joseph.salisbury@...onical.com>,
	Kay Sievers <kay@...y.org>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	Tim Gardner <tim.gardner@...onical.com>,
	Pierre Fersing <pierre-fersing@...rref.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nagalakshmi Nandigama <nagalakshmi.nandigama@...gotech.com>,
	Praveen Krishnamoorthy <praveen.krishnamoorthy@...gotech.com>,
	Sreekanth Reddy <sreekanth.reddy@...gotech.com>,
	Abhijit Mahajan <abhijit.mahajan@...gotech.com>,
	Casey Leedom <leedom@...lsio.com>,
	Hariprasad S <hariprasad@...lsio.com>,
	"mpt-fusionlinux.pdl" <MPT-FusionLinux.pdl@...gotech.com>,
	Linux SCSI List <linux-scsi@...r.kernel.org>,
	netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH v1 5/5] driver-core: add driver asynchronous probe support

On Tue, Sep 30, 2014 at 5:24 PM, Luis R. Rodriguez <mcgrof@...e.com> wrote:
>> > commit e64fae5573e566ce4fd9b23c68ac8f3096603314
>> > Author: Kay Sievers <kay.sievers@...y.org>
>> > Date:   Wed Jan 18 05:06:18 2012 +0100
>> >
>> >     udevd: kill hanging event processes after 30 seconds
>> >
>> >     Some broken kernel drivers load firmware synchronously in the module init
>> >     path and block modprobe until the firmware request is fulfilled.
>> >     <...>
>>
>> This was a workaround to avoid a deadlock between udev and the kernel.
>> The 180 s timeout was already in place before this change, and was not
>> motivated by firmware loading. Also note that this patch was not about
>> "tracking device drivers", just about avoiding dead-lock.
>
> Thanks, can you elaborate on how a deadlock can occur if the kmod
> worker is not at some point sigkilled?

This was only relevant whet udev did the firmware loading. modprobe
would wait for the kernel, which would wait for the firmware loading,
which would wait for modprobe. This is no longer a problem as udev
does not do firmware loading any more.

> Is the issue that if there is no extra worker available and all are
> idling on sleep / synchronous long work boot will potentially hang
> unless a new worker becomes available to do more work?

Correct.

> If so I can
> see the sigkill helping for hanging tasks but it doesn't necessarily
> mean its a good idea to kill modules loading taking a while. Also
> what if the sigkill is just avoided for *just* kmod workers?

Depending on the number of devices you have, I suppose we could still
exhaust the workers.

>> The way I see it, the current status from systemd's side is: our
>> short-term work-around is to increase the timeout, and at the moment
>> it appears no long-term solution is needed (i.e., it seems like the
>> right thing to do is to make sure insmod can be near instantaneous, it
>> appears people are working towards this goal, and so far no examples
>> have cropped up showing that it is fundamentally impossible (once/if
>> they do, we should of course revisit the problem)).
>
> That again would be reactive behaviour, what would prevent avoiding the
> sigkill only for kmod workers? Is it known the deadlock is immiment?
> If the amount of workers for kmod that would hit the timeout is
> considered low I don't see how that's possible and why not just lift
> the sigkill.

Making kmod a special case is of course possible. However, as long as
there is no fundamental reason why kmod should get this special
treatment, this just looks like a work-around to me. We already have a
work-around, which is to increase the global timeout. If you still
think we should do something different in systemd, it is probably best
to take the discussion to systemd-devel to make sure all the relevant
people are involved.

Cheers,

Tom
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ