lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <200905292026.46093.rjw@sisk.pl>
Date:	Fri, 29 May 2009 20:26:45 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Pavel Machek <pavel@....cz>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	kernel list <linux-kernel@...r.kernel.org>,
	akpm@...ux-foundation.org, fengguang.wu@...el.com,
	nigel@...onice.net, mm-commits@...r.kernel.org
Subject: Re: breaking drivers with low probability Re: [merged] pm-suspend-do-not-shrink-memory-before-suspend.patch removed from -mm tree

On Friday 29 May 2009, Pavel Machek wrote:
> 
> On Fri 2009-05-29 00:32:07, Rafael J. Wysocki wrote:
> > On Thursday 28 May 2009, Pavel Machek wrote:
> > > 
> > > On Thu 2009-05-28 23:14:41, Rafael J. Wysocki wrote:
> > > > On Thursday 28 May 2009, Pavel Machek wrote:
> > > > > 
> > > > > > > > > ...i.e. 0 pages free. OTOH... I don't think you audited all the
> > > > > > > > > drivers to verify they can handle it, nor you attempted to contact all
> > > > > > > > > the driver authors to warn them they suspend/resume routines can now
> > > > > > > > > be called with 0 free pages.
> > > > > > > > 
> > > > > > > > Are you sure we can actually get to this point with 0 free pages?
> > > > > > > 
> > > > > > > If I recall how mm works; yes I believe it is possible to hit this
> > > > > > > with 0 free pages if you are unlucky. (Heavy memory pressure with some
> > > > > > > network packet storm just before suspend...).
> > > > > > > 
> > > > > > > Do you think 0 pages free here is impossible?
> > > > > > 
> > > > > > I think it's just extremely unlikely, which is why I'm asking for a test case.
> > > > > > If you have one, we can see what it takes to trigger and put a safeguard
> > > > > > against _that_.
> > > > > 
> > > > > No, I do not have a test case, and I agree that it is quite
> > > > > unlikely. But I dislike adding bugs in unlikely cases.
> > > > > 
> > > > > > > If so, what do you think minimum number of free pages here is and why?
> > > > > > 
> > > > > > Seriously, I don't know.  Only the drivers know how much memory they are
> > > > > > going to need and _they_ should allocate it in advance.  When we get to
> > > > > > their suspend callbacks it's already too late.
> > > > > 
> > > > > Tell that to the driver authors. At least one driver does allocate in
> > > > > _suspend(), and probably more.
> > > > > 
> > > > > > Still, even if I knew, I think it would be better to just allocate that memory
> > > > > > before we freeze tasks and then free it instead of using the current approach.
> > > > > 
> > > > > Agreed, it would be better. 
> > > > > 
> > > > > OTOH providing 4MB as a safety area for the drivers that don't do that
> > > > > seems quite reasonable. Deleting the safety area would be fine, but I
> > > > > believe we need to fix the drivers, first, or at least ask driver
> > > > > writes to get them fixed.
> > > > 
> > > > Or perhaps we can see if it's really necessary.
> > > 
> > > How? We already know this bug is pretty unlikely to be caught by testing.
> > > 
> > > > > IOW I believe the patch should be reverted.
> > > > 
> > > > Linus is supporting this change and it's going to be easy enough to revert if
> > > > it's confirmed to cause any problems.  Which I seriously doubt.
> > > 
> > > I already found one bug you introduced... by code inspection. (Will
> > > you at least fix that?).
> > 
> > No, you didn't.  You only pointed out that there may be a problem in certain
> > circumstances, but the probablility of these circumstances happening in
> > practice is close to zero.
> 
> IOW you added bug that is hard to trigger.
> 
> > > I'm pretty sure there are more. You tell me
> > > that "it can be reverted if it proves problematic". 
> > > 
> > > I already proved it problematic by code inspection.
> > 
> > No, you didn't prove anything.  Sorry.
> 
> Would you explain how much memory is guaranteed to be free for
> drivers?  We know video/s1d13xxxfb.c needs some memory.
> 
> > > Please revert it.
> > 
> > If I know the exact mechanism by which we can exhaust memory before suspend
> > so that casual allocations with kmalloc() from drviers' suspend callbacks will fail.
> > Possible failure scenario, perhaps?
> 
> Just
> 
> 0) create memory pressure from userland so that free memory goes down
> to min_free_kbytes (GFP_KERNEL allocations)
> 
> 1) hit network driver over fast enough network to eat remaining memory
> with GFP_ATOMIC allocations
> 
> 2) suspend with video/s1d13xxxfb.c loaded and your patch.

Well, you don't need video/s1d13xxxfb.c for this test.  Just put
kmalloc(something) into any driver's ->suspend() routine and the
corresponding kfree() into its ->resume().

So, have you tried it?  That would have been your test case, wouldn't it?

Best,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ