lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090528224756.GJ11477@elf.ucw.cz>
Date:	Fri, 29 May 2009 00:47:56 +0200
From:	Pavel Machek <pavel@....cz>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	kernel list <linux-kernel@...r.kernel.org>,
	akpm@...ux-foundation.org, fengguang.wu@...el.com,
	nigel@...onice.net, mm-commits@...r.kernel.org
Subject: Re: breaking drivers with low probability Re: [merged]
	pm-suspend-do-not-shrink-memory-before-suspend.patch removed from
	-mm tree


On Fri 2009-05-29 00:32:07, Rafael J. Wysocki wrote:
> On Thursday 28 May 2009, Pavel Machek wrote:
> > 
> > On Thu 2009-05-28 23:14:41, Rafael J. Wysocki wrote:
> > > On Thursday 28 May 2009, Pavel Machek wrote:
> > > > 
> > > > > > > > ...i.e. 0 pages free. OTOH... I don't think you audited all the
> > > > > > > > drivers to verify they can handle it, nor you attempted to contact all
> > > > > > > > the driver authors to warn them they suspend/resume routines can now
> > > > > > > > be called with 0 free pages.
> > > > > > > 
> > > > > > > Are you sure we can actually get to this point with 0 free pages?
> > > > > > 
> > > > > > If I recall how mm works; yes I believe it is possible to hit this
> > > > > > with 0 free pages if you are unlucky. (Heavy memory pressure with some
> > > > > > network packet storm just before suspend...).
> > > > > > 
> > > > > > Do you think 0 pages free here is impossible?
> > > > > 
> > > > > I think it's just extremely unlikely, which is why I'm asking for a test case.
> > > > > If you have one, we can see what it takes to trigger and put a safeguard
> > > > > against _that_.
> > > > 
> > > > No, I do not have a test case, and I agree that it is quite
> > > > unlikely. But I dislike adding bugs in unlikely cases.
> > > > 
> > > > > > If so, what do you think minimum number of free pages here is and why?
> > > > > 
> > > > > Seriously, I don't know.  Only the drivers know how much memory they are
> > > > > going to need and _they_ should allocate it in advance.  When we get to
> > > > > their suspend callbacks it's already too late.
> > > > 
> > > > Tell that to the driver authors. At least one driver does allocate in
> > > > _suspend(), and probably more.
> > > > 
> > > > > Still, even if I knew, I think it would be better to just allocate that memory
> > > > > before we freeze tasks and then free it instead of using the current approach.
> > > > 
> > > > Agreed, it would be better. 
> > > > 
> > > > OTOH providing 4MB as a safety area for the drivers that don't do that
> > > > seems quite reasonable. Deleting the safety area would be fine, but I
> > > > believe we need to fix the drivers, first, or at least ask driver
> > > > writes to get them fixed.
> > > 
> > > Or perhaps we can see if it's really necessary.
> > 
> > How? We already know this bug is pretty unlikely to be caught by testing.
> > 
> > > > IOW I believe the patch should be reverted.
> > > 
> > > Linus is supporting this change and it's going to be easy enough to revert if
> > > it's confirmed to cause any problems.  Which I seriously doubt.
> > 
> > I already found one bug you introduced... by code inspection. (Will
> > you at least fix that?).
> 
> No, you didn't.  You only pointed out that there may be a problem in certain
> circumstances, but the probablility of these circumstances happening in
> practice is close to zero.

IOW you added bug that is hard to trigger.

> > I'm pretty sure there are more. You tell me
> > that "it can be reverted if it proves problematic". 
> > 
> > I already proved it problematic by code inspection.
> 
> No, you didn't prove anything.  Sorry.

Would you explain how much memory is guaranteed to be free for
drivers?  We know video/s1d13xxxfb.c needs some memory.

> > Please revert it.
> 
> If I know the exact mechanism by which we can exhaust memory before suspend
> so that casual allocations with kmalloc() from drviers' suspend callbacks will fail.
> Possible failure scenario, perhaps?

Just

0) create memory pressure from userland so that free memory goes down
to min_free_kbytes (GFP_KERNEL allocations)

1) hit network driver over fast enough network to eat remaining memory
with GFP_ATOMIC allocations

2) suspend with video/s1d13xxxfb.c loaded and your patch.

> > Testing _can not_ prove problematic. From analysis, we already know
> > suspend with 0 free pages is pretty unlikely.
> 
> So what's the point, really?

The point is that you can't assume GFP_ATOMIC allocations
work (suspend allocations run under similar rules, because swapping is
unavailable). And you added that assumption. Bad.

> In fact, the existing code doesn't solve any problem, because we don't know how
> much memory is going to be necessary anyway.  So, it doesn't eiliminate the
> issue if there is any, it only makes it a bit more difficult to trigger.

4MB is certainly enough for the video/s1d13xxxfb.c driver, so you
added at least one bug.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ