linux-kernel - RE: Bug 71331 - mlock yields processor to lower priority process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <FFF95760268D324AB6DD9426E83C8DF70B2E725C@ARLEXCHMBX01.lst.link.l-3com.com>
Date:	Thu, 27 Mar 2014 04:20:11 +0000
From:	<jimmie.davis@...com.com>
To:	<luto@...capital.net>, <umgwanakikbuti@...il.com>
CC:	<oneukum@...e.de>, <artem_fetishev@...m.com>,
	<peterz@...radead.org>, <kosaki.motohiro@...fujitsu.com>,
	<linux-kernel@...r.kernel.org>
Subject: RE: Bug 71331 - mlock yields processor to lower priority process

-----Original Message-----
From: Andy Lutomirski [mailto:luto@...capital.net] 
Sent: Wednesday, March 26, 2014 7:40 PM
To: Davis, Bud @ SSG - Link; umgwanakikbuti@...il.com
Cc: oneukum@...e.de; artem_fetishev@...m.com; peterz@...radead.org; kosaki.motohiro@...fujitsu.com; linux-kernel@...r.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On 03/21/2014 07:50 AM, jimmie.davis@...com.com wrote:
> 
> ________________________________________
> From: Mike Galbraith [umgwanakikbuti@...il.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneukum@...e.de; artem_fetishev@...m.com; peterz@...radead.org; kosaki.motohiro@...fujitsu.com; linux-kernel@...r.kernel.org
> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
> 
> On Fri, 2014-03-21 at 14:01 +0000, jimmie.davis@...com.com wrote:
> 
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
> 
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
> 
> -Mike
> 
> Two options.
> 
> #1. Return with a status value of EAGAIN.
> 
> or 
> 
> #2.  Don't return until you can do it.
> 
> If SCHED_FIFO is used, and mlock() is called, the intention of the user is very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not block).
> 
> SCHED_FIFO users don't care about fairness.  They want the system to do what it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy

===================================================================================================================

Andy,

The example code submitted into bugzilla (chase back on the thread a bit, there is a reference) shows the problem.

Two threads, TaskA (high priority) and TaskB (low priority).  Assigned to the same processor, explicitly for the guarantee that only one of them can execute at a time.  TaskA becomes eligible to run.  As part of its processing ( which the normal end is a call to sem_wait() ), it calls mlock().  TaskA then blocks, and TaskB begins running.  But wait, the system is designed that TaskA will run until it is done (thus SCHED_FIFO and a priority less than TaskB).  TaskA, a higher priority task is suspended and TaskB starts running.  And in the code that lead me on this endeavor :) {consisting of a lot of Ada threads}, the result was a segfault due to half-processed data by TaskA.

This is what I call 'blocking'; the thread is no longer running and the scheduler puts someone else in the processor.  I don't mean 'takes a long time until it returns'.  Takes a long time is fine, the system design relies on priority based scheduling and cpu affinity to ensure ordered access to application data.

mlock() now blocks.  I don't care how long mlock() takes, what I care about is the lower priority process pre-empting me.  Only a limited number of syscalls block; those that do are documented and usually have a way to obtain blocking or non-blocking behavior.

Can I change the system to deal with mlock() being a blocking syscall ?  Yes, but this is a situation where working code, that meets the API has stopped working.

Thanks for looking at it.

Regards,
Bud Davis

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/