[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150720175145.GH21558@kvack.org>
Date: Mon, 20 Jul 2015 13:51:45 -0400
From: Benjamin LaHaise <bcrl@...ck.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Jeff Moyer <jmoyer@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Joonsoo Kim <js1304@...il.com>,
Fengguang Wu <fengguang.wu@...el.com>,
Johannes Weiner <hannes@...xchg.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
linux-next@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
On Mon, Jul 20, 2015 at 07:33:11PM +0200, Oleg Nesterov wrote:
> Hi Jeff,
>
> On 07/20, Jeff Moyer wrote:
> >
> > Hi, Oleg,
> >
> > Oleg Nesterov <oleg@...hat.com> writes:
> >
> > > Shouldn't we account aio events/pages somehow, say per-user, or in
> > > mm->pinned_vm ?
> >
> > Ages ago I wrote a patch to account the completion ring to a process'
> > memlock limit:
> > "[patch] aio: remove aio-max-nr and instead use the memlock rlimit to
> > limit the number of pages pinned for the aio completion ring"
> > http://marc.info/?l=linux-aio&m=123661380807041&w=2
> >
> > The problem with that patch is that it modifies the user/kernel
> > interface. It could be done over time, as Andrew outlined in that
> > thread, but I've been reluctant to take that on.
>
> See also the usage of mm->pinned_vm and user->locked_vm in perf_mmap(),
> perhaps aio can do the same...
>
> > If you just mean we should account the memory so that the right process
> > can be killed, that sounds like a good idea to me.
>
> Not sure we actually need this. I only meant that this looks confusing
> because this memory is actually locked but the kernel doesn't know this.
>
> And btw, I forgot to mention that I triggered OOM on the testing machine
> with only 512mb ram, and aio-max-nr was huge. So, once again, while this
> all doesn't look right to me, I do not think this is the real problem.
>
> Except the fact that an unpriviliged user can steal all aio-max-nr events.
> This probably worth fixing in any case.
>
>
>
> And if we accept the fact this memory is locked and if we properly account
> it, then may be we can just kill aio_migratepage(), aio_private_file(), and
> change aio_setup_ring() to simply use install_special_mapping(). This will
> greatly simplify the code. But let me remind that I know nothing about aio,
> so please don't take my thoughts seriously.
No, you can't get rid of that code. The page migration is required when
CPUs/memory is offlined and data needs to be moved to another node.
Similarly, support for mremap() is also required for container migration /
restoration.
As for accounting locked memory, we don't do that for memory pinned by
O_DIRECT either. Given how small the amount of memory aio can pin is
compared to O_DIRECT or mlock(), it is unlikely that the accounting of
how much aio has pinned will make any real difference in the big picture.
A single O_DIRECT i/o can pin megabytes of memory.
-ben
> Oleg.
--
"Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists