[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180906054955.GB27492@rapoport-lnx>
Date: Thu, 6 Sep 2018 08:49:56 +0300
From: Mike Rapoport <rppt@...ux.vnet.ibm.com>
To: Pasha Tatashin <Pavel.Tatashin@...rosoft.com>
Cc: Daniel Jordan <daniel.m.jordan@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Aaron Lu <aaron.lu@...el.com>,
"alex.kogan@...cle.com" <alex.kogan@...cle.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"boqun.feng@...il.com" <boqun.feng@...il.com>,
"brouer@...hat.com" <brouer@...hat.com>,
"dave@...olabs.net" <dave@...olabs.net>,
"dave.dice@...cle.com" <dave.dice@...cle.com>,
Dhaval Giani <dhaval.giani@...cle.com>,
"ktkhai@...tuozzo.com" <ktkhai@...tuozzo.com>,
"ldufour@...ux.vnet.ibm.com" <ldufour@...ux.vnet.ibm.com>,
"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
"shady.issa@...cle.com" <shady.issa@...cle.com>,
"tariqt@...lanox.com" <tariqt@...lanox.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"tim.c.chen@...el.com" <tim.c.chen@...el.com>,
"vbabka@...e.cz" <vbabka@...e.cz>,
"longman@...hat.com" <longman@...hat.com>,
"yang.shi@...ux.alibaba.com" <yang.shi@...ux.alibaba.com>,
"shy828301@...il.com" <shy828301@...il.com>,
Huang Ying <ying.huang@...el.com>,
"subhra.mazumdar@...cle.com" <subhra.mazumdar@...cle.com>,
Steven Sistare <steven.sistare@...cle.com>,
"jwadams@...gle.com" <jwadams@...gle.com>,
"ashwinch@...gle.com" <ashwinch@...gle.com>,
"sqazi@...gle.com" <sqazi@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
"walken@...gle.com" <walken@...gle.com>,
"rientjes@...gle.com" <rientjes@...gle.com>,
"junaids@...gle.com" <junaids@...gle.com>,
Neha Agarwal <nehaagarwal@...gle.com>,
Pavel Emelyanov <xemul@...tuozzo.com>,
Andrei Vagin <avagin@...tuozzo.com>
Subject: Re: Plumbers 2018 - Performance and Scalability Microconference
Hi,
On Wed, Sep 05, 2018 at 07:51:34PM +0000, Pasha Tatashin wrote:
>
> On 9/5/18 2:38 AM, Mike Rapoport wrote:
> > On Tue, Sep 04, 2018 at 05:28:13PM -0400, Daniel Jordan wrote:
> >> Pavel Tatashin, Ying Huang, and I are excited to be organizing a performance and scalability microconference this year at Plumbers[*], which is happening in Vancouver this year. The microconference is scheduled for the morning of the second day (Wed, Nov 14).
> >>
> >> We have a preliminary agenda and a list of confirmed and interested attendees (cc'ed), and are seeking more of both!
> >>
> >> Some of the items on the agenda as it stands now are:
> >>
> >> - Promoting huge page usage: With memory sizes becoming ever larger, huge pages are becoming more and more important to reduce TLB misses and the overhead of memory management itself--that is, to make the system scalable with the memory size. But there are still some remaining gaps that prevent huge pages from being deployed in some situations, such as huge page allocation latency and memory fragmentation.
> >>
> >> - Reducing the number of users of mmap_sem: This semaphore is frequently used throughout the kernel. In order to facilitate scaling this longstanding bottleneck, these uses should be documented and unnecessary users should be fixed.
> >>
> >> - Parallelizing cpu-intensive kernel work: Resolve problems of past approaches including extra threads interfering with other processes, playing well with power management, and proper cgroup accounting for the extra threads. Bonus topic: proper accounting of workqueue threads running on behalf of cgroups.
> >>
> >> - Preserving userland during kexec with a hibernation-like mechanism.
> >
> > Just some crazy idea: have you considered using checkpoint-restore as a
> > replacement or an addition to hibernation?
>
> Hi Mike,
>
> Yes, this is one way I was thinking about, and use kernel to pass the
> application stored state to new kernel in pmem. The only problem is that
> we waste memory: when there is not enough system memory to copy and pass
> application state to new kernel this scheme won't work. Think about DB
> that occupies 80% of system memory and we want to checkpoint/restore it.
>
> So, we need to have another way, where the preserved memory is the
> memory that is actually used by the applications, not copied. One easy
> way is to give each application that has a large state that is expensive
> to recreate a persistent memory device and let applications to keep its
> state on that device (say /dev/pmemN). The only problem is that memory
> on that device must be accessible just as fast as regular memory without
> any file system overhead and hopefully without need for DAX.
Like hibernation, checkpoint persists the state, so it won't require
additional memory. At the restore time, the memory state is recreated from
the persistent checkpoint and of course it's slower than the regular
memory access, but it won't differ much from resuming from hibernation.
Maybe it would be possible to preserve applications state if we extend
suspend-to-RAM -> resume with the ability to load a new kernel during
resume...
> I just want to get some ideas of what people are thinking about this,
> and what would be the best way to achieve it.
>
> Pavel
>
>
> >
> >> These center around our interests, but having lots of topics to choose from ensures we cover what's most important to the community, so we would like to hear about additional topics and extensions to those listed here. This includes, but is certainly not limited to, work in progress that would benefit from in-person discussion, real-world performance problems, and experimental and academic work.
> >>
> >> If you haven't already done so, please let us know if you are interested in attending, or have suggestions for other attendees.
> >>
> >> Thanks,
> >> Daniel
> >>
> >> [*] https://blog.linuxplumbersconf.org/2018/performance-mc/
> >>
> >
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists