[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANaxB-z57KoCNawGEkmpoiHV_iCaYr8YiOc2zQiTHM4fso0ABQ@mail.gmail.com>
Date: Thu, 12 Dec 2024 22:33:21 -0800
From: Andrei Vagin <avagin@...il.com>
To: Jeff Xu <jeffxu@...omium.org>
Cc: akpm@...ux-foundation.org, keescook@...omium.org, jannh@...gle.com,
torvalds@...ux-foundation.org, adhemerval.zanella@...aro.org, oleg@...hat.com,
linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
linux-mm@...ck.org, jorgelo@...omium.org, sroettger@...gle.com,
ojeda@...nel.org, adobriyan@...il.com, anna-maria@...utronix.de,
mark.rutland@....com, linus.walleij@...aro.org, Jason@...c4.com,
deller@....de, rdunlap@...radead.org, davem@...emloft.net, hch@....de,
peterx@...hat.com, hca@...ux.ibm.com, f.fainelli@...il.com, gerg@...nel.org,
dave.hansen@...ux.intel.com, mingo@...nel.org, ardb@...nel.org,
Liam.Howlett@...cle.com, mhocko@...e.com, 42.hyeyoo@...il.com,
peterz@...radead.org, ardb@...gle.com, enh@...gle.com, rientjes@...gle.com,
groeck@...omium.org, mpe@...erman.id.au,
Dmitry Safonov <0x7f454c46@...il.com>, Mike Rapoport <mike.rapoport@...il.com>,
Alexander Mikhalitsyn <aleksandr.mikhalitsyn@...onical.com>, Andrei Vagin <avagin@...gle.com>
Subject: Re: [PATCH v4 1/1] exec: seal system mappings
On Wed, Dec 11, 2024 at 2:47 PM Jeff Xu <jeffxu@...omium.org> wrote:
>
> Hi Andrei
>
> Thanks for your email.
> I was hoping to get some feedback from CRIU devs, and happy to see you
> reaching out..
>
...
> I have been thinking of other alternatives, but those would require
> more understanding on CRIU use cases.
> One of my questions is: Would CRIU target an individual process? or
> entire systems?
It targets individual processes that have been forked from the main
CRIU process.
>
> If it is an individual process, we could use prctl to opt-in/opt-out
> certain processes. There could be two alternatives.
> 1> Opt-in solution: process must set prctl.seal_criu_mapping, this
> needs to be set before execve() because sealing is applied at execve()
> call.
> 2> opt-out solution: The system will by default seal all of the system
> mappings, but individual processes can opt-out by setting
> prctl.not_seal_criu_mappings. This also needs to be set before
> execve() call.
I like the idea and I think the opt-out solution should work for CRIU.
CRIU will be able to call this prctl and re-execute itself.
Let me give you a bit of context on how CRIU works. When CRIU restores
processes, it recreates a process tree by forking itself. Afterwards, it
restores all mappings in each process but doesn't put them to proper
addresses. After that, each process unmaps CRIU mappings from its address
space and remaps its restored mappings to the proper addresses. So CRIU should
be able to move system mappings and seal them if they have been sealed before
dump.
BTW, It isn't just about CRIU. gVisor and maybe some other sandbox solutions
will be affected by this change too. gVisor uses stub-processes to represent
guest address spaces. In a stub process, it unmaps all system mappings.
>
> For both cases, we will want to identify what type of mapping CRIU
> cares about, i.e. maybe CRIU doesn't care about uprobe and vsyscall ?
> and only care about vdso/vvar/sigpage ?
As for now, it handles only vdso/vvar/sigpage mappings. It doesn't care
about vsyscall because it is always mapped to the fixed address.
gVisor should be able to unmap all system mappings from a process
address space.
Thanks,
Andrei
Powered by blists - more mailing lists