lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABi2SkUfXrRLg7+ZrLjMEQzh4FtU0EAfckdiDuza1mFKzi0SAg@mail.gmail.com>
Date: Wed, 16 Oct 2024 20:43:02 -0700
From: Jeff Xu <jeffxu@...omium.org>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>, jeffxu@...omium.org, akpm@...ux-foundation.org, 
	keescook@...omium.org, jannh@...gle.com, torvalds@...ux-foundation.org, 
	adhemerval.zanella@...aro.org, oleg@...hat.com, linux-kernel@...r.kernel.org, 
	linux-hardening@...r.kernel.org, linux-mm@...ck.org, jorgelo@...omium.org, 
	sroettger@...gle.com, ojeda@...nel.org, adobriyan@...il.com, 
	anna-maria@...utronix.de, mark.rutland@....com, linus.walleij@...aro.org, 
	Jason@...c4.com, deller@....de, rdunlap@...radead.org, davem@...emloft.net, 
	hch@....de, peterx@...hat.com, hca@...ux.ibm.com, f.fainelli@...il.com, 
	gerg@...nel.org, dave.hansen@...ux.intel.com, mingo@...nel.org, 
	ardb@...nel.org, mhocko@...e.com, 42.hyeyoo@...il.com, peterz@...radead.org, 
	ardb@...gle.com, enh@...gle.com, rientjes@...gle.com, groeck@...omium.org, 
	lorenzo.stoakes@...cle.com
Subject: Re: [RFC PATCH v2 1/1] exec: seal system mappings

On Wed, Oct 16, 2024 at 6:10 PM Liam R. Howlett <Liam.Howlett@...cle.com> wrote:
>
> > +     exec.seal_system_mappings = [KNL]
> > +                     Format: { never | always }
> > +                     Seal system mappings: vdso, vvar, sigpage, uprobes,
> > +                     vsyscall.
> > +                     This overwrites KCONFIG CONFIG_SEAL_SYSTEM_MAPPINGS_*
> > +                     - 'never':  never seal system mappings.
>
> Not true, uprobes are sealed when 'never' is selected.
>
Thanks. I forgot to uprobes from the description in Kconfig and
kernel-parameters.txt, will update.

> > +                     - 'always': always seal system mappings.
> > +                     If not specified or invalid, default is the KCONFIG value.
> > +                     This option has no effect if CONFIG_64BIT=n
> > +
> >       early_page_ext [KNL,EARLY] Enforces page_ext initialization to earlier
> >                       stages so cover more early boot allocations.
> >                       Please note that as side effect some optimizations
> > diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
> > index 2fb7d53cf333..20a3000550d2 100644
> > --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> > +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/mm_types.h>
> >  #include <linux/syscalls.h>
> >  #include <linux/ratelimit.h>
> > +#include <linux/fs.h>
> >
> >  #include <asm/vsyscall.h>
> >  #include <asm/unistd.h>
> > @@ -366,8 +367,12 @@ void __init map_vsyscall(void)
> >               set_vsyscall_pgtable_user_bits(swapper_pg_dir);
> >       }
> >
> > -     if (vsyscall_mode == XONLY)
> > -             vm_flags_init(&gate_vma, VM_EXEC);
> > +     if (vsyscall_mode == XONLY) {
> > +             unsigned long vm_flags = VM_EXEC;
> > +
> > +             update_seal_exec_system_mappings(&vm_flags);
> > +             vm_flags_init(&gate_vma, vm_flags);
> > +     }
> >
> >       BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=
> >                    (unsigned long)VSYSCALL_ADDR);
> > diff --git a/fs/exec.c b/fs/exec.c
> > index 77364806b48d..5030879cda47 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
>
> Does it make sense for this to live in exec?  Couldn't you put it in the
> mm/mseal.c file?  It's vma flags for mappings and you've put it in
> fs/exec?
>
If you are referring to utilities related to kernel cmdline, they
should be in this file.

> > @@ -68,6 +68,7 @@
> >  #include <linux/user_events.h>
> >  #include <linux/rseq.h>
> >  #include <linux/ksm.h>
> > +#include <linux/fs_parser.h>
> >
> >  #include <linux/uaccess.h>
> >  #include <asm/mmu_context.h>
> > @@ -2159,3 +2160,55 @@ fs_initcall(init_fs_exec_sysctls);
> >  #ifdef CONFIG_EXEC_KUNIT_TEST
> >  #include "tests/exec_kunit.c"
> >  #endif
> > +
> > +#ifdef CONFIG_64BIT
> > +/*
> > + * Kernel cmdline overwrite for CONFIG_SEAL_SYSTEM_MAPPINGS_X
> > + */
> > +enum seal_system_mappings_type {
> > +     SEAL_SYSTEM_MAPPINGS_NEVER,
> > +     SEAL_SYSTEM_MAPPINGS_ALWAYS
> > +};
> > +
> > +static enum seal_system_mappings_type seal_system_mappings __ro_after_init =
> > +     IS_ENABLED(CONFIG_SEAL_SYSTEM_MAPPINGS_ALWAYS) ? SEAL_SYSTEM_MAPPINGS_ALWAYS :
> > +     SEAL_SYSTEM_MAPPINGS_NEVER;
> > +
> > +static const struct constant_table value_table_sys_mapping[] __initconst = {
> > +     { "never", SEAL_SYSTEM_MAPPINGS_NEVER},
> > +     { "always", SEAL_SYSTEM_MAPPINGS_ALWAYS},
> > +     { }
> > +};
> > +
> > +static int __init early_seal_system_mappings_override(char *buf)
> > +{
> > +     if (!buf)
> > +             return -EINVAL;
> > +
> > +     seal_system_mappings = lookup_constant(value_table_sys_mapping,
> > +                     buf, seal_system_mappings);
> > +
> > +     return 0;
> > +}
> > +
> > +early_param("exec.seal_system_mappings", early_seal_system_mappings_override);
> > +
> > +static bool seal_system_mappings_enabled(void)
> > +{
> > +     if (seal_system_mappings == SEAL_SYSTEM_MAPPINGS_ALWAYS)
> > +             return true;
> > +
> > +     return false;
> > +}
>
> This function seems unnecessary, it is called from another 3-4 line
> function only.
>
It is more readable this way.

> > +
> > +void update_seal_exec_system_mappings(unsigned long *vm_flags)
> > +{
> > +     if (!(*vm_flags & VM_SEALED) && seal_system_mappings_enabled())
>
> Why !(*vm_flags & VM_SEALED) here?
>
If vm_flags is already sealed, then there is no need to check
seal_system_mappings_enabled.

> > +             *vm_flags |= VM_SEALED;
> > +
> > +}
>
> Instead of passing a pointer around and checking enabled, why don't you
> have a function that just returns the VM_SEALED or 0 and just or it into
> the flags?  This seems very heavy for what it does, why did you do it
> this way?
>
Why is that heavy ? passing a pointer for updating variables is natural.

> The name is also very long and a bit odd, it could be used for other
> reasons, but you have _system_mappings on the end, and you use seal but
> it's mseal (or vm_seal)?  Would mseal_flag() work?
>
It could be longer :-)
 it means update_sealing_flag_for_executable_system_mappings.
mseal_flag is too short and not descriptive.

> > +#else
> > +void update_seal_exec_system_mappings(unsigned long *vm_flags)
> > +{
> > +}
> > +#endif /* CONFIG_64BIT */
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 42444ec95c9b..6e44aca4b24b 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
>
> Again, I don't understand why fs.h is the place for mseal definitions?
>
include/linux/fs.h contains other exec related function signatures. So
it is better to keep them at the same header.

> > @@ -3079,6 +3079,7 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos);
> >  extern ssize_t kernel_write(struct file *, const void *, size_t, loff_t *);
> >  extern ssize_t __kernel_write(struct file *, const void *, size_t, loff_t *);
> >  extern struct file * open_exec(const char *);
> > +extern void update_seal_exec_system_mappings(unsigned long *vm_flags);
>
> We are dropping extern where possible now.
>
extern can be dropped, it appears not causing link error.
> >
> >  /* fs/dcache.c -- generic fs support functions */
> >  extern bool is_subdir(struct dentry *, struct dentry *);
> > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> > index c47a0bf25e58..e9876fae8887 100644
> > --- a/kernel/events/uprobes.c
> > +++ b/kernel/events/uprobes.c
> > @@ -1506,7 +1506,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
> >       }
> >
> >       vma = _install_special_mapping(mm, area->vaddr, PAGE_SIZE,
> > -                             VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO,
> > +                             VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO|VM_SEALED,
> >                               &xol_mapping);
>
> Changing all uprobes seems like something that should probably be
> mentioned more than just the note at the end of the change log, even if
> you think it won't have any impact.  The note is even hidden at the end
> of a paragraph.
>
> I would go as far as splitting this patch out as its own so that the
> subject line specifies that all uprobes will be VM_SEALED now.
>
> Maybe it's fine but maybe it isn't and you've buried it so that it will
> be missed by virtually everyone.
>
I will  add "It is sealed from creation." in the above "uprobe" section.

>
> >       if (IS_ERR(vma)) {
> >               ret = PTR_ERR(vma);
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 57fd5ab2abe7..d4717e34a60d 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2133,6 +2133,7 @@ struct vm_area_struct *_install_special_mapping(
> >       unsigned long addr, unsigned long len,
> >       unsigned long vm_flags, const struct vm_special_mapping *spec)
> >  {
> > +     update_seal_exec_system_mappings(&vm_flags);
> >       return __install_special_mapping(mm, addr, len, vm_flags, (void *)spec,
> >                                       &special_mapping_vmops);
>
> If you were to return a flag, you could change the vm_flags argument to
> vm_flags | mseal_flag()
>
passing pointer seems to be the most efficient way.

Thanks
-Jeff

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ