lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGiH9goRPcHS03NnhYBj-vvrfa702ZZiCK5q-MFa-MShQ@mail.gmail.com>
Date: Tue, 14 Jan 2025 21:41:57 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>, Suren Baghdasaryan <surenb@...gle.com>, 
	Wei Yang <richard.weiyang@...il.com>, akpm@...ux-foundation.org, peterz@...radead.org, 
	willy@...radead.org, lorenzo.stoakes@...cle.com, david.laight.linux@...il.com, 
	mhocko@...e.com, vbabka@...e.cz, hannes@...xchg.org, mjguzik@...il.com, 
	oliver.sang@...el.com, mgorman@...hsingularity.net, david@...hat.com, 
	peterx@...hat.com, oleg@...hat.com, dave@...olabs.net, paulmck@...nel.org, 
	brauner@...nel.org, dhowells@...hat.com, hdanton@...a.com, hughd@...gle.com, 
	lokeshgidra@...gle.com, minchan@...gle.com, jannh@...gle.com, 
	shakeel.butt@...ux.dev, souravpanda@...gle.com, pasha.tatashin@...een.com, 
	klarasmodin@...il.com, corbet@....net, linux-doc@...r.kernel.org, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v9 16/17] mm: make vma cache SLAB_TYPESAFE_BY_RCU

On Tue, Jan 14, 2025 at 7:58 PM Liam R. Howlett <Liam.Howlett@...cle.com> wrote:
>
> * Suren Baghdasaryan <surenb@...gle.com> [250114 22:15]:
> > On Tue, Jan 14, 2025 at 6:27 PM Wei Yang <richard.weiyang@...il.com> wrote:
> > >
> > > On Fri, Jan 10, 2025 at 08:26:03PM -0800, Suren Baghdasaryan wrote:
> > >
> > > >diff --git a/kernel/fork.c b/kernel/fork.c
> > > >index 9d9275783cf8..151b40627c14 100644
> > > >--- a/kernel/fork.c
> > > >+++ b/kernel/fork.c
> > > >@@ -449,6 +449,42 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
> > > >       return vma;
> > > > }
> > > >
> > > >+static void vm_area_init_from(const struct vm_area_struct *src,
> > > >+                            struct vm_area_struct *dest)
> > > >+{
> > > >+      dest->vm_mm = src->vm_mm;
> > > >+      dest->vm_ops = src->vm_ops;
> > > >+      dest->vm_start = src->vm_start;
> > > >+      dest->vm_end = src->vm_end;
> > > >+      dest->anon_vma = src->anon_vma;
> > > >+      dest->vm_pgoff = src->vm_pgoff;
> > > >+      dest->vm_file = src->vm_file;
> > > >+      dest->vm_private_data = src->vm_private_data;
> > > >+      vm_flags_init(dest, src->vm_flags);
> > > >+      memcpy(&dest->vm_page_prot, &src->vm_page_prot,
> > > >+             sizeof(dest->vm_page_prot));
> > > >+      /*
> > > >+       * src->shared.rb may be modified concurrently when called from
> > > >+       * dup_mmap(), but the clone will reinitialize it.
> > > >+       */
> > > >+      data_race(memcpy(&dest->shared, &src->shared, sizeof(dest->shared)));
> > > >+      memcpy(&dest->vm_userfaultfd_ctx, &src->vm_userfaultfd_ctx,
> > > >+             sizeof(dest->vm_userfaultfd_ctx));
> > > >+#ifdef CONFIG_ANON_VMA_NAME
> > > >+      dest->anon_name = src->anon_name;
> > > >+#endif
> > > >+#ifdef CONFIG_SWAP
> > > >+      memcpy(&dest->swap_readahead_info, &src->swap_readahead_info,
> > > >+             sizeof(dest->swap_readahead_info));
> > > >+#endif
> > > >+#ifndef CONFIG_MMU
> > > >+      dest->vm_region = src->vm_region;
> > > >+#endif
> > > >+#ifdef CONFIG_NUMA
> > > >+      dest->vm_policy = src->vm_policy;
> > > >+#endif
> > > >+}
> > >
> > > Would this be difficult to maintain? We should make sure not miss or overwrite
> > > anything.
> >
> > Yeah, it is less maintainable than a simple memcpy() but I did not
> > find a better alternative. I added a warning above the struct
> > vm_area_struct definition to update this function every time we change
> > that structure. Not sure if there is anything else I can do to help
> > with this.
>
> Here's a horrible idea.. if we put the ref count at the end or start of
> the struct, we could set the ref count to zero and copy the other area
> in one mmecpy().
>
> Even worse idea, we could use a pointer_of like macro to get the position
> of the ref count in the vma struct, set the ref count to zero and
> carefully copy the other two parts in two memcpy() operations.

I implemented this approach in v3 of this patchset here:
https://lore.kernel.org/all/20241117080931.600731-5-surenb@google.com/
like this:

#define VMA_BEFORE_LOCK offsetof(struct vm_area_struct, vm_lock)
#define VMA_LOCK_END(vma) \
        (((void *)(vma)) + offsetofend(struct vm_area_struct, vm_lock))
#define VMA_AFTER_LOCK \
        (sizeof(struct vm_area_struct) - offsetofend(struct
vm_area_struct, vm_lock))

static inline void vma_copy(struct vm_area_struct *new, struct
vm_area_struct *orig)
{
        /* Preserve vma->vm_lock */
        data_race(memcpy(new, orig, VMA_BEFORE_LOCK));
        data_race(memcpy(VMA_LOCK_END(new), VMA_LOCK_END(orig),
VMA_AFTER_LOCK));
}

If this looks more maintainable I can revive it.
Maybe introduce a more generic function to copy any structure
excluding a specific field and use it like this:

copy_struct_except(new, orig, struct vm_area_struct, vm_refcnt);

Would that be better?

>
> Feel free to disregard these ideas as it is late here and I'm having
> fun thinking up bad ways to make this "more" maintainable.
>
> Either of these would make updating the struct easier, but very painful
> to debug when it goes wrong (or reading the function).
>
> Thanks,
> Liam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ