lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Nov 2023 09:51:08 -0500
From:   Gregory Price <gregory.price@...verge.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Gregory Price <gourry.memverge@...il.com>, linux-mm@...ck.org,
        linux-doc@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-api@...r.kernel.org, linux-arch@...r.kernel.org,
        linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        arnd@...db.de, tglx@...utronix.de, luto@...nel.org,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, hpa@...or.com, tj@...nel.org, ying.huang@...el.com
Subject: Re: [RFC PATCH 06/11] mm/mempolicy: modify do_mbind to operate on
 task argument instead of current

On Tue, Nov 28, 2023 at 03:11:06PM +0100, Michal Hocko wrote:
> On Wed 22-11-23 16:11:55, Gregory Price wrote:
> [...]
> > + * Like get_vma_policy and get_task_policy, must hold alloc/task_lock
> > + * while calling this.
> > + */
> > +static struct mempolicy *get_task_vma_policy(struct task_struct *task,
> > +					     struct vm_area_struct *vma,
> > +					     unsigned long addr, int order,
> > +					     pgoff_t *ilx)
> [...]
> 
> You should add lockdep annotation for alloc_lock/task_lock here for clarity and 
> also...  
> > @@ -1844,16 +1899,7 @@ struct mempolicy *__get_vma_policy(struct vm_area_struct *vma,
> >  struct mempolicy *get_vma_policy(struct vm_area_struct *vma,
> >  				 unsigned long addr, int order, pgoff_t *ilx)
> >  {
> > -	struct mempolicy *pol;
> > -
> > -	pol = __get_vma_policy(vma, addr, ilx);
> > -	if (!pol)
> > -		pol = get_task_policy(current);
> > -	if (pol->mode == MPOL_INTERLEAVE) {
> > -		*ilx += vma->vm_pgoff >> order;
> > -		*ilx += (addr - vma->vm_start) >> (PAGE_SHIFT + order);
> > -	}
> > -	return pol;
> > +	return get_task_vma_policy(current, vma, addr, order, ilx);
> 
> I do not think that all get_vma_policy take task_lock (just random check
> dequeue_hugetlb_folio_vma->huge_node->get_vma_policy AFAICS)
> 

hm, i might have gotten turned around on this one.  Forgot to check for
external references to get_vma_policy.  I thought I considered it, but i
clearly did not leave myself any notes if I did.

This pattern is troublesome, we're holding the task lock during the
callback stack in __get_vma_policy - just incase that returns NULL so we
can return the task policy instead.  If that vma is shared, it will take
the vma shared policy lock (sp->lock)

I almost want to change this interface to return NULL if the VMA doesn't
have one, and change callers to fetch the task policy explicitly instead
of implicitly returning the task policy.  At least then we'd only take
the task lock on an explicit access to the *Task* policy.

> Also I do not see policy_nodemask to be handled anywhere. That one is
> used along with get_vma_policy (sometimes hidden like in
> alloc_pages_mpol). It has a dependency on
> cpuset_nodemask_valid_mems_allowed. That means that e.g. mbind on a
> remote task would be constrained by current task cpuset when allocating
> migration targets for the target task. I am wondering how many other
> dependencies like that are lurking there.

bah!

thought i dug all these out, but i missed alloc_migration_target_by_mpol
from do_mbind.

I'll need to take another look at the calls to cpusets interfaces to
make sure i dig this out.  The number of hidden accesses to current is
really nasty :[ 

> -- 
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ