lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190716125309.GA10672@andrea>
Date:   Tue, 16 Jul 2019 14:53:09 +0200
From:   Andrea Parri <andrea.parri@...rulasolutions.com>
To:     Daniel Jordan <daniel.m.jordan@...cle.com>
Cc:     Steffen Klassert <steffen.klassert@...unet.com>,
        Herbert Xu <herbert@...dor.apana.org.au>, boqun.feng@...il.com,
        paulmck@...ux.ibm.com, peterz@...radead.org,
        linux-arch@...r.kernel.org, linux-crypto@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] padata: use smp_mb in padata_reorder to avoid orphaned
 padata jobs

Hi Daniel,

My two cents (summarizing some findings we discussed privately):


> I think adding the full barrier guarantees the following ordering, and the
> memory model people can correct me if I'm wrong:
> 
> CPU21                      CPU22
> ------------------------   --------------------------
> UNLOCK pd->lock
> smp_mb()
> LOAD reorder_objects
>                            INC reorder_objects
>                            spin_unlock(&pqueue->reorder.lock) // release barrier
>                            TRYLOCK pd->lock
> 
> So if CPU22 has incremented reorder_objects but CPU21 reads 0 for it, CPU21
> should also have unlocked pd->lock so CPU22 can get it and serialize any
> remaining jobs.

This information inspired me to write down the following litmus test:
(AFAICT, you independently wrote a very similar test, which is indeed
quite reassuring! ;D)

C daniel-padata

{ }

P0(atomic_t *reorder_objects, spinlock_t *pd_lock)
{
	int r0;

	spin_lock(pd_lock);
	spin_unlock(pd_lock);
	smp_mb();
	r0 = atomic_read(reorder_objects);
}

P1(atomic_t *reorder_objects, spinlock_t *pd_lock, spinlock_t *reorder_lock)
{
	int r1;

	spin_lock(reorder_lock);
	atomic_inc(reorder_objects);
	spin_unlock(reorder_lock);
	//smp_mb();
	r1 = spin_trylock(pd_lock);
}

exists (0:r0=0 /\ 1:r1=0)

It seems worth noticing that this test's "exists" clause is satisfiable
according to the (current) memory consistency model.  (Informally, this
can be explained by noticing that the RELEASE from the spin_unlock() in
P1 does not provide any order between the atomic increment and the read
part of the spin_trylock() operation.)  FWIW, uncommenting the smp_mb()
in P1 would suffice to prevent this clause from being satisfiable; I am
not sure, however, whether this approach is feasible or ideal... (sorry,
I'm definitely not too familiar with this code... ;/)

Thanks,
  Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ