lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <X75roagtWe3e96Y2@mtj.duckdns.org>
Date:   Wed, 25 Nov 2020 09:35:13 -0500
From:   Tejun Heo <tj@...nel.org>
To:     Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc:     axboe@...nel.dk, baolin.wang7@...il.com,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/7] blk-iocost: Add a flag to indicate if need update hwi

On Wed, Nov 25, 2020 at 10:15:38PM +0800, Baolin Wang wrote:
> 
> > Hello,
> > 
> > On Tue, Nov 24, 2020 at 11:33:33AM +0800, Baolin Wang wrote:
> > > @@ -1445,7 +1447,8 @@ static void iocg_kick_waitq(struct ioc_gq *iocg, bool pay_debt,
> > >   	 * after the above debt payment.
> > >   	 */
> > >   	ctx.vbudget = vbudget;
> > > -	current_hweight(iocg, NULL, &ctx.hw_inuse);
> > > +	if (need_update_hwi)
> > > +		current_hweight(iocg, NULL, &ctx.hw_inuse);
> > 
> > So, if you look at the implementation of current_hweight(), it's
> > 
> > 1. If nothing has changed, read out the cached values.
> > 2. If something has changed, recalculate.
> 
> Yes, correct.
> 
> > 
> > and the "something changed" test is single memory read (most likely L1 hot
> > at this point) and testing for equality. IOW, the change you're suggesting
> > isn't much of an optimization. Maybe the compiler can do a somewhat better
> > job of arranging the code and it's a register load than memory load but
> > given that it's already a relatively cold wait path, this is unlikely to
> > make any actual difference. And that's how current_hweight() is meant to be
> > used.
> 
> What I want to avoid is the 'atomic_read(&ioc->hweight_gen)' in
> current_hweight(), cause this is not a register load and is always a memory
> load. But introducing a flag can be cached and more light than a memory
> load.
> 
> But after thinking more, I think we can just move the "current_hweight(iocg,
> NULL, &ctx.hw_inuse);" to the correct place without introducing new flag to
> optimize the code. How do you think the below code?

I don't find this discussion very meaningful. We're talking about
theoretical one memory load optimization in a path which likely isn't hot
enough for such difference to make any difference. If you can show that this
matters, please do. Otherwise, what are we doing?

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ