lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 08 Apr 2020 09:36:16 +1000
From:   Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:     Tao Ren <rentao.bupt@...il.com>
Cc:     Felipe Balbi <balbi@...nel.org>, linux-aspeed@...ts.ozlabs.org,
        Andrew Jeffery <andrew@...id.au>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        openbmc@...ts.ozlabs.org, linux-usb@...r.kernel.org,
        linux-kernel@...r.kernel.org, Stephen Boyd <swboyd@...omium.org>,
        Joel Stanley <joel@....id.au>, taoren@...com,
        Chunfeng Yun <chunfeng.yun@...iatek.com>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v3] usb: gadget: aspeed: improve vhub port irq handling

On Mon, 2020-04-06 at 23:02 -0700, Tao Ren wrote:
> I ran some testing on my ast2400 and ast2500 BMC and looks like the
> for() loop runs faster than for_each_set_bit_from() loop in my
> environment. I'm not sure if something needs to be revised in my test
> code, but please kindly share your suggestions:
> 
> I use get_cycles() to calculate execution time of 2 different loops, and
> ast_vhub_dev_irq() is replaced with barrier() to avoid "noise"; below
> are the results:
> 
>   - when downstream port number is 5 and only 1 irq bit is set, it takes
>     ~30 cycles to finish for_each_set_bit() loop, and 20-25 cycles to
>     finish the for() loop.
> 
>   - if downstream port number is 5 and all 5 bits are set, then
>     for_each_set_bit() loop takes ~50 cycles and for() loop takes ~25
>     cycles.
> 
>   - when I increase downsteam port number to 16 and set 1 irq bit, the
>     for_each_set_bit() loop takes ~30 cycles and for() loop takes 25
>     cycles. It's a little surprise to me because I thought for() loop
>     would cost 60+ cycles (3 times of the value when port number is 5).
> 
>   - if downstream port number is 16 and all irq status bits are set,
>     then for_each_set_bit() loop takes 60-70 cycles and for() loop takes
>     30+ cycles.

I suspect the CPU doesn't have an efficient find-zero-bit primitive,
check the generated asm. In that case I would go back to the simple for
loop.

Cheers,
Ben.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ