lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c0bcc9e-ab76-b356-7da6-6eb8b3868610@arm.com>
Date:   Fri, 19 Jun 2020 20:17:05 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Syed Nayyar Waris <syednwaris@...il.com>, linus.walleij@...aro.org,
        akpm@...ux-foundation.org
Cc:     linux-gpio@...r.kernel.org, linux-kernel@...r.kernel.org,
        vilhelm.gray@...il.com, michal.simek@...inx.com,
        bgolaszewski@...libre.com, andriy.shevchenko@...ux.intel.com,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v8 4/4] gpio: xilinx: Utilize for_each_set_clump macro

On 2020-06-15 13:54, Syed Nayyar Waris wrote:
> This patch reimplements the xgpio_set_multiple function in
> drivers/gpio/gpio-xilinx.c to use the new for_each_set_clump macro.
> Instead of looping for each bit in xgpio_set_multiple
> function, now we can check each channel at a time and save cycles.
> 
> Cc: Bartosz Golaszewski <bgolaszewski@...libre.com>
> Cc: Michal Simek <michal.simek@...inx.com>
> Signed-off-by: Syed Nayyar Waris <syednwaris@...il.com>
> Signed-off-by: William Breathitt Gray <vilhelm.gray@...il.com>
> ---
> Changes in v8:
>   - No change.
> 
> Changes in v7:
>   - No change.
> 
> Changes in v6:
>   - No change.
> 
> Changes in v5:
>   - Minor change: Inline values '32' and '64' in code for better
>     code readability.
> 
> Changes in v4:
>   - Minor change: Inline values '32' and '64' in code for better
>     code readability.
> 
> Changes in v3:
>   - No change.
> 
> Changes in v2:
>   - No change.
> 
>   drivers/gpio/gpio-xilinx.c | 62 ++++++++++++++++++++------------------
>   1 file changed, 32 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> index 67f9f82e0db0..e81092dea27e 100644
> --- a/drivers/gpio/gpio-xilinx.c
> +++ b/drivers/gpio/gpio-xilinx.c
> @@ -136,39 +136,41 @@ static void xgpio_set(struct gpio_chip *gc, unsigned int gpio, int val)
>   static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
>   			       unsigned long *bits)
>   {
> -	unsigned long flags;
> +	unsigned long flags[2];
>   	struct xgpio_instance *chip = gpiochip_get_data(gc);
> -	int index = xgpio_index(chip, 0);
> -	int offset, i;
> -
> -	spin_lock_irqsave(&chip->gpio_lock[index], flags);
> -
> -	/* Write to GPIO signals */
> -	for (i = 0; i < gc->ngpio; i++) {
> -		if (*mask == 0)
> -			break;
> -		/* Once finished with an index write it out to the register */
> -		if (index !=  xgpio_index(chip, i)) {
> -			xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> -				       index * XGPIO_CHANNEL_OFFSET,
> -				       chip->gpio_state[index]);
> -			spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
> -			index =  xgpio_index(chip, i);
> -			spin_lock_irqsave(&chip->gpio_lock[index], flags);
> -		}
> -		if (__test_and_clear_bit(i, mask)) {
> -			offset =  xgpio_offset(chip, i);
> -			if (test_bit(i, bits))
> -				chip->gpio_state[index] |= BIT(offset);
> -			else
> -				chip->gpio_state[index] &= ~BIT(offset);
> -		}
> +	u32 *const state = chip->gpio_state;
> +	unsigned int *const width = chip->gpio_width;

Immutable pointers to mutable data are pretty unusual, especially for 
temporary local variables. Let me share my thought process upon seeing this:

- hmm, is "* const" simply a mistake that's meant to be "const *"?
- <scan the rest of the function> no, updating chip->gpio_state seems 
appropriate, so it can't be that.
- does anything take the address of either of these variables that might 
justify it?
- <scan the rest of the function again> nope, they're only ever used by 
value
- hmm, maybe it's just paranoia, but in that case why isn't width "const 
* const" since chip->gpio_width shouldn't need to be modified?
- hmm...

And at that point I've spent nearly a minute parsing what should have 
been be some trivial definitions of local shorthand variables. Defensive 
programming is all very well, but the distraction to readers (I can't be 
the only one) can easily outweigh any perceived value in trying to 
harden against theoretical future developer error in a straightforward 
~30-line function.

> +	unsigned long offset, clump;
> +	size_t index;
> +
> +	DECLARE_BITMAP(old, 64);
> +	DECLARE_BITMAP(new, 64);
> +	DECLARE_BITMAP(changed, 64);
> +
> +	spin_lock_irqsave(&chip->gpio_lock[0], flags[0]);
> +	spin_lock_irqsave(&chip->gpio_lock[1], flags[1]);

Why _irqsave on the inner lock? (think about it...)

> +
> +	bitmap_set_value(old, state[0], 0, width[0]);
> +	bitmap_set_value(old, state[1], width[0], width[1]);
> +	bitmap_replace(new, old, bits, mask, gc->ngpio);
> +
> +	bitmap_set_value(old, state[0], 0, 32);
> +	bitmap_set_value(old, state[1], 32, 32);
> +	state[0] = bitmap_get_value(new, 0, width[0]);
> +	state[1] = bitmap_get_value(new, width[0], width[1]);
> +	bitmap_set_value(new, state[0], 0, 32);
> +	bitmap_set_value(new, state[1], 32, 32);
> +	bitmap_xor(changed, old, new, 64);
> +
> +	for_each_set_clump(offset, clump, changed, 64, 32) {
> +		index = offset / 32;
> +		xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> +				index * XGPIO_CHANNEL_OFFSET,
> +				state[index]);
>   	}

TBH this looks like a rather overcomplicated and horribly inefficient 
way of doing:

	if (((u32 *)changed)[0])
		xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
				state[0]);
	if (((u32 *)changed)[1])
		xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
				XGPIO_CHANNEL_OFFSET, state[1]);

(and doing the changed/state update itself one word at a time for each 
condition would probably be a fair bit more efficient in terms of 
minimising spilling to the stack on 32-bit machines)

I can see this API having merit if the clumps are a weird size or 
expected to be significantly sparse in the bitmap, but making 
out-of-line calls to an iterator which itself involves another 
out-of-line call and an integer division, all just to process two halves 
of a 64-bit value, seems... unnecessarily silly :/

[drive-by review since I had a "packing small values into bitmaps" 
use-case and wondered if there might be anything interesting here]

Robin.

>   
> -	xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> -		       index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
> -
> -	spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
> +	spin_unlock_irqrestore(&chip->gpio_lock[1], flags[1]);
> +	spin_unlock_irqrestore(&chip->gpio_lock[0], flags[0]);
>   }
>   
>   /**
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ