lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53A94EBD.101@ozlabs.ru>
Date:	Tue, 24 Jun 2014 20:11:09 +1000
From:	Alexey Kardashevskiy <aik@...abs.ru>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Alex Williamson <alex.williamson@...hat.com>
CC:	linuxppc-dev@...ts.ozlabs.org, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, Alexander Graf <agraf@...e.de>,
	Nikunj A Dadhania <nikunj@...ux.vnet.ibm.com>
Subject: Re: [PATCH] vfio: Fix endianness handling for emulated BARs

On 06/21/2014 09:12 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-19 at 21:21 -0600, Alex Williamson wrote:
> 
>> Working on big endian being an accident may be a matter of perspective
> 
>  :-)
> 
>> The comment remains that this patch doesn't actually fix anything except
>> the overhead on big endian systems doing redundant byte swapping and
>> maybe the philosophy that vfio regions are little endian.
> 
> Yes, that works by accident because technically VFIO is a transport and
> thus shouldn't perform any endian swapping of any sort, which remains
> the responsibility of the end driver which is the only one to know
> whether a given BAR location is a a register or some streaming data
> and in the former case whether it's LE or BE (some PCI devices are BE
> even ! :-)
> 
> But yes, in the end, it works with the dual "cancelling" swaps and the
> overhead of those swaps is probably drowned in the noise of the syscall
> overhead.
> 
>> I'm still not a fan of iowrite vs iowritebe, there must be something we
>> can use that doesn't have an implicit swap.
> 
> Sadly there isn't ... In the old day we didn't even have the "be"
> variant and readl/writel style accessors still don't have them either
> for all archs.
> 
> There is __raw_readl/writel but here the semantics are much more than
> just "don't swap", they also don't have memory barriers (which means
> they are essentially useless to most drivers unless those are platform
> specific drivers which know exactly what they are doing, or in the rare
> cases such as accessing a framebuffer which we know never have side
> effects). 
> 
>>  Calling it iowrite*_native is also an abuse of the namespace.
> 
> 
>>  Next thing we know some common code
>> will legitimately use that name. 
> 
> I might make sense to those definitions into a common header. There have
> been a handful of cases in the past that wanted that sort of "native
> byte order" MMIOs iirc (though don't ask me for examples, I can't really
> remember).
> 
>>  If we do need to define an alias
>> (which I'd like to avoid) it should be something like vfio_iowrite32.


Ping?

We need to make a decision whether to move those xxx_native() helpers
somewhere (where?) or leave the patch as is (as we figured out that
iowriteXX functions implement barriers and we cannot just use raw
accessors) and fix commit log to explain everything.

Thanks!




>> Thanks,
> 
> Cheers,
> Ben.
> 
>> Alex
>>
>>>> ===
>>>>
>>>> any better?
>>>>
>>>>
>>>>
>>>>
>>>>>>> Suggested-by: Benjamin Herrenschmidt <benh@...nel.crashing.org>
>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@...abs.ru>
>>>>>>> ---
>>>>>>>  drivers/vfio/pci/vfio_pci_rdwr.c | 20 ++++++++++++++++----
>>>>>>>  1 file changed, 16 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
>>>>>>> index 210db24..f363b5a 100644
>>>>>>> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
>>>>>>> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
>>>>>>> @@ -21,6 +21,18 @@
>>>>>>>  
>>>>>>>  #include "vfio_pci_private.h"
>>>>>>>  
>>>>>>> +#ifdef __BIG_ENDIAN__
>>>>>>> +#define ioread16_native		ioread16be
>>>>>>> +#define ioread32_native		ioread32be
>>>>>>> +#define iowrite16_native	iowrite16be
>>>>>>> +#define iowrite32_native	iowrite32be
>>>>>>> +#else
>>>>>>> +#define ioread16_native		ioread16
>>>>>>> +#define ioread32_native		ioread32
>>>>>>> +#define iowrite16_native	iowrite16
>>>>>>> +#define iowrite32_native	iowrite32
>>>>>>> +#endif
>>>>>>> +
>>>>>>>  /*
>>>>>>>   * Read or write from an __iomem region (MMIO or I/O port) with an excluded
>>>>>>>   * range which is inaccessible.  The excluded range drops writes and fills
>>>>>>> @@ -50,9 +62,9 @@ static ssize_t do_io_rw(void __iomem *io, char __user *buf,
>>>>>>>  				if (copy_from_user(&val, buf, 4))
>>>>>>>  					return -EFAULT;
>>>>>>>  
>>>>>>> -				iowrite32(le32_to_cpu(val), io + off);
>>>>>>> +				iowrite32_native(val, io + off);
>>>>>>>  			} else {
>>>>>>> -				val = cpu_to_le32(ioread32(io + off));
>>>>>>> +				val = ioread32_native(io + off);
>>>>>>>  
>>>>>>>  				if (copy_to_user(buf, &val, 4))
>>>>>>>  					return -EFAULT;
>>>>>>> @@ -66,9 +78,9 @@ static ssize_t do_io_rw(void __iomem *io, char __user *buf,
>>>>>>>  				if (copy_from_user(&val, buf, 2))
>>>>>>>  					return -EFAULT;
>>>>>>>  
>>>>>>> -				iowrite16(le16_to_cpu(val), io + off);
>>>>>>> +				iowrite16_native(val, io + off);
>>>>>>>  			} else {
>>>>>>> -				val = cpu_to_le16(ioread16(io + off));
>>>>>>> +				val = ioread16_native(io + off);
>>>>>>>  
>>>>>>>  				if (copy_to_user(buf, &val, 2))
>>>>>>>  					return -EFAULT;
>>>>>>
>>>>>>
>>>>>>

-- 
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ