lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 23 Jan 2017 09:44:35 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Alexander Duyck' <alexander.duyck@...il.com>
CC:     David Miller <davem@...emloft.net>,
        "maowenan@...wei.com" <maowenan@...wei.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "jeffrey.t.kirsher@...el.com" <jeffrey.t.kirsher@...el.com>
Subject: RE: [PATCH v2 net-next] net:add one common config
 ARCH_WANT_RELAX_ORDER to support relax ordering.

Alexander Duyck
> Sent: 19 January 2017 15:55
...
> >> The Relaxed Ordering attribute doesn't get applied across the board.
> >> It ends up being limited to a subset of the transactions if I recall
> >> correctly.  In this case it is the Tx descriptor write back, and the
> >> Rx data write back.  We don't apply the RO bit to any other
> >> transactions.
> >>
> >> In the case of Tx descriptor there is no harm in allowing it to be
> >> reordered because we only really read the DD bit so we don't care
> >> about the ordering of the write back.  In the case of the Rx data the
> >> Rx descriptor essentially acts as a flush since it is sent without the
> >> RO bit set.  So all the writes before it must be completed before the
> >> Rx descriptor write back.
> >
> > In which case why not set it unconditionally for all architectures?
> >
> > I'm surprised (I often am) that allowing those re-orderings makes
> > any significant difference.
> > Unfortunately you need a PCIe analyser to see what is really happening
> > and they don't come cheap.
> >
> > What I do vaguely remember is that some hosts don't always implement
> > the 'normal' re-ordering of reads and read completions.
> > Re-ordering of reads allows descriptor reads to overtake transmit
> > traffic which is likely to make a difference.
> 
> I think part of the issue, at least in the case of SPARC, is that the
> handling of the memory writes in the PCIe root complex is impacted by
> the RO attribute.  On the bus itself it doesn't matter much, but at
> the root complex it can become expensive to have to wait on a partial
> write to complete while there are other writes pending.  This is why
> the IOMMU for SPARC now has a WEAK_ORDERING attribute you can add so
> that it can write the data in whatever order it wants in relation to
> other writes in that region.

I hope the IOMMU only ever reorders writes that have the RO bit set.

Has anyone tried cache invalidates on the rx buffers?
Might make the writes less expensive.
Or is the issue with NUMA rather than cache.

	David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ