netdev - Re: [PATCH v2 net-next] net:add one common config ARCH_WANT_RELAX

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Mon, 23 Jan 2017 08:28:05 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     David Laight <David.Laight@...lab.com>,
        Tushar Dave <tushar.n.dave@...cle.com>
Cc:     David Miller <davem@...emloft.net>,
        "maowenan@...wei.com" <maowenan@...wei.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "jeffrey.t.kirsher@...el.com" <jeffrey.t.kirsher@...el.com>
Subject: Re: [PATCH v2 net-next] net:add one common config ARCH_WANT_RELAX_ORDER
 to support relax ordering.

On Mon, Jan 23, 2017 at 1:44 AM, David Laight <David.Laight@...lab.com> wrote:
> Alexander Duyck
>> Sent: 19 January 2017 15:55
> ...
>> >> The Relaxed Ordering attribute doesn't get applied across the board.
>> >> It ends up being limited to a subset of the transactions if I recall
>> >> correctly.  In this case it is the Tx descriptor write back, and the
>> >> Rx data write back.  We don't apply the RO bit to any other
>> >> transactions.
>> >>
>> >> In the case of Tx descriptor there is no harm in allowing it to be
>> >> reordered because we only really read the DD bit so we don't care
>> >> about the ordering of the write back.  In the case of the Rx data the
>> >> Rx descriptor essentially acts as a flush since it is sent without the
>> >> RO bit set.  So all the writes before it must be completed before the
>> >> Rx descriptor write back.
>> >
>> > In which case why not set it unconditionally for all architectures?
>> >
>> > I'm surprised (I often am) that allowing those re-orderings makes
>> > any significant difference.
>> > Unfortunately you need a PCIe analyser to see what is really happening
>> > and they don't come cheap.
>> >
>> > What I do vaguely remember is that some hosts don't always implement
>> > the 'normal' re-ordering of reads and read completions.
>> > Re-ordering of reads allows descriptor reads to overtake transmit
>> > traffic which is likely to make a difference.
>>
>> I think part of the issue, at least in the case of SPARC, is that the
>> handling of the memory writes in the PCIe root complex is impacted by
>> the RO attribute.  On the bus itself it doesn't matter much, but at
>> the root complex it can become expensive to have to wait on a partial
>> write to complete while there are other writes pending.  This is why
>> the IOMMU for SPARC now has a WEAK_ORDERING attribute you can add so
>> that it can write the data in whatever order it wants in relation to
>> other writes in that region.
>
> I hope the IOMMU only ever reorders writes that have the RO bit set.

I'm assuming it only applies to DMA regions mapped with
DMA_ATTR_WEAK_ORDERING.  Since drivers have to specify that attribute
it likely is only going to apply to DMA regions that could have the RO
bit set.

> Has anyone tried cache invalidates on the rx buffers?
> Might make the writes less expensive.
> Or is the issue with NUMA rather than cache.

I don't know.  This is all very SPARC specific and I haven't done any
of the work on it.  You might try checking with those responsible for
introducing DMA_ATTR_WEAK_ORDERING for the SPARC architecture.

- Alex