netdev - Re: [PATCH RFC 2/2] virtio_net: enable probing for NEEDS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAF=yD-+E_+AGtjM5MeXkre9T9BMK3KK06yn=zbuyiXexfTxp1Q@mail.gmail.com>
Date:   Tue, 29 Aug 2017 18:38:31 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        Jason Wang <jasowang@...hat.com>,
        virtualization@...ts.linux-foundation.org,
        Willem de Bruijn <willemb@...gle.com>,
        virtio-dev@...ts.oasis-open.org
Subject: Re: [PATCH RFC 2/2] virtio_net: enable probing for NEEDS_RESET support

On Tue, Aug 29, 2017 at 5:13 PM, Michael S. Tsirkin <mst@...hat.com> wrote:
> On Tue, Aug 29, 2017 at 05:02:29PM -0400, Willem de Bruijn wrote:
>> + virtio-dev
>>
>> On Tue, Aug 29, 2017 at 4:38 PM, Michael S. Tsirkin <mst@...hat.com> wrote:
>> > On Tue, Aug 29, 2017 at 04:27:41PM -0400, Willem de Bruijn wrote:
>> >> On Tue, Aug 29, 2017 at 4:16 PM, Michael S. Tsirkin <mst@...hat.com> wrote:
>> >> > On Tue, Aug 29, 2017 at 04:07:59PM -0400, Willem de Bruijn wrote:
>> >> >> From: Willem de Bruijn <willemb@...gle.com>
>> >> >>
>> >> >> Implement a mechanism to signal that a virtio device implements the
>> >> >> VIRTIO_CONFIG_S_NEEDS_RESET command.
>> >> >>
>> >> >> Testing for VIRTIO_CONFIG_S_NEEDS_RESET support by issuing the request
>> >> >> and verifying the reset state would require an expensive state change.
>> >> >>
>> >> >> To avoid that, add a status bit to the feature register used during
>> >> >> feature negotiation between host and guest.
>> >> >>
>> >> >> Set the bit for virtio-net, which supports the feature.
>> >> >>
>> >> >> Signed-off-by: Willem de Bruijn <willemb@...gle.com>
>> >> >
>> >> > All virtio 1 devices have the reset feature
>> >>
>> >> I don't quite follow. No device drivers implement this request currently?
>> >
>> > Depends. Spec 1.0 describes the bit and that driver can respond
>> > by reseting the device. You seem to do something else
>> > in this patchset, but as designed in 1.0 it does not seem to need
>> > a feature bit.
>>
>> I see. So support is designed to be best effort?
>>
>> The feature bit is only needed if driver support is optional.
>>
>> The ack response is necessary if the device acts differently
>> depending on whether the reset happened. The device has
>> to reset its local state, too, so I think that this is needed.
>
> That reset should only happen when guest driver resets the device.
> And spec already has a mechanism for that anyway.

And the device can read the ring state to see whether it has reset,
perhaps.

>>
>> >> > so maybe guest does
>> >> > not need this flag. Does device need it? Does device really
>> >> > care that guest can't recover?
>> >>
>> >> If all device drivers support it, then the flag is not needed.
>> >>
>> >> But as long as legacy device drivers can exist that do not support
>> >> this feature, it has to have a way of differentiating between the two.
>> >
>> > Why? Device won't set this unless it's in a bad state. In that case,
>> > setting the bit is harmless even if drivers ignore it.
>>
>> The goal is for the device to be able to rely on the driver reset to get
>> to a good state even if it gets it into a bad state.
>>
>> That allows it to implement optimizations that could get it into that bad
>> state.
>
> I see. I don't think this is what the need reset was designed for.
>
> We can extend it to cover this case but let's add a bit more
> documentation then.

Okay.

> And in particular won't it be better if we could just reset one ring,
> and not the whole device state? This might be nicer so flows on other
> rings aren't disrupted.

Indeed. But that would require a different request, then? It also
depends on the use case. A full device reset is a big hammer,
but if used only to get out of rare edge cases, it is good enough.

>>
>> In particular, in the edge case where the device performs backpressure,
>> takes the descriptor out of the avail ring and does not immediately post
>> it to the used ring.
>>
>> A reset will make the guest free all delayed packets and treat any
>> unsent and unacknowledged as network drops. This allows the
>> device to indeed drop long delayed packets when they eventually
>> surface (e.g., leave a qdisc queue).
>
> In this particular case, won't it be easier for device to just
> report all packets as used, without involving the guest?

When the device can just iterate over the outstanding packet, yeah,
that's actually simpler.

>> This is of course not safe with
>> zerocopy packets.
>
>
> I wonder if we can teach kernel to drop zero copy packets too.

Your point about changing frags[] underneath a cloned skb really does
make that hard. We might be able to mitigate individual specific cases
of high latency, such as TC queue occupancy.

>
> --
> MST