[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1521019321.8260.1.camel@redhat.com>
Date: Wed, 14 Mar 2018 10:22:01 +0100
From: Mohammed Gamal <mgamal@...hat.com>
To: Stephen Hemminger <stephen@...workplumber.org>
Cc: netdev@...r.kernel.org, sthemmin@...rosoft.com, otubo@...hat.com,
linux-kernel@...r.kernel.org, devel@...uxdriverproject.org,
vkuznets@...hat.com, davem@...emloft.net
Subject: Re: [PATCH] hv_netvsc: Make sure out channel is fully opened on send
On Tue, 2018-03-13 at 12:35 -0700, Stephen Hemminger wrote:
> On Tue, 13 Mar 2018 20:06:50 +0100
> Mohammed Gamal <mgamal@...hat.com> wrote:
>
> > Dring high network traffic changes to network interface parameters
> > such as number of channels or MTU can cause a kernel panic with a
> > NULL
> > pointer dereference. This is due to netvsc_device_remove() being
> > called and deallocating the channel ring buffers, which can then be
> > accessed by netvsc_send_pkt() before they're allocated on calling
> > netvsc_device_add()
> >
> > The patch fixes this problem by checking the channel state and
> > returning
> > ENODEV if not yet opened. We also move the call to
> > hv_ringbuf_avail_percent()
> > which may access the uninitialized ring buffer.
> >
> > Signed-off-by: Mohammed Gamal <mgamal@...hat.com>
> > ---
> > drivers/net/hyperv/netvsc.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc.c
> > b/drivers/net/hyperv/netvsc.c
> > index 0265d70..44a8358 100644
> > --- a/drivers/net/hyperv/netvsc.c
> > +++ b/drivers/net/hyperv/netvsc.c
> > @@ -757,7 +757,7 @@ static inline int netvsc_send_pkt(
> > struct netdev_queue *txq = netdev_get_tx_queue(ndev,
> > packet->q_idx);
> > u64 req_id;
> > int ret;
> > - u32 ring_avail = hv_ringbuf_avail_percent(&out_channel-
> > >outbound);
> > + u32 ring_avail;
> >
> > nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT;
> > if (skb)
> > @@ -773,7 +773,7 @@ static inline int netvsc_send_pkt(
> >
> > req_id = (ulong)skb;
> >
> > - if (out_channel->rescind)
> > + if (out_channel->rescind || out_channel->state !=
> > CHANNEL_OPENED_STATE)
> > return -ENODEV;
> >
> > if (packet->page_buf_cnt) {
> > @@ -791,6 +791,7 @@ static inline int netvsc_send_pkt(
> > VMBUS_DATA_PACKET_FLAG_COMP
> > LETION_REQUESTED);
> > }
> >
> > + ring_avail = hv_ringbuf_avail_percent(&out_channel-
> > >outbound);
> > if (ret == 0) {
> > atomic_inc_return(&nvchan->queue_sends);
> >
>
> Thanks for your patch. Yes there are races with the current update
> logic. The root cause goes higher up in the flow; the send queues
> should
> be stopped before netvsc_device_remove is called. Solving it where
> you tried
> to is racy and not going to work reliably.
>
> Network patches should go to netdev@...r.kernel.org
>
> You can't move the ring_avail check until after the vmbus_sendpacket
> because
> that will break the flow control logic.
>
Why? I don't see ring_avail being used before that point.
> Instead, you should just move the avail_read check until just after
> the existing rescind
> check.
>
> Also, you shouldn't need to check for OPENED_STATE, just rescind is
> enough.
That rarely mitigated the race. channel->rescind flag is set on vmbus
exit - called on module unload - and when a rescind offer is received
from the host, which AFAICT doesn't happen on every call to
netvsc_device_remove, so it's quite possible that the ringbuffer is
accessed before it's allocated again on channel open and hence the
check for OPENED_STAT - which is only set after all vmbus data is
initialized.
>
>
>
>
Powered by blists - more mailing lists