lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 Jan 2019 19:20:28 +0000
From:   Dexuan Cui <>
To:     kimbrownkd <>
CC:     Michael Kelley <>,
        Long Li <>,
        Sasha Levin <>,
        Stephen Hemminger <>,
        KY Srinivasan <>,
        Haiyang Zhang <>,
        "" <>,
        "" <>
Subject: RE: [PATCH] Drivers: hv: vmbus: Add mutex lock to channel show

> From: Kimberly Brown <>
> > ... 
> > But as you pointed, at least for sub-channels, channel->ringbuffer_page
> > can indeed disappear in vmbus_close() -> ... -> vmbus_free_ring(), and
> > the "attribute->show()" could crash when the race happens.
> > Adding channel_mutex here seems to be able to fix the race for
> > sub-channels, as the same channel_mutex is used in
> vmbus_disconnect_ring().
> >
> > For a primary channel, vmbus_close() -> vmbus_free_ring() can still
> > free channel->ringbuffer_page without notifying the "attribute->show()".
> > We may also need to acquire/release the channel_mutex in vmbus_close()?
> >
> > > Actually, there should be checks that "chan" is not null and that
> > > "chan->state" is CHANNEL_OPENED_STATE within the locked section. I'll
> > > need to fix that.
> > I suppose "chan" can not be NULL here (see the above).
> >
> > Checking "chan->state" may not help to completely fix the race, because
> > there is no locking/synchronization code in
> > vmbus_close_internal() when we test and change "channel->state".
> >
> The calls to vmbus_close_internal() for the subchannels and the primary
> channel are protected with channel_mutex in vmbus_disconnect_ring().
> This prevents "channel->state" from changing while "attribute->show()" is
> running.
Ah, I think you're right. 
> > I guess we may need to check if channel->ringbuffer_page is NULL in
> > the "attribute->show()".
> >
> For the primary channel, vmbus_free_ring() is called after the
> return from vmbus_disconnect_ring(). Therefore, the primary channel's
> state is changed before "channel->ringbuffer_page" is set to NULL.
> Checking the channel state should be sufficient to prevent the ring
> buffers from being freed while "attribute->show()" is running. The
> ring buffers can't be freed until the channel's state is changed, and
> the channel state change is protected by the mutex.
I think you're right (I noticed in a previous mail you mentioned you would
improve your patch to check "chan->state" with the mutax held).

> I think checking that "channel->ringbuffer_page" is not NULL would
> also work, but, as you stated, we would need to aquire/release
> channel_mutex in vmbus_close().
Then it looks unnecessary to check "channel->ringbuffer_page".
> > PS, to prove that a race condition does exist and can really cause a panic or
> > something, I usually add some msleep() delays in different paths so that I
> > can reproduce the crash every time I take a special action, e.g. when I read
> > the sysfs files of a NIC, I try to remove hv_netvsc driver. This way, I can prove
> > a patch can indeed help, at least it can fix the crash which would happen
> > without the patch. :-)
> >
> Thanks! I was able to free the ring buffers while "attribute->show()"
> was running, which caused a null pointer dereference bug. As expected,
> the mutex lock fixed it.

-- Dexuan

Powered by blists - more mailing lists