[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200403133826.GA25401@andrea>
Date: Fri, 3 Apr 2020 15:38:26 +0200
From: Andrea Parri <parri.andrea@...il.com>
To: Vitaly Kuznetsov <vkuznets@...hat.com>
Cc: linux-kernel@...r.kernel.org,
"K . Y . Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>, linux-hyperv@...r.kernel.org,
Michael Kelley <mikelley@...rosoft.com>,
Dexuan Cui <decui@...rosoft.com>,
Boqun Feng <boqun.feng@...il.com>
Subject: Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU
channel lists with a global array of channels
On Mon, Mar 30, 2020 at 02:45:54PM +0200, Vitaly Kuznetsov wrote:
> Andrea Parri <parri.andrea@...il.com> writes:
>
> >> Correct me if I'm wrong, but currently vmbus_chan_sched() accesses
> >> per-cpu list of channels on the same CPU so we don't need a spinlock to
> >> guarantee that during an interrupt we'll be able to see the update if it
> >> happened before the interrupt (in chronological order). With a global
> >> list of relids, who guarantees that an interrupt handler on another CPU
> >> will actually see the modified list?
> >
> > Thanks for pointing this out!
> >
> > The offer/resume path presents implicit full memory barriers, program
> > -order after the array store which should guarantee the visibility of
> > the store to *all* CPUs before the offer/resume can complete (c.f.,
> >
> > tools/memory-model/Documentation/explanation.txt, Sect. #13
> >
> > and assuming that the offer/resume for a channel must complete before
> > the corresponding handler, which seems to be the case considered that
> > some essential channel fields are initialized only later...)
> >
> > IIUC, the spin lock approach you suggested will work and be "simpler";
> > an obvious side effect would be, well, a global synchronization point
> > in vmbus_chan_sched()...
> >
> > Thoughts?
>
> This is, of course, very theoretical as if we're seeing an interrupt for
> a channel at the same time we're writing its relid we're already in
> trouble. I can, however, try to suggest one tiny improvement:
Indeed. I think the idea (still quite informal) is that:
1) the mapping of the channel relid is propagated to (visible from)
all CPUs before add_channel_work is queued (full barrier in
queue_work()),
2) add_channel_work is queued before the channel is opened (aka,
before the channel ring buffer is allocate/initalized and the
OPENCHANNEL msg is sent and acked from Hyper-V, cf. OPEN_STATE),
3) the channel is opened before Hyper-V can start sending interrupts
for the channel, and hence before vmbus_chan_sched() can find the
channel relid in recv_int_page set,
4) vmbus_chan_sched() finds the channel's relid in recv_int_page
set before it search/load from the channel array (full barrier
in sync_test_and_clear_bit()).
This is for the "normal"/not resuming from hibernation case; for the
latter, notice that:
a) vmbus_isr() (and vmbus_chan_sched()) can not run until when
vmbus_bus_resume() has finished (@resume_noirq callback),
b) vmbus_bus_resume() can not complete before nr_chan_fixup_on_resume
equals 0 in check_ready_for_resume_event().
(and check_ready_for_resume_event() does also provides a full barrier).
If makes sense to you, I'll try to add some of the above in comments.
Thanks,
Andrea
>
> vmbus_chan_sched() now clean the bit in the event page and then searches
> for a channel with this relid; in case we allow the search to
> (temporary) fail we can reverse the logic: search for the channel and
> clean the bit only if we succeed. In case we fail, next time (next IRQ)
> we'll try again and likely succeed. The only purpose is to make sure no
> interrupts are ever lost. This may be an overkill, we may want to try
> to count how many times (if ever) this happens.
>
> Just a thought though.
>
> --
> Vitaly
>
Powered by blists - more mailing lists