[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MW2PR2101MB10521D93B6CDE4D7D9C435E3D7CA0@MW2PR2101MB1052.namprd21.prod.outlook.com>
Date: Sun, 29 Mar 2020 03:49:06 +0000
From: Michael Kelley <mikelley@...rosoft.com>
To: Andrea Parri <parri.andrea@...il.com>,
vkuznets <vkuznets@...hat.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
KY Srinivasan <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
Dexuan Cui <decui@...rosoft.com>,
Boqun Feng <boqun.feng@...il.com>
Subject: RE: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel
lists with a global array of channels
From: Andrea Parri <parri.andrea@...il.com> Sent: Saturday, March 28, 2020 11:22 AM
>
> > Correct me if I'm wrong, but currently vmbus_chan_sched() accesses
> > per-cpu list of channels on the same CPU so we don't need a spinlock to
> > guarantee that during an interrupt we'll be able to see the update if it
> > happened before the interrupt (in chronological order). With a global
> > list of relids, who guarantees that an interrupt handler on another CPU
> > will actually see the modified list?
>
> Thanks for pointing this out!
>
> The offer/resume path presents implicit full memory barriers, program
> -order after the array store which should guarantee the visibility of
> the store to *all* CPUs before the offer/resume can complete (c.f.,
>
> tools/memory-model/Documentation/explanation.txt, Sect. #13
>
> and assuming that the offer/resume for a channel must complete before
> the corresponding handler, which seems to be the case considered that
> some essential channel fields are initialized only later...)
>
> IIUC, the spin lock approach you suggested will work and be "simpler";
> an obvious side effect would be, well, a global synchronization point
> in vmbus_chan_sched()...
>
> Thoughts?
>
Note that this global array is accessed overwhelmingly with reads. Once
The system is initialized, channels only rarely come-or-go, so writes will
be rare. So the array can be cached in all CPUs, and we need to avoid
any global synchronization points. Leveraging the full semantics of the
memory model (across all architectures) seems like the right approach
to preserve a high level of concurrency.
Michael
Powered by blists - more mailing lists