[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150921164330.GD26912@kroah.com>
Date: Mon, 21 Sep 2015 09:43:30 -0700
From: Greg KH <gregkh@...uxfoundation.org>
To: KY Srinivasan <kys@...rosoft.com>
Cc: Olaf Hering <olaf@...fle.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
"apw@...onical.com" <apw@...onical.com>,
"vkuznets@...hat.com" <vkuznets@...hat.com>,
"jasowang@...hat.com" <jasowang@...hat.com>
Subject: Re: [PATCH 2/5] hv: add helpers to handle hv_util device state
On Mon, Sep 21, 2015 at 04:34:56PM +0000, KY Srinivasan wrote:
>
>
> > -----Original Message-----
> > From: Olaf Hering [mailto:olaf@...fle.de]
> > Sent: Monday, September 21, 2015 3:26 AM
> > To: KY Srinivasan <kys@...rosoft.com>; Greg KH
> > <gregkh@...uxfoundation.org>
> > Cc: linux-kernel@...r.kernel.org; devel@...uxdriverproject.org;
> > apw@...onical.com; vkuznets@...hat.com; jasowang@...hat.com
> > Subject: Re: [PATCH 2/5] hv: add helpers to handle hv_util device state
> >
> > On Sun, Sep 20, Greg KH wrote:
> >
> > > Just use a lock, that's what it is there for.
> >
> > How would that help? It might help because it enforces ordering. But
> > that requires that all three utils get refactored to deal with the
> > introduced locking. I will let KY comment on this.
> >
> > The issue I see with fcopy is that after or while fcopy_respond_to_host
> > runs an interrupt triggers which also calls into
> > hv_fcopy_onchannelcallback. It was most likely caused by a logic change
> > in "recent" vmbus updates because this did not happen before. At least,
> > the fcopy hang was not seen earler. Maybe the bug did just not trigger
> > up to now for other reasons...
>
> All util channels are bound to CPU 0. Just forcing all activity on CPU 0 may be the
> simplest solution here. Besides, these are not performance critical services anyway.
>
> The problem you may have run into could be related to the fact that we could potentially
> run the polling function on a CPU other than CPU 0.
Again, this sounds like a locking issue, you have multiple
threads/processes accessing the same data. Even if you bind it all to
one cpu, this shows a real design problem.
Use a lock to fix this properly. That way, when you stop using only one
CPU, the code will "just work", and if you are really only on one CPU
today, there will not be any lock contention.
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists