lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <FB42D5CCD7B5934EB1827DB5ED9B850E07085D7F@TK5EX14MBXC104.redmond.corp.microsoft.com>
Date:	Tue, 15 Feb 2011 16:22:20 +0000
From:	KY Srinivasan <kys@...rosoft.com>
To:	Greg KH <gregkh@...e.de>
CC:	Jiri Slaby <jirislaby@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
	"virtualization@...ts.osdl.org" <virtualization@...ts.osdl.org>
Subject: RE: [PATCH 2/3]: Staging: hv: Use native wait primitives



> -----Original Message-----
> From: Greg KH [mailto:gregkh@...e.de]
> Sent: Tuesday, February 15, 2011 9:03 AM
> To: KY Srinivasan
> Cc: Jiri Slaby; linux-kernel@...r.kernel.org; devel@...uxdriverproject.org;
> virtualization@...ts.osdl.org
> Subject: Re: [PATCH 2/3]: Staging: hv: Use native wait primitives
> 
> On Tue, Feb 15, 2011 at 01:35:56PM +0000, KY Srinivasan wrote:
> >
> >
> > > -----Original Message-----
> > > From: Jiri Slaby [mailto:jirislaby@...il.com]
> > > Sent: Tuesday, February 15, 2011 4:21 AM
> > > To: KY Srinivasan
> > > Cc: gregkh@...e.de; linux-kernel@...r.kernel.org;
> > > devel@...uxdriverproject.org; virtualization@...ts.osdl.org
> > > Subject: Re: [PATCH 2/3]: Staging: hv: Use native wait primitives
> > >
> > > On 02/11/2011 06:59 PM, K. Y. Srinivasan wrote:
> > > > In preperation for getting rid of the osd layer; change
> > > > the code to use native wait interfaces. As part of this,
> > > > fixed the buggy implementation in the osd_wait_primitive
> > > > where the condition was cleared potentially after the
> > > > condition was signalled.
> > > ...
> > > > @@ -566,7 +567,11 @@ int vmbus_establish_gpadl(struct vmbus_channel
> > > *channel, void *kbuffer,
> > > >
> > > >  		}
> > > >  	}
> > > > -	osd_waitevent_wait(msginfo->waitevent);
> > > > +	wait_event_timeout(msginfo->waitevent,
> > > > +				msginfo->wait_condition,
> > > > +				msecs_to_jiffies(1000));
> > > > +	BUG_ON(msginfo->wait_condition == 0);
> > >
> > > The added BUG_ONs all over the code look scary. These shouldn't be
> > > BUG_ONs at all. You should maybe warn and bail out, but not kill the
> > > whole machine.
> >
> > This is Linux code running as a guest on a Windows host; and so the guest
> cannot
> > tolerate a failure of the host. In the cases where I have chosen to BUG_ON,
> there
> > is no reasonable recovery possible when the host is non-functional (as
> determined
> > by a non-responsive host).
> 
> If you have a non-responsive host, wouldn't that imply that this guest
> code wouldn't run at all?  :)

The fact  that on a particular transaction the host has not responded within an expected
time interval does not necessarily  mean that the guest code would not be running. There may be 
issues on the host side that may be either transient or permanent that may cause problems like
this. Keep in mind, HyperV is a type 1 hypervisor that would schedule all VMs including the host
and so, guest would get scheduled.

> 
> Having BUG_ON() in drivers is not a good idea either way.  Please remove
> these in future patches.

In situations where there is not a reasonable rollback strategy (for
instance in one of the cases, we are granting access to the guest
physical pages to the host) we really have only 2 options:

1) Wait until the host responds. This wait could potentially be unbounded
and in fact this  was the way the code was to begin with. One of the reviewers
had suggested that unbounded wait was to be corrected.
2) Wait for a specific period and if the host does not respond
within a reasonable period, kill the guest since there is no recovery
possible.

I chose option 2, as part of addressing some of the prior review
comments. If the consensus now is to go back to option 1, I am fine with that;
I will send you a patch to rectify this.

Regards,

K. Y
  
> 
> > > And looking at the code, more appropriate would be completion instead of
> > > wait events.
> > >
> > > And msecs_to_jiffies(1000) == HZ.
> >
> > Agreed. In this first round of cleanup, I chose to keep the primitives
> > as they were in osd.c. Greg, if it is ok with you, I will send you a
> > patch that fixes these issues on top of the patches I have already
> > sent.
> 
> Yes, that is fine.
> 
> thanks,
> 
> greg k-h

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ