lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <597A1A30-E61B-484B-AB79-31AB83AE7A50@oracle.com>
Date:	Mon, 1 Nov 2010 17:12:48 -0400
From:	Chuck Lever <chuck.lever@...cle.com>
To:	Trond Myklebust <Trond.Myklebust@...app.com>
Cc:	Nick Bowler <nbowler@...iptictech.com>,
	LKML Kernel <linux-kernel@...r.kernel.org>,
	"J. Bruce Fields" <bfields@...hat.com>,
	Linux NFS Mailing List <linux-nfs@...r.kernel.org>
Subject: Re: Regression, bisected: sqlite locking failure on nfs


On Nov 1, 2010, at 4:35 PM, Trond Myklebust wrote:

> On Mon, 2010-11-01 at 15:55 -0400, Chuck Lever wrote:
>> On Nov 1, 2010, at 3:48 PM, Trond Myklebust wrote:
>> 
>>> On Mon, 2010-11-01 at 15:45 -0400, Chuck Lever wrote:
>>>> On Nov 1, 2010, at 3:42 PM, Trond Myklebust wrote:
>>>> 
>>>>> On Mon, 2010-11-01 at 15:22 -0400, Trond Myklebust wrote:
>>>>>> I suspect nlmclnt_lookup_host() is to blame. It appears to be the _only_
>>>>>> thing in the kernel that actually sets this 'srcaddr' field, and it sets
>>>>>> it to
>>>>>> 
>>>>>> const struct sockaddr source = {
>>>>>> 	.sa_family      = AF_UNSPEC,
>>>>>> };
>>>>>> 
>>>>>> You triggered the bug by removing the line
>>>>>> 
>>>>>> 	transport->srcaddr.ss_family = family;
>>>>>> 
>>>>>> from xs_create_sock().
>>>>>> 
>>>>>> Trond
>>>>> 
>>>>> Does this fix the regression?
>>>>> 
>>>>> Trond
>>>>> 
>>>>> ----------------------------------------------------------------------------------------------
>>>>> NLM: Fix a regression in lockd
>>>>> 
>>>>> From: Trond Myklebust <Trond.Myklebust@...app.com>
>>>>> 
>>>>> Nick Bowler reports:
>>>>> There are no unusual messages on the client... but I just logged into
>>>>> the server and I see lots of messages of the following form:
>>>>> 
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> 
>>>>> Bisected to commit 9247685088398cf21bcb513bd2832b4cd42516c4 (SUNRPC:
>>>>> Properly initialize sock_xprt.srcaddr in all cases)
>>>>> 
>>>>> Apparently, removing the 'transport->srcaddr.ss_family = family' from
>>>>> xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly
>>>>> initialising the srcaddr family to AF_UNSPEC.
>>>>> 
>>>>> Reported-by: Nick Bowler <nbowler@...iptictech.com>
>>>>> Signed-off-by: Trond Myklebust <Trond.Myklebust@...app.com>
>>>>> ---
>>>>> 
>>>>> fs/lockd/host.c |    5 -----
>>>>> 1 files changed, 0 insertions(+), 5 deletions(-)
>>>>> 
>>>>> 
>>>>> diff --git a/fs/lockd/host.c b/fs/lockd/host.c
>>>>> index 25e21e4..9ff0c0e 100644
>>>>> --- a/fs/lockd/host.c
>>>>> +++ b/fs/lockd/host.c
>>>>> @@ -238,9 +238,6 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
>>>>> 				     const char *hostname,
>>>>> 				     int noresvport)
>>>>> {
>>>>> -	const struct sockaddr source = {
>>>>> -		.sa_family	= AF_UNSPEC,
>>>>> -	};
>>>>> 	struct nlm_lookup_host_info ni = {
>>>>> 		.server		= 0,
>>>>> 		.sap		= sap,
>>>>> @@ -249,8 +246,6 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
>>>>> 		.version	= version,
>>>>> 		.hostname	= hostname,
>>>>> 		.hostname_len	= strlen(hostname),
>>>>> -		.src_sap	= &source,
>>>>> -		.src_len	= sizeof(source),
>>>>> 		.noresvport	= noresvport,
>>>>> 	};
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> What about that memcpy() in nlm_lookup_host()?  With this patch, wouldn't it be copying garbage into the host's srcaddr field?
>>>> 
>>> 
>>> It shouldn't. ni->src_len is now zero.
>> 
>> Yech.  All this still assumes that ANYADDR is all zeroes, and that the memory this is going into is already initialized to zeroes.  It's asking for trouble if we re-arrange all this someday.
>> 
>> I've got an untested one line patch that should fix this for any upper layer caller.  Posting now.
>> 
> 
> No! Upper layer callers should simply not be setting .saddress. Pretty
> much the only thing that _should_ be setting .saddress is the lockd
> callback, and possibly the nfsv4 server callback.

Then the take 2 patch description should be updated to reflect your actual intent.  I don't think it's especially clear from your patch description why "nlmclnt_lookup_host() ... initialising the srcaddr family to AF_UNSPEC" is incorrect from an architectural standpoint.  The specifics of "fixing bad behavior" are obvious already, but the architectural problem needs to be documented.

Otherwise I'm OK with the general idea of take 2, since it passes in a NULL pointer when no special bind address is desired, as other callers of rpc_create() do.  Of course, take 2 has to pass Nick's test case too.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ