netdev - Re: [RFC][PATCH] xfrm: do not leak ESRCH to user space

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1224810300.8667.26.camel@sebastian.kern.oss.ntt.co.jp>
Date:	Fri, 24 Oct 2008 10:05:00 +0900
From:	Fernando Luis Vázquez Cao 
	<fernando@....ntt.co.jp>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org
Subject: Re: [RFC][PATCH] xfrm: do not leak ESRCH to user space

On Thu, 2008-10-23 at 14:11 -0700, David Miller wrote:
> From: Fernando Luis Vázquez Cao <fernando@....ntt.co.jp>
> Date: Thu, 23 Oct 2008 23:27:19 +0900
> 
> > I noticed that, under certain conditions, ESRCH can be leaked from the
> > xfrm layer to user space through sys_connect. In particular, this seems
> > to happen reliably when the kernel fails to resolve a template either
> > because the AF_KEY receive buffer being used by racoon is full or
> > because the SA entry we are trying to use is in XFRM_STATE_EXPIRED
> > state.
> > 
> > However, since this could be a transient issue it could be argued that
> > EAGAIN would be more appropriate. Besides this error code is not even
> > documented in the man page for sys_connect (as of man-pages 3.07).
> > 
> > What is the expected behavior (I could not find anything in the RFCs)?
> > Should we just fix the connect(2) man page instead?
> 
> I think this case requires some care.
> 
> -EAGAIN tells the caller that it is a temporary failure and that
> retrying can be expected to succeed eventually (some resource is not
> available at the moment).  So applications loop when they see this
> error returned, they will try again.
> 
> But that's not what is happening when ESRCH is signalled.  We found
> no matching policy, and we've done nothing to make such a policy
> be found in the (near) future.  It is more of a hard failure, which
> should not necessarily be retried over and over again.
> 
> So converting this to -EAGAIN doesn't seem correct at all.

That would be so if -ESRCH did not happen to be a transient error.
Looking at the code, the window during which an entry is in
XFRM_STATE_EXPIRED state seems to be about 2 seconds in the worst case.
Connection attempts before and after that window would most likely
result in a successful connection or -EAGAIN, respectively. Would not it
make sense to return -EAGAIN also during that 2 seconds window?

Regarding the case when the kernel does not initiate a SA resolution
because the the AF_KEY receive buffer is full, I think it fits into the
"some resource is not available at the moment" definition for -EAGAIN.
As the buffer gets emptied chances are the future attempts will succeed.

This behavior is kind of confusing, but if deemed correct I think it
deserves to be properly documented in the respective man page. Do you
want me to do that or should the error code we return to user space be
changed in any of the two cases mentioned above?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html