lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 7 Sep 2015 03:16:31 -0400
From:	Richard Guy Briggs <rgb@...hat.com>
To:	Paul Moore <paul@...l-moore.com>
Cc:	linux-audit@...hat.com, linux-kernel@...r.kernel.org,
	sgrubb@...hat.com, eparis@...hat.com, v.rathor@...il.com,
	ctcard@...mail.com
Subject: Re: [PATCH V1] audit: try harder to send to auditd upon netlink
 failure

On 15/09/04, Paul Moore wrote:
> On Friday, September 04, 2015 05:14:54 AM Richard Guy Briggs wrote:
> > There are several reports of the kernel losing contact with auditd ...
> 
> Even if this doesn't completely solve the problem, I like the extra reporting 
> and robustness of this change.  Some comments inline ...
> 
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index 1c13e42..4ee114a 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -404,19 +404,54 @@ static void audit_printk_skb(struct sk_buff *skb)
> >  	audit_hold_skb(skb);
> >  }
> > 
> > +static char *audit_strerror(int err)
> > +{
> > +	switch (err) {
> > +	case -ECONNREFUSED:
> > +		return "ECONNREFUSED";
> > +	case -EPERM:
> > +		return "EPERM";
> > +	case -ENOMEM:
> > +		return "ENOMEM";
> > +	case -EAGAIN:
> > +		return "EAGAIN";
> > +	case -ERESTARTSYS:
> > +		return "ERESTARTSYS";
> > +	case -EINTR:
> > +		return "EINTR";
> > +	default:
> > +		return "(other)";
> > +	}
> > +}
> 
> See comments below.
> 
> >  static void kauditd_send_skb(struct sk_buff *skb)
> >  {
> >  	int err;
> > +	int attempts = 0;
> > +#define AUDITD_RETRIES 5
> > +
> > +restart:
> >  	/* take a reference in case we can't send it and we want to hold it */
> >  	skb_get(skb);
> 
> Should the restart label go after the skb_get() call?  It seems like if we 
> ever jump to restart we could end up needlessly bumping the skb's refcnt.

I checked specifically for this and all the paths through
netlink_unicast() that return an error consume the skb.  In fact, all
paths through netlink_unicast() consume the skb.

> >  	err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
> >  	if (err < 0) {
> >  		BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
> > +		pr_err("netlink_unicast sending to audit_pid=%d returned error: %d, 
> %s\n"
> > +		       , audit_pid, err, audit_strerror(err));
> 
> This is a style nit, but please put the comma on the preceding line.  I know 
> why you did it, but it bothers me.

I blame Hugh Redelmeier with whom I worked on the FreeS/WAN project, who
has been heavily involved with the ANSI C standards committee.  ;-)    I
should be sticking with the prevailing standard of the existing code.

> I'm also debating if audit_strerror() is worth it.  I agree with you that it 
> is a good idea to indicate the specific error code, I'm just not sure we need 
> to bother translating that into a proper errno name.  Can you think of some 
> reason why we would need the errno name as opposed to the integer?

Other than convenience, not really.  The only reason other than that
would be if an unexpected error code is returned it will be mover
obvious to me when reviewing a user report.  Other than that, it is
complete without the text.

> >  		if (audit_pid) {
> > -			pr_err("*NO* daemon at audit_pid=%d\n", audit_pid);
> > -			audit_log_lost("auditd disappeared");
> > -			audit_pid = 0;
> > -			audit_sock = NULL;
> > +			if (err == -ECONNREFUSED || err == -EPERM
> > +			    || ++attempts >= AUDITD_RETRIES) {
> > +				audit_log_lost("audit_pid=%d reset");
> > +				audit_pid = 0;
> > +				audit_sock = NULL;
> > +			} else {
> > +				pr_warn("re-scheduling(#%d) write to audit_pid=%d\n"
> > +					, attempts, audit_pid);
> 
> Same thing with the comma.
> 
> > +				set_current_state(TASK_INTERRUPTIBLE);
> > +				schedule();
> > +				__set_current_state(TASK_RUNNING);
> > +				goto restart;
> 
> See my comment above about the skb's reference count.
> 
> > +			}
> >  		}
> >  		/* we might get lucky and get this in the next auditd */
> >  		audit_hold_skb(skb);
> 
> -- 
> paul moore
> www.paul-moore.com
> 

- RGB

--
Richard Guy Briggs <rbriggs@...hat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ