[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3789288.SQMGBHfSc0@sifl>
Date: Fri, 04 Sep 2015 11:52:01 -0700 (PDT)
From: Paul Moore <paul@...l-moore.com>
To: Richard Guy Briggs <rgb@...hat.com>
Cc: linux-audit@...hat.com, linux-kernel@...r.kernel.org,
sgrubb@...hat.com, eparis@...hat.com, v.rathor@...il.com,
ctcard@...mail.com
Subject: Re: [PATCH V1] audit: try harder to send to auditd upon netlink failure
On Friday, September 04, 2015 05:14:54 AM Richard Guy Briggs wrote:
> There are several reports of the kernel losing contact with auditd ...
Even if this doesn't completely solve the problem, I like the extra reporting
and robustness of this change. Some comments inline ...
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 1c13e42..4ee114a 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -404,19 +404,54 @@ static void audit_printk_skb(struct sk_buff *skb)
> audit_hold_skb(skb);
> }
>
> +static char *audit_strerror(int err)
> +{
> + switch (err) {
> + case -ECONNREFUSED:
> + return "ECONNREFUSED";
> + case -EPERM:
> + return "EPERM";
> + case -ENOMEM:
> + return "ENOMEM";
> + case -EAGAIN:
> + return "EAGAIN";
> + case -ERESTARTSYS:
> + return "ERESTARTSYS";
> + case -EINTR:
> + return "EINTR";
> + default:
> + return "(other)";
> + }
> +}
See comments below.
> static void kauditd_send_skb(struct sk_buff *skb)
> {
> int err;
> + int attempts = 0;
> +#define AUDITD_RETRIES 5
> +
> +restart:
> /* take a reference in case we can't send it and we want to hold it */
> skb_get(skb);
Should the restart label go after the skb_get() call? It seems like if we
ever jump to restart we could end up needlessly bumping the skb's refcnt.
> err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
> if (err < 0) {
> BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
> + pr_err("netlink_unicast sending to audit_pid=%d returned error: %d,
%s\n"
> + , audit_pid, err, audit_strerror(err));
This is a style nit, but please put the comma on the preceding line. I know
why you did it, but it bothers me.
I'm also debating if audit_strerror() is worth it. I agree with you that it
is a good idea to indicate the specific error code, I'm just not sure we need
to bother translating that into a proper errno name. Can you think of some
reason why we would need the errno name as opposed to the integer?
> if (audit_pid) {
> - pr_err("*NO* daemon at audit_pid=%d\n", audit_pid);
> - audit_log_lost("auditd disappeared");
> - audit_pid = 0;
> - audit_sock = NULL;
> + if (err == -ECONNREFUSED || err == -EPERM
> + || ++attempts >= AUDITD_RETRIES) {
> + audit_log_lost("audit_pid=%d reset");
> + audit_pid = 0;
> + audit_sock = NULL;
> + } else {
> + pr_warn("re-scheduling(#%d) write to audit_pid=%d\n"
> + , attempts, audit_pid);
Same thing with the comma.
> + set_current_state(TASK_INTERRUPTIBLE);
> + schedule();
> + __set_current_state(TASK_RUNNING);
> + goto restart;
See my comment above about the skb's reference count.
> + }
> }
> /* we might get lucky and get this in the next auditd */
> audit_hold_skb(skb);
--
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists