[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8735cagcgi.fsf@oracle.com>
Date: Thu, 29 Sep 2022 13:19:57 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Paul Moore <paul@...l-moore.com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-audit@...hat.com,
eparis@...hat.com, linux-kernel@...r.kernel.org,
boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH 2/3] audit: annotate branch direction for audit_in_mask()
Paul Moore <paul@...l-moore.com> writes:
> On Tue, Sep 27, 2022 at 7:00 PM Ankur Arora <ankur.a.arora@...cle.com> wrote:
>>
>> With sane audit rules, audit logging would only be triggered
>> infrequently. Keeping this in mind, annotate audit_in_mask() as
>> unlikely() to allow the compiler to pessimize the call to
>> audit_filter_rules().
>>
>> This allows GCC to invert the branch direction for the audit_filter_rules()
>> basic block in this loop:
>>
>> list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) {
>> if (audit_in_mask(&e->rule, major) &&
>> audit_filter_rules(tsk, &e->rule, ctx, NULL,
>> &state, false)) {
>> ...
>>
>> such that it executes the common case in a straight line fashion.
>>
>> On a Skylakex system change in getpid() latency (all results
>> aggregated across 12 boot cycles):
>>
>> Min Mean Median Max pstdev
>> (ns) (ns) (ns) (ns)
>>
>> - 196.63 207.86 206.60 230.98 (+- 3.92%)
>> + 173.11 182.51 179.65 202.09 (+- 4.34%)
>>
>> Performance counter stats for 'bin/getpid' (3 runs) go from:
>> cycles 805.58 ( +- 4.11% )
>> instructions 1654.11 ( +- .05% )
>> IPC 2.06 ( +- 3.39% )
>> branches 430.02 ( +- .05% )
>> branch-misses 1.55 ( +- 7.09% )
>> L1-dcache-loads 440.01 ( +- .09% )
>> L1-dcache-load-misses 9.05 ( +- 74.03% )
>>
>> to:
>> cycles 706.13 ( +- 4.13% )
>> instructions 1654.70 ( +- .06% )
>> IPC 2.35 ( +- 4.25% )
>> branches 430.99 ( +- .06% )
>> branch-misses 0.50 ( +- 2.00% )
>> L1-dcache-loads 440.02 ( +- .07% )
>> L1-dcache-load-misses 5.22 ( +- 82.75% )
>>
>> (Both aggregated over 12 boot cycles.)
>>
>> cycles: performance improves on average by ~100 cycles/call. IPC
>> improves commensurately. Two reasons for this improvement:
>>
>> * one fewer branch mispred: no obvious reason for this
>> branch-miss reduction. There is no significant change in
>> basic-block structure (apart from the branch inversion.)
>>
>> * the direction of the branch for the call is now inverted, so it
>> chooses the not-taken direction more often. The issue-latency
>> for not-taken branches is often cheaper.
>>
>> Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
>> ---
>> kernel/auditsc.c | 15 ++++++++-------
>> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> I generally dislike merging likely()/unlikely() additions to code
> paths that can have varying levels of performance depending on runtime
> configuration.
I think that's fair, and in this particular case the benchmark is quite
contrived.
But, just to elaborate a bit more on why that unlikely() clause made
sense to me: it seems to me that audit typically would be triggered for
control syscalls and the ratio between control and non-control ones
would be fairly lopsided.
Let me see if I can rewrite the conditional in a different way to get a
similar effect but I suspect that might be even more compiler dependent.
Also, let me run the audit-testsuite this time. Is there a good test
there that you would recommend that might serve as a more representative
workload?
Thanks
Ankur
> While I appreciate the work you are doing to improve
> audit performance, I don't think this is something I want to merge,
> I'm sorry.
>
>> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
>> index 533b087c3c02..bf26f47b5226 100644
>> --- a/kernel/auditsc.c
>> +++ b/kernel/auditsc.c
>> @@ -789,7 +789,7 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
>> return AUDIT_STATE_BUILD;
>> }
>>
>> -static int audit_in_mask(const struct audit_krule *rule, unsigned long val)
>> +static bool audit_in_mask(const struct audit_krule *rule, unsigned long val)
>> {
>> int word, bit;
>>
>> @@ -850,12 +850,13 @@ static void audit_filter_syscall(struct task_struct *tsk,
>>
>> rcu_read_lock();
>> list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) {
>> - if (audit_in_mask(&e->rule, major) &&
>> - audit_filter_rules(tsk, &e->rule, ctx, NULL,
>> - &state, false)) {
>> - rcu_read_unlock();
>> - ctx->current_state = state;
>> - return;
>> + if (unlikely(audit_in_mask(&e->rule, major))) {
>> + if (audit_filter_rules(tsk, &e->rule, ctx, NULL,
>> + &state, false)) {
>> + rcu_read_unlock();
>> + ctx->current_state = state;
>> + return;
>> + }
>> }
>> }
>> rcu_read_unlock();
>> --
>> 2.31.1
--
ankur
Powered by blists - more mailing lists