linux-kernel - Re: [PATCH] audit: accelerate audit rule filter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <10e0cc00a83d4214b7bcfda62c36bb20@huawei.com>
Date:   Thu, 2 Dec 2021 02:25:01 +0000
From:   "zhaozixuan (C)" <zhaozixuan2@...wei.com>
To:     Paul Moore <paul@...l-moore.com>
CC:     "eparis@...hat.com" <eparis@...hat.com>,
        "linux-audit@...hat.com" <linux-audit@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] audit: accelerate audit rule filter

>  On Mon, Nov 29, 2021 at 2:35 AM zhaozixuan (C) <zhaozixuan2@...wei.com> wrote:
> > >On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@...wei.com> wrote:
> > >> We used lat_syscall of lmbench3 to test the performance impact of 
> > >> this patch. We changed the number of rules and run lat_syscall with 
> > >> 1000 repetitions at each test. Syscalls measured by lat_syscall are 
> > >> not monitored by rules.
> > >>
> > >> Before this optimization:
> > >>
> > >>              null     read    write     stat    fstat      open
> > >>   0 rules  1.87ms   2.74ms   2.56ms   26.31ms  4.13ms   69.66ms
> > >>  10 rules  2.15ms   3.13ms   3.32ms   26.99ms  4.16ms   74.70ms
> > >>  20 rules  2.45ms   3.97ms   3.82ms   27.05ms  4.60ms   76.35ms
> > >>  30 rules  2.64ms   4.52ms   3.95ms   30.30ms  4.94ms   78.94ms
> > >>  40 rules  2.83ms   4.97ms   4.23ms   32.16ms  5.40ms   81.88ms
> > >>  50 rules  3.00ms   5.30ms   4.84ms   33.49ms  5.79ms   83.20ms
> > >> 100 rules  4.24ms   9.75ms   7.42ms   37.68ms  6.55ms   93.70ms
> > >> 160 rules  5.50ms   16.89ms  12.18ms  51.53ms  17.45ms  155.40ms
> > >>
> > >> After this optimization:
> > >>
> > >>              null     read    write     stat    fstat      open
> > >>   0 rules  1.81ms   2.84ms   2.42ms  27.70ms   4.15ms   69.10ms
> > >>  10 rules  1.97ms   2.83ms   2.69ms  27.70ms   4.15ms   69.30ms
> > >>  20 rules  1.72ms   2.91ms   2.41ms  26.49ms   3.91ms   71.19ms
> > >>  30 rules  1.85ms   2.94ms   2.48ms  26.27ms   3.97ms   71.43ms
> > >>  40 rules  1.88ms   2.94ms   2.78ms  26.85ms   4.08ms   69.79ms
> > >>  50 rules  1.86ms   3.17ms   3.08ms  26.25ms   4.03ms   72.32ms
> > >> 100 rules  1.84ms   3.00ms   2.81ms  26.25ms   3.98ms   70.25ms
> > >> 160 rules  1.92ms   3.32ms   3.06ms  26.81ms   4.57ms   71.41ms
> > >>
> > >> As the result shown above, the syscall latencies increase as  the 
> > >> number  of rules increases, while with the patch the latencies remain stable.
> > >>  This could help when a user adds many audit rules for purposes 
> > >> (such as  attack tracing or process behavior recording) but suffers 
> > >> from low performance.
> > >
> > >I have general concerns about trading memory and complexity for performance gains, but beyond that the numbers you posted above don't yet make sense to me.
> >
> > Thanks for your reply.
> >
> > The memory cost of this patch is less than 4KB (1820 bytes on x64 and
> >  3640 bytes on compatible x86_64) which is trivial in many cases.
> >  Besides, syscalls are called frequently on a system so a small  
> > optimization could bring a good income.
> 
> The tradeoff still exists, even though you feel it is worthwhile.
> 
> > >Why are the latency increases due to rule count not similar across the different syscalls? For example, I would think that if the increase in syscall latency was > >directly attributed to the audit rule processing then the increase on the "open" syscall should be similar to that of the "null" syscall.  In other phrasing, if we > >can process 160 rules in ~4ms in the "null" case, why does it take us ~86ms in the "open" case?
> >
> > As to the test result, we did some investigations and concluded two
> >  reasons:
> > 1. The chosen rule sets were not very suitable. Though they were not 
> > hit  by syscalls being measured, some of them were hit by other 
> > processes,  which reduced the system performance and affected the test 
> > result; 2. The routine of lat_syscall is much more complicated than we 
> > thought. It  called many other syscalls during the test, which may 
> > cause the result  not to be linear.
> >
> > Due to the reasons above, we did another test. We modified audit rule 
> > sets  and made sure they wouldn't be hit at runtime. Then, we added
> >  ktime_get_real_ts64 to auditsc.c to record the time of executing  
> > __audit_syscall_exit. We ran "stat" syscall 10000 times for each rule 
> > set  and recorded the time interval. The result is shown below:
> >
> > Before this optimization:
> >
> > rule set          time
> >   0 rules     3843.96ns
> >   1 rules    13119.08ns
> >  10 rules    14003.13ns
> >  20 rules    15420.18ns
> >  30 rules    17284.84ns
> >  40 rules    19010.67ns
> >  50 rules    21112.63ns
> > 100 rules    25815.02ns
> > 130 rules    29447.09ns
> >
> > After this optimization:
> >
> >  rule set          time
> >   0 rules     3597.78ns
> >   1 rules    13498.73ns
> >  10 rules    13122.57ns
> >  20 rules    12874.88ns
> >  30 rules    14351.99ns
> >  40 rules    14181.07ns
> >  50 rules    13806.45ns
> > 100 rules    13890.85ns
> > 130 rules    14441.45ns
> >
> > As the result showed, the interval is linearly increased before  
> > optimization while the interval remains stable after optimization. 
> > Note  that audit skips some operations if there are no rules, so there 
> > is a gap  between 0 rule and 1 rule set.
> 
> It looks like a single rule like the one below could effectively disable this optimization, is that correct?
> 
>   % auditctl -a exit,always -F uid=1001
>   % auditctl -l
>   -a always,exit -S all -F uid=1001

Yes, rules like this one which monitors all syscalls could disable the
 optimization. The number of the global array could exponentially increase
 if we want to handle more audit fields. However, we don't that kind of
 rule is practical because they might generate a great number of logs and
 even lead to log loss.