lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 12 Apr 2013 11:00:39 -0700
From:	Casey Schaufler <casey@...aufler-ca.com>
To:	Łukasz Stelmach <l.stelmach@...sung.com>
CC:	Tomasz Stanislawski <t.stanislaws@...sung.com>,
	linux-security-module@...r.kernel.org, m.szyprowski@...sung.com,
	kyungmin.park@...sung.com, r.krypa@...sung.com,
	linux-kernel@...r.kernel.org,
	Karol Lewandowski <k.lewandowsk@...sung.com>,
	Piotr Bereza <p.bereza@...sung.com>,
	Casey Schaufler <casey@...aufler-ca.com>
Subject: Re: [RFC] security: smack: add hash table for smack for quick label
 searching

On 4/12/2013 8:12 AM, Łukasz Stelmach wrote:
> It was <2013-04-11 czw 19:59>, when Casey Schaufler wrote:
>> On 4/11/2013 1:46 AM, Tomasz Stanislawski wrote:
>>> Hi everyone,
>>> I am a developer working on optimization of the TIZEN system.
>>> Recently, I've discovered a performance issue in SMACK subsystem.
>>> I used the PERF tool to find performance bottlenecks.
>>>
>>> The test scenario was simple. Run multiple applications and
>>> see what the system does using the following command:
>>>
>>>  perf record -a -g
>>>
>>> Next, see the results with the command:
>>>
>>>  perf report -s symbol -g graph,0.5
>>>
>>> Among the many lines, the following ones are especially interesting:
>>>
>>>      5.96%  [k] smk_find_entry                                                                         
>>>             |          
>>>             |--5.06%-- smk_access
>>>             |          |          
>>>             |           --4.99%-- smk_curacc
>>>             |                     |          
>>>             |                     |--3.79%-- smack_ptrace_access_check
> [...]
>
>>> To sum up, the result indicates that the CPU spents circa 8% (2.16% + 5.96%)
>>> of cycles searching for a SMACK label in the smk_find_entry function.
>>> The function iterates through smack_known_list to find an entry.
>>> The further analysis showed that the size of the list can reach even 600.
>>> I measured that it takes circa 200 tries to find an entry on average.
>>> The value was computed as a total number iterations in the smk_find_entry's
>>> loop divided by the times smk_find_entry was called in a time-window of
>>> the length of 10 seconds.
>>>
>>> IMO, this is a serious performance issue which scales badly with
>>> a complexity of the system.
>>>
>>> I implemented a patch that makes a use of a hash table to quicken searching
>>> for SMACK's labels.  The patch is rebased onto the latest v3.9-rc6 kernel.
>>> The code is thread-safe (I hope) because it shares the RCU mechanism
>>> and locks with smack_known_list.
> [...]
>
>>> I hope you find the measurement and the patch useful.
>>> All comments are welcome.
>> NAK
>>
>> There will be no hash tables in Smack.
>>
>> The correct solution is simple.
>>
>> In the task_smack structure there are two Smack label pointers,
>> smk_task and smk_forked. Replace these fields with pointers to
>> the smack_known structures that contain the Smack label pointers
>> used today. This will require trivial changes throughout the
>> Smack code to accommodate the type change and a few logical twists
>> around smk_import. It will eliminate the need for smk_lookup_entry.
> Allow me to join the conversation with an interesting observation.
> I don't know whether hash table is a good solution to the problem Tomasz
> has mentioned it definitely improves performance during loading rules.

Loading rules is a rare event and an uninteresting case.

> Conditions:
>
> --8<---------------cut here---------------start------------->8---
> # number of subjects
> sh-4.1# cat /etc/smack/accesses.d/* | cut -d\  -f1 | sort -u | wc -l
> 379
> # number of rules
> sh-4.1# cat /etc/smack/accesses.d/* | wc -l   
> 23895
> --8<---------------cut here---------------end--------------->8---
>
> Without the patch
>
> --8<---------------cut here---------------start------------->8---
> sh-4.1# time smackctl apply
>
> real    0m1.255s
> user    0m0.115s
> sys     0m0.895s
> --8<---------------cut here---------------end--------------->8---
>
> perfs output is:
>
> --8<---------------cut here---------------start------------->8---
> # ========
> # captured on: Tue Jan  1 09:52:14 2013
> # os release : 3.4.5
> # arch : armv7l
> # nrcpus online : 4
> # nrcpus avail : 4
> # cpudesc : ARMv7 Processor rev 0 (v7l)
> # total memory : 1025456 kB
> # cmdline : /perf record -a -g -- /usr/bin/smackctl apply 
> # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 45, 46, 47, 48 }
> # ========
> #
> # Events: 1K cycles
> #
> # Overhead                              Symbol
> # ........  ..................................
> #
>     47.49%  [k] strcmp                        
>             |          
>             |--41.27%-- smk_find_entry
>             |          |          
>             |          |--28.87%-- smk_import_entry
>             |          |          smk_import
>             |          |          smk_fill_rule
>             |          |          smk_parse_long_rule
>             |          |          smk_write_rules_list.clone.3
>             |          |          smk_write_load2
>             |          |          vfs_write
>             |          |          sys_write
>             |          |          ret_fast_syscall
>             |          |          
>             |           --12.41%-- smk_write_rules_list.clone.3
>             |                     smk_write_load2
>             |                     vfs_write
>             |                     sys_write
>             |                     ret_fast_syscall
>             |          
>             |--4.41%-- smk_import_entry
>             |          smk_import
>             |          smk_fill_rule
>             |          smk_parse_long_rule
>             |          smk_write_rules_list.clone.3
>             |          smk_write_load2
>             |          vfs_write
>             |          sys_write
>             |          ret_fast_syscall
>             |          
>              --1.80%-- smk_write_rules_list.clone.3
>                        smk_write_load2
>                        vfs_write
>                        sys_write
>                        ret_fast_syscall
>     19.21%  [k] smk_find_entry                
>             |          
>             |--12.44%-- smk_import_entry
>             |          smk_import
>             |          smk_fill_rule
>             |          smk_parse_long_rule
>             |          smk_write_rules_list.clone.3
>             |          smk_write_load2
>             |          vfs_write
>             |          sys_write
>             |          ret_fast_syscall
>             |          
>              --6.78%-- smk_write_rules_list.clone.3
>                        smk_write_load2
>                        vfs_write
>                        sys_write
>                        ret_fast_syscall
>
> [ less than 10% per entry ]
>
> --8<---------------cut here---------------end--------------->8---
>
> With the patch
>
> --8<---------------cut here---------------start------------->8---
> sh-4.1# time smackctl apply                                  
>
> real    0m0.349s
> user    0m0.095s
> sys     0m0.250s
> --8<---------------cut here---------------end--------------->8---
>
> still quite long but 3.59598853868 times better.
>
> --8<---------------cut here---------------start------------->8---
> # ========
> # captured on: Tue Jan  1 10:40:45 2013
> # os release : 3.4.5
> # arch : armv7l
> # nrcpus online : 4
> # nrcpus avail : 4
> # cpudesc : ARMv7 Processor rev 0 (v7l)
> # total memory : 1025544 kB
> # cmdline : /perf record -a -g -- smackctl apply 
> # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 1, 2, 3, 4 }
> # ========
> #
> # Events: 606  cycles
> #
> # Overhead                                 Symbol
> # ........  .....................................
> #
>     12.90%  [k] smk_write_rules_list.isra.9      
>             |
>             --- smk_write_load2
>                 vfs_write
>                 sys_write
>                 ret_fast_syscall
>      7.86%  [k] exynos_enter_idle                
>             |
>             --- cpuidle_enter
>                 cpuidle_enter_state
>                 cpuidle_idle_call
>                 cpu_idle
>                |          
>                 --7.69%-- secondary_start_kernel
>                           0x4066df54
>      6.95%  [k] v7_dma_inv_range                 
>             |          
>             |--3.70%-- __dma_page_cpu_to_dev
>             |          arm_dma_map_page
>             |          arm_dma_map_sg
>             |          dw_mci_pre_dma_transfer
>             |          dw_mci_pre_req
>             |          mmc_pre_req
>             |          mmc_start_req
>             |          mmc_blk_issue_rw_rq
>             |          mmc_blk_issue_rq
>             |          mmc_queue_thread
>             |          kthread
>             |          do_exit
>             |          
>              --3.25%-- __dma_page_dev_to_cpu
>                        arm_dma_unmap_page
>                        arm_dma_unmap_sg
>                        dw_mci_post_req
>                        mmc_post_req
>                        mmc_start_req
>                        mmc_blk_issue_rw_rq
>                        mmc_blk_issue_rq
>                        mmc_queue_thread
>                        kthread
>                        do_exit
>      3.56%  [k] _raw_spin_unlock_irq             
>             |          
>             |--1.40%-- mmc_blk_issue_rw_rq
>             |          mmc_blk_issue_rq
>             |          mmc_queue_thread
>             |          kthread
>             |          do_exit
>             |          
>              --1.09%-- finish_task_switch
>                        __schedule
>                        |          
>                         --0.75%-- schedule
>
>
>      3.45%  [.] 0x0000118c                       
>      2.94%  [.] strncpy                          
>      2.67%  [.] vfprintf                         
>      2.44%  [k] strcmp                           
>             |
>             --- smk_find_entry
>                 smk_import_entry
>                 smk_import
>                 smk_fill_rule
>                 smk_parse_long_rule
>                 smk_write_rules_list.isra.9
>                 smk_write_load2
>                 vfs_write
>                 sys_write
>                 ret_fast_syscall
>
> [ less than 2.5%  per entry ]
> --8<---------------cut here---------------end--------------->8---
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ