lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <871uag812q.fsf@samsung.com>
Date:	Fri, 12 Apr 2013 17:12:09 +0200
From:	l.stelmach@...sung.com (Łukasz Stelmach)
To:	Casey Schaufler <casey@...aufler-ca.com>
Cc:	Tomasz Stanislawski <t.stanislaws@...sung.com>,
	linux-security-module@...r.kernel.org, m.szyprowski@...sung.com,
	kyungmin.park@...sung.com, r.krypa@...sung.com,
	linux-kernel@...r.kernel.org,
	Karol Lewandowski <k.lewandowsk@...sung.com>,
	Piotr Bereza <p.bereza@...sung.com>
Subject: Re: [RFC] security: smack: add hash table for smack for quick label
 searching

It was <2013-04-11 czw 19:59>, when Casey Schaufler wrote:
> On 4/11/2013 1:46 AM, Tomasz Stanislawski wrote:
>> Hi everyone,
>> I am a developer working on optimization of the TIZEN system.
>> Recently, I've discovered a performance issue in SMACK subsystem.
>> I used the PERF tool to find performance bottlenecks.
>>
>> The test scenario was simple. Run multiple applications and
>> see what the system does using the following command:
>>
>>  perf record -a -g
>>
>> Next, see the results with the command:
>>
>>  perf report -s symbol -g graph,0.5
>>
>> Among the many lines, the following ones are especially interesting:
>>
>>      5.96%  [k] smk_find_entry                                                                         
>>             |          
>>             |--5.06%-- smk_access
>>             |          |          
>>             |           --4.99%-- smk_curacc
>>             |                     |          
>>             |                     |--3.79%-- smack_ptrace_access_check

[...]

>>
>> To sum up, the result indicates that the CPU spents circa 8% (2.16% + 5.96%)
>> of cycles searching for a SMACK label in the smk_find_entry function.
>> The function iterates through smack_known_list to find an entry.
>> The further analysis showed that the size of the list can reach even 600.
>> I measured that it takes circa 200 tries to find an entry on average.
>> The value was computed as a total number iterations in the smk_find_entry's
>> loop divided by the times smk_find_entry was called in a time-window of
>> the length of 10 seconds.
>>
>> IMO, this is a serious performance issue which scales badly with
>> a complexity of the system.
>>
>> I implemented a patch that makes a use of a hash table to quicken searching
>> for SMACK's labels.  The patch is rebased onto the latest v3.9-rc6 kernel.
>> The code is thread-safe (I hope) because it shares the RCU mechanism
>> and locks with smack_known_list.

[...]

>>
>> I hope you find the measurement and the patch useful.
>> All comments are welcome.
>
> NAK
>
> There will be no hash tables in Smack.
>
> The correct solution is simple.
>
> In the task_smack structure there are two Smack label pointers,
> smk_task and smk_forked. Replace these fields with pointers to
> the smack_known structures that contain the Smack label pointers
> used today. This will require trivial changes throughout the
> Smack code to accommodate the type change and a few logical twists
> around smk_import. It will eliminate the need for smk_lookup_entry.

Allow me to join the conversation with an interesting observation.
I don't know whether hash table is a good solution to the problem Tomasz
has mentioned it definitely improves performance during loading rules.

Conditions:

--8<---------------cut here---------------start------------->8---
# number of subjects
sh-4.1# cat /etc/smack/accesses.d/* | cut -d\  -f1 | sort -u | wc -l
379
# number of rules
sh-4.1# cat /etc/smack/accesses.d/* | wc -l   
23895
--8<---------------cut here---------------end--------------->8---

Without the patch

--8<---------------cut here---------------start------------->8---
sh-4.1# time smackctl apply

real    0m1.255s
user    0m0.115s
sys     0m0.895s
--8<---------------cut here---------------end--------------->8---

perfs output is:

--8<---------------cut here---------------start------------->8---
# ========
# captured on: Tue Jan  1 09:52:14 2013
# os release : 3.4.5
# arch : armv7l
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : ARMv7 Processor rev 0 (v7l)
# total memory : 1025456 kB
# cmdline : /perf record -a -g -- /usr/bin/smackctl apply 
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 45, 46, 47, 48 }
# ========
#
# Events: 1K cycles
#
# Overhead                              Symbol
# ........  ..................................
#
    47.49%  [k] strcmp                        
            |          
            |--41.27%-- smk_find_entry
            |          |          
            |          |--28.87%-- smk_import_entry
            |          |          smk_import
            |          |          smk_fill_rule
            |          |          smk_parse_long_rule
            |          |          smk_write_rules_list.clone.3
            |          |          smk_write_load2
            |          |          vfs_write
            |          |          sys_write
            |          |          ret_fast_syscall
            |          |          
            |           --12.41%-- smk_write_rules_list.clone.3
            |                     smk_write_load2
            |                     vfs_write
            |                     sys_write
            |                     ret_fast_syscall
            |          
            |--4.41%-- smk_import_entry
            |          smk_import
            |          smk_fill_rule
            |          smk_parse_long_rule
            |          smk_write_rules_list.clone.3
            |          smk_write_load2
            |          vfs_write
            |          sys_write
            |          ret_fast_syscall
            |          
             --1.80%-- smk_write_rules_list.clone.3
                       smk_write_load2
                       vfs_write
                       sys_write
                       ret_fast_syscall
    19.21%  [k] smk_find_entry                
            |          
            |--12.44%-- smk_import_entry
            |          smk_import
            |          smk_fill_rule
            |          smk_parse_long_rule
            |          smk_write_rules_list.clone.3
            |          smk_write_load2
            |          vfs_write
            |          sys_write
            |          ret_fast_syscall
            |          
             --6.78%-- smk_write_rules_list.clone.3
                       smk_write_load2
                       vfs_write
                       sys_write
                       ret_fast_syscall

[ less than 10% per entry ]

--8<---------------cut here---------------end--------------->8---

With the patch

--8<---------------cut here---------------start------------->8---
sh-4.1# time smackctl apply                                  

real    0m0.349s
user    0m0.095s
sys     0m0.250s
--8<---------------cut here---------------end--------------->8---

still quite long but 3.59598853868 times better.

--8<---------------cut here---------------start------------->8---
# ========
# captured on: Tue Jan  1 10:40:45 2013
# os release : 3.4.5
# arch : armv7l
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : ARMv7 Processor rev 0 (v7l)
# total memory : 1025544 kB
# cmdline : /perf record -a -g -- smackctl apply 
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 1, 2, 3, 4 }
# ========
#
# Events: 606  cycles
#
# Overhead                                 Symbol
# ........  .....................................
#
    12.90%  [k] smk_write_rules_list.isra.9      
            |
            --- smk_write_load2
                vfs_write
                sys_write
                ret_fast_syscall
     7.86%  [k] exynos_enter_idle                
            |
            --- cpuidle_enter
                cpuidle_enter_state
                cpuidle_idle_call
                cpu_idle
               |          
                --7.69%-- secondary_start_kernel
                          0x4066df54
     6.95%  [k] v7_dma_inv_range                 
            |          
            |--3.70%-- __dma_page_cpu_to_dev
            |          arm_dma_map_page
            |          arm_dma_map_sg
            |          dw_mci_pre_dma_transfer
            |          dw_mci_pre_req
            |          mmc_pre_req
            |          mmc_start_req
            |          mmc_blk_issue_rw_rq
            |          mmc_blk_issue_rq
            |          mmc_queue_thread
            |          kthread
            |          do_exit
            |          
             --3.25%-- __dma_page_dev_to_cpu
                       arm_dma_unmap_page
                       arm_dma_unmap_sg
                       dw_mci_post_req
                       mmc_post_req
                       mmc_start_req
                       mmc_blk_issue_rw_rq
                       mmc_blk_issue_rq
                       mmc_queue_thread
                       kthread
                       do_exit
     3.56%  [k] _raw_spin_unlock_irq             
            |          
            |--1.40%-- mmc_blk_issue_rw_rq
            |          mmc_blk_issue_rq
            |          mmc_queue_thread
            |          kthread
            |          do_exit
            |          
             --1.09%-- finish_task_switch
                       __schedule
                       |          
                        --0.75%-- schedule


     3.45%  [.] 0x0000118c                       
     2.94%  [.] strncpy                          
     2.67%  [.] vfprintf                         
     2.44%  [k] strcmp                           
            |
            --- smk_find_entry
                smk_import_entry
                smk_import
                smk_fill_rule
                smk_parse_long_rule
                smk_write_rules_list.isra.9
                smk_write_load2
                vfs_write
                sys_write
                ret_fast_syscall

[ less than 2.5%  per entry ]
--8<---------------cut here---------------end--------------->8---

-- 
Łukasz Stelmach
Software wizzard
Samsung Poland R&D Center

Al. Armii Ludowej 26, 00-609 Warszawa
http://www.rd.samsung.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ