lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49306EA8.1050801@cosmosbay.com>
Date:	Fri, 28 Nov 2008 23:20:24 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Al Viro <viro@...IV.linux.org.uk>,
	David Miller <davem@...emloft.net>,
	"Rafael J. Wysocki" <rjw@...k.pl>, linux-kernel@...r.kernel.org,
	kernel-testers@...r.kernel.org, Mike Galbraith <efault@....de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linux Netdev List <netdev@...r.kernel.org>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Christoph Hellwig <hch@...radead.org>, rth@...ddle.net,
	ink@...assic.park.msu.ru
Subject: Re: [PATCH 6/6] fs: Introduce kern_mount_special() to mount	special
 vfs

Ingo Molnar a écrit :
> * Al Viro <viro@...IV.linux.org.uk> wrote:
> 
>> On Thu, Nov 27, 2008 at 12:32:59AM +0100, Eric Dumazet wrote:
>>> This function arms a flag (MNT_SPECIAL) on the vfs, to avoid
>>> refcounting on permanent system vfs.
>>> Use this function for sockets, pipes, anonymous fds.
>> IMO that's pushing it past the point of usefulness; unless you can show
>> that this really gives considerable win on pipes et.al. *AND* that it
>> doesn't hurt other loads...
> 
> The numbers look pretty convincing:
> 
>>>  (socket8 bench result : from 2.94s to 2.23s)
> 
> And i wouldnt expect it to hurt real-filesystem workloads.
> 
> Here's the contemporary trace of a typical ext3- sys_open():
> 
>  0)               |  sys_open() {
>  0)               |    do_sys_open() {
>  0)               |      getname() {
>  0)      0.367 us |        kmem_cache_alloc();
>  0)               |        strncpy_from_user(); {
>  0)               |          _cond_resched() {
>  0)               |            need_resched() {
>  0)      0.363 us |              constant_test_bit();
>  0)      1. 47 us |            }
>  0)      1.815 us |          }
>  0)      2.587 us |        }
>  0)      4. 22 us |      }
>  0)               |      alloc_fd() {
>  0)      0.480 us |        _spin_lock();
>  0)      0.487 us |        expand_files();
>  0)      2.356 us |      }
>  0)               |      do_filp_open() {
>  0)               |        path_lookup_open() {
>  0)               |          get_empty_filp() {
>  0)      0.439 us |            kmem_cache_alloc();
>  0)               |            security_file_alloc() {
>  0)      0.316 us |              cap_file_alloc_security();
>  0)      1. 87 us |            }
>  0)      3.189 us |          }
>  0)               |          do_path_lookup() {
>  0)      0.366 us |            _read_lock();
>  0)               |            path_walk() {
>  0)               |              __link_path_walk() {
>  0)               |                inode_permission() {
>  0)               |                  ext3_permission() {
>  0)      0.441 us |                    generic_permission();
>  0)      1.247 us |                  }
>  0)               |                  security_inode_permission() {
>  0)      0.411 us |                    cap_inode_permission();
>  0)      1.186 us |                  }
>  0)      3.555 us |                }
>  0)               |                do_lookup() {
>  0)               |                  __d_lookup() {
>  0)      0.486 us |                    _spin_lock();
>  0)      1.369 us |                  }
>  0)      0.442 us |                  __follow_mount();
>  0)      3. 14 us |                }
>  0)               |                path_to_nameidata() {
>  0)      0.476 us |                  dput();
>  0)      1.235 us |                }
>  0)               |                inode_permission() {
>  0)               |                  ext3_permission() {
>  0)               |                    generic_permission() {
>  0)               |                      in_group_p() {
>  0)      0.410 us |                        groups_search();
>  0)      1.172 us |                      }
>  0)      1.994 us |                    }
>  0)      2.789 us |                  }
>  0)               |                  security_inode_permission() {
>  0)      0.454 us |                    cap_inode_permission();
>  0)      1.238 us |                  }
>  0)      5.262 us |                }
>  0)               |                do_lookup() {
>  0)               |                  __d_lookup() {
>  0)      0.480 us |                    _spin_lock();
>  0)      1.621 us |                  }
>  0)      0.456 us |                  __follow_mount();
>  0)      3.215 us |                }
>  0)               |                path_to_nameidata() {
>  0)      0.420 us |                  dput();
>  0)      1.193 us |                }
>  0) +   23.551 us |              }
>  0)               |              path_put() {
>  0)      0.420 us |                dput();
>  0)               |                mntput() {
>  0)      0.359 us |                  mntput_no_expire();
>  0)      1. 50 us |                }
>  0)      2.544 us |              }
>  0) +   27.253 us |            }
>  0) +   28.850 us |          }
>  0) +   33.217 us |        }
>  0)               |        may_open() {
>  0)               |          inode_permission() {
>  0)               |            ext3_permission() {
>  0)      0.480 us |              generic_permission();
>  0)      1.229 us |            }
>  0)               |            security_inode_permission() {
>  0)      0.405 us |              cap_inode_permission();
>  0)      1.196 us |            }
>  0)      3.589 us |          }
>  0)      4.600 us |        }
>  0)               |        nameidata_to_filp() {
>  0)               |          __dentry_open() {
>  0)               |            file_move() {
>  0)      0.470 us |              _spin_lock();
>  0)      1.243 us |            }
>  0)               |            security_dentry_open() {
>  0)      0.344 us |              cap_dentry_open();
>  0)      1.139 us |            }
>  0)      0.412 us |            generic_file_open();
>  0)      0.561 us |            file_ra_state_init();
>  0)      5.714 us |          }
>  0)      6.483 us |        }
>  0) +   46.494 us |      }
>  0)      0.453 us |      inotify_dentry_parent_queue_event();
>  0)      0.403 us |      inotify_inode_queue_event();
>  0)               |      fd_install() {
>  0)      0.440 us |        _spin_lock();
>  0)      1.247 us |      }
>  0)               |      putname() {
>  0)               |        kmem_cache_free() {
>  0)               |          virt_to_head_page() {
>  0)      0.369 us |            constant_test_bit();
>  0)      1. 23 us |          }
>  0)      1.738 us |        }
>  0)      2.422 us |      }
>  0) +   60.560 us |    }
>  0) +   61.368 us |  }
> 
> and here's a sys_close():
> 
>  0)               |  sys_close() {
>  0)      0.540 us |    _spin_lock();
>  0)               |    filp_close() {
>  0)      0.437 us |      dnotify_flush();
>  0)      0.401 us |      locks_remove_posix();
>  0)      0.349 us |      fput();
>  0)      2.679 us |    }
>  0)      4.452 us |  }
> 
> i'd be surprised to see a flag to show up in that codepath. Eric, does 
> your testing confirm that?

On a socket/pipe, definitly no, because inode->i_sb->s_flags is not contended.

But on a shared inode, it might hurt :

offsetof(struct inode, i_count)=0x24
offsetof(struct inode, i_lock)=0x70
offsetof(struct inode, i_sb)=0x9c
offsetof(struct inode, i_writecount)=0x144

So i_sb sits in a probably contended cache line 

I wonder why i_writecount sits so far from i_count, that doesnt make sense.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ