[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20150210122242.4eca36e5d9fd28d401f58513@linux-foundation.org>
Date: Tue, 10 Feb 2015 12:22:42 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: "Wang, Yalin" <Yalin.Wang@...ymobile.com>
Cc: "'viro@...iv.linux.org.uk'" <viro@...iv.linux.org.uk>,
"'linux-fsdevel@...r.kernel.org'" <linux-fsdevel@...r.kernel.org>,
"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
"Gao, Neil" <Neil.Gao@...ymobile.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC V2] test_bit before clear files_struct bits
(cc Linus for CPU-fu)
On Tue, 10 Feb 2015 15:11:37 +0800 "Wang, Yalin" <Yalin.Wang@...ymobile.com> wrote:
> add test_bit() before clear close_on_exec and open_fds,
> by trace __clear_bit(), these 2 place are false in most times,
> we test it so that we don't need clear_bit, and we can win
> in most time.
>
> ...
>
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -209,7 +209,8 @@ static inline void __set_close_on_exec(int fd, struct fdtable *fdt)
>
> static inline void __clear_close_on_exec(int fd, struct fdtable *fdt)
> {
> - __clear_bit(fd, fdt->close_on_exec);
> + if (test_bit(fd, fdt->close_on_exec))
> + __clear_bit(fd, fdt->close_on_exec);
> }
>
> static inline void __set_open_fd(int fd, struct fdtable *fdt)
> @@ -309,7 +310,7 @@ struct files_struct *dup_fd(struct files_struct *oldf, int *errorp)
> struct file *f = *old_fds++;
> if (f) {
> get_file(f);
> - } else {
> + } else if (test_bit(open_files - i, new_fdt->open_fds)) {
> /*
> * The fd may be claimed in the fd bitmap but not yet
> * instantiated in the files array if a sibling thread
The patch is good but I'm still wondering if any CPUs can do this
speedup for us. The CPU has to pull in the target word to modify the
bit and what it *could* do is to avoid dirtying the cacheline if it
sees that the bit is already in the desired state.
However somef elapsed-time testing I did on a couple of Intel
machines indicates that these CPUs don't perform that optimisation.
Perhaps there's some reason why they don't, dunno.
Still, I think we should encapsulate the above (common) pattern into
helper functions in include/linux/bitops.h because
- it's cleaner
- it's self-documenting
- it permits us to eliminate the if(test_bit) on any CPU which does
perform the optimisation internally, if such exists.
You actually have measurement results for these (and other)
set-bit-on-already-set-bit call sites. Please include all of that info
in the changelog.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists