lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150130005158.GE24182@sejong>
Date:	Fri, 30 Jan 2015 09:51:58 +0900
From:	Namhyung Kim <namhyung@...nel.org>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Jiri Olsa <jolsa@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	David Ahern <dsahern@...il.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Andi Kleen <andi@...stfloor.org>,
	Stephane Eranian <eranian@...gle.com>,
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH 29/42] perf tools: Protect dso cache fd with a mutex

On Thu, Jan 29, 2015 at 01:23:33PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 29, 2015 at 10:19:38PM +0900, Namhyung Kim escreveu:
> > On Thu, Jan 29, 2015 at 09:31:07AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Jan 29, 2015 at 05:07:10PM +0900, Namhyung Kim escreveu:
> > > > When dso cache is accessed in multi-thread environment, it's possible
> > > > to close other dso->data.fd during operation due to open file limit.
> > > > Protect the file descriptors using a separate mutex.
> > > > 
> > > > Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> > > > ---
> > > >  tools/perf/tests/dso-data.c |   5 ++
> > > >  tools/perf/util/dso.c       | 136 +++++++++++++++++++++++++++++---------------
> > > >  2 files changed, 94 insertions(+), 47 deletions(-)
> > > > 
> > > > diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
> > > > index caaf37f079b1..0276e7d2d41b 100644
> > > > --- a/tools/perf/tests/dso-data.c
> > > > +++ b/tools/perf/tests/dso-data.c
> > > > @@ -111,6 +111,9 @@ int test__dso_data(void)
> > > >  	memset(&machine, 0, sizeof(machine));
> > > >  
> > > >  	dso = dso__new((const char *)file);
> > > > +	TEST_ASSERT_VAL("failed to get dso", dso);
> > > > +
> > > > +	dso->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> > > >  
> > > >  	/* Basic 10 bytes tests. */
> > > >  	for (i = 0; i < ARRAY_SIZE(offsets); i++) {
> > > > @@ -199,6 +202,8 @@ static int dsos__create(int cnt, int size)
> > > >  
> > > >  		dsos[i] = dso__new(file);
> > > >  		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
> > > > +
> > > > +		dsos[i]->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> > > 
> > > Those two are unrelated, please put them in a separate patch, one that I
> > > can even cherrypick ahead of the other patches.
> > 
> > It's a consequence of changing dso__data_read_offset() not to call
> > dso__data_fd() due to a performance reason.  The binary_type was
> > determined during the dso__data_fd() before, but now it needs to be
> > set explicitly for this test.
> > 
> > In the original code, it was called everytime we access to the dso
> > cache just to check an error, I guess.  But it's enough to check the
> > status field.
> 
> Are you saying that this test should not rely on some function that is
> called somewhere down the functions it uses and should instead do as you
> do above?
> 
> I.e. if that is the case, then this stands out as a separate patch, if
> not, if this is indeed really related to this patch (at first sight it
> doesn't look like) then this explanation you give should be in the
> patch comment log.

I think it'd be better adding dso__data_fd() after dso__new() as an
extra validation step.  With that we don't need to specify binary type
manually and have a more consistent usage pattern.  I'll do it as a
separate patch.


> 
> > > >  	return 0;
> > > > diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> > > > index 11ece224ef50..ae92046ae2c8 100644
> > > > --- a/tools/perf/util/dso.c
> > > > +++ b/tools/perf/util/dso.c
> > > > @@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
> > > >   */
> > > >  static LIST_HEAD(dso__data_open);
> > > >  static long dso__data_open_cnt;
> > > > +static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
> > > >  
> > > >  static void dso__list_add(struct dso *dso)
> > > >  {
> > > > @@ -240,7 +241,7 @@ static int do_open(char *name)
> > > >  		if (fd >= 0)
> > > >  			return fd;
> > > >  
> > > > -		pr_debug("dso open failed, mmap: %s\n",
> > > > +		pr_debug("dso open failed: %s\n",
> > > >  			 strerror_r(errno, sbuf, sizeof(sbuf)));
> > > >  		if (!dso__data_open_cnt || errno != EMFILE)
> > > 
> > > Ditto, another unrelated patch, please separate.
> > 
> > Ah, okay.  I kept it since it's just a small change.  But I'd like to
> > separate if it helps reviewing.
> 
> Thanks
> 
> > > >  			break;
> > > > @@ -382,7 +383,9 @@ static void check_data_close(void)
> > > >   */
> > > >  void dso__data_close(struct dso *dso)
> > > >  {
> > > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > >  	close_dso(dso);
> > > > +	pthread_mutex_unlock(&dso__data_open_lock);
> > > >  }
> > > >  
> > > >  /**
> > > > @@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > > >  	if (dso->data.status == DSO_DATA_STATUS_ERROR)
> > > >  		return -1;
> > > >  
> > > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > > +
> > > >  	if (dso->data.fd >= 0)
> > > >  		goto out;
> > > >  
> > > > @@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > > >  	else
> > > >  		dso->data.status = DSO_DATA_STATUS_ERROR;
> > > >  
> > > > +	pthread_mutex_unlock(&dso__data_open_lock);
> > > >  	return dso->data.fd;
> > > >  }
> > > >  
> > > > @@ -531,52 +537,66 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
> > > >  }
> > > >  
> > > >  static ssize_t
> > > > -dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> > > > +dso_cache__read(struct dso *dso, struct machine *machine,
> > > > +		u64 offset, u8 *data, ssize_t size)
> > > >  {
> > > >  	struct dso_cache *cache;
> > > >  	struct dso_cache *old;
> > > > -	ssize_t ret;
> > > > -
> > > > -	do {
> > > > -		u64 cache_offset;
> > > 
> > > While I understand that there was no need for this do { } while (0)
> > > construct in the first place, removing it in this patch is not
> > > interesting, as it is both unrelated to this patch and makes the it
> > > harder to review by just looking at the patch :-\ Please refrain from
> > > doing this in this patch.
> > 
> > Understood, sorry for bothering! :)
> 
> :-)
> 
> > > A later patch that does _just_ that could be done, if you feel like
> > > doing it.
> > 
> > Okay.
> > 
> > 
> > > 
> > > > +	ssize_t ret = -EINVAL;
> > > > +	u64 cache_offset;
> > > >  
> > > > -		ret = -ENOMEM;
> > > > +	cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > > > +	if (!cache)
> > > > +		return -ENOMEM;
> > > >  
> > > > -		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > > > -		if (!cache)
> > > > -			break;
> > > > +	cache_offset = offset & DSO__DATA_CACHE_MASK;
> > > >  
> > > > -		cache_offset = offset & DSO__DATA_CACHE_MASK;
> > > > -		ret = -EINVAL;
> > > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > >  
> > > > -		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> > > > -			break;
> > > > +	/*
> > > > +	 * dso->data.fd might be closed if other thread opened another
> > > > +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> > > > +	 */
> > > > +	if (dso->data.fd < 0) {
> > > > +		dso->data.fd = open_dso(dso, machine);
> > > 
> > > Also please consider adding a backpointer to machine in the dso object,
> > > since you need to reopen it, so that we don't have to go on passing
> > > machine around to dso_cache__read(), etc.
> > 
> > Yeah, it's a pain to passing a machine pointer.
> 
> Hey, so setting of dso->data.fd is protected in this function and we can
> be sure that it will not be closed _again_ just before we do that lseek
> and other operations, I guess so, just checking...

Right.  Accessing to dso->data.fd is safe only if it grabs the
dso__data_open_lock.


>  
> > > 
> > > This probably needs to be done in the patch that makes dso->data.fd to
> > > be closed due to limit.
> > 
> > I don't know which patch you are refering..  It already closes an fd
> > if it reaches the limit - what this patch does is protecting such
> > concurrent open and close when multi-thread is used.
> 
> Ok then, i.e.:
> 
> A) take the lock, close it if over the limit, drop the lock.
> 
> B) take the lock, check if it is closed, open if so, use it, drop the
> lock.
> 
> Is that right?

It's like this:

A) take the lock, check if it's still open, use it and drop the lock.

B) take the lock, it's closed, reopen it and check if it reaches the
limit.  Then close the first (kinda in LRU) dso to make next open()
succeeded.  Use it and drop the lock.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ