linux-kernel - Re: [PATCHv4 01/17] zram: switch to non-atomic entry locking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cr3ekbanqyzmeqeohwwlfqyced2fvtqh6kzermiahm4blonhd2@fod2itisib5f>
Date: Mon, 3 Feb 2025 16:11:59 +0900
From: Sergey Senozhatsky <senozhatsky@...omium.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, 
	Minchan Kim <minchan@...nel.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv4 01/17] zram: switch to non-atomic entry locking

On (25/02/03 12:26), Sergey Senozhatsky wrote:
> On (25/01/31 14:55), Andrew Morton wrote:
> > > +static void zram_slot_write_lock(struct zram *zram, u32 index)
> > > +{
> > > +	atomic_t *lock = &zram->table[index].lock;
> > > +	int old = atomic_read(lock);
> > > +
> > > +	do {
> > > +		if (old != ZRAM_ENTRY_UNLOCKED) {
> > > +			cond_resched();
> > > +			old = atomic_read(lock);
> > > +			continue;
> > > +		}
> > > +	} while (!atomic_try_cmpxchg(lock, &old, ZRAM_ENTRY_WRLOCKED));
> > > +}
> > 
> > I expect that if the calling userspace process has realtime policy (eg
> > SCHED_FIFO) then the cond_resched() won't schedule SCHED_NORMAL tasks
> > and this becomes a busy loop.  And if the machine is single-CPU, the
> > loop is infinite.
> 
> So for that scenario to happen zram needs to see two writes() to the same
> index (page) simultaneously?  Or read() and write() on the same index (page)
> concurrently?

Just to put more details:

1) zram always works with only one particular zram entry index, which
   is provided by an upper layer (e.g. bi_sector)

2) for read

 read()
  zram_read(page, index)
   rlock entry[index]
   decompress entry zshandle page
   runlock entry[index]

   for write

 write()
  zram_write(page, index)
   len = compress page obj
   handle = zsmalloc len
   wlock entry[index]
   entry.handle = handle
   entry.len = len
   wunlock entry[index]

3) at no point zram locks more than entry index
  a) there is no entry cross-locking (entries are not hierarchical)
  b) there is no entry lock nesting (including recursion)


I guess where we actually need zram entry lock is writeback and
recompression.  Writeback moves object from zsmalloc pool to actual
physical storage, freeing zsmalloc memory after that and setting
zram entry[index] handle to the backikng device's block idx, which
needs synchronization.  Recompression does a similar thing, it frees
the old zsmalloc handle and stores recompressed objects under new
zsmalloc handle, it thus updates zram entry[index] handle to point
to the new location, which needs to be synchronized.