[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87frg96uxg.wl-tiwai@suse.de>
Date: Mon, 09 Jun 2025 13:00:59 +0200
From: Takashi Iwai <tiwai@...e.de>
To: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: Takashi Iwai <tiwai@...e.de>,
Jaroslav Kysela <perex@...ex.cz>,
Takashi Iwai <tiwai@...e.com>,
linux-kernel@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org,
linux-sound@...r.kernel.org,
Herve Codina <herve.codina@...tlin.com>,
Mark Brown <broonie@...nel.org>
Subject: Re: [PATCH v2] ALSA: pcm: Convert multiple {get/put}_user to user_access_begin/user_access_end()
On Mon, 09 Jun 2025 12:02:00 +0200,
Christophe Leroy wrote:
>
>
>
> Le 09/06/2025 à 10:10, Takashi Iwai a écrit :
> > On Mon, 09 Jun 2025 10:00:38 +0200,
> > Christophe Leroy wrote:
> >>
> >> With user access protection (Called SMAP on x86 or KUAP on powerpc)
> >> each and every call to get_user() or put_user() performs heavy
> >> operations to unlock and lock kernel access to userspace.
> >>
> >> To avoid that, perform user accesses by blocks using
> >> user_access_begin/user_access_end() and unsafe_get_user()/
> >> unsafe_put_user() and alike.
> >>
> >> As an exemple, before the patch the 9 calls to put_user() at the
> >> end of snd_pcm_ioctl_sync_ptr_compat() imply the following set of
> >> instructions about 9 times (access_ok - enable user - write - disable
> >> user):
> >> 0.00 : c057f858: 3d 20 7f ff lis r9,32767
> >> 0.29 : c057f85c: 39 5e 00 14 addi r10,r30,20
> >> 0.77 : c057f860: 61 29 ff fc ori r9,r9,65532
> >> 0.32 : c057f864: 7c 0a 48 40 cmplw r10,r9
> >> 0.36 : c057f868: 41 a1 fb 58 bgt c057f3c0 <snd_pcm_ioctl+0xbb0>
> >> 0.30 : c057f86c: 3d 20 dc 00 lis r9,-9216
> >> 1.95 : c057f870: 7d 3a c3 a6 mtspr 794,r9
> >> 0.33 : c057f874: 92 8a 00 00 stw r20,0(r10)
> >> 0.27 : c057f878: 3d 20 de 00 lis r9,-8704
> >> 0.28 : c057f87c: 7d 3a c3 a6 mtspr 794,r9
> >> ...
> >>
> >> A perf profile shows that in total the 9 put_user() represent 36% of
> >> the time spent in snd_pcm_ioctl() and about 80 instructions.
> >>
> >> With this patch everything is done in 13 instructions and represent
> >> only 15% of the time spent in snd_pcm_ioctl():
> >>
> >> 0.57 : c057f5dc: 3d 20 dc 00 lis r9,-9216
> >> 0.98 : c057f5e0: 7d 3a c3 a6 mtspr 794,r9
> >> 0.16 : c057f5e4: 92 7f 00 04 stw r19,4(r31)
> >> 0.63 : c057f5e8: 93 df 00 0c stw r30,12(r31)
> >> 0.16 : c057f5ec: 93 9f 00 10 stw r28,16(r31)
> >> 4.95 : c057f5f0: 92 9f 00 14 stw r20,20(r31)
> >> 0.19 : c057f5f4: 92 5f 00 18 stw r18,24(r31)
> >> 0.49 : c057f5f8: 92 bf 00 1c stw r21,28(r31)
> >> 0.27 : c057f5fc: 93 7f 00 20 stw r27,32(r31)
> >> 5.88 : c057f600: 93 36 00 00 stw r25,0(r22)
> >> 0.11 : c057f604: 93 17 00 00 stw r24,0(r23)
> >> 0.00 : c057f608: 3d 20 de 00 lis r9,-8704
> >> 0.79 : c057f60c: 7d 3a c3 a6 mtspr 794,r9
> >>
> >> Note that here the access_ok() in user_write_access_begin() is skipped
> >> because the exact same verification has already been performed at the
> >> beginning of the fonction with the call to user_read_access_begin().
> >>
> >> A couple more can be converted as well but require
> >> unsafe_copy_from_user() which is not defined on x86 and arm64, so
> >> those are left aside for the time being and will be handled in a
> >> separate patch.
> >>
> >> Signed-off-by: Christophe Leroy <christophe.leroy@...roup.eu>
> >> ---
> >> v2: Split out the two hunks using copy_from_user() as unsafe_copy_from_user() is not implemented on x86 and arm64 yet.
> >
> > Thanks for the patch.
> >
> > The idea looks interesting, but the implementations with
> > unsafe_get_user() leads to very ugly goto lines, and that's too bad;
> > it makes the code flow much more difficult to follow.
> >
> > I guess that, in most cases this patch tries to cover, we just use
> > another temporary variable for compat struct, copy fields locally,
> > then run copy_to_user() in a shot instead.
>
> Thanks for looking.
>
> I'll give it a try but I think going through a local intermediate will
> be less performant than direct copy with unsafe_get/put_user().
Yes, but the code readability is often more important than minor
optimizations unless it's in a hot path.
thanks,
Takashi
Powered by blists - more mailing lists