lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c92913c-c04b-4784-9cdc-5d75b10d2ed9@mev.co.uk>
Date: Mon, 20 Oct 2025 11:45:37 +0100
From: Ian Abbott <abbotti@....co.uk>
To: Nikita Zhandarovich <n.zhandarovich@...tech.ru>,
 syzbot+af53dea94b16396bc646@...kaller.appspotmail.com,
 H Hartley Sweeten <hsweeten@...ionengravers.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 linux-kernel@...r.kernel.org, lvc-project@...uxtesting.org
Subject: Re: [syzbot] [kernel?] divide error in comedi_inc_scan_progress

On 16/10/2025 12:05, Nikita Zhandarovich wrote:
>> Oops: divide error: 0000 [#1] SMP KASAN PTI
>> CPU: 0 UID: 0 PID: 11660 Comm: irq/7-comedi_pa Not tainted syzkaller #0 PREEMPT_{RT,(full)}
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
>> RIP: 0010:comedi_inc_scan_progress+0x1a4/0x430 drivers/comedi/drivers.c:563
> 
> Hi,
> 
> I would like to solicit advice on how to properly address this
> issue [1], if no one minds.
> 
> First, I think both [1] and [2] problems are similar in the way
> they are triggered. While there are no syzkaller-side reproducers for
> either of them (even console logs do not have proper traces of what
> combinations of syscalls provoked wrong division), the blame clearly
> lies with comedi driver-specific interrupt handlers
> (parport_interrupt, das16m1_interrupt etc.).
> 
> Syzkaller at its current state manages to fuzz select comedi drivers
> by manually configuring them via COMEDI_DEVCONFIG ioctl. In the course
> of do_devconfig_ioctl() and, for instance, parport_attach() functions,
> specific irq handlers are enabled (parport_interrupt) and these
> handlers in turn interact with async->cmd->XXX values such as
> async->cmd->chanlist_len.
> 
> However, in the absence of ioctls related to cmd setup, simply
> after a single COMEDI_DEVCONFIG, async (and async->cmd) is
> initialized in __comedi_device_postconfig_async() with kzalloc.
> 
> Thus, when there is an irq is to be dealt with, these "empty"
> comedi_async objects and, specifically async->cmd->XXX, are
> processed leading to erroneous divisions like in [1] and [2].
> 
> The easiest solution, similar to one suggested in [2], is to check for
> divisor with zero values. In case of [1], comedi_inc_scan_progress
> would look something like this:
> 
>      ...
>      if (!(s->subdev_flags & SDF_PACKED) && (cmd->chanlist_len != 0)) {
> 	async->cur_chan += comedi_bytes_to_samples(s, num_bytes);
> 	async->cur_chan %= cmd->chanlist_len;
>      }
>      ...
> 
> Any suggestions are greatly appreciated!

I have a plan to deal with these unexpected interrupts, at least within 
the Comedi core functions.  The basic idea would be to for the Comedi 
core functions called by the ISRs to check that the subdevice is in the 
running state and increment a reference counter (either a refcount_t or 
a struct kref) if it is safe to proceed.  Then it will be allowed to 
assume that the struct comedi_async contents are reasonable until it 
decrements the reference counter.  Some other task may be calling 
do_become_nonbusy() in parallel with the above.  After 
do_become_nonbusy() clears the COMEDI_SRF_RUNNING flag to mark the 
subdevice as not being in the running state, it should wait until it is 
safe to continue before calling comedi_buf_reset().  This will make use 
of the reference counter and a struct completion.  The reference counter 
and struct completion can be stored as members of struct comedi_async.

There may be other parts of individual driver ISRs that use struct 
comedi_async directly and may need changing to do similar checks.  The 
checking and final decrement can done by a couple of new exported functions:

bool comedi_get_is_subdevice_running(struct comedi_subdevice *s);
void comedi_put_is_subdevice_running(struct comedi_subdevice *s);

If comedi_get_is_subdevice_running(s) returns true, the ISR can safely 
access s->async and then should call comedi_put_is_subdevice_running(s) 
when it has finished.


> P.S. To reiterate, I've failed to reproduce this error and this flawed
> analysis is theoretical only.
> 
> [1] https://syzkaller.appspot.com/bug?extid=af53dea94b16396bc646
> [2] https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
> 
> Regards,
> Nikita


-- 
-=( Ian Abbott <abbotti@....co.uk> || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ