lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 30 Mar 2022 16:22:53 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Mukesh Ojha <quic_mojha@...cinc.com>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: Possible case of Race in kobject_get_path()

On Wed, Mar 30, 2022 at 06:25:29PM +0530, Mukesh Ojha wrote:
> 
> On 3/30/2022 5:01 PM, Greg KH wrote:
> > On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote:
> > > On 3/30/2022 4:49 PM, Greg KH wrote:
> > > > On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote:
> > > > > On 3/30/2022 4:31 PM, Greg KH wrote:
> > > > > > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote:
> > > > > > > On 3/30/2022 3:46 PM, Greg KH wrote:
> > > > > > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote:
> > > > > > > > > Hi All,
> > > > > > > > > 
> > > > > > > > > We are facing one issue where one driver (p1) is trying to register its
> > > > > > > > > device from driver probe
> > > > > > > > > and from another path (p2) dev_set_name(new name) done from driver probe of
> > > > > > > > > the added device whose
> > > > > > > > > new name length can be more than earlier done in (p1 path) which could
> > > > > > > > > result in redzone overwritten issue.
> > > > > > > > I do not understand, what specific driver is doing this so that we can
> > > > > > > > see an example of this problem?
> > > > > > > trying to paste some logs of the race.
> > > > > > > 
> > > > > > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone
> > > > > > > overwritten
> > > > > > What kernel version is this?
> > > > > 5.10
> > > > That is very old, is this an issue in 5.17?
> > > > 
> > > > > > > [ 14.244189][ T503]
> > > > > > > -----------------------------------------------------------------------------
> > > > > > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f
> > > > > > > @offset=3965. First byte 0x2f instead of 0xcc
> > > > > > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1
> > > > > > > cpu=0 pid=503
> > > > > > > [ 14.273111][ T503] kobject_get_path+0x58/0x100
> > > > > > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744
> > > > > > > [ 14.282639][ T503] device_add+0x98c/0x1028
> > > > > > > [ 14.286925][ T503] *device_register+0x24/0x38*
> > > > > > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus]
> > > > > > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus]
> > > > > > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl]
> > > > > > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180
> > > > > > > [ 14.314812][ T503] really_probe+0x208/0xb64
> > > > > > > [ 14.319184][ T503] driver_probe_device+0x130/0x254
> > > > > > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324
> > > > > > > [ 14.329399][ T503] __device_attach+0x170/0x364
> > > > > > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164
> > > > > > > [ 14.338671][ T503] device_add+0xa60/0x1028
> > > > > > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8
> > > > > > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804
> > > > > > > [slim_qcom_ngd_ctrl]
> > > > > > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1
> > > > > > > cpu=4 pid=518
> > > > > > > [ 14.363018][ T503] do_init_module+0xac/0x494
> > > > > > > [ 14.367475][ T503] load_module+0x35c0/0x3e54
> > > > > > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8
> > > > > > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30
> > > > > > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294
> > > > > > > [ 14.386786][ T503] el0_svc+0x38/0x9c
> > > > > > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0
> > > > > > > 
> > > > > > > 
> > > > > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc
> > > > > > > cc cc cc cc 2f 64 65 .............*/de*
> > > > > > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74
> > > > > > > 66 6f 72 6d 2f 73 vices/platform/s
> > > > > > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30
> > > > > > > 2e 73 6c 69 6d 2f oc/3340000.slim/
> > > > > > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d
> > > > > > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/
> > > > > > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73
> > > > > > > 6c 61 76 65 00 00 *btfmslim_slave*..
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 499 static int btfm_slim_probe(struct slim_device *slim)
> > > > > > I do not see that function in Linus's tree right now.  Where does it
> > > > > > come from?
> > > > > Don't you think, it can be caused by any kernel module.
> > > > Kernel code do not protect themselves against other kernel code doing
> > > > abusive things.
> > > > 
> > > > Perhaps fix your out-of-tree code to not do this if no in-kernel code is
> > > > doing this?  Nothing we can do at all about out-of-tree code that we
> > > > have never seen.
> > > 
> > > Thanks for the comment, Greg.
> > > Why do we make dev_set_name() make it available to module apart from the
> > > core ?
> > Because busses can be in a module, just like you should be doing :)
> > 
> > Please submit your code upstream for review.
> 
> I still did not get why one should not do device name change if
> dev_set_name() is available for anyone to do name change while
> kobject_get_path() is running?

I do not see where in the kernel today this is happening.  What code
changes the name of a device like this?

> This is possible race, if we know this is the limitation/issue and would be
> difficult to address , that is fine and should be documented.

Again, why are you doing this?  What needs to change the device name in
this way and what is it racing with?

Without code examples, speculating about all of this is quite difficult
to discuss.

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ