lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Oct 2019 17:28:07 +0200
From:   Stefan Haberland <sth@...ux.ibm.com>
To:     Qian Cai <cai@....pw>, Jan Hoeppner <hoeppner@...ux.ibm.com>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: memory leaks in dasd_eckd_check_characteristics() error paths

On 16.10.19 17:26, Qian Cai wrote:
> On Wed, 2019-10-16 at 16:56 +0200, Stefan Haberland wrote:
>> On 16.10.19 16:09, Qian Cai wrote:
>>> On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote:
>>>> Hi,
>>>>
>>>> thanks for reporting this.
>>>>
>>>> On 02.10.19 21:33, Qian Cai wrote:
>>>>> For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
>>>>> dasd_generic_set_online() emits this message,
>>>>>
>>>>> dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
>>>>>
>>>>> After that, there are several memory leaks below. There are "config_data" and
>>>>> then stored as,
>>>>>
>>>>> /* store per path conf_data */
>>>>> device->path[pos].conf_data = conf_data;
>>>>>
>>>>> When it processes the error path in  dasd_generic_set_online(), it calls
>>>>> dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
>>>>> the device->path[].conf_data first. 
>>>> Usually dasd_delete_device() calls dasd_generic_free_discipline() which
>>>> takes care of
>>>> the device->path[].conf_data in dasd_eckd_uncheck_device().
>>>> From a first look this looks sane.
>>>>
>>>> So I need to spend a closer look if this does not happen correctly here.
>>> When dasd_eckd_check_characteristics() failed here,
>>>
>>> 	if (!private) {
>>> 		private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
>>> 		if (!private) {
>>> 			dev_warn(&device->cdev->dev,
>>> 				 "Allocating memory for private DASD data "
>>> 				 "failed\n");
>>> 			return -ENOMEM;
>>> 		}
>>> 		device->private = private;
>>>
>>> The device->private is NULL.
>>>
>>> Then, in dasd_eckd_uncheck_device(), it will return immediately.
>>>
>>> 	if (!private)
>>> 		return;
>> Yes but in this case there is no per_path configuration data stored.
>> This is done after the private structure is allocated successfully.
> Yes, you are right. It has been a while so I must lost a bit memory. Actually,
> it looks like in dasd_eckd_check_characteristic() that device->private is set to
> NULL from this path,
>
> 	/* Read Configuration Data */
> 	rc = dasd_eckd_read_conf(device);
> 	if (rc)
> 		goto out_err1;
>
> out_err1:
> 	kfree(private->conf_data);
> 	kfree(device->private);
> 	device->private = NULL;
> 	return rc;
>
> because dasd_eckd_read_conf() returns -ENOMEM and calls,
>
> out_error:
> 	kfree(rcd_buf);
> 	*rcd_buffer = NULL;
> 	*rcd_buffer_size = 0;
> 	return ret;
>
> It will only free its own config_data in one operational path, but there could
> has already allocated in earlier paths in dasd_eckd_read_conf() without any
> rollback and calls return,
>
> 	for (lpm = 0x80; lpm; lpm>>= 1) {
> 		if (!(lpm & opm))
> 			continue;
> 		rc = dasd_eckd_read_conf_lpm(device, &conf_data,
> 					     &conf_len, lpm);
> 		if (rc && rc != -EOPNOTSUPP) {	/* -EOPNOTSUPP is ok */
> 			DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
> 					"Read configuration data returned "
> 					"error %d", rc);
> 			return rc;
> 		}
>
> Later, dasd_eckd_uncheck_device() see private->device is NULL without cleaning
> up. Does it make sense?

Yes, this looks like an error path not handling this correctly.

>
>>
>>>>> Is it safe to free those in
>>>>> dasd_free_device() without worrying about the double-free? Or, is it better to
>>>>> free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
>>>>> out_err*?
>>>>>
>>>>> --- a/drivers/s390/block/dasd.c
>>>>> +++ b/drivers/s390/block/dasd.c
>>>>> @@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
>>>>>   */
>>>>>  void dasd_free_device(struct dasd_device *device)
>>>>>  {
>>>>> +       for (int i = 0; i < 8; i++)
>>>>> +               kfree(device->path[i].conf_data);
>>>>> +
>>>>>         kfree(device->private);
>>>>>         free_pages((unsigned long) device->ese_mem, 1);
>>>>>         free_page((unsigned long) device->erp_mem);
>>>>>
>>>>>
>>>>> unreferenced object 0x0fcee900 (size 256):
>>>>>   comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
>>>>>   hex dump (first 32 bytes):
>>>>>     dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
>>>>>     f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
>>>>>   backtrace:
>>>>>     [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
>>>>>     [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
>>>>>     [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
>>>>> [dasd_eckd_mod]
>>>>>     [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
>>>>>     [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
>>>>>     [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
>>>>>     [<00000000349a5446>] online_store+0x2ce/0x420
>>>>>     [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
>>>>>     [<0000000005664197>] vfs_write+0xce/0x220
>>>>>     [<0000000044a8bccb>] ksys_write+0xea/0x190
>>>>>     [<0000000037335938>] system_call+0x296/0x2b4
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ