Date: Mon, 2 Dec 2024 17:41:34 +0800 From: Zhenlei Huang <zlei@FreeBSD.org> To: Warner Losh <imp@freebsd.org> Cc: stable@freebsd.org, Edward Tomasz Napierala <trasz@FreeBSD.org> Subject: MFC fixes for uninitialized kernel stack variables in sys/cam or do direct fix for pvscsi driver Message-ID: <0DDE1B66-B794-472D-A901-54FA2FF1E853@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
Hi Warner, Recently I upgraded some ESXi vms from 13.3 to 13.4 and noticed weird = report for sas speed. The boot console has the following, ``` da0 at pvscsi0 bus 0 scbus2 target 0 lun 0 da0: <VMware Virtual disk 2.0> Fixed Direct Access SPC-4 SCSI device da0: 4294967.295MB/s transfers ``` But camcontrol report the correct value, ``` # camcontrol inquiry da0 -R pass1: 750.000MB/s transfers, Command Queueing Enabled ``` The `4294967.295MB` is actually 0xffff_ffff or -1 but I do not see any = logic set those values. Finally I managed to get the stack trace, ``` _scsi_announce_periph scsi_announce_periph_sbuf xpt_announce_periph_sbuf dadone_proberc xpt_done_process xpt_done_td fork_exit fork_trampoline ``` and noticed that the last param `cts` of `_scsi_announce_periph(struct = cam_periph *periph, u_int *speed, u_int *freq, struct ccb_trans_settings = *cts)` is from kernel stack and is not properly initialized, latter I found = some commits related to this, 076686fe0703 cam: make sure to clear CCBs allocated on the stack ec5325dbca62 cam: make sure to clear even more CCBs allocated on the = stack 0f206cc91279 cam: add missing zeroing of a stack-allocated CCB. 616a676a0535 cam: clear stack-allocated CCB in the target layer I applied them to stable/13, rebuild and reboot, now the speed of da0 is = reported correctly. I also tried to patch the pvscsi driver with few = lines and it also works as intended. ``` --- a/sys/dev/vmware/pvscsi/pvscsi.c +++ b/sys/dev/vmware/pvscsi/pvscsi.c @@ -1444,6 +1444,10 @@ pvscsi_action(struct cam_sim *sim, union ccb = *ccb) cts->proto_specific.scsi.flags =3D = CTS_SCSI_FLAGS_TAG_ENB; cts->proto_specific.scsi.valid =3D CTS_SCSI_VALID_TQ; =20 + /* Prefer connection speed over sas port speed */ + /* cts->xport_specific.sas.bitrate =3D 0; */ + cts->xport_specific.sas.valid =3D 0; + ccb_h->status =3D CAM_REQ_CMP; xpt_done(ccb); ``` Things come clear and I know why this weird speed happens, now it is = time to decide how to fix it. Fixing the consumer of cam, aka pvscsi driver, is quite simple and = promising. I did a quick search it appears other consumers set = `cts->xport_specific.sas.valid` correctly. It does not convince me as = I'm quite new to cam subsystem. Which one do you prefer, MFC commits to stable/13, or do direct fix for = pvscsi driver to stable/13 ? Best regards, Zhenlei
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0DDE1B66-B794-472D-A901-54FA2FF1E853>