Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 09 Dec 2023 17:14:53 +0000
From:      bugzilla-noreply@freebsd.org
To:        ports-bugs@FreeBSD.org
Subject:   [Bug 275659] sysutils/nut: NUT USB communication to UPS fails repeatedly
Message-ID:  <bug-275659-7788@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275659

            Bug ID: 275659
           Summary: sysutils/nut: NUT USB communication to UPS fails
                    repeatedly
           Product: Ports & Packages
           Version: Latest
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: Individual Port(s)
          Assignee: cy@FreeBSD.org
          Reporter: lloydsystems1@tpg.com.au
             Flags: maintainer-feedback?(cy@FreeBSD.org)
          Assignee: cy@FreeBSD.org

Overview:
Network UPS Tools experiences repeated issues with the USB driver losing
communication to the UPS, resulting in stale data and connection failure
errors. On system boot, NUT will start and operate correctly, but loses
communication to the UPS after a few hours. Once communication is dead,
attempts to restart the NUT service fail. The only means of recovery is to
reboot or unplug and reconnect the UPS USB cable.

The information here is also posted on the FreeBSD forum, and includes a
workaround for recovery.
https://forums.freebsd.org/threads/sysutils-nut-usb-communication-failure.9=
1092/


Steps to Reproduce:
Start the NUT services (nut, nut_upsmon, nut_upslog).
Monitor the system log.


Actual Results:
The USB driver loses communication after a few hours. upsd marks the data as
stale, and the following appears in the system log.

upsd[3132]: Data for UPS [smt750] is stale - check driver
upsmon[5991]: Poll UPS [smt750@localhost] failed - Data stale
upsmon[5991]: Communications with UPS smt750@localhost lost

On loss of communication, upsmon is configured to raise a notify event (noc=
omm
or commbad) to the scheduler, and a 30-second timer will restart the NUT
service via the upssched-cmd script. But the service restart fails because =
the
driver refuses the connection, as shown in the system log.

upssched-cmd[34596]: Timer event comm-bad triggered: restarting NUT service
upsd[3132]: mainloop: Interrupted system call
upsmon[5991]: Poll UPS [smt750@localhost] failed - Write error: Broken pipe
upsmon[5991]: UPS [smt750@localhost]: connect failed: Connection failure:
Connection refused
usbhid-ups[2912]: libusb1: Could not open any HID devices: insufficient
permissions on everything
upsmon[5991]: UPS [smt750@localhost]: connect failed: Connection failure:
Connection refused
root[37497]: /usr/local/etc/rc.d/nut: WARNING: failed precmd routine for nut
upsmon[5991]: UPS [smt750@localhost]: connect failed: Connection failure:
Connection refused
upsmon[5991]: UPS smt750@localhost is unavailable


Expected Results:
The NUT upsd and USB driver should not lose communication to the UPS for no
apparent reason.


Additional Information:
1. The UPS is a APC Smart-UPS 750.

2. FreeBSD runs in a virtual machine on VMware ESXi. UPS connects to the ES=
Xi
host by USB cable and to the FreeBSD guest by USB pass-through. ESXi host h=
as
NUT client software installed and communicates to the FreeBSD NUT server to
shutdown guests and the host on power failure.

3. I had this setup (same server and UPS) running NUT on a CentOS 6 virtual
machine and worked perfectly for years without UPS communication issues. Wh=
en I
moved the NUT server to FreeBSD I copied the working configuration (settings
changed from defaults) from CentOS.

4. Settings applied in ups.conf
   [smt750]
       driver =3D usbhid-ups
       port =3D auto
       desc =3D "APC Smart-UPS SMT750I"
       offdelay =3D 180
       ondelay =3D 240

5. Settings applied in upsd.conf
   LISTEN 127.0.0.1 3493
   LISTEN <the.vm.ip.address> 3493

6. The following was added to /boot/loader.conf so that the kernel USB HID
driver will detach from the device and allow NUT to bind to it.
   usb_quirk_load=3D"YES"
   hw.usb.quirk.0=3D"0x051d 0x0003 0 0xffff UQ_HID_IGNORE"

7. NUT has correct owner and permissions for the device.
   # ll /dev/usb
   crw-rw----  1 root  nut       0x84 Nov  9 17:51 0.4.0

8. Output of usbconfig when everything is working.
   # usbconfig
   ugen0.4: <American Power Conversion Smart-UPS 750 FW:UPS 15.0 / ID18> at
usbus0, cfg=3D0 md=3DHOST spd=3DFULL (12Mbps) pwr=3DON (2mA)
   # usbconfig -d ugen0.4 dump_device_desc
   bLength =3D 0x0012
   bDescriptorType =3D 0x0001
   bcdUSB =3D 0x0200
   bDeviceClass =3D 0x0000  <Probed by interface class>
   bDeviceSubClass =3D 0x0000
   bDeviceProtocol =3D 0x0000
   bMaxPacketSize0 =3D 0x0040
   idVendor =3D 0x051d
   idProduct =3D 0x0003
   bcdDevice =3D 0x0106
   iManufacturer =3D 0x0001  <American Power Conversion >
   iProduct =3D 0x0002  <Smart-UPS 750 FW:UPS 15.0 / ID=3D18>
   iSerialNumber =3D 0x0003  <xxxxxxxxxxxx  >
   bNumConfigurations =3D 0x0001

9. After repeated communication failures, the following changes were made.
Extending the poll interval improved the situation, increasing the period
between failures from hours to days. But it does not fix the problem.
   upsd.conf:    MAXAGE 30
   upsmon.conf:  DEADTIME 30
   ups.conf:     pollinterval =3D 10


Software Versions:
FreeBSD version 13.2-p4.
nut version     2.8.0_24 (FreeBSD package)

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-275659-7788>