Date: Thu, 3 Jan 2019 01:25:23 -0800 From: Mark Millard <marklmi@yahoo.com> To: Kyle Evans <kevans@freebsd.org>, freebsd-emulation@freebsd.org Cc: freebsd-arm <freebsd-arm@freebsd.org> Subject: Under qemu-aarch64-static "wc /dev/null" gets "Unsupported ancillary data: 1/0" from a sendmsg attempt: because of wrong cmsg_len type in target_cmsghdr Message-ID: <22184643-4320-4B7C-86DA-A71DF62D4543@yahoo.com>
next in thread | raw e-mail | index | archive | help
[This note follows the investigation sequence, ending with the important conclusions.] My test context here is a poudriere-devel bulk -i for a amd64->aarch64 context. wc /dev/null or wc //dev/null does: # wc /dev/null Unsupported ancillary data: 1/0 that then hangs-up until I ^C to get back to a prompt. Here is what ktrace/kdump shows the process before the hang through when I hit ^C to stop the hang-up: . . . 98475 101033 qemu-aarch64-static 0.000340 CALL = sigprocmask[340](SIG_BLOCK,0x7ffffffe3c80,0x7ffffffe3d80) 98475 101033 qemu-aarch64-static 0.000003 RET sigprocmask[340] 0 98475 101033 qemu-aarch64-static 0.000001 CALL = pselect[522](0x6,0,0x7ffffffe3fb0,0,0,0x7ffffffe3d80) 98475 101033 qemu-aarch64-static 0.000001 RET pselect[522] 1 98475 101033 qemu-aarch64-static 0.000001 CALL = sigprocmask[340](SIG_SETMASK,0x7ffffffe3c80,0) 98475 101033 qemu-aarch64-static 0.000001 RET sigprocmask[340] 0 98475 101033 qemu-aarch64-static 0.000042 CALL = write[4](0x2,0x7ffffffe3480,0x20) 98475 101033 qemu-aarch64-static 0.000036 GIO fd 2 wrote 32 bytes "Unsupported ancillary data: 1/0 " 98475 101033 qemu-aarch64-static 0.000003 RET write[4] 32/0x20 98475 101033 qemu-aarch64-static 0.000001 CALL = sendmsg[28](0x5,0x7ffffffe3c28,0) 98475 101033 qemu-aarch64-static 0.000003 RET sendmsg[28] -1 errno 22 = Invalid argument 98475 101033 qemu-aarch64-static 0.000184 CALL close[6](0x3) 98475 101033 qemu-aarch64-static 0.000040 RET close[6] 0 98475 101033 qemu-aarch64-static 0.000017 CALL close[6](0x7) 98475 101033 qemu-aarch64-static 0.000005 RET close[6] 0 98475 101033 qemu-aarch64-static 0.000002 CALL = sigprocmask[340](SIG_BLOCK,0x7ffffffe3c80,0x7ffffffe3d80) 98475 101033 qemu-aarch64-static 0.000001 RET sigprocmask[340] 0 98475 101033 qemu-aarch64-static 0.000001 CALL = pselect[522](0x6,0x7ffffffe3dd0,0,0,0,0x7ffffffe3d80) 98475 101539 qemu-aarch64-static 0.000089 RET nanosleep[240] 0 98475 101539 qemu-aarch64-static 0.000042 CALL = _umtx_op[454](0x86101f008,UMTX_OP_WAIT_UINT_PRIVATE,0,0,0) 98475 101033 qemu-aarch64-static 15.845396 RET pselect[522] -1 errno = 4 Interrupted system call Note the qemu-aarch64 genrated message and the later: sendmsg[28] -1 errno 22 Invalid argument The qemu-*-static code that wrote the message is from t2h_freebsd_cmsg and is: if ((cmsg->cmsg_level =3D=3D TARGET_SOL_SOCKET) && (cmsg->cmsg_type =3D=3D SCM_RIGHTS)) { int *fd =3D (int *)data; int *target_fd =3D (int *)target_data; int i, numfds =3D len / sizeof(int); for (i =3D 0; i < numfds; i++) { fd[i] =3D tswap32(target_fd[i]); } } else if ((cmsg->cmsg_level =3D=3D TARGET_SOL_SOCKET) && (cmsg->cmsg_type =3D=3D SCM_TIMESTAMP) && (len =3D=3D sizeof(struct timeval))) { /* copy struct timeval to host */ struct timeval *tv =3D (struct timeval *)data; struct target_freebsd_timeval *target_tv =3D (struct target_freebsd_timeval *)target_data; __get_user(tv->tv_sec, &target_tv->tv_sec); __get_user(tv->tv_usec, &target_tv->tv_usec); } else { gemu_log("Unsupported ancillary data: %d/%d\n", cmsg->cmsg_level, cmsg->cmsg_type); memcpy(data, target_data, len); } =20 Well it turns out that qemu_*-static 's code has: struct target_cmsghdr { abi_long cmsg_len; int32_t cmsg_level; int32_t cmsg_type; }; where for amd64 target_cmsghdr has: (gdb) p/d sizeof(struct target_cmsghdr) $2 =3D 16 (gdb) p/d sizeof(((struct target_cmsghdr *)0)->cmsg_len)=20 $5 =3D 8 (gdb) p/d &((struct target_cmsghdr *)0)->cmsg_level $4 =3D 8 (gdb) p/d &((struct target_cmsghdr *)0)->cmsg_type=20 $1 =3D 12 which does not match the amd64 or aarch64 native: struct cmsghdr { socklen_t cmsg_len; /* data byte count, = including hdr */ int cmsg_level; /* originating protocol = */ int cmsg_type; /* protocol-specific = type */ /* followed by u_char cmsg_data[]; */ }; =20 because the cmsghdr's cmsg_len is smaller, even on a 64-bit = architecture: (gdb) p/d sizeof(((struct cmsghdr *)0)->cmsg_len) $6 =3D 4 /usr/include/arpa/inet.h:typedef __socklen_t socklen_t; /usr/include/netinet/in.h:typedef __socklen_t socklen_t; /usr/include/netinet6/in6.h:typedef __socklen_t socklen_t; /usr/include/sys/_types.h:typedef __uint32_t __socklen_t; /usr/include/sys/socket.h:typedef __socklen_t socklen_t; . . . /usr/include/netdb.h:typedef __socklen_t socklen_t; so abi_long does not match socklen_t for 64-bit architectures. So code such as in t2h_freebsd_cmsg: cmsg->cmsg_level =3D tswap32(target_cmsg->cmsg_level); cmsg->cmsg_type =3D tswap32(target_cmsg->cmsg_type); is not using the correct target offsets when aarch64 is the target that it is extracting from (for example). For comparison on a 64-bit architecture: (gdb) p/d sizeof(struct cmsghdr) $1 =3D 12 (gdb) p/d &((struct cmsghdr *)0)->cmsg_level $2 =3D 4 (gdb) p/d &((struct cmsghdr *)0)->cmsg_type=20 $3 =3D 8 I do not yet have a tested change. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?22184643-4320-4B7C-86DA-A71DF62D4543>