Date: Thu, 3 Jan 2019 01:25:23 -0800 From: Mark Millard <marklmi@yahoo.com> To: Kyle Evans <kevans@freebsd.org>, freebsd-emulation@freebsd.org Cc: freebsd-arm <freebsd-arm@freebsd.org> Subject: Under qemu-aarch64-static "wc /dev/null" gets "Unsupported ancillary data: 1/0" from a sendmsg attempt: because of wrong cmsg_len type in target_cmsghdr Message-ID: <22184643-4320-4B7C-86DA-A71DF62D4543@yahoo.com>
next in thread | raw e-mail | index | archive | help
[This note follows the investigation sequence,
ending with the important conclusions.]
My test context here is a poudriere-devel bulk -i for a
amd64->aarch64 context.
wc /dev/null or wc //dev/null does:
# wc /dev/null
Unsupported ancillary data: 1/0
that then hangs-up until I ^C to get back to a prompt.
Here is what ktrace/kdump shows the process before the hang through
when I hit ^C to stop the hang-up:
. . .
98475 101033 qemu-aarch64-static 0.000340 CALL =
sigprocmask[340](SIG_BLOCK,0x7ffffffe3c80,0x7ffffffe3d80)
98475 101033 qemu-aarch64-static 0.000003 RET sigprocmask[340] 0
98475 101033 qemu-aarch64-static 0.000001 CALL =
pselect[522](0x6,0,0x7ffffffe3fb0,0,0,0x7ffffffe3d80)
98475 101033 qemu-aarch64-static 0.000001 RET pselect[522] 1
98475 101033 qemu-aarch64-static 0.000001 CALL =
sigprocmask[340](SIG_SETMASK,0x7ffffffe3c80,0)
98475 101033 qemu-aarch64-static 0.000001 RET sigprocmask[340] 0
98475 101033 qemu-aarch64-static 0.000042 CALL =
write[4](0x2,0x7ffffffe3480,0x20)
98475 101033 qemu-aarch64-static 0.000036 GIO fd 2 wrote 32 bytes
"Unsupported ancillary data: 1/0
"
98475 101033 qemu-aarch64-static 0.000003 RET write[4] 32/0x20
98475 101033 qemu-aarch64-static 0.000001 CALL =
sendmsg[28](0x5,0x7ffffffe3c28,0)
98475 101033 qemu-aarch64-static 0.000003 RET sendmsg[28] -1 errno 22 =
Invalid argument
98475 101033 qemu-aarch64-static 0.000184 CALL close[6](0x3)
98475 101033 qemu-aarch64-static 0.000040 RET close[6] 0
98475 101033 qemu-aarch64-static 0.000017 CALL close[6](0x7)
98475 101033 qemu-aarch64-static 0.000005 RET close[6] 0
98475 101033 qemu-aarch64-static 0.000002 CALL =
sigprocmask[340](SIG_BLOCK,0x7ffffffe3c80,0x7ffffffe3d80)
98475 101033 qemu-aarch64-static 0.000001 RET sigprocmask[340] 0
98475 101033 qemu-aarch64-static 0.000001 CALL =
pselect[522](0x6,0x7ffffffe3dd0,0,0,0,0x7ffffffe3d80)
98475 101539 qemu-aarch64-static 0.000089 RET nanosleep[240] 0
98475 101539 qemu-aarch64-static 0.000042 CALL =
_umtx_op[454](0x86101f008,UMTX_OP_WAIT_UINT_PRIVATE,0,0,0)
98475 101033 qemu-aarch64-static 15.845396 RET pselect[522] -1 errno =
4 Interrupted system call
Note the qemu-aarch64 genrated message and the later:
sendmsg[28] -1 errno 22 Invalid argument
The qemu-*-static code that wrote the message is from
t2h_freebsd_cmsg and is:
if ((cmsg->cmsg_level =3D=3D TARGET_SOL_SOCKET) &&
(cmsg->cmsg_type =3D=3D SCM_RIGHTS)) {
int *fd =3D (int *)data;
int *target_fd =3D (int *)target_data;
int i, numfds =3D len / sizeof(int);
for (i =3D 0; i < numfds; i++) {
fd[i] =3D tswap32(target_fd[i]);
}
} else if ((cmsg->cmsg_level =3D=3D TARGET_SOL_SOCKET) &&
(cmsg->cmsg_type =3D=3D SCM_TIMESTAMP) &&
(len =3D=3D sizeof(struct timeval))) {
/* copy struct timeval to host */
struct timeval *tv =3D (struct timeval *)data;
struct target_freebsd_timeval *target_tv =3D
(struct target_freebsd_timeval *)target_data;
__get_user(tv->tv_sec, &target_tv->tv_sec);
__get_user(tv->tv_usec, &target_tv->tv_usec);
} else {
gemu_log("Unsupported ancillary data: %d/%d\n",
cmsg->cmsg_level, cmsg->cmsg_type);
memcpy(data, target_data, len);
}
=20
Well it turns out that qemu_*-static 's code has:
struct target_cmsghdr {
abi_long cmsg_len;
int32_t cmsg_level;
int32_t cmsg_type;
};
where for amd64 target_cmsghdr has:
(gdb) p/d sizeof(struct target_cmsghdr)
$2 =3D 16
(gdb) p/d sizeof(((struct target_cmsghdr *)0)->cmsg_len)=20
$5 =3D 8
(gdb) p/d &((struct target_cmsghdr *)0)->cmsg_level
$4 =3D 8
(gdb) p/d &((struct target_cmsghdr *)0)->cmsg_type=20
$1 =3D 12
which does not match the amd64 or aarch64 native:
struct cmsghdr {
socklen_t cmsg_len; /* data byte count, =
including hdr */
int cmsg_level; /* originating protocol =
*/
int cmsg_type; /* protocol-specific =
type */
/* followed by u_char cmsg_data[]; */
}; =20
because the cmsghdr's cmsg_len is smaller, even on a 64-bit =
architecture:
(gdb) p/d sizeof(((struct cmsghdr *)0)->cmsg_len)
$6 =3D 4
/usr/include/arpa/inet.h:typedef __socklen_t socklen_t;
/usr/include/netinet/in.h:typedef __socklen_t socklen_t;
/usr/include/netinet6/in6.h:typedef __socklen_t socklen_t;
/usr/include/sys/_types.h:typedef __uint32_t __socklen_t;
/usr/include/sys/socket.h:typedef __socklen_t socklen_t;
. . .
/usr/include/netdb.h:typedef __socklen_t socklen_t;
so abi_long does not match socklen_t for 64-bit architectures.
So code such as in t2h_freebsd_cmsg:
cmsg->cmsg_level =3D tswap32(target_cmsg->cmsg_level);
cmsg->cmsg_type =3D tswap32(target_cmsg->cmsg_type);
is not using the correct target offsets when aarch64 is the target
that it is extracting from (for example).
For comparison on a 64-bit architecture:
(gdb) p/d sizeof(struct cmsghdr)
$1 =3D 12
(gdb) p/d &((struct cmsghdr *)0)->cmsg_level
$2 =3D 4
(gdb) p/d &((struct cmsghdr *)0)->cmsg_type=20
$3 =3D 8
I do not yet have a tested change.
=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?22184643-4320-4B7C-86DA-A71DF62D4543>
