Date: Sun, 14 Apr 1996 22:09:32 -0400 From: "Louis A. Mamakos" <louie@TransSys.COM> To: hackers@freebsd.org Subject: new socket option for timestamps, plus a "bug" fix Message-ID: <199604150209.WAA06890@whizzo.transsys.com>
index | next in thread | raw e-mail
Based on the discussion on the mailing list last week regarding SIGIO
and why you'd use it for different applications, I decided to
reimplement the code I had put into 4.3BSD-tahoe some years ago to add
a timestamp socket option. That is, on a UDP socket, you can enable a
timestamp to be associated with a message as it's queued to the socket
buffer.
The code to do this was rather simple. Here's the diffs, to a recent
FreeBSD-current kernel. What this does is retain a timestamp gathered
from microtime() which can be returned as control information by the
user using the recvmsg() system call. The returned data in the
control info buffer is a struct cmsghdr followed by a struct timeval.
Index: sys/sys/socket.h
===================================================================
RCS file: /usr/local/FreeBSD/cvs/src/sys/sys/socket.h,v
retrieving revision 1.10
diff -u -r1.10 socket.h
--- socket.h 1996/02/07 16:19:02 1.10
+++ socket.h 1996/04/09 03:06:21
@@ -63,6 +63,7 @@
#define SO_LINGER 0x0080 /* linger on close if data present */
#define SO_OOBINLINE 0x0100 /* leave received OOB data in line */
#define SO_REUSEPORT 0x0200 /* allow local address & port reuse */
+#define SO_TIMESTAMP 0x0400 /* timestamp received dgram traffic */
/*
* Additional options, not kept in so_options.
@@ -296,6 +297,7 @@
/* "Socket"-level control message types: */
#define SCM_RIGHTS 0x01 /* access rights (array of int) */
+#define SCM_TIMESTAMP 0x02 /* timestamp (struct timeval) */
/*
* 4.3 compat sockaddr, move to compat file later
Index: sys/sys/netinet/udp_usrreq.c
===================================================================
RCS file: /usr/local/FreeBSD/cvs/src/sys/netinet/udp_usrreq.c,v
retrieving revision 1.21
diff -u -r1.21 udp_usrreq.c
--- udp_usrreq.c 1996/04/04 10:46:44 1.21
+++ udp_usrreq.c 1996/04/09 04:13:28
@@ -95,6 +95,9 @@
struct mbuf *));
static void udp_notify __P((struct inpcb *, int));
static struct mbuf *udp_saveopt __P((caddr_t, int, int));
+#if defined(SO_TIMESTAMP) && defined(SCM_TIMESTAMP)
+static struct mbuf *udp_timestamp __P((void));
+#endif
void
udp_init()
@@ -300,9 +303,20 @@
*/
udp_in.sin_port = uh->uh_sport;
udp_in.sin_addr = ip->ip_src;
- if (inp->inp_flags & INP_CONTROLOPTS) {
+ if (inp->inp_flags & INP_CONTROLOPTS
+#if defined(SO_TIMESTAMP) && defined(SCM_TIMESTAMP)
+ || inp->inp_socket->so_options & SO_TIMESTAMP
+#endif
+ ) {
struct mbuf **mp = &opts;
+#if defined(SO_TIMESTAMP) && defined(SCM_TIMESTAMP)
+ if (inp->inp_socket->so_options & SO_TIMESTAMP) {
+ if (*mp = udp_timestamp())
+ mp = &(*mp)->m_next;
+ }
+#endif
+
if (inp->inp_flags & INP_RECVDSTADDR) {
*mp = udp_saveopt((caddr_t) &ip->ip_dst,
sizeof(struct in_addr), IP_RECVDSTADDR);
@@ -367,6 +381,29 @@
cp->cmsg_type = type;
return (m);
}
+
+#if defined(SO_TIMESTAMP) && defined(SCM_TIMESTAMP)
+static struct mbuf *
+udp_timestamp()
+{
+ register struct cmsghdr *cp;
+ struct mbuf *m;
+ struct timeval tv;
+
+ MGET(m, M_DONTWAIT, MT_CONTROL);
+ if (m == 0)
+ return (struct mbuf *) 0;
+
+ microtime(&tv);
+ cp = (struct cmsghdr *) mtod(m, struct cmsghdr *);
+ cp->cmsg_len =
+ m->m_len = sizeof(*cp) + sizeof(struct timeval);
+ cp->cmsg_level = SOL_SOCKET;
+ cp->cmsg_type = SCM_TIMESTAMP;
+ (void) memcpy(CMSG_DATA(cp), &tv, sizeof(struct timeval));
+ return (m);
+}
+#endif /* defined(SO_TIMESTAMP) && defined(SCM_TIMESTAMP) */
/*
* Notify a udp user of an asynchronous error;
Just One Ugly Thing: It's really distasteful to put this socket option
implemention into netinet/udp_usrreq.c; it really is a socket-level
option and not specific only to UDP sockets. Logically, it belongs in
kern/uipc_socket2.c which is where the data actually gets queued to
the sockbuf. The problem though, is that the functions there all get
called with a "struct sockbuf *", and the function isn't able to find
the enclosing socket structure to look at the socket options which are
enabled. This feels, somehow, like a layering/API kinda problem, but
it's much less clear how to fix this, if you do at all. There's no
reason why this shouldn't also work on AF_UNIX, er, AF_LOCAL flavored
sockets without having to reimplement the code there.
What I noticed in poking around the code is that it seems to only be
possible to return a single element of control information using the
recvmsg() system call. It seems to have been intended to accumulate a
number of distinct entities; for example, look at netinet/udp_usrreq.c
where multiple mbufs can be queued up depending on which socket
options are turned off. Some of this code is #ifdef'ed out at the
moment. If you look at the code in sys/kern/uipc_socket2.c (in the
sbappendaddr() function, for example), you'll see that you can queue
more than one mbuf of control information. And looking at
sys/kern/uipc_socket.c at so, multiple mbufs of control information
are almost lovingly extracted from the sockbuf in soreceive() to be
returned to the caller.
However, in sys/kern/uipc_syscalls.c, only the first of these was ever
actually extracted and returned to the user; the rest just get
released. This didn't seem to be "correct", so I whacked on that code
to support returning multiple mbufs of control information into the
user's buffer (since they are self-describing in length, etc.).
Index: sys/kern/uipc_syscalls.c
===================================================================
RCS file: /usr/local/FreeBSD/cvs/src/sys/kern/uipc_syscalls.c,v
retrieving revision 1.16
diff -u -r1.16 uipc_syscalls.c
--- uipc_syscalls.c 1996/03/11 15:37:33 1.16
+++ uipc_syscalls.c 1996/04/10 05:29:08
@@ -636,7 +636,8 @@
register struct iovec *iov;
register int i;
int len, error;
- struct mbuf *from = 0, *control = 0;
+ struct mbuf *m, *from = 0, *control = 0;
+ caddr_t ctlbuf;
#ifdef KTRACE
struct iovec *ktriov = NULL;
#endif
@@ -735,17 +736,29 @@
}
#endif
len = mp->msg_controllen;
- if (len <= 0 || control == 0)
- len = 0;
- else {
- if (len >= control->m_len)
- len = control->m_len;
- else
+ m = control;
+ mp->msg_controllen = 0;
+ ctlbuf = (caddr_t) mp->msg_control;
+
+ while (m && len > 0) {
+ unsigned int tocopy;
+
+ if (len >= m->m_len)
+ tocopy = m->m_len;
+ else {
mp->msg_flags |= MSG_CTRUNC;
- error = copyout((caddr_t)mtod(control, caddr_t),
- (caddr_t)mp->msg_control, (unsigned)len);
+ tocopy = len;
+ }
+
+ if (error = copyout((caddr_t)mtod(m, caddr_t),
+ ctlbuf, tocopy))
+ goto out;
+
+ ctlbuf += tocopy;
+ len -= tocopy;
+ m = m->m_next;
}
- mp->msg_controllen = len;
+ mp->msg_controllen = ctlbuf - mp->msg_control;
}
out:
if (from)
Anyway, this code has been running on my box for a few days, and seems
to be pretty happy. I've modified the ntptrace program in a recent
version of the xntp 3.5c distribution as a test case, and intend to
modify xntpd in that version to use this code. So far, so good.
While we can debate the merits of the timestamp socket option, I think
there is a genuine bug (or at least misimplementation) in not being
able to return more than one chunk of control message info with the
recvmsg system call.
Is there any interest importing this into the source tree? If so,
I'll be happy to pass along further changes to things like xntpd which
take advantage of this code. Actually, I'd like to see a more recent
version of xntp imported as well, but that's another story..
Louis Mamakos
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604150209.WAA06890>
