From owner-freebsd-net@FreeBSD.ORG Wed Oct 10 12:26:25 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AE1AABB6; Wed, 10 Oct 2012 12:26:25 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 44AD38FC0A; Wed, 10 Oct 2012 12:26:25 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id B18357300A; Wed, 10 Oct 2012 14:46:37 +0200 (CEST) Date: Wed, 10 Oct 2012 14:46:37 +0200 From: Luigi Rizzo To: Gleb Smirnoff Subject: Re: [CFT/Review] net byte order for AF_INET Message-ID: <20121010124637.GA15522@onelab2.iet.unipi.it> References: <20121009154128.GU34622@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121009154128.GU34622@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Oct 2012 12:26:25 -0000 On Tue, Oct 09, 2012 at 07:41:28PM +0400, Gleb Smirnoff wrote: > Hello, > > this is a patch that switches entire IPv4 stack to network > byte order. That means, that at any layer any module should > expect IP header in network byte order. Any host byte order > values can be stored in local variables only and are never stored > into a packet itself. ... > Code reviewing also appreciated. > > -- > Totus tuus, Glebius. I am really grataful you are doing this. A few comments: + as a strategy, i would probably suggest (something you mostly seem to do already) that arithmetic comparisons (even if just for equality) always use the HOST format, so that at some point one needs to change == to > or < this does not cause subtle bugs. This includes comparison with 0 (there is 1-2 instances where we do x == 0 when x is in NET format. This might confuse the reader. I would almost suggest the same for | and & though the chances that one changes | onto something that is endian-sensitive are null. + i hope this change will be merged to STABLE as soon as we are confident that it is complete so we will have an easier time maintaining the code across branches. Some comments inline (first one is probably a bug, the others are just suggestions) > Index: sys/netinet/raw_ip.c > =================================================================== > --- sys/netinet/raw_ip.c (revision 241370) > +++ sys/netinet/raw_ip.c (working copy) > @@ -292,6 +292,7 @@ > * not modify the packet except for some > * byte order swaps. > */ > + ip->ip_len = ntohs(ip->ip_len); > ip->ip_len += off; > > hash = INP_PCBHASH_RAW(proto, ip->ip_src.s_addr, this seems wrong, perhaps you want - ip->ip_len += off; + ip->ip_len = htons(ntohs(ip->ip_len) + off); > Index: sys/netinet/ip_input.c > =================================================================== > --- sys/netinet/ip_input.c (revision 241370) > +++ sys/netinet/ip_input.c (working copy) > @@ -914,22 +904,21 @@ > * Adjust ip_len to not reflect header, > * convert offset of this to bytes. > */ > - ip->ip_len -= hlen; > - if (ip->ip_off & IP_MF) { > + ip->ip_len = htons(ntohs(ip->ip_len) - hlen); > + if (ip->ip_off & htons(IP_MF)) { > /* > * Make sure that fragments have a data length > * that's a non-zero multiple of 8 bytes. > */ > - if (ip->ip_len == 0 || (ip->ip_len & 0x7) != 0) { > + if (ip->ip_len == 0 || (ip->ip_len & htons(0x7)) != 0) { > IPSTAT_INC(ips_toosmall); /* XXX */ > goto dropfrag; > } > m->m_flags |= M_FRAG; > } else > m->m_flags &= ~M_FRAG; > - ip->ip_off <<= 3; > + ip->ip_off = htons(ntohs(ip->ip_off) << 3); > > - > /* > * Attempt reassembly; if it succeeds, proceed. > * ip_reass() will return a different mbuf. + above i would rather use a temp variable ip_len = ntohs(ip->ip_len) and retain the existing expression in native format. + in the chunk below, maybe it makes sense for readability to use a couple of temp variables for ip->ip_off and ip->ip_len and adjust the values in *ip only at the end ? > @@ -1013,14 +1002,15 @@ > * segment, then it's checksum is invalidated. > */ > if (p) { > - i = GETIP(p)->ip_off + GETIP(p)->ip_len - ip->ip_off; > + i = ntohs(GETIP(p)->ip_off) + ntohs(GETIP(p)->ip_len) - > + ntohs(ip->ip_off); > if (i > 0) { > - if (i >= ip->ip_len) > + if (i >= ntohs(ip->ip_len)) > goto dropfrag; > m_adj(m, i); > m->m_pkthdr.csum_flags = 0; > - ip->ip_off += i; > - ip->ip_len -= i; > + ip->ip_off = htons(ntohs(ip->ip_off) + i); > + ip->ip_len = htons(ntohs(ip->ip_len) - i); > } > m->m_nextpkt = p->m_nextpkt; > p->m_nextpkt = m; > @@ -1033,12 +1023,13 @@ > * While we overlap succeeding segments trim them or, > * if they are completely covered, dequeue them. > */ > - for (; q != NULL && ip->ip_off + ip->ip_len > GETIP(q)->ip_off; > - q = nq) { > - i = (ip->ip_off + ip->ip_len) - GETIP(q)->ip_off; > - if (i < GETIP(q)->ip_len) { > - GETIP(q)->ip_len -= i; > - GETIP(q)->ip_off += i; > + for (; q != NULL && ntohs(ip->ip_off) + ntohs(ip->ip_len) > > + ntohs(GETIP(q)->ip_off); q = nq) { > + i = (ntohs(ip->ip_off) + ntohs(ip->ip_len)) - > + ntohs(GETIP(q)->ip_off); > + if (i < ntohs(GETIP(q)->ip_len)) { > + GETIP(q)->ip_len = htons(ntohs(GETIP(q)->ip_len) - i); > + GETIP(q)->ip_off = htons(ntohs(GETIP(q)->ip_off) + i); > m_adj(q, i); > q->m_pkthdr.csum_flags = 0; > break; > @@ -1062,14 +1053,14 @@ > */ > next = 0; > for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt) { > - if (GETIP(q)->ip_off != next) { > + if (ntohs(GETIP(q)->ip_off) != next) { > if (fp->ipq_nfrags > V_maxfragsperpacket) { > IPSTAT_ADD(ips_fragdropped, fp->ipq_nfrags); > ip_freef(head, fp); > } > goto done; > } > - next += GETIP(q)->ip_len; > + next += ntohs(GETIP(q)->ip_len); > } > /* Make sure the last packet didn't have the IP_MF flag */ > if (p->m_flags & M_FRAG) { thans again luigi