Date: Tue, 7 Feb 2006 13:21:55 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Nate Lawson <nate@root.org> Cc: Alan Cox <alc@cs.rice.edu>, src-committers@freebsd.org, Oleg Bulyzhin <oleg@freebsd.org>, cvs-all@freebsd.org, cvs-src@freebsd.org Subject: Re: cvs commit: src/sys/dev/bge if_bge.c Message-ID: <20060207102727.D21867@delplex.bde.org> In-Reply-To: <43E7D662.6000608@root.org> References: <200602020958.k129wWtc066930@repoman.freebsd.org> <20060202100637.GB24350@lath.rinet.ru> <20060205235817.GQ5499@cs.rice.edu> <20060206221141.GA57775@lath.rinet.ru> <43E7CBD5.5090203@root.org> <20060206223436.GB57775@lath.rinet.ru> <43E7D662.6000608@root.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 6 Feb 2006, Nate Lawson wrote: > Oleg Bulyzhin wrote: >> On Mon, Feb 06, 2006 at 02:21:09PM -0800, Nate Lawson wrote: >>> Oleg Bulyzhin wrote: Other style bugs in the (not quoted) declaration of `sum' are the missing blank line after the declarations and the type of the variable. I think these are only style bugs, except on unsupported machines with 16 bit ints or maybe ones with 1's complement or sign-magnitude ints. Nested declarations don't cost at runtime, at least with gcc, since they are optimized to a fault by allocating stack space for all local variables on entry to the function, so you can't control their layout or stack size much even if you want to. (Code like like { char p[LARGE]; memset(p, 'A', sizeof(p); } { char p[LARGE]; memset(p, 'A', sizeof(p); } takes 2*LARGE bytes of stack space even though gdb doesn't support access to dead variables, so u can't easily see the contents of dead parts of the stack.) >>>> nq = q->m_nextpkt; >>>> q->m_nextpkt = NULL; >>>> m->m_pkthdr.csum_flags &= q->m_pkthdr.csum_flags; >>>> - m->m_pkthdr.csum_data += q->m_pkthdr.csum_data; >>>> + sum = m->m_pkthdr.csum_data + q->m_pkthdr.csum_data; >>>> + m->m_pkthdr.csum_data = (sum & 0xffff) + (sum >> 16); >>>> m_cat(m, q); >>>> } >>>> #ifdef MAC >>> >>> I'm not familiar with this code. So m->m_pkthdr.csum_data is 32 bits? Me too :-). csum_data is plain int, so it is not very suitable for checksums, but it has at least 16 bits so it is large enough to hold the results of checksum calculations (perhaps even in the 1's complement case with 16-bit ints since the relevant checksums are really 1's complement) although not suitable for doing them. >>> Couldn't the same thing be achieved with making it 16 bits since the add >>> will wrap normally? It is apparently used for non-checksums so it might need to be larger than 16 bits. Most of the other uses are to hold the result of offsetof() where its type is bogus in a different way (int is often much smaller than ptrdiff_t, but is large enough in practice since most offsets in objects are not very large; uint16_t would probably be large enough too). >> It will not work cause it's not just a trivial sum it's so called >> "1's complement sum" (refer rfc1071 for details). > > You're right, but aren't you missing the NOT step? > 1. 2's complement sum > 2. Add in carry > 3. 1's complement of result (not) The above is (more clearly) missing part of step 2. There may be another carry in "(sum & 0xffff) + (sum >> 16)". Something like this expression repeated to handle this carry. Call the two add-in-carry steps 2a and 2b. I think in_psdeudo() is supposed to be used to do (1) and (2), and (3) is supposed to be done inline. This method is used in tcp_input(): % th->th_sum = in_pseudo(ip->ip_src.s_addr, % ip->ip_dst.s_addr, % htonl(m->m_pkthdr.csum_data + % ip->ip_len + % IPPROTO_TCP)); % th->th_sum ^= 0xffff; Similarly in udp_input(). All other uses of csum_data seem to be for non-checksums. The above depends on ints having 17 or 18 usable bits so that the sum in htonl()'s arg doesn't overflow. csum_data should be <= 0xffff, and ip_len is u_short so it is <= 0xffff, and IPPROTO_TCP is small, so the sum can't be much larger than 0x1ffff, and it can't be larger than 0x2ffff. in_pseudo() handles both carry steps, but is fairly bogus. Here is the i386 version: % static __inline u_short % in_pseudo(u_int sum, u_int b, u_int c) % { % /* __volatile is necessary because the condition codes are used. */ % __asm __volatile ("addl %1, %0" : "+r" (sum) : "g" (b)); % __asm __volatile ("adcl %1, %0" : "+r" (sum) : "g" (c)); % __asm __volatile ("adcl $0, %0" : "+r" (sum)); The asms here are mostly bogus, if not broken. adcl is only needed if there is carry out of the 32nd bit in a 32-bit u_int, but callers like tcp_input() only pass values that almost fit in a 16-bit u_int. With 32-bit ints there are plenty of bits to spare for the possibly slightly larger values psssed by tcp_input(), so carry rarely occurs in the above and the above can often be reduced to "sum += b + c;". Or just expect the caller to add up small terms and only handle carries here. % % sum = (sum & 0xffff) + (sum >> 16); This is step 2a. % if (sum > 0xffff) % sum -= 0xffff; This is step 2b. Before step 2a, carry bits above the 16th may be represented in all higher bits of `sum' (but probably don't go past the 19th). After step 2a, there is at most 1 carry bit represented in a higher bit (the 17th), and sum <= 0x1fffe. I wonder why the above doesn't repeat the code for step 2a to implement step 2b (a third carry can't occur since 1 + 0xfffe <= 0xffff). This gives branch-free code, and branches were expensive when the original version of the above was written before FreeBSD-1 (in_cksum.h is new but copies old code). I think branch target caches now make the branching code better now, since the double-carry case rarely occurs. % return (sum); Complementation is intentionally left to callers by this interface. % } Modifying the patch to steps 2b and 3 gives: sum = m->m_pkthdr.csum_data + q->m_pkthdr.csum_data; /* (1) */ m->m_pkthdr.csum_data = (sum & 0xffff) + (sum >> 16); /* (2) */ sum = m->m_pkthdr.csum_data; /* for clarity */ m->m_pkthdr.csum_data = (sum & 0xffff) + (sum >> 16); /* (2b) */ m->m_pkthdr.csum_data ^= 0xffff; /* (3) */ Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060207102727.D21867>