From owner-freebsd-atm Tue Nov 11 19:06:14 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id TAA22949 for atm-outgoing; Tue, 11 Nov 1997 19:06:14 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from eden.dei.uc.pt (eden.dei.uc.pt [193.136.212.3]) by hub.freebsd.org (8.8.7/8.8.7) with SMTP id TAA22941 for ; Tue, 11 Nov 1997 19:06:06 -0800 (PST) (envelope-from aalves@dei.uc.pt) Received: from zorg.dei.uc.pt by eden.dei.uc.pt (5.65v3.2/1.1.10.5/28Jun97-0144PM) id AA16290; Wed, 12 Nov 1997 03:12:17 GMT Received: from localhost (aalves@localhost) by zorg.dei.uc.pt (8.8.5/8.8.5) with SMTP id DAA20785 for ; Wed, 12 Nov 1997 03:06:46 GMT Date: Wed, 12 Nov 1997 03:06:46 +0000 (WET) From: Antonio Luis Alves Reply-To: Antonio Luis Alves To: freebsd-atm@FreeBSD.ORG Subject: ATM driver & udp stream Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk I have been working on a port to FreeBSD of a major manufacturer atm implementation which includes classical ip and lan emulation. The most difficult part was the development of the low level code to drive the pci adapters as this was my first project working inside the kernel and also on FreeBSD. At this time most of the work as been done and I have only the lan emu code and ILMI to finish the port. We have been running the port for several months on Pentium machines with FreeBSD 2.2.2 Release without any problems, until two weeks ago when I started to do some performance tests. One of tests ( a UDP stream ) was crashing the driver allays at the same place in the pdu receive routine. The driver allocs several buffers at init time to be used by the card to write the received pdu's. The supply routine allocs an mbuf for each buffer and supplies the card with the dma address of the buffer and the mbuf handle. When the card interrupts, the pdu receive routine reads the receive pdu queue. Each entry in this queue is a receive pdu descriptor. Each descriptor contains several information related to the received pdu, including several segments to form the pdu. Each segment contains the mbuf handle ( allocated by the supply routine ) and the number of bytes in the buffer written by the card. At this time I use 4k buffers and each pdu descriptor with max 16 segments to be able to receive a max pdu of 64k. The mbuf handles are read by the card from the supply queue and written to the segment descriptors after the buffers have been filled with cells payloads. The mbuf chain forming the received pdu is formed and the pdu is send upstairs. Each mbuf in the chain as the ext_buf pointing to the buffer, and pointers to the external_free_routine and an external_ref_function. The buffers will return to the free buffer pool on the driver as soon as the external_free_routine is called and the reference count allows it. These buffers are then used again by the supply routine. When the card is low on buffers, the pdu receive routine does not send the external buffers upstairs, instead it copies the data to mbufs with clusters, so the buffers are made available again to the card faster. All was working well until I made a UDP stream test. The sender was a Pentium 233mmx and on the receive side a Pentium 166. The test is a normal netperf UDP_STREAM with default values for the datagram, which is the max udp datagram 9216. The driver crashes at the pdu receive routine when accessing the segments which form the pdu. With the debugger I can see that the mbuf handles are correct but the data on the mbuf is not. Some times the m_type is MT_FREE and m->next as got a valid pointer on it, and other times all the members of the mbuf are zero. On a normal situation the mbufs would be of type MT_DATA, and correctly initialized with the pointers to the external free and reference function, m_len, ext_size and ext_buffer. >From the firmware guide it says the card never touches the mbuf, it just pass the mbuf handle received from the supply routine to the segments on the received pdu descriptor. These problem only happens when there is fragmentation and reassembly of packets. If I make the test with the udp datagram below the mtu of the interface ( 9188 ) it goes well without any problems. Also if I pace the transfer with delays and small burst it also works for the max datagram. I made also other tests from a sun as the sending machine ( which is very slow compared with the Pentiums we have here ) and it works ok even with the max datagram of 9216. So I suppose there is something related to the full blast of the Pentium 233mmx and the reassembly of the packets at the ip layer. I also think that at this speed ( 155MPS cards ) most of the fragments ( mbuf w/ external buffer ) are kept at the ip fragment queue and then as soon as the card gets low on buffers , the copy to mbufs w/ clusters routine enters the game and starts asking the system for a lot of mbufs and clusters. At this time I am running out of ideas of where to look to find the problem, so I decided to do a post here and ask for some ideas. I have made already lots of modifications in the driver with lots of paranoid checks but without any success. Sorry for the long post, but I tried to explain as much as possible for a better evaluation of this problem. Thank you. Antonio Alves From owner-freebsd-atm Tue Nov 11 20:46:13 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id UAA01879 for atm-outgoing; Tue, 11 Nov 1997 20:46:13 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from inetfw.sonycsl.co.jp (inetfw.sonycsl.co.jp [203.137.129.4]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id UAA01861 for ; Tue, 11 Nov 1997 20:46:05 -0800 (PST) (envelope-from kjc@csl.sony.co.jp) Received: from hotaka.csl.sony.co.jp (hotaka.csl.sony.co.jp [43.27.98.57]) by inetfw.sonycsl.co.jp (8.8.5/3.5W) with ESMTP id NAA24392; Wed, 12 Nov 1997 13:45:54 +0900 (JST) Received: from localhost (localhost [127.0.0.1]) by hotaka.csl.sony.co.jp (8.8.4/3.3W3) with ESMTP id NAA24120; Wed, 12 Nov 1997 13:45:39 +0900 (JST) Message-Id: <199711120445.NAA24120@hotaka.csl.sony.co.jp> To: Antonio Luis Alves cc: freebsd-atm@FreeBSD.ORG Subject: Re: ATM driver & udp stream In-reply-to: Your message of "Wed, 12 Nov 1997 03:06:46 GMT." Date: Wed, 12 Nov 1997 13:45:38 +0900 From: Kenjiro Cho Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk >> All was working well until I made a UDP stream test. The sender was a >> Pentium 233mmx and on the receive side a Pentium 166. The test is a normal >> netperf UDP_STREAM with default values for the datagram, which is the max >> udp datagram 9216. The driver crashes at the pdu receive routine when >> accessing the segments which form the pdu. With the debugger I can see that >> the mbuf handles are correct but the data on the mbuf is not. Some times >> the m_type is MT_FREE and m->next as got a valid pointer on it, and other >> times all the members of the mbuf are zero. On a normal situation the >> mbufs would be of type MT_DATA, and correctly initialized with the >> pointers to the external free and reference function, m_len, ext_size and >> ext_buffer. >> From the firmware guide it says the card never touches the mbuf, it just >> pass the mbuf handle received from the supply routine to the segments on >> the received pdu descriptor. It sounds familiar to me..., so you might want to take a look at the following possibility. I had a similar experience with Chuck Cranor's ATM driver long time ago. It turns out that, under heavy load, a circular list holding pointers to mbufs gets completely full and wrapped around. As a result, a wrong mbuf in use gets freed. The system crashes after a while when it refers to the misplaced mbuf. IMO, it's a very bad idea to put a free list chain at the top of a mbuf cluster data area when a mbuf is on the free list. It is very hard to debug if something goes wrong. --kj From owner-freebsd-atm Wed Nov 12 08:22:05 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id IAA14731 for atm-outgoing; Wed, 12 Nov 1997 08:22:05 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from plains.NoDak.edu (tinguely@plains.NoDak.edu [134.129.111.64]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id IAA14725 for ; Wed, 12 Nov 1997 08:21:59 -0800 (PST) (envelope-from tinguely@plains.NoDak.edu) Received: (from tinguely@localhost) by plains.NoDak.edu (8.8.8/8.8.8) id KAA29115; Wed, 12 Nov 1997 10:21:19 -0600 (CST) Date: Wed, 12 Nov 1997 10:21:19 -0600 (CST) From: Mark Tinguely Message-Id: <199711121621.KAA29115@plains.NoDak.edu> To: aalves@dei.uc.pt Subject: Re: ATM driver & udp stream Cc: freebsd-atm@FreeBSD.ORG Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk I did something simular in my IDT NICStAR drive, except I changed the MBUF routine to keep a permanent association between the MBUF and its external data buffer. The MBUF virtual address and also the external data physical address are programmed into the card. Obviously, the external data physical address so that the card can store the information into the buffer and the MBUF virtual address to more quickly link this buffer with the rest of the PDU or to hand the buffer up to the higher protocols. With the permanent association between an MBUF and its external data buffer, I keep a seperate free MBUF link list (stored on m_nextpkt) for MBUF that are not being processed in the stack, nor are programmed in the card for new input. Since these MBUFs can be quickly recycled, I did not want the MBUF routines to break the MBUF to external data association only to re-establish it. This feature also gives me an easy external data to MBUF map function. --mark. From owner-freebsd-atm Wed Nov 12 08:50:59 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id IAA16986 for atm-outgoing; Wed, 12 Nov 1997 08:50:59 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from galileo.ravel.ufrj.br (galileo.ravel.ufrj.br [146.164.32.68]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id IAA16893 for ; Wed, 12 Nov 1997 08:49:58 -0800 (PST) (envelope-from rodolfo@galileo.ravel.ufrj.br) Received: (from rodolfo@localhost) by galileo.ravel.ufrj.br (8.8.7/8.8.7) id OAA00538; Wed, 12 Nov 1997 14:41:43 -0200 (EDT) From: Rodolfo Heitor Gevaerd de Faria Message-Id: <199711121641.OAA00538@galileo.ravel.ufrj.br> Subject: Re: ATM driver & udp stream In-Reply-To: from Antonio Luis Alves at "Nov 12, 97 03:06:46 am" To: aalves@dei.uc.pt Date: Wed, 12 Nov 1997 14:41:43 -0200 (EDT) Cc: freebsd-atm@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL32 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Antonio Luis Alves was saying that, ^ ^ I have been working on a port to FreeBSD of a major manufacturer atm If you don't mind, what is the "major manufacturer" which you're porting ? ^ implementation which includes classical ip and lan emulation. The most ^ difficult part was the development of the low level code to drive the pci ^ adapters as this was my first project working inside the kernel and also ^ on FreeBSD. At this time most of the work as been done and I have only the ^ lan emu code and ILMI to finish the port. ^ ^ We have been running the port for several months on Pentium machines with ^ FreeBSD 2.2.2 Release without any problems, until two weeks ago when I ^ started to do some performance tests. One of tests ( a UDP stream ) was ^ crashing the driver allays at the same place in the pdu receive routine. ^ ^ The driver allocs several buffers at init time to be used by the card to ^ write the received pdu's. The supply routine allocs an mbuf for each ^ buffer and supplies the card with the dma address of the buffer and ^ the mbuf handle. When the card interrupts, the pdu receive routine reads ^ the receive pdu queue. Each entry in this queue is a receive pdu ^ descriptor. Each descriptor contains several information related to the ^ received pdu, including several segments to form the pdu. Each segment ^ contains the mbuf handle ( allocated by the supply routine ) and the ^ number of bytes in the buffer written by the card. At this time I use 4k ^ buffers and each pdu descriptor with max 16 segments to be able to receive ^ a max pdu of 64k. The mbuf handles are read by the card from the supply ^ queue and written to the segment descriptors after the buffers have been ^ filled with cells payloads. ^ The mbuf chain forming the received pdu is formed and the pdu is send ^ upstairs. Each mbuf in the chain as the ext_buf pointing to the buffer, ^ and pointers to the external_free_routine and an external_ref_function. ^ The buffers will return to the free buffer pool on the driver as soon as ^ the external_free_routine is called and the reference count allows it. ^ These buffers are then used again by the supply routine. ^ When the card is low on buffers, the pdu receive routine does not send the ^ external buffers upstairs, instead it copies the data to mbufs with ^ clusters, so the buffers are made available again to the card faster. ^ ^ All was working well until I made a UDP stream test. The sender was a ^ Pentium 233mmx and on the receive side a Pentium 166. The test is a normal ^ netperf UDP_STREAM with default values for the datagram, which is the max ^ udp datagram 9216. The driver crashes at the pdu receive routine when ^ accessing the segments which form the pdu. With the debugger I can see that ^ the mbuf handles are correct but the data on the mbuf is not. Some times ^ the m_type is MT_FREE and m->next as got a valid pointer on it, and other ^ times all the members of the mbuf are zero. On a normal situation the ^ mbufs would be of type MT_DATA, and correctly initialized with the ^ pointers to the external free and reference function, m_len, ext_size and ^ ext_buffer. ^ >From the firmware guide it says the card never touches the mbuf, it just ^ pass the mbuf handle received from the supply routine to the segments on ^ the received pdu descriptor. ^ ^ These problem only happens when there is fragmentation and reassembly of ^ packets. If I make the test with the udp datagram below the mtu of the ^ interface ( 9188 ) it goes well without any problems. Also if I pace the ^ transfer with delays and small burst it also works for the max datagram. I ^ made also other tests from a sun as the sending machine ( which is very ^ slow compared with the Pentiums we have here ) and it works ok even with ^ the max datagram of 9216. ^ ^ So I suppose there is something related to the full blast of the Pentium ^ 233mmx and the reassembly of the packets at the ip layer. I also think ^ that at this speed ( 155MPS cards ) most of the fragments ( mbuf w/ ^ external buffer ) are kept at the ip fragment queue and then as soon as ^ the card gets low on buffers , the copy to mbufs w/ clusters routine ^ enters the game and starts asking the system for a lot of mbufs and ^ clusters. ^ ^ At this time I am running out of ideas of where to look to find the ^ problem, so I decided to do a post here and ask for some ideas. ^ I have made already lots of modifications in the driver with lots of ^ paranoid checks but without any success. ^ ^ Sorry for the long post, but I tried to explain as much as possible for a ^ better evaluation of this problem. ^ ^ Thank you. ^ ^ ^ Antonio Alves ^ Rodolfo H G Faria From owner-freebsd-atm Wed Nov 12 17:23:58 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id RAA28928 for atm-outgoing; Wed, 12 Nov 1997 17:23:58 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from eden.dei.uc.pt (eden.dei.uc.pt [193.136.212.3]) by hub.freebsd.org (8.8.7/8.8.7) with SMTP id RAA28921 for ; Wed, 12 Nov 1997 17:23:50 -0800 (PST) (envelope-from aalves@dei.uc.pt) Received: from zorg.dei.uc.pt by eden.dei.uc.pt (5.65v3.2/1.1.10.5/28Jun97-0144PM) id AA01698; Thu, 13 Nov 1997 01:30:04 GMT Received: from localhost (aalves@localhost) by zorg.dei.uc.pt (8.8.5/8.8.5) with SMTP id BAA01437; Thu, 13 Nov 1997 01:24:06 GMT Date: Thu, 13 Nov 1997 01:24:06 +0000 (WET) From: Antonio Luis Alves To: Kenjiro Cho Cc: freebsd-atm@FreeBSD.ORG Subject: Re: ATM driver & udp stream In-Reply-To: <199711120445.NAA24120@hotaka.csl.sony.co.jp> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Wed, 12 Nov 1997, Kenjiro Cho wrote: > > >> All was working well until I made a UDP stream test. The sender was a > >> Pentium 233mmx and on the receive side a Pentium 166. The test is a normal > >> netperf UDP_STREAM with default values for the datagram, which is the max > >> udp datagram 9216. The driver crashes at the pdu receive routine when > >> accessing the segments which form the pdu. With the debugger I can see that > >> the mbuf handles are correct but the data on the mbuf is not. Some times > >> the m_type is MT_FREE and m->next as got a valid pointer on it, and other > >> times all the members of the mbuf are zero. On a normal situation the > >> mbufs would be of type MT_DATA, and correctly initialized with the > >> pointers to the external free and reference function, m_len, ext_size and > >> ext_buffer. > >> From the firmware guide it says the card never touches the mbuf, it just > >> pass the mbuf handle received from the supply routine to the segments on > >> the received pdu descriptor. > > It sounds familiar to me..., so you might want to take a look at the > following possibility. > > I had a similar experience with Chuck Cranor's ATM driver long time > ago. It turns out that, under heavy load, a circular list holding > pointers to mbufs gets completely full and wrapped around. As a > result, a wrong mbuf in use gets freed. The system crashes after a > while when it refers to the misplaced mbuf. > > IMO, it's a very bad idea to put a free list chain at the top of a > mbuf cluster data area when a mbuf is on the free list. It is very > hard to debug if something goes wrong. > After reading your mail, the first thing I thought was to put a periodic test to check the mbufs in the circular list of the buffer supply protocol. This way I probably could get more close to problem and try to trace it. This is the only place I can think of at this moment that could cause the problem if the mbufs were freed while still owned by the card. However, because the supply protocol used on the driver was not changed much from the original distribution which is running on other architectures for quite some time I also suspected that the problem could come from the kernel mbuf handling code which could under the heavy load cause the problem. Anyway, today I found that there was new reassembly code on ip_input.c on release 2.2.5, and because the problem only showed when there was reassembly at the ip layer I decided to give it a try and upgraded the Pentium166 machine to this release. It turns out that the driver does not crash anymore under 2.2.5 . I just made this a few hours ago, but the various tests I have made so far did not crash the machine. I rebooted the machine several times and made around 10 udp-stream tests each time without any problem. So it seems the old reassembly code was the key of the problem. However it is still not clear to me if there is a bug on the driver which shows only under heavy load with the old reassembly code, or if the old code was the responsible for the trash of the mbufs. I say this because I don't remember to see on hackers anyone reporting problems with the reassembly code. Right now I am going to upgrade all our machines to 2.2.5 and see if we will not get any problems in the next weeks. Antonio Alves From owner-freebsd-atm Wed Nov 12 17:36:54 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id RAA00506 for atm-outgoing; Wed, 12 Nov 1997 17:36:54 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from eden.dei.uc.pt (eden.dei.uc.pt [193.136.212.3]) by hub.freebsd.org (8.8.7/8.8.7) with SMTP id RAA00501 for ; Wed, 12 Nov 1997 17:36:50 -0800 (PST) (envelope-from aalves@dei.uc.pt) Received: from zorg.dei.uc.pt by eden.dei.uc.pt (5.65v3.2/1.1.10.5/28Jun97-0144PM) id AA25566; Thu, 13 Nov 1997 01:42:42 GMT Received: from localhost (aalves@localhost) by zorg.dei.uc.pt (8.8.5/8.8.5) with SMTP id BAA01456; Thu, 13 Nov 1997 01:36:58 GMT Date: Thu, 13 Nov 1997 01:36:58 +0000 (WET) From: Antonio Luis Alves To: Mark Tinguely Cc: freebsd-atm@FreeBSD.ORG Subject: Re: ATM driver & udp stream In-Reply-To: <199711121621.KAA29115@plains.NoDak.edu> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Wed, 12 Nov 1997, Mark Tinguely wrote: > With the permanent association between an MBUF and its external data buffer, > I keep a seperate free MBUF link list (stored on m_nextpkt) for MBUF that are > not being processed in the stack, nor are programmed in the card for new input. > Since these MBUFs can be quickly recycled, I did not want the MBUF routines > to break the MBUF to external data association only to re-establish it. This > feature also gives me an easy external data to MBUF map function. > I have seen your code when you released it, and when my problem showed I thought you probably were not experiencing the same type of problem because the way you handled the M_PERM mbufs. I liked the way you did it, because of the less overhead and the mbuf control it gives. However there was a design decision here at our group since the beginning, and it was to make the port without touching the kernel, to keep our port as much as possible close to the original. Antonio Alves From owner-freebsd-atm Wed Nov 12 17:50:58 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id RAA01807 for atm-outgoing; Wed, 12 Nov 1997 17:50:58 -0800 (PST) (envelope-from owner-freebsd-atm) Received: from eden.dei.uc.pt (eden.dei.uc.pt [193.136.212.3]) by hub.freebsd.org (8.8.7/8.8.7) with SMTP id RAA01784 for ; Wed, 12 Nov 1997 17:50:52 -0800 (PST) (envelope-from aalves@dei.uc.pt) Received: from zorg.dei.uc.pt by eden.dei.uc.pt (5.65v3.2/1.1.10.5/28Jun97-0144PM) id AA05852; Thu, 13 Nov 1997 01:57:16 GMT Received: from localhost (aalves@localhost) by zorg.dei.uc.pt (8.8.5/8.8.5) with SMTP id BAA01471; Thu, 13 Nov 1997 01:51:32 GMT Date: Thu, 13 Nov 1997 01:51:32 +0000 (WET) From: Antonio Luis Alves To: Rodolfo Heitor Gevaerd de Faria Cc: freebsd-atm@FreeBSD.ORG Subject: Re: ATM driver & udp stream In-Reply-To: <199711121641.OAA00538@galileo.ravel.ufrj.br> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-atm@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Wed, 12 Nov 1997, Rodolfo Heitor Gevaerd de Faria wrote: > Antonio Luis Alves was saying that, > ^ > ^ I have been working on a port to FreeBSD of a major manufacturer atm > > If you don't mind, what is the "major manufacturer" which you're porting ? > Unfortunately I can't tell at this time more details about the company and our work because of an agreement we have. However I hope this will change soon, as we are negotiating the possibility to make the port freely available to the FreeBSD community as a binary only distribution. I will let you know and make a post here as soon as we have more details. Antonio Alves