Date: Mon, 10 Feb 2003 04:10:06 -0800 (PST) From: "Pawel Malachowski" <pawmal@unia.3lo.lublin.pl> To: freebsd-bugs@FreeBSD.org Subject: Re: kern/48009: dummynet(4) related machine hangs Message-ID: <200302101210.h1ACA62r005454@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/48009; it has been noted by GNATS. From: "Pawel Malachowski" <pawmal@unia.3lo.lublin.pl> To: Maxim Konovalov <maxim@macomnet.ru> Cc: bug-followup@freebsd.org, freebsd-ipfw@freebsd.org Subject: Re: kern/48009: dummynet(4) related machine hangs Date: Mon, 10 Feb 2003 13:00:33 +0100 Hello, I've checked if really q->numbytes is getting negative as Mike Hibler described in kern/37573 and I realized this is true. My machine goes into infinite loop in splimp()/splx() sections and that's why it looks as if it was frozen. I've decided the check if the problem exists on other 2 machines (4.7-RELEASE and 4.7-STABLE) and I must confirm, all of them hang is the same way when I frequently modify bandwith parameter. Once again, * Install fresh FreeBSD 4.7-RELEASE * Recompile GENERIC kernel with IPFW2, rebuild ipfw and libalias as described in ipfw(8), reinstall and reboot * Do: ipfw pipe 10 config bw 0 ipfw pipe 20 config bw 0 ipfw add 10 pipe 10 ip from any to any in recv NIC ipfw add 20 pipe 20 ip from any to any out xmit NIC (my NIC was: rl0, fxp0 -- a NIC one is using to connect with LAN) Then connect for example to the fast FTP-server in LAN and GET some big file (hundreds of MB). While this file is downloading, run the following script: #!/bin/sh while [ 1 ] i=1 do echo Test number $i \(`date`\). echo Step 1. ipfw pipe 10 config bw 512kbit/s echo Step 2. ipfw pipe 20 config bw 512kbit/s sleep 1 echo Step 3. ipfw pipe 10 config bw 2Mbit/s echo Step 4. ipfw pipe 20 config bw 2Mbit/s sleep 1 i=`expr $i + 1` done * And look q->numbytes after a while will grow fast up to 2^31-1 and revert into negative causing system hang. What is going on? In ip_dummynet.c, in ready_event() the following line (551) reverts q->numbytes into negative: q->numbytes += ( curr_time - q->sched_time ) * p->bandwidth; causing dummynet() to go into infinite loop with ready_event(). Note, in line 557 we are decreasing q->numbytes preventing it from growing to much: q->numbytes -= len_scaled ; (where len_scaled = pkt->dn_m->m_pkthdr.len * 8 * hz) When we frequently modify bandwith using ipfw pipe config bw xxx, q->numbytes starts growing up to maximum signed integer size. This was easy to observe cause I've added diagnostic printf to config_pipe() showing current q->numbytes value every time config_pipe() was called (every time ipfw pipe config bw xxx was used). It looks, when we are downloading something big (means, we have high traffic on pipe 10 (recv), the q->numbytes associated with pipe 20 (xmit!) grows fast. This patch (work-around) comes from kern/37573 and was changed a bit to cleanly apply on RELENG_4_7 ip_dummynet.c. It works for me preventing machine from hanging. ==================== *** ip_dummynet.c.origThu Jan 23 22:06:45 2003 --- ip_dummynet.cSun Feb 9 23:51:49 2003 *************** *** 549,554 **** --- 549,559 ---- * setting len_scaled = 0 does the job. */ q->numbytes += ( curr_time - q->sched_time ) * p->bandwidth; + if (q->numbytes<0) { + /* This shouldn't happen, I clear q->numbytes in config_pipe() */ + printf("Oops, ready_event has a problem with q->numbytes<0.\n"); + q->numbytes=0 ; + } while ( (pkt = q->head) != NULL ) { int len = pkt->dn_m->m_pkthdr.len; int len_scaled = p->bandwidth ? len*8*hz : 0 ; *************** *** 1515,1521 **** static int config_pipe(struct dn_pipe *p) { ! int s ; struct dn_flow_set *pfs = &(p->fs); /* --- 1520,1526 ---- static int config_pipe(struct dn_pipe *p) { ! int s = 0; struct dn_flow_set *pfs = &(p->fs); /* *************** *** 1549,1561 **** x->idle_heap.size = x->idle_heap.elements = 0 ; x->idle_heap.offset=OFFSET_OF(struct dn_flow_queue, heap_pos); } else x = b; ! x->bandwidth = p->bandwidth ; x->numbytes = 0; /* just in case... */ bcopy(p->if_name, x->if_name, sizeof(p->if_name) ); x->ifp = NULL ; /* reset interface ptr */ ! x->delay = p->delay ; set_fs_parms(&(x->fs), pfs); --- 1554,1579 ---- x->idle_heap.size = x->idle_heap.elements = 0 ; x->idle_heap.offset=OFFSET_OF(struct dn_flow_queue, heap_pos); } else + { struct dn_flow_queue *q; + int i; + x = b; + s = splimp(); /* protect mods to active pipe/flow set */ + + /* Obtained from kern/37573 Audit-Trail */ + /* flush accumulated credit for all queues */ + for (i = 0 ; i <= x->fs.rq_size ; i++ ) + for (q = x->fs.rq[i] ; q ; q = q->next ) { + q->numbytes = 0; + } + } + ! x->bandwidth = p->bandwidth ; x->numbytes = 0; /* just in case... */ bcopy(p->if_name, x->if_name, sizeof(p->if_name) ); x->ifp = NULL ; /* reset interface ptr */ ! x->delay = p->delay ; set_fs_parms(&(x->fs), pfs); *************** *** 1571,1578 **** all_pipes = x ; else a->next = x ; - splx(s); } } else { /* config queue */ struct dn_flow_set *x, *a, *b ; --- 1589,1596 ---- all_pipes = x ; else a->next = x ; } + splx(s); } else { /* config queue */ struct dn_flow_set *x, *a, *b ; *************** *** 1600,1605 **** --- 1618,1624 ---- if (pfs->parent_nr != 0 && b->parent_nr != pfs->parent_nr) return EINVAL ; x = b; + s = splimp(); /* protect mods to active pipe/flow set */ } set_fs_parms(x, pfs); *************** *** 1615,1622 **** all_flow_sets = x; else a->next = x; - splx(s); } } return 0 ; } --- 1634,1641 ---- all_flow_sets = x; else a->next = x; } + splx(0); } return 0 ; } ==================== -- Paweł Małachowski To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200302101210.h1ACA62r005454>