Date: Wed, 6 Jun 2007 14:45:22 +0900 From: Pyun YongHyeon <pyunyh@gmail.com> To: Paul Bielecki <pawciobiel@gmail.com> Cc: freebsd-net@freebsd.org Subject: Re: lge fiber-optic loose connection for 1-6s Message-ID: <20070606054522.GA18286@cdnetworks.co.kr> In-Reply-To: <2e420cc20706051003k64f829bbhd7fa38c7fc2ee29f@mail.gmail.com> References: <2e420cc20706051003k64f829bbhd7fa38c7fc2ee29f@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 05, 2007 at 06:03:20PM +0100, Paul Bielecki wrote: > Hello All > > I have network connection problems with my small database/samba server. > Machine is on small shuttle box with lge fiber-optic 1000baseSX on LAN > and rl0 to VPN connection. > Server been set up by somebody else, about 4 years ago and have not > been update since. > I have 6x FreeBSD +2x linux + 4x M$ servers, but it is only one server > I have connection problems with. > > It is FreeBSD 4.8 stable, Mysql 4.0.12, Samba 2.2.8 > > Network: 330 machines + network printers; 60 machines including this > server on 10.0.0.0/24, printers are on 10.0.0.0/22 and the rest lan is > 10.0.1.0/22, 10.0.2.0/22, 10.0.3.0/22. > Default gateway is set to host in 10.0.0.0/24. > rl link is connected to a second FreeBSD box which act only as a VPN, > network 172.16.12.0/24. > There is one main switch which connects servers and uplinks from all > rooms and buildings. > Almost all windows machines in network are up-to date and all have > anti virus software installed. > > What happen is that occasionally, from 6 to 20 times a day, all > machines seems to lose connection with this server for 1-6 seconds. > > If it happens > -I can ping google.com or other host in the same network from server > itself and I have reply (?) > -I lose my ssh connection to this server > -there is no errors or warnings in messages apart smbd errors > -samba gives me lots of "smbd read_data: read failure for 4. Error = > Operation time out" or smbd_oplock/oplock break. > -tcpdump shows lots of ACK packtes from to server on 139 > > I think that having 10.0.0.0/24 and 10.0.0.0/22 as a one big thing > doesn't help, believe that it should be set up with VLANs but I can't > change it just like that. > The second thing is that M$ network is not configured properly, there > should be one wins server or PDC, no bcasts. > > I use to just blindly watch tcpdump -v -s 255 -i lge0 port not 22 and > port not 139 and not icmp > but I dont know what should I look for. > > Let me know your thoughts and please give me some "tips" how can I > diagnose what can cause my problems. > > some help with tcpdump would be much appreciated too, > for instance: > 17:05:49.644256 0.00:01:e6:9d:07:16.452 > > 0.ff:ff:ff:ff:ff:ff.452:ipx-sap-resp 30c '0001E69D071680DDNPI9D0716' > addr 0.00:01:e6:9d:07:16 > 17:33:04.521449 802.1d config 8000.00:05:5d:1f:00:80.8002 root > 8000.00:05:5d:1f:00:80 pathcost 0 age 0 max 20 hello 2 fdelay 15 > > # printers > 17:33:07.370377 10.0.0.225.svrloc > HP-DEVICE-DISC.MCAST.NET.svrloc: > [udp sum ok] udp 151 (ttl 4, id 51568, len 179) > 17:05:18.409507 10.0.0.237.netbios-dgm > 255.255.255.255.netbios-dgm: > [udp sum ok] NBT UDP PACKET(138) (ttl 60, id 14452, len 229) > 17:05:18.757053 10.0.0.218.netbios-dgm > 255.255.255.255.netbios-dgm: > [udp sum ok] NBT UDP PACKET(138) (ttl 60, id 20727, len 229) > > # another samba server to bcast > 17:05:29.708120 10.0.0.127.33191 > 10.0.3.255.netbios-ns: [udp sum ok] > NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST (DF) (ttl 64, id 0, len > 78) > > I'm unsure what caused this issue but it seems that lge(4) lacks some protections from overly-fragmented packets. Did you see "watchdog timeout" messages in console? I don't have lge(4) hardwares so it's hard to fix it. It seems that lge(4) needs the following work. - endian clean - bus_dma(9) conversion - fragment handling as the hardware can't handle more than 10 fragments. -- Regards, Pyun YongHyeon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070606054522.GA18286>