Date: 22 Apr 2008 19:38:32 +0200 From: "Arno J. Klaassen" <arno@heho.snv.jussieu.fr> To: Mike Tancsa <mike@sentex.net> Cc: stable@freebsd.org Subject: Re: nfs-server silent data corruption Message-ID: <wpzlrlu6w7.fsf@heho.snv.jussieu.fr> In-Reply-To: <200804221501.m3MF1guW092221@lava.sentex.ca> References: <wpmyno2kqe.fsf@heho.snv.jussieu.fr> <20080421094718.GY25623@hub.freebsd.org> <wp63ubp8e0.fsf@heho.snv.jussieu.fr> <200804211537.m3LFbaZA086977@lava.sentex.ca> <wpy77650s0.fsf@heho.snv.jussieu.fr> <200804221501.m3MF1guW092221@lava.sentex.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello, Mike Tancsa <mike@sentex.net> writes: > At 05:57 PM 4/21/2008, Arno J. Klaassen wrote: > > > Hi, > > > How long does it take for the problem to show up ? > > > > > >Less than an hour in general (running the same client script > >simultanuously on a 100Mbps linux box and 1Gbps bds6-x86) > > I am running my nic at gig speeds only... I recompiled the kernel > this morning to include cpufreq as well as made sure the cool&quiet > was enabled in the BIOS. > > > > >for info, I test with args '38 999' (38M, try 999 times) on linux > >(slightly adapted script BTW) and '138 999' on bsd. The best 'score' I > >got was 'still 871 iterations to go' > > > So far I have done 150 loops with an 80MB file and no issues and 200 > loopswith a 160MB file. My nfe nic does not support MSI and has its > own interrupt > > # vmstat -i > interrupt total rate > irq1: atkbd0 5 0 > irq4: sio0 3049 1 > irq16: twe0 327046 164 > irq19: bge0 385147 194 > irq21: atapci1 976355 492 > irq23: nfe0 11876726 5986 > cpu0: timer 3966420 1999 > cpu1: timer 3964392 1998 # vmstat -i interrupt total rate irq1: atkbd0 4 0 irq14: ata0 69 0 irq20: nfe0 11650955 5283 irq24: atapci1 94 0 irq28: atapci2 178 0 irq29: ahd0 355704 161 cpu0: timer 4409020 1999 cpu1: timer 4391646 1991 cpu2: timer 4391643 1991 cpu3: timer 4391641 1991 > I have powerd started up with > powerd_enable="YES" > powerd_flags="-a adaptive -b adaptive -n adaptive" slightly different, I mostly use "-b adaptive -i 90 -n adaptive -r 80" but the problem shows up without flags as well. > With the "sleep" in my test script, powerd does seem to be fiddling > with frequencies as well during the inactivity. I most often provoke slight swapping for "randomizing" frequency changes and a burnK7 or similar to psuh up and down by hand > # sysctl dev. | grep -i fre > dev.cpu.0.freq: 1800 > dev.cpu.0.freq_levels: 2200/110000 2000/105600 1800/89100 1000/49000 > dev.powernow.0.freq_settings: 2200/110000 2000/105600 1800/89100 1000/49000 > dev.powernow.1.freq_settings: 2200/110000 2000/105600 1800/89100 1000/49000 > dev.cpufreq.0.%driver: cpufreq > dev.cpufreq.0.%parent: cpu0 > dev.cpufreq.1.%driver: cpufreq > dev.cpufreq.1.%parent: cpu1 funny, when I do that : # sysctl dev. | grep -i fre dev.cpu.0.freq: 995 dev.cpu.0.freq_levels: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100 dev.powernow.0.freq_settings: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100 dev.powernow.1.freq_settings: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100 dev.powernow.2.freq_settings: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100 dev.powernow.3.freq_settings: 6747/95000 6228/90300 5709/76200 5190/63800 4671/53200 2595/36100 dev.cpufreq.0.%driver: cpufreq dev.cpufreq.0.%parent: cpu0 dev.cpufreq.1.%driver: cpufreq dev.cpufreq.1.%parent: cpu1 dev.cpufreq.2.%driver: cpufreq dev.cpufreq.2.%parent: cpu2 dev.cpufreq.3.%driver: cpufreq dev.cpufreq.3.%parent: cpu3 especially the dev.powernow.3.freq_settings look weird ... that said, I once more dug up the old acpi_ppc.c and slightly adapted it for fbsd7 (basically some name changes and using read_cpu_time() i.s.o. cp_time) and the problem disappears ... the algo of acpi_ppc makes it somewhat harder to push up frequencies, though I doubt that matters. I tried as well with hint.acpi_throttle.0.disabled="1" in loader.conf with no luck (using powerd). I'm out of office tomorrow but will try to find time tommorow evening to test with another NIC. Best, Arno
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?wpzlrlu6w7.fsf>