Skip site navigation (1)Skip section navigation (2)
Date:      22 Apr 2008 19:38:32 +0200
From:      "Arno J. Klaassen" <arno@heho.snv.jussieu.fr>
To:        Mike Tancsa <mike@sentex.net>
Cc:        stable@freebsd.org
Subject:   Re: nfs-server silent data corruption
Message-ID:  <wpzlrlu6w7.fsf@heho.snv.jussieu.fr>
In-Reply-To: <200804221501.m3MF1guW092221@lava.sentex.ca>
References:  <wpmyno2kqe.fsf@heho.snv.jussieu.fr> <20080421094718.GY25623@hub.freebsd.org> <wp63ubp8e0.fsf@heho.snv.jussieu.fr> <200804211537.m3LFbaZA086977@lava.sentex.ca> <wpy77650s0.fsf@heho.snv.jussieu.fr> <200804221501.m3MF1guW092221@lava.sentex.ca>

index | next in thread | previous in thread | raw e-mail


Hello,

Mike Tancsa <mike@sentex.net> writes:

> At 05:57 PM 4/21/2008, Arno J. Klaassen wrote:
> > > Hi,
> > > How long does it take for the problem to show up ?
> >
> >
> >Less than an hour in general (running the same client script
> >simultanuously on a 100Mbps linux box and 1Gbps bds6-x86)
> 
> I am running my nic at gig speeds only...   I recompiled the kernel
> this morning to include cpufreq as well as made sure the cool&quiet
> was enabled in the BIOS.
> 
> 
> 
> >for info, I test with args '38 999' (38M, try 999 times) on linux
> >(slightly adapted script BTW) and '138 999' on bsd. The best 'score' I
> >got was 'still 871 iterations to go'
> 
> 
> So far I have done 150 loops with an 80MB file and no issues and 200
> loopswith a 160MB file.  My nfe nic does not support MSI and has its
> own interrupt
> 
> # vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                           5          0
> irq4: sio0                          3049          1
> irq16: twe0                       327046        164
> irq19: bge0                       385147        194
> irq21: atapci1                    976355        492
> irq23: nfe0                     11876726       5986
> cpu0: timer                      3966420       1999
> cpu1: timer                      3964392       1998
 

# vmstat -i
interrupt                          total       rate
irq1: atkbd0                           4          0
irq14: ata0                           69          0
irq20: nfe0                     11650955       5283
irq24: atapci1                        94          0
irq28: atapci2                       178          0
irq29: ahd0                       355704        161
cpu0: timer                      4409020       1999
cpu1: timer                      4391646       1991
cpu2: timer                      4391643       1991
cpu3: timer                      4391641       1991
 
> I have powerd started up with
> powerd_enable="YES"
> powerd_flags="-a adaptive -b adaptive -n adaptive"


slightly different, I mostly use "-b adaptive -i 90 -n adaptive -r 80"
but the problem shows up without flags as well.

 
> With the "sleep" in my test script, powerd does seem to be fiddling
> with frequencies as well during the inactivity.

I most often provoke slight swapping for "randomizing" frequency changes
and a burnK7 or similar to psuh up and down by hand
 
> # sysctl dev. | grep -i fre
> dev.cpu.0.freq: 1800
> dev.cpu.0.freq_levels: 2200/110000 2000/105600 1800/89100 1000/49000
> dev.powernow.0.freq_settings: 2200/110000 2000/105600 1800/89100 1000/49000
> dev.powernow.1.freq_settings: 2200/110000 2000/105600 1800/89100 1000/49000
> dev.cpufreq.0.%driver: cpufreq
> dev.cpufreq.0.%parent: cpu0
> dev.cpufreq.1.%driver: cpufreq
> dev.cpufreq.1.%parent: cpu1

funny, when I do that :

# sysctl dev. | grep -i fre
dev.cpu.0.freq: 995
dev.cpu.0.freq_levels: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100
dev.powernow.0.freq_settings: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100
dev.powernow.1.freq_settings: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100
dev.powernow.2.freq_settings: 2587/95000 2388/90300 2189/76200 1990/63800 1791/53200 995/36100
dev.powernow.3.freq_settings: 6747/95000 6228/90300 5709/76200 5190/63800 4671/53200 2595/36100
dev.cpufreq.0.%driver: cpufreq
dev.cpufreq.0.%parent: cpu0
dev.cpufreq.1.%driver: cpufreq
dev.cpufreq.1.%parent: cpu1
dev.cpufreq.2.%driver: cpufreq
dev.cpufreq.2.%parent: cpu2
dev.cpufreq.3.%driver: cpufreq
dev.cpufreq.3.%parent: cpu3

especially the  dev.powernow.3.freq_settings look weird ...

that said, I once more dug up the old acpi_ppc.c and slightly
adapted it for fbsd7 (basically some name changes and using
read_cpu_time() i.s.o. cp_time) and the problem disappears ...

the algo of acpi_ppc makes it somewhat harder to push up frequencies,
though I doubt that matters.

I tried as well with hint.acpi_throttle.0.disabled="1" in loader.conf
with no luck (using powerd).

I'm out of office tomorrow but will try to find time tommorow evening
to test with another NIC.

Best, Arno


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?wpzlrlu6w7.fsf>