FreeBSD Mail Archives

Date:      Tue, 15 Jul 2008 22:29:28 -0400
From:      Steve Bertrand <steve@ibctech.ca>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: taskqueue timeout
Message-ID:  <487D5D08.9070102@ibctech.ca>
In-Reply-To: <200807151955.m6FJtf77008969@apollo.backplane.com>
References:  <487CCD46.8080506@ibctech.ca>	<200807151711.m6FHBgVO007481@apollo.backplane.com>	<487CF077.2040201@ibctech.ca> <487CFA08.5000308@ibctech.ca> <200807151955.m6FJtf77008969@apollo.backplane.com>

Matthew Dillon wrote:
> :Went from 10->15, and it took quite a bit longer into the backup before 
> :the problem cropped back up.

Jumping right into it, there is another post after this one, but I'm 
going to try to reply inline:

>     Try 30 or longer.  See if you can make the problem go away entirely.
>     then fall back to 5 and see if the problem resumes at its earlier
>     pace.

I'm sure 30 will either push the issue longer, or into non-existence, 
but are there any developers here who can say what this timer does? ie. 
How does changing this timer affect the performance of the disk 
subsystem (aside from allowing it to work, of course).

After I'm done responding this message, I'll be testing the sysctl to 30.

 >   It could be temperature related.  The drives are being exercised
>     a lot, they could very well be overheating.  To find out add more
>     airflow (a big house fan would do the trick).
>

Temperature is a good thought, but currently, my physical situation has 
this:

- 2U chassis
- multiple fans in the case
- in my lab (which is essentially beside my desk)
- the case has no lid
- it is 64 degrees with A/C and circulating fans in this area
- hard drives are separated relatively well inside the case

>     It could be that errors are accumulating on the drives, but it seems
>     unlikely that four drives would exhibit the same problem.

Thats what I'm thinking. All four drives are exhibiting the same 
errors... or, for all intents and purposes, the machine is coughing the 
same errors for all the drives.

>     Also make sure the power supply can handle four drives.  Most power
>     supplies that come with consumer boxes can't under full load if you
>     also have a mid or high-end graphics card installed.  Power supplies
>     that come with OEM slap-together enclosures are not usually much better.

I currently have a 550W PSU in the 2U chassis, which again, is sitting 
open. I have more hardware, running in worse conditions with less 
wattage PSUs that don't exhibit this behavior. I need to determine 
whether this problem is SATA, ZFS, the motherboard or code.

>     Specifically, look at the +5V and +12V amperage maximums on the power
>     supply, then check the disk labels to see what they draw, then
>     multiply by 2.  e.g. if your power supply can do 30A@12V and you have
>     four drives each taking 2A@12V (and typically ~half that at 5V), thats
>     4x2x2 = 16A@12V and you would probably be ok.

I'm well within specs. Even after V/A tests with the meter. The power 
supply is providing ample wattage to each device accordingly.

>     To test, remove two of the four drives, reformat the ZFS to use just 2,
>     and see if the problem reoccurs with just two drives.

... I knew that was going to come up... my response is "I worked so hard 
to get this system with ZFS all configured *exactly* how I wanted it".

To test, I'm going to flip to 30 as per Matthews recommendation, and see 
how far that takes me. At this time, I'm only testing by backing up one 
machine on the network. If it fails, I'll clock the time, and then 
'reformat' with two drives.

Is there a technical reason this may work better with only two drives?

Is there anyone interested to the point where remote login would be helpful?

Steve

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?487D5D08.9070102>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation