Date: Sun, 16 Oct 2022 16:40:35 +0000 From: void <void@f-m.fm> To: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: zfs with operations like rm -rf takes a very long time recently Message-ID: <64a0af42-630d-4f42-8b8c-921b68b03078@app.fastmail.com> In-Reply-To: <CAOtMX2hjGnpLgs2vD1pTELK-nF9nBORhCj57zyYmkEmHAGohnQ@mail.gmail.com> References: <f25d069a-2ee7-491a-a5ef-a14b973c12e2@app.fastmail.com> <CAOtMX2jW1s8n_FGez4-sCMBnbd-GDUoj9Sx83gaTBTEuW58HQw@mail.gmail.com> <1625dc7b-81c4-4350-8f86-1b65f5a860d9@app.fastmail.com> <CAOtMX2hjGnpLgs2vD1pTELK-nF9nBORhCj57zyYmkEmHAGohnQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 16 Oct 2022, at 16:07, Alan Somers wrote: > Gstat is showing that your disk is fully busy. It's also showing read > latency as high as 128 ms, which is extremely slow. I ran another gstat with -I 5s and have omitted -d as per your last email. How did you calculate latency? There's zfs-stats installed too. Would pasting some output here be useful? # gstat -opC -I 5s timestamp,name,q-depth,total_ops/s,read/s,read-KiB/s,ms/read,write/s,write-KiB/s,ms/write,other/s,ms/other,%busy 2022-10-16 17:23:47.080118413,da0,3,0,0,0,0.0,0,0,0.0,0,0.0,0.0 2022-10-16 17:23:52.088745392,da0,4,93,45,246,26.4,49,3974,12.7,0,154.4,97.8 2022-10-16 17:23:57.167503039,da0,3,82,41,164,25.5,40,3193,17.3,1,153.2,99.2 2022-10-16 17:24:02.176759152,da0,3,81,43,176,22.8,38,2704,13.3,0,252.6,98.6 2022-10-16 17:24:07.189364538,da0,2,90,35,140,27.9,55,3371,15.9,1,146.9,99.4 2022-10-16 17:24:12.190874083,da0,3,95,42,170,23.3,52,3486,13.2,0,167.0,96.6 2022-10-16 17:24:17.212769876,da0,4,85,33,134,30.6,52,3508,22.3,1,164.9,100.7 2022-10-16 17:24:22.286986167,da0,3,89,42,168,23.2,46,3149,12.7,1,127.3,97.2 2022-10-16 17:24:27.295992137,da0,3,88,40,160,25.4,48,3978,15.6,0,147.4,101.3 2022-10-16 17:24:32.406662934,da0,2,90,46,254,27.1,44,3554,12.9,0,159.3,97.9 2022-10-16 17:24:37.426033077,da0,3,87,42,177,23.3,45,2835,15.4,1,112.8,98.0 2022-10-16 17:24:42.455203661,da0,3,95,43,170,23.1,52,3770,11.9,0,153.5,98.8 2022-10-16 17:24:47.468741642,da0,3,71,36,144,27.9,35,2510,19.7,1,243.7,99.3 2022-10-16 17:24:52.471973107,da0,3,80,34,138,29.1,46,3630,17.6,0,266.6,99.2 2022-10-16 17:24:57.490425268,da0,3,79,35,141,28.1,43,3372,18.6,1,297.8,98.3 2022-10-16 17:25:02.589730547,da0,4,68,30,123,32.7,36,2318,22.4,3,105.2,97.9 ^C > I suspect a > problem with your disk. FYI, ZFS naturally has a 5-second rhythm > (unless you changed vfs.zfs.txg.timeout), # sysctl vfs.zfs.txg.timeout vfs.zfs.txg.timeout: 5 > so gstat's output is > sometimes more consistent if you use "-I 5s". You can also omit "-d" > for magnetic HDDs, since they don't have anything like TRIM. > I suggest checking dmesg to see if there are any messages about errors > from da0. It would also be worth running "smartctl -a /dev/da0", from > sysutils/smartmontools. Unfortunately, no errors seen in either dmesg or smartctl. there's been a long smartctl test relatively recently, also came back clean. SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 39253 - # 2 Extended offline Completed without error 00% 39176 - # 3 Short offline Completed without error 00% 39172 - ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0 3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 2595 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 1487 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 002 002 000 Old_age Always - 39254 10 Spin_Retry_Count 0x0033 128 100 030 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1263 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 426 193 Load_Cycle_Count 0x0032 029 029 000 Old_age Always - 710616 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 54 (Min/Max 16/65) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0 222 Loaded_Hours 0x0032 075 075 000 Old_age Always - 10367 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0 224 Load_Friction 0x0022 100 100 000 Old_age Always - 0 226 Load-in_Time 0x0026 100 100 000 Old_age Always - 260 240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0 SMART Error Log Version: 1 No Errors Logged
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?64a0af42-630d-4f42-8b8c-921b68b03078>