Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Oct 2022 16:40:35 +0000
From:      void <void@f-m.fm>
To:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: zfs with operations like rm -rf takes a very long time recently
Message-ID:  <64a0af42-630d-4f42-8b8c-921b68b03078@app.fastmail.com>
In-Reply-To:  <CAOtMX2hjGnpLgs2vD1pTELK-nF9nBORhCj57zyYmkEmHAGohnQ@mail.gmail.com>
References:  <f25d069a-2ee7-491a-a5ef-a14b973c12e2@app.fastmail.com> <CAOtMX2jW1s8n_FGez4-sCMBnbd-GDUoj9Sx83gaTBTEuW58HQw@mail.gmail.com> <1625dc7b-81c4-4350-8f86-1b65f5a860d9@app.fastmail.com> <CAOtMX2hjGnpLgs2vD1pTELK-nF9nBORhCj57zyYmkEmHAGohnQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 16 Oct 2022, at 16:07, Alan Somers wrote:
> Gstat is showing that your disk is fully busy.  It's also showing read
> latency as high as 128 ms, which is extremely slow.  

I ran another gstat with -I 5s and have omitted -d as 
per your last email. How did you calculate latency?

There's zfs-stats installed too. Would pasting some output here be useful?

# gstat -opC -I 5s
timestamp,name,q-depth,total_ops/s,read/s,read-KiB/s,ms/read,write/s,write-KiB/s,ms/write,other/s,ms/other,%busy
2022-10-16 17:23:47.080118413,da0,3,0,0,0,0.0,0,0,0.0,0,0.0,0.0
2022-10-16 17:23:52.088745392,da0,4,93,45,246,26.4,49,3974,12.7,0,154.4,97.8
2022-10-16 17:23:57.167503039,da0,3,82,41,164,25.5,40,3193,17.3,1,153.2,99.2
2022-10-16 17:24:02.176759152,da0,3,81,43,176,22.8,38,2704,13.3,0,252.6,98.6
2022-10-16 17:24:07.189364538,da0,2,90,35,140,27.9,55,3371,15.9,1,146.9,99.4
2022-10-16 17:24:12.190874083,da0,3,95,42,170,23.3,52,3486,13.2,0,167.0,96.6
2022-10-16 17:24:17.212769876,da0,4,85,33,134,30.6,52,3508,22.3,1,164.9,100.7
2022-10-16 17:24:22.286986167,da0,3,89,42,168,23.2,46,3149,12.7,1,127.3,97.2
2022-10-16 17:24:27.295992137,da0,3,88,40,160,25.4,48,3978,15.6,0,147.4,101.3
2022-10-16 17:24:32.406662934,da0,2,90,46,254,27.1,44,3554,12.9,0,159.3,97.9
2022-10-16 17:24:37.426033077,da0,3,87,42,177,23.3,45,2835,15.4,1,112.8,98.0
2022-10-16 17:24:42.455203661,da0,3,95,43,170,23.1,52,3770,11.9,0,153.5,98.8
2022-10-16 17:24:47.468741642,da0,3,71,36,144,27.9,35,2510,19.7,1,243.7,99.3
2022-10-16 17:24:52.471973107,da0,3,80,34,138,29.1,46,3630,17.6,0,266.6,99.2
2022-10-16 17:24:57.490425268,da0,3,79,35,141,28.1,43,3372,18.6,1,297.8,98.3
2022-10-16 17:25:02.589730547,da0,4,68,30,123,32.7,36,2318,22.4,3,105.2,97.9
^C

> I suspect a
> problem with your disk.  FYI, ZFS naturally has a 5-second rhythm
> (unless you changed vfs.zfs.txg.timeout), 

# sysctl vfs.zfs.txg.timeout
vfs.zfs.txg.timeout: 5

> so gstat's output is
> sometimes more consistent if you use "-I 5s".  You can also omit "-d"
> for magnetic HDDs, since they don't have anything like TRIM.

> I suggest checking dmesg to see if there are any messages about errors
> from da0.  It would also be worth running "smartctl -a /dev/da0", from
> sysutils/smartmontools.

Unfortunately, no errors seen in either dmesg or smartctl. there's been a long smartctl test relatively recently, also came back clean.

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     39253         -
# 2  Extended offline    Completed without error       00%     39176         -
# 3  Short offline       Completed without error       00%     39172         -

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       2595
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       1487
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   002   002   000    Old_age   Always       -       39254
 10 Spin_Retry_Count        0x0033   128   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1263
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       4
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       426
193 Load_Cycle_Count        0x0032   029   029   000    Old_age   Always       -       710616
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       54 (Min/Max 16/65)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
222 Loaded_Hours            0x0032   075   075   000    Old_age   Always       -       10367
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       260
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?64a0af42-630d-4f42-8b8c-921b68b03078>