Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Sep 2024 13:40:27 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   OpenZFS/FreeBSD only: potential space leak with async_destroy
Message-ID:  <8e3488d7-9e8a-49a5-9e16-53137eb6d7fc@FreeBSD.org>

next in thread | raw e-mail | index | archive | help

This is a so far theoretical only concern based on a real problem that I 
encountered at work.

It seems that only on FreeBSD ERESTART == TRAVERSE_VISIT_NO_CHILDREN == -1.
 From a quick search, Linux and illumos use positive numbers for ERESTART.

The potential problem is that dsl_process_async_destroys uses bptree_iterate to 
traverse blocks of a destroyed dataset.  bptree_iterate internally uses 
traverse_dataset_destroyed and all the common traversal machinery.
If dsl_scan_free_block_cb decides that the async destroy activity should stop 
(pause) in the current sync pass, then it would return ERESTART.

traverse_dnode can misinterpret that return value as TRAVERSE_VISIT_NO_CHILDREN 
and stop visiting data blocks while continuing the traversal.

As a result:
1. bptree_iterate may continue traversing despite dsl_scan_async_block_should_pause;
2. bptree_iterate may believe that it visited data blocks when, in fact, they 
were skipped.

A test would require destroying a large enough dataset (on a throwaway pool), 
waiting for feature@async_destroy to switch from active to enabled and then
checking that the pool's freeing property is not zero.

What do you think?

-- 
Andriy Gapon




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8e3488d7-9e8a-49a5-9e16-53137eb6d7fc>