2023-12-03 02:08:29 +00:00
|
|
|
# datashake - level out zpools by rewriting files
|
|
|
|
|
|
|
|
## Background
|
|
|
|
|
|
|
|
I have a niche problem: my storage server's ZFS pool is lumpy!
|
|
|
|
|
|
|
|
```
|
|
|
|
NAME SIZE ALLOC FREE FRAG CAP HEALTH
|
|
|
|
zones 32.6T 12.2T 20.4T 3% 37% ONLINE
|
|
|
|
mirror 3.62T 2.21T 1.41T 5% 61.1% ONLINE
|
|
|
|
c0t5000CCA25DE8EBF4d0 - - - - - ONLINE
|
|
|
|
c0t5000CCA25DEEC08Ad0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 2.22T 1.40T 6% 61.3% ONLINE
|
|
|
|
c0t5000CCA25DE6FD92d0 - - - - - ONLINE
|
|
|
|
c0t5000CCA25DEEC738d0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 2.28T 1.34T 6% 63.0% ONLINE
|
|
|
|
c0t5000CCA25DEAA3EEd0 - - - - - ONLINE
|
|
|
|
c0t5000CCA25DE6F42Ed0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 2.29T 1.33T 5% 63.2% ONLINE
|
|
|
|
c0t5000CCA25DE9DB9Dd0 - - - - - ONLINE
|
|
|
|
c0t5000CCA25DEED5B7d0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 2.29T 1.34T 5% 63.1% ONLINE
|
|
|
|
c0t5000CCA25DEB0F42d0 - - - - - ONLINE
|
|
|
|
c0t5000CCA25DEECB9Dd0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 237G 3.39T 1% 6.38% ONLINE
|
|
|
|
c0t5000CCA24CF36876d0 - - - - - ONLINE
|
|
|
|
c0t5000CCA249D4AA59d0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 236G 3.39T 0% 6.36% ONLINE
|
|
|
|
c0t5000CCA24CE9D1CAd0 - - - - - ONLINE
|
|
|
|
c0t5000CCA24CE954D2d0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 228G 3.40T 0% 6.13% ONLINE
|
|
|
|
c0t5000CCA24CE8C60Ed0 - - - - - ONLINE
|
|
|
|
c0t5000CCA24CE9D249d0 - - - - - ONLINE
|
|
|
|
mirror 3.62T 220G 3.41T 0% 5.93% ONLINE
|
|
|
|
c0t5000CCA24CF80849d0 - - - - - ONLINE
|
|
|
|
c0t5000CCA24CF80838d0 - - - - - ONLINE
|
|
|
|
```
|
|
|
|
|
|
|
|
You can probably guess what happened: I had a zpool with five mirrors, and then
|
|
|
|
expanded it by adding four more mirrors. ZFS doesn't automatically rebalance
|
|
|
|
existing data, but does skew writes of new data so that more go to the newer
|
|
|
|
mirrors.
|
|
|
|
|
2023-12-05 03:48:27 +00:00
|
|
|
To rebalance the data, the algorithm is straightforward:
|
2023-12-03 02:08:29 +00:00
|
|
|
|
2023-12-05 03:48:27 +00:00
|
|
|
* for file in dataset,
|
|
|
|
* copy the file to a temporary directory in another dataset
|
|
|
|
* delete the original file
|
|
|
|
* copy from the temporary directory to recreate the original file
|
|
|
|
* delete the temporary directory
|
2023-12-03 02:08:29 +00:00
|
|
|
|
|
|
|
As the files get rewritten, not only do the newer mirrors get more full, but
|
|
|
|
also the older mirrors free up space. Eventually, the utilization of all mirrors
|
|
|
|
should converge.
|
|
|
|
|
|
|
|
## Solution
|
|
|
|
|
|
|
|
The `datashake` program aims to automate the rebalancing process, while also
|
|
|
|
adding some robustness and heuristics.
|
|
|
|
|
|
|
|
* Gracefully handle shutdowns (e.g. Ctrl-c) to prevent files from getting lost.
|
|
|
|
* Keep track of processed files, so that if the program stops and resumes, it
|
|
|
|
can skip those files.
|
|
|
|
* Write a journal of operations so that, if shut down ungracefully, files in
|
|
|
|
the temporary directory can be identified and recovered.
|
|
|
|
* Don't bother processing really small files.
|