Shorten the README; save the story for the blog post.

2023-12-06 18:33:26 -08:00
parent 62dcffc2c9
commit 0346b96449
1 changed files with 14 additions and 58 deletions
--- a/README.md
+++ b/README.md
@@ -2,65 +2,21 @@
 ## Background
-I have a niche problem: my storage server's ZFS pool is lumpy!
+See [this blog post](https://blog.humancabbage.net/posts/datashake) for the
 motivation behind this program. Basically, this program copies files back-and-
 forth between ZFS datasets to attempt to address unbalanced utilization among
 vdevs.
-```
+## Usage
-NAME                        SIZE  ALLOC   FREE  FRAG    CAP  HEALTH
+
-zones                      32.6T  12.2T  20.4T    3%    37%  ONLINE
+```text
- mirror                    3.62T  2.21T  1.41T    5%  61.1%  ONLINE
+$ datashake --source /tank/stuff --temp /tank/temp --concurrency 2
  c0t5000CCA25DE8EBF4d0        -      -      -     -      -  ONLINE
  c0t5000CCA25DEEC08Ad0        -      -      -     -      -  ONLINE
 mirror                    3.62T  2.22T  1.40T    6%  61.3%  ONLINE
  c0t5000CCA25DE6FD92d0        -      -      -     -      -  ONLINE
  c0t5000CCA25DEEC738d0        -      -      -     -      -  ONLINE
 mirror                    3.62T  2.28T  1.34T    6%  63.0%  ONLINE
  c0t5000CCA25DEAA3EEd0        -      -      -     -      -  ONLINE
  c0t5000CCA25DE6F42Ed0        -      -      -     -      -  ONLINE
 mirror                    3.62T  2.29T  1.33T    5%  63.2%  ONLINE
  c0t5000CCA25DE9DB9Dd0        -      -      -     -      -  ONLINE
  c0t5000CCA25DEED5B7d0        -      -      -     -      -  ONLINE
 mirror                    3.62T  2.29T  1.34T    5%  63.1%  ONLINE
  c0t5000CCA25DEB0F42d0        -      -      -     -      -  ONLINE
  c0t5000CCA25DEECB9Dd0        -      -      -     -      -  ONLINE
 mirror                    3.62T   237G  3.39T    1%  6.38%  ONLINE
  c0t5000CCA24CF36876d0        -      -      -     -      -  ONLINE
  c0t5000CCA249D4AA59d0        -      -      -     -      -  ONLINE
 mirror                    3.62T   236G  3.39T    0%  6.36%  ONLINE
  c0t5000CCA24CE9D1CAd0        -      -      -     -      -  ONLINE
  c0t5000CCA24CE954D2d0        -      -      -     -      -  ONLINE
 mirror                    3.62T   228G  3.40T    0%  6.13%  ONLINE
  c0t5000CCA24CE8C60Ed0        -      -      -     -      -  ONLINE
  c0t5000CCA24CE9D249d0        -      -      -     -      -  ONLINE
 mirror                    3.62T   220G  3.41T    0%  5.93%  ONLINE
  c0t5000CCA24CF80849d0        -      -      -     -      -  ONLINE
  c0t5000CCA24CF80838d0        -      -      -     -      -  ONLINE
 ```
-You can probably guess what happened: I had a zpool with five mirrors, and then
+## Shortcomings
 expanded it by adding four more mirrors. ZFS doesn't automatically rebalance
 existing data, but does skew writes of new data so that more go to the newer
 mirrors.
-To rebalance the data, the algorithm is straightforward:
+* The way actions and errors are logged in-memory and only persisted at the
-
+  end is not robust enough. Program crashes or system power loss can cause
-* for file in dataset,
+  files to be lost in the temporary directory. In the meantime, the program
-  * copy the file to a temporary directory in another dataset
+  still writes to `stdout` for each copy operation, so piping the output to
-  * delete the original file
+  `tee` should suffice for now.
  * copy from the temporary directory to recreate the original file
  * delete the temporary directory
 As the files get rewritten, not only do the newer mirrors get more full, but
 also the older mirrors free up space. Eventually, the utilization of all mirrors
 should converge.
 ## Solution
 The `datashake` program aims to automate the rebalancing process, while also
 adding some robustness and heuristics.
 * Gracefully handle shutdowns (e.g. Ctrl-c) to prevent files from getting lost.
 * Keep track of processed files, so that if the program stops and resumes, it
  can skip those files.
 * Write a journal of operations so that, if shut down ungracefully, files in
  the temporary directory can be identified and recovered.
 * Don't bother processing really small files.