2022-10-07: Updated with dual Xeon v4 bladebit.
2023-02-11: Updated with a madmax cuda plotter invalid ndevices error
2023-02-20: Updated with madmax cuda plotting with 128GB RAM
The backlog is getting more interesting, but in an attempt to compare a Xeon Silver processor to one or two E5-2620v4 processors for some future Chia plotting, I’ve arrived at some benchmarks and a bladebit caveat for the new diskplotter.
The idea is to replace my OG plots with NFT-style plots, while still self-pooling them. At some point I will probably expand my storage again as well.
Links are to original manufacturer specifications. If you find this document useful, feel free to send me a coffee. It might help with the memory upgrades on one or both machines too.
The systems involved:
System one:
- Supermicro SuperServer E403-9P-FN2T (X11SPW-TF motherboard)
- Xeon Silver 4210R (10c20t, 2.4-3.2GHz, 13.75MB cache, 100W TDP)
- 24GB RAM
- 960GB NVMe
- Ubuntu 22.04-1 LTS with current updates as of October 1, 2022
Quick observation: On my Monoprice Stitch power meter, this system goes from about 60W at idle to 160W while plotting with Madmax or Bladebit. Not surprising, but noisy and blowy.
System two:
- BOXX RenderPro 2 (Supermicro X10DRT-L motherboard)
- Dual Xeon E5-2620v4 (each 8c16t, 2.1-3.0GH, 20MB cache, 85W TDP)
- 32GB RAM
- 500GB NVMe on PCIe card
- 1TB m.2 SATA drive on same PCIe card
- Ubuntu 22.04-1 LTS with current updates as of October 1, 2022
Quick observation: This storage is very suboptimal for plotting, but it’s what came with the systems. I will dig into whether I have a larger faster SSD. Unfortunately this system only has USB 2.0 externally, and one low profile PCIe slot, so I’m a bit limited. Might put a 1TB NVMe drive in the PCIe slot though and see how that goes.
System three (I’ve written about this one before):
- Dell Precision Workstation T7910
- Dual Xeon E5-2650Lv4 (each 14c28t)
- 128GB RAM
- 4x 1TB Samsung NVMe drives on the Ultra Quad (PCIe 3.0 x4 per drive) in software RAID-0
- Ubuntu 22.04.1 LTS with current updates as of February 2023
Plotters:
- Chiapos plotter from Chia 1.6.0 built from source
- MadMax plotter (d1a9e88) built from source as of October 2, 2022
- Bladebit v2.0.0-beta1 binary downloaded from Chia github
- All plotters left at default settings unless otherwise noted.
Metrics so far:
System one, Chiapos with 12200MB memory assigned
Time for phase 1 = 10876.922 seconds. CPU (147.640%) Sun Oct 2 19:31:42 2022
Time for phase 2 = 4247.395 seconds. CPU (97.160%) Sun Oct 2 20:42:29 2022
Time for phase 3 = 9153.365 seconds. CPU (95.640%) Sun Oct 2 23:15:03 2022
Time for phase 4 = 635.266 seconds. CPU (97.980%) Sun Oct 2 23:25:38 2022
Total time = 24912.949 seconds. CPU (118.660%) Sun Oct 2 23:25:38 2022
System one, Madmax with -r 10
Phase 1 took 1461.93 sec
Phase 2 took 773.745 sec
Phase 3 took 1241.66 sec, wrote 21866600944 entries to final plot
Phase 4 took 61.6523 sec, final plot size is 108771592628 bytes
Total plot creation time was 3539.07 sec (58.9845 min)
System one, Bladebit with 16GB cache configured
Bladebit plot with 16G cache
Finished Phase 1 in 1744.37 seconds ( 29.1 minutes ).
Finished Phase 2 in 174.39 seconds ( 2.9 minutes ).
Finished Phase 3 in 1501.98 seconds ( 25.0 minutes ).
Finished plotting in 3420.74 seconds ( 57.0 minutes ).
System two with SN_750 NVMe drive (500GB), Bladebit with 24G cache
Finished Phase 1 in 1376.37 seconds ( 22.9 minutes ).
Finished Phase 2 in 148.09 seconds ( 2.5 minutes ).
Finished Phase 3 in 970.59 seconds ( 16.2 minutes ).
Finished plotting in 2495.06 seconds ( 41.6 minutes ).
Gigahorse metrics so far:
System three:
./cuda_plot_k32 -C 5 -n 5 -t /nvme/chia/ -2 /nvme/chia/ -d /plots/gigahorse-cuda/ -c xch1xxxxx -f a00fcxxxxx
Total plot creation time was 380.192 sec (6.33654 min)
Total plot creation time was 336.725 sec (5.61209 min)
Total plot creation time was 355.188 sec (5.9198 min)
Total plot creation time was 374.554 sec (6.24257 min)
Total plot creation time was 388.424 sec (6.47374 min)
The bladebit diskplot quirk:
If you get this error, there’s a good chance you didn’t specify the destination for the plot.
Allocating memory
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::Mconstruct null not valid
Aborted (core dumped)
So for example:
./bladebit -n 3 -f <farmerkey> -c <poolcontract> diskplot -t1 /nvme/tmp/ --cache 16G
would give this error. Unlike the other plotters, it does *not* assume that your temp path is your output path if you only specify the temp path. So you’d use:
./bladebit -n 3 -f <farmerkey> -c <poolcontract> diskplot -t1 /nvme/tmp/ --cache 16G /nvme/plots/
instead.
The gigahorse cuda plotter error:
With GPU-enhanced plotting now available in released (binary-only) code from Madmax, I decided to throw a modern GPU into my T7910, repair the post-22.04-upgrade mount failures, and give it a try.
As a reminder from previous posts, this is a dual E5-2650L v4 system with 128GB RAM and 4x 1TB NVMe on the Ultra Quad card. It boots from a 256GB NVMe drive on a PCIe card, and has 4x 8TB SAS drives that don’t seem to be recognized after a few months off. Probably a SATA controller or cable issue, but life goes on.
So I put one of my RTX 3060 LHR cards in, fixed up the NVMe stuff a bit, and went to run cuda_plotter_k32. It should do the partial memory plot, but alas, I got an error:
Invalid -r | --ndevices, not enough devices: 0
The card showed up in lspci, but then I realized it needed NVIDIA drivers. So I installed the 530 server bundle and tools, and then the plotter worked.
Alas, the first GPU enhanced plot seems to have wedged the machine against interactive use. Looks like that’s a memory issue that I’ll have to work out, probably by adding memory.
I will update this with further stats, and maybe make a comparison chart, as testing progresses. I’m also giving serious thought to upgrading the SSD in the dual-E5 machine.
Obligatory disclosure:
While I work for Supermicro at the time of this writing, the servers and all other elements of my home labs and workspaces are my own and have no association with my employer. This post is my own, and my employer probably doesn’t even remember I have a blog, much less approve of it.