Problems expanding a Synology SHR volume on DS1821+ with a faulty SSD cache attached

I got a Synology DS1821+ array about two years ago, planning to finally cascade my other Synology units and let one or two go. So far, that has not happened, but I’m getting closer. 

DS1821plus Network Attached Storage array from Synology

Synology DS1821+, photo courtesy of Synology. Mine looks like this but with a bit more dust.

The back story of my DS1821+

This is the 8-bay model with a Ryzen V1500B 4-core 2.2GHz processor, support for virtual machines and containers, and a PCIe slot which I filled with a dual port Mellanox ConnectX-3 (CX312A) 10 Gigabit SFP+ card which came in under $100 on eBay. The expansion options include two eSATA ports (usually used for the overly expensive DX expansion units) and four USB 3 ports (one of which now has a 5-bay Terramaster JBOD attached). 

Today I could get a 40 Gigabit card for that price. In fact, I did for another project, for about $90+tax with two Mellanox cables, just last month, but I’m not sure it would work on. It’s not too hard to find one of these 10 Gigabit cards for under $50 shipped in the US. Be sure to get the dual plate or low profile version for this Synology array.

I ran it for a while with 64GB RAM (not “supported” but it works), and then swapped that out to upgrade my XPS 15 9570 laptop, putting that machine’s 32GB back into the Synology. I had a couple of 16TB MDD white label hard drives and a 256GB LiteON SSD as a cache. I know, I know, there’s NVME cache in the bottom and you can even use it as a filesystem volume now.

Here’s where something went wrong

Sometime in the past couple of updates, the SSD cache started warning that it was missing but still accessible. I ignored it, since this system doesn’t see a lot of use and I don’t really care about the cache.

Volume expansion attempt, which failed. SSD cache warning showing here as well.

Earlier this month, I got a couple more of the MDD white label drives (actually ST16000NM001G-2KK103 according to Synology Storage Manager), I was able to expand the storage pool but not the volume.

Successful storage pool expansion
The volume expansion error. No filesystem errors were discovered.

“The system could not expand Volume 1. This may be caused by file system errors. To rescue your data, please sign in to your Synology Account to submit a technical support ticket.”

Well, as I went to the Synology website to enter a ticket, I remembered the SSD issue and wondered if that caused the problem with growing the volume. 

Easier to fix than I had feared

Sure enough, removing the cache made the volume expand normally, bringing me from 93% used to 45% used. Not too bad. 

 

Where do we go from here?

At some point in the next month or two, I plan to get three more of these 16TB drives, pull the unused 8TB and unused 256GB SSD, and get the system maxed out. 

I’m a bit torn between using this array to replace my Chia farms, at least for a while, or merge my other two substantial Synology arrays onto it and then use one of them (maybe the DS1515+) as the Chia farm with the DX513 and an assortment of external USB drives. Flexfarmer on Docker makes it pretty easy to run Chia farming on a Synology with minimal maintenance. 

Replacing Meraki with TP-Link Omada for the new year

[This post was originally teased on Medium – check it out and follow me there too.]

I’m a big fan of Meraki, but now that I haven’t been an employee of Cisco for over two years, I no longer have the free license renewals or the employee purchase discounts on new products and licenses. So October 28, 2022, was the end of my Meraki era. (Technically a month later, but I needed a plan by October 28 just in case.)

The home network, mostly decabled, that got me through the last 4-5 years.

I needed a replacement solution that wouldn’t set me back over a thousand dollars a year, and my original plan was to use a Sophos SG310 either with the Sophos home firewall version or PFsense or the like. I even got the dual 10gig module for it, so that I could support larger internal networks and work with higher speed connectivity when the WAN links go above 1Gbps. I racked it up with a gigabit PoE switch with 10gig links, and now a patch panel and power switching module.

The not-really-interim network plan. The Pyle power strip and iwillink keystone patch panel stayed in the “final” network rack.

But I didn’t make the time to figure it out and build an equivalent solution in time.

How do you solve a problem like Omada?

Sometime in early to mid 2022 I discovered that TP-Link had a cloud-manageable solution called Omada.

It’s similar in nature to Meraki’s cloud management, but far less polished. But on the flip side, licensing 12 Omada devices would cost less than $120/year, vs about $1500/year (or $3k for 3 years) with Meraki. So I figured I’d give it a try.

The core element of the Omada ecosystem is the router. Currently they have two models, the ER605 at about $60-70, and the ER7206 at about $150. I went with the ER605, one version 1 without USB failover (for home, where I have two wireline ISPs), and one version 2 model with USB failover (for my shop where I have one wireline ISP and plan to set up cellular failover).

You’ll note I said cloud-manageable above. That’s a distinction for Omada compared to Meraki, in that you can manage the Omada devices individually per unit (router, switch, access point), or through a controller model.

The controller has three deployment models:

  • On-site hardware (OC200 at $100, for up to 100 devices, or OC300 at $160, for up to 500 devices)
  • On-site or virtualized software controller, free, self-managed
  • Cloud-based controller, $9.95 per device per year (30 day free trial for up to 10 devices I believe)

I installed the software controller on a VM on my Synology array, but decided to go web-based so I could manage it from anywhere without managing access into my home network.

Working out the VPN kinks

The complication to my network is that I have VPN connectivity between home and the shop across town. I also had a VPN into a lab network in the garage. Meraki did this seamlessly with what you could call a cloud witness or gateway – didn’t have to open any holes or even put my CPE into bridge mode. With Omada, I did have to tweak things, and it didn’t go well at first.

I was in bridge mode on Comcast CPE on both ends of the VPN, and did the “manual” setup of the VPN, but never established a connection. I tried a lot of things myself, even asked on the Omada subreddit (to no direct avail).

I came up with Plan B including the purchase of a Meraki MX65. I was ready drop $300-500 to license the MX65 at home, MX64 at the shop, and the MR56 access point at home to keep things going, with other brands of switches to replace the 4-5 Meraki switches I had in use.

As a hail-mary effort, I posted on one of the Omada subreddits. The indirect help I got from Reddit had me re-read other documentation on TP-Link’s site, wherein I found the trick to the VPN connectivity – IKEv1, not v2. Once I made that change, the link came up, and the “VPN Status” in Insights gave me the connectivity.

The trick to the manual VPN connectivity was IKEv1, not v2

The last trick, which Meraki handled transparently when you specified exported subnets, was routing between the two. I had to go to Settings -> Transmission -> Routing and add a static route with next hop to the other side of the tunnel. Suddenly it worked, and I was able to connect back and forth.

Looking at the old infrastructure

My old Meraki network had 12 devices, including three security appliances, four switches, a cellular gateway, and four access points. The home network used the MX84 as the core, with a MS42p as core switch, a MS220-24 as the “workbench” switch on the other side of the room, and a MS220-8P downstairs feeding the television, TiVo, printers, MR42 access point, and my honey’s workstation, connected via wireless link with a DLink media access point in client mode. I also had a MS510TXPP from Netgear, primarily to provide 2.5GbE PoE for the Meraki MR56 access point.

There was a SG550XG-8F8T in my core “rack” (a 4U wall-mountable rack sitting on top of the MS42p switch) but it was not in use at the time – I didn’t have any 10GBase-T gear, and the MS42p had four 10GbE SFP+ slots for my needs.

The garage lab had a SG500XG-8F8T behind the Z1 teleworker appliance. TP-Link powerline feeds that network from the home office.

The remote shop had a MX64, MS220-8P, and MR18, as well as the MG21E with a Google Fi sim card.

So there was a lot to replace, and complicate in the process.

Looking at the new infrastructure

The new core router is the TP-Link ER605, feeding the MS510TXPP switch for mgig and 10gig distribution (including WiFi), with another downlink to a TL-SG2008P switch ($90 at time of purchase) which offers 4 PoE+ ports and integrated monitoring with Omada.

The ER605 has front-facing ports, so I have those cables going into the patch panel to connect Internet uplinks and the PoE switch. On the SG2008P, ports are on the back and LEDs are on the front, so I have all 8 ports going to the patch panel and they feed things from there.

The MS510TXPP has downlinks to the powerline network, a SG500-48X switch across the room connected by 10 Gigabit DAC, and a few other things in the office.

I have the wireless needs fulfilled by a Netgear Nighthawk router in AP mode, and a TP-Link Omada EAP650 access point that needs some tuning. I expect to replace the Nighthawk with the EAP650 at some point, and I have a Motorola Q11 mesh network kit coming soon which could replace much of the wifi in the house.

The downstairs network is still fed by the DLink wireless bridge (as a client of the Nighthawk), but now it has a random Linksys 8 port switch serving the first floor needs.

The garage lab still has the SG500XG, bridged via powerline, and very limited hardware running due to California electric prices.

In the shop, I have the ER605v2, feeding a random 8-port TP-Link unmanaged switch for now. I’m thinking about getting an Omada switch there, and I recently installed a UeeVii WiFi6 access point (acquired through Amazon Vine, review and photos here) which is more than enough to cover the 500 square feet I need there.

Why’d it take so long to post?

I had found an Etsy seller who made 3d printed rackmount accessories, and I ordered a cablemodem mount, router mount, and a 5-port keystone patch panel. I ordered December 15, shipping label was issued December 21, and I expected it right after Christmas. Alas, after a month and two shipping labels being generated, I had no gear and no useful response from the seller, so I got a refund and went with rack plan B.

I took a 14″ 1U rack shelf like this one (but fewer slots and about half the price) and used zip ties to attach the router and 8-port switch to it. Not a great fit, but inside the CRS08 carpeted rack it’s not so visible.

Where do we go from here?

Right now the networks are stable, except for no wifi in the garage and occasional wifi flakiness in the house. So my next steps will be fixing the home wifi, and probably moving another AP to the garage (possibly even setting up a wireless bridge to replace the powerline connection).

I am looking at some more switching, possibly upgrading the Omada switch to replace the Netgear at home, and then take the existing 8 port Omada to the shop to provide more manageability (and PoE+) over there.

The front runners for the new switch right now are the SX3008F (8 port SFP+ at $230; 16 port SX3016F is $500), SG3428X (24 port gigabit, 4 port SFP+), and the SG3210XHP-M2 (8 port 2.5GbE copper PoE + 2 SFP+ slots at $400, pretty much the same as the Netgear except with no 5GbE ports).

There are a couple of other options, like the $500 SSG3452X which is equivalent to the MS42p, but I’ll have to consider power budget and hardware budget, and what I can get sold from the retired stash this month to further fund the expansion.

I also need to work out client VPN to connect in to both sites. I had client VPN on my travel laptop to the shop for a couple of years, but haven’t tried it with the new platform yet.

TP-LInk supposedly has a combination router/controller/limited switch coming out this year, the ER7212 which also offers 110W PoE across eight gigabit ports. It’s apparently available in Europe for 279 Euros. Hopefully it (and other new products) will be released in the US at CES Las Vegas this week.

I was going to bemoan the lack of 10G ports, but then I saw the ER8411 VPN router with two SFP+ ports (one WAN, one WAN/LAN). Still doesn’t seem to support my 2.5Gbit cable WAN, but it’s at least listed on Amazon albeit out of stock as of this writing.

A quick Chia plotting note and a future searchable tip

2022-10-07: Updated with dual Xeon v4 bladebit.
2023-02-11: Updated with a madmax cuda plotter invalid ndevices error
2023-02-20: Updated with madmax cuda plotting with 128GB RAM 

The backlog is getting more interesting, but in an attempt to compare a Xeon Silver processor to one or two E5-2620v4 processors for some future Chia plotting, I’ve arrived at some benchmarks and a bladebit caveat for the new diskplotter.

The idea is to replace my OG plots with NFT-style plots, while still self-pooling them. At some point I will probably expand my storage again as well. 

Links are to original manufacturer specifications. If you find this document useful, feel free to send me a coffee. It might help with the memory upgrades on one or both machines too.

The systems involved:

System one:

Quick observation: On my Monoprice Stitch power meter, this system goes from about 60W at idle to 160W while plotting with Madmax or Bladebit. Not surprising, but noisy and blowy. 

System two:

Quick observation: This storage is very suboptimal for plotting, but it’s what came with the systems. I will dig into whether I have a larger faster SSD. Unfortunately this system only has USB 2.0 externally, and one low profile PCIe slot, so I’m a bit limited. Might put a 1TB NVMe drive in the PCIe slot though and see how that goes. 

System three (I’ve written about this one before):

  • Dell Precision Workstation T7910
  • Dual Xeon E5-2650Lv4 (each 14c28t)
  • 128GB RAM
  • 4x 1TB Samsung NVMe drives on the Ultra Quad (PCIe 3.0 x4 per drive) in software RAID-0
  • Ubuntu 22.04.1 LTS with current updates as of February 2023

Plotters:

Metrics so far:

System one, Chiapos with 12200MB memory assigned

Time for phase 1 = 10876.922 seconds. CPU (147.640%) Sun Oct 2 19:31:42 2022
Time for phase 2 = 4247.395 seconds. CPU (97.160%) Sun Oct 2 20:42:29 2022
Time for phase 3 = 9153.365 seconds. CPU (95.640%) Sun Oct 2 23:15:03 2022
Time for phase 4 = 635.266 seconds. CPU (97.980%) Sun Oct 2 23:25:38 2022

Total time = 24912.949 seconds. CPU (118.660%) Sun Oct 2 23:25:38 2022

System one, Madmax with -r 10

Phase 1 took 1461.93 sec
Phase 2 took 773.745 sec
Phase 3 took 1241.66 sec, wrote 21866600944 entries to final plot
Phase 4 took 61.6523 sec, final plot size is 108771592628 bytes
Total plot creation time was 3539.07 sec (58.9845 min)

System one, Bladebit with 16GB cache configured

Bladebit plot with 16G cache
Finished Phase 1 in 1744.37 seconds ( 29.1 minutes ).
Finished Phase 2 in 174.39 seconds ( 2.9 minutes ).
Finished Phase 3 in 1501.98 seconds ( 25.0 minutes ).
Finished plotting in 3420.74 seconds ( 57.0 minutes ).

System two with SN_750 NVMe drive (500GB), Bladebit with 24G cache

Finished Phase 1 in 1376.37 seconds ( 22.9 minutes ).
Finished Phase 2 in 148.09 seconds ( 2.5 minutes ).
Finished Phase 3 in 970.59 seconds ( 16.2 minutes ).
Finished plotting in 2495.06 seconds ( 41.6 minutes ).

Gigahorse metrics so far:

System three:

./cuda_plot_k32 -C 5 -n 5 -t /nvme/chia/ -2 /nvme/chia/ -d /plots/gigahorse-cuda/ -c xch1xxxxx -f a00fcxxxxx

Total plot creation time was 380.192 sec (6.33654 min)
Total plot creation time was 336.725 sec (5.61209 min)
Total plot creation time was 355.188 sec (5.9198 min)
Total plot creation time was 374.554 sec (6.24257 min)
Total plot creation time was 388.424 sec (6.47374 min)

The bladebit diskplot quirk:

If you get this error, there’s a good chance you didn’t specify the destination for the plot. 

 Allocating memory
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::Mconstruct null not valid
Aborted (core dumped)

So for example:

./bladebit -n 3 -f <farmerkey> -c <poolcontract> diskplot -t1 /nvme/tmp/ --cache 16G 

would give this error. Unlike the other plotters, it does *not* assume that your temp path is your output path if you only specify the temp path. So you’d use:

./bladebit -n 3 -f <farmerkey> -c <poolcontract> diskplot -t1 /nvme/tmp/ --cache 16G /nvme/plots/

instead. 

The gigahorse cuda plotter error:

With GPU-enhanced plotting now available in released (binary-only) code from Madmax, I decided to throw a modern GPU into my T7910, repair the post-22.04-upgrade mount failures, and give it a try. 

As a reminder from previous posts, this is a dual E5-2650L v4 system with 128GB RAM and 4x 1TB NVMe on the Ultra Quad card. It boots from a 256GB NVMe drive on a PCIe card, and has 4x 8TB SAS drives that don’t seem to be recognized after a few months off. Probably a SATA controller or cable issue, but life goes on. 

So I put one of my RTX 3060 LHR cards in, fixed up the NVMe stuff a bit, and went to run cuda_plotter_k32. It should do the partial memory plot, but alas, I got an error:

Invalid -r | --ndevices, not enough devices: 0

The card showed up in lspci, but then I realized it needed NVIDIA drivers. So I installed the 530 server bundle and tools, and then the plotter worked. 

Alas, the first GPU enhanced plot seems to have wedged the machine against interactive use. Looks like that’s a memory issue that I’ll have to work out, probably by adding memory. 


I will update this with further stats, and maybe make a comparison chart, as testing progresses. I’m also giving serious thought to upgrading the SSD in the dual-E5 machine. 

Obligatory disclosure:

While I work for Supermicro at the time of this writing, the servers and all other elements of my home labs and workspaces are my own and have no association with my employer. This post is my own, and my employer probably doesn’t even remember I have a blog, much less approve of it. 

Pi in the sky: Seven tips for finding the single board computer of your dreams

2022-07-03: Updated for AtomicPi

Raspberry Pi boards have been intermittently available this year. They’re still very useful, but your odds of going into a retailer and picking up a few at list price are about as good as Ethereum hitting $5k this month. In other words, don’t hold your breath.

That being said, this type of single board computer is not completely unobtainable, even in today’s supply-chain-constrained market. Here are seven tips for finding the SBC of your dreams. 

1. Check local retailers

This is a long shot, but for some people in some regions, it may still work. My local shop, Central Computers in Silicon Valley, has had them intermittently for a couple of months at reasonable prices. 

2. Check official distributors 

You can find sellers of the Pi boards on the official Raspberry Pi website. Stock may vary from day to day, and preorders may be possible, so check early and often if you’re pursuing this option. 

3. Check Amazon

Right now, I see a number of shippable Pi 4 boards in 4GB and 8Gb on Amazon. They’re pricey, with the 4GB board around $144 and the 8GB board around $195. But if you have to have it for work, or if you’ve found a way to profit majorly from using one of these boards, it may be the way to go. 

4. Consider kits

You may be thinking “I don’t need a power supply, a microSD card, a case, and all the other stuff,” but even when backorders weren’t considered, I saw starter kits with the Pi 4 board available in quantity at the above options. Right now, my local shop has the Okdo starter kit with the 8GB board for $160, limit one per customer. The bare board is $90 but out of stock, as are all of the standalone boards. So if you need access to a board soon (hopefully with someone else footing the bill), this is a very viable option. 

5. Can I interest you in a Pi400?

The Raspberry Pi 400 computer is a Pi 4b equivalent in a different form factor. The board should have the same performance as a 4GB Pi4b, and even when boards and kits were unavailable, the Pi 400 was readily available in a standalone unit at about $80 or a kit with power adapter for $110. Prices on Amazon are a bit higher (like $120 for the standalone or $180 for the kit), but still lower than the 4GB standalone board mentioned at Amazon above.  

You won’t be able to use your Pi cases or enclosures with the Pi 400, since it’s wider, but you can consider building your own stand or looking on Thingiverse and the like for 3d-printable enclosures for these boards. 

See Jeff Geerling’s “Raspberry Pi 400 Teardown” blog post and video to see what’s inside and how you might be able to repurpose the board for your needs. 

6. Check your local marketplaces for new or used boards

You may find some boards locally on Craigslist, Facebook Marketplace, Nextdoor, or the like. eBay is also an option, but it may or may not be local. As I write this post, I see boards in my extended area from $200-325 on Craigslist, and surprisingly $120 and up on Facebook. Someone is selling a complete 8-node cluster, including 6 8GB and 2 2GB boards (and power supply, network switch, tower case, etc) for $1000, which is pretty reasonable for the current market.

With these local marketplace options, be sure to buy locally, and if possible, try the board out before paying (if it’s not sealed). With eBay, read the ad carefully and be aware of buyer protections available to you.

7. Look into other small computer options

Raspberry Pi is the most famous card-sized board, probably with the longest run and best name recognition, But you can also look at things from the RockPi boards to ODROID, to LattePanda x86.

Intel NUC (NUC5PPYB/NUC5PPYH) on a 3d-printed stand with memory and HDMI dummy plug.

You may also be able to find bare board Intel NUC systems (like the remnants of the legendary Rabbit doors from a few years ago) that, while not exactly as tiny and requiring a bit more than 3-5 watts, may well do what you need. 

See the Rabbit Overview (October 2020)
and the Rabbit Launch system build (December 2021)

For example, there are some i3 and even i7 boards here on eBay for as low as $95 shipped (searching under the “motherboard” category). When I searched under “Desktops & All-in-Ones” I found some of the old Rabbit boards (quad core Pentium with Gigabit Ethernet) for around $50 each. You’ll have to add a DDR3 SODIMM, a power supply, and probably storage of some sort, but even then you can get a 4GB system for around $100 or so. 

NVI

If you don’t need an ultra-modern OS, you can also look into systems like the Jetson Nano (which I believe easily runs Ubuntu 18), or even Jetson TK1 (Ubuntu 14/16) from NVIDIA. These outdated boards are still quite interesting, and have many uses if you can “outsource” the security to a system with a newer platform.

And yet another option I found after posting this – Digital Loggers, a Silicon Valley company better known for their Ethernet-connected power controllers (mentioned in a previous post and used in my shop) are apparently the folks behind the AtomicPi Intel Atom-based single board computer. It takes a little bit more work to power, but for $50 you get a board based on the Atom x5-Z8350 1.44GHz CPU with 2GB RAM and 16GB EMMC on board, a breakout board, and an AI camera module. 

Unlike the other boards mentioned, I have not tried this one, but it’s worth a look if you can handle the limitations and get your 5V 3A power into it yourself. 

Where do we go from here?

I’m realizing I have a few boards that may be worth dusting off and using, or even selling. There’s a Pi 3b+ cluster in need of an expansion, and some other projects in the works for the upcoming holiday weekend. 

What are you doing with single board computers, and have you found any tips and tricks I missed? Share in the comments!

Three ways to build low profile Chia (and forks) nodes

This is another piece on a part of the Chia and cryptocurrency landscapes. See previous posts at https://rsts11.com/crypto

Need to set up a lightweight VPN to get into your low profile node remotely? Check out Stephen Foskett’s writeup on Zerotier. I’m using it on my Pi nodes to reduce NAT layers.

Many if not most Chia farmers run a full node on their farming / plotting machine. Some larger farms will use the remote harvester model, with a single full node and several machines farming plots on local storage. 

If you’re using Flexfarmer from Flexpool, or just want a supplemental node (maybe to speed up your own resyncing, or to supplement decentralization on the Chia network), you might want a dedicated node that doesn’t farm or plot. And for that use case, you don’t really need dual EPYC or AMD Threadripper machines. 

In fact, a well-planned Raspberry Pi 4B 4GB or 8GB system, with an external USB drive, will do quite well for this use. If you want to do a few forks as well, or another blockchain full node, a moderately-recent Intel NUC would do quite well for not much more. 

So here we’ll look at three builds to get you going. Note that any of these can run a full node plus Flexfarmer if you want, or just a full node. 

If you don’t already have Chia software and a full node installed, go ahead and install and sync the node on a full scale PC. it may save you five days of waiting. My original build for this use case was to test the blockchain syncing time from scratch.

Syncing from a semi-optimal Pi 4B from scratch took about 8 days, for what it’s worth. One member of the Chia public Keybase forum reported about 28 hours to sync on an Intel Core i5 12600k. 

Caveat: Raspberry Pi boards are a bit more challenging to find and even harder to find anywhere near the frequently-touted $35 price point, or even under $150. And for Chia nodes, you want a minimum of the 4GB Pi 4B (8GB wouldn’t hurt). So while it’s possible to run on older hardware, it’s not recommended.

 

You might also be able to run on a Pi400 (the Raspberry Pi 4B in a keyboard case, which is much easier to find for $100 or so, complete). I plan to test this soon.

 

Raspberry Pi with external USB SSD. 

This was my initial build, and today it’s running at the Andromedary Instinct providing an accessible full node for about 10-15 watts maximum. 

Continue reading