Happy 2023!

Here’s hoping that everyone had a Merry Christmas (or whichever holiday you celebrate!) and wishing everyone a happy, healthy and prosperous 2023!

Since my last post back in mid-October, I had my thoughts on what my next upgrade(s) would be – and I changed my mind. The DL360G Gen8 is, frankly, too noisy. It is great for testing things out, but with hybrid work it is a distraction at best and maddenly irritation at worst. Thus, buying the UniFi Aggregation Switch would serve no purpose – for now.

The other driving factor is that my “work” laptop, an old Lenovo Y50-70, started getting far too flaky. I think that there may be some cold solder joints but I am not set up to fix them. And it is getting old. I first thought that all the the Ubuntu 22.10 upgrade had gone sideways. There had been a lot of upgrades from Ubuntu 16.04 and I don’t limit myself to the LTS releases. When I went to do a fresh 22.10 install and the install would fail with not being able to find the Samsung Evo 850 SSD. I put the SSD in the MiniG3 and the Evo worked fine. The Samsung SSD from the MiniG3 showed the same issues in the Y50-70. After valiant service, I decided that the Y50 had to be put out to pasture.

I decided to replace it with my IdeaPad L340 (my now-old gaming laptop) with a minimal Windows 11 install – jury’s still out on Windows 11, but it seems to be incrementally improving; maybe Windows 12 will fix it 🙂 – for some specific Windows things. Most of the time it will be booted in the Ubuntu.

Of course, that meant the L340 needed a replacement. The Black Friday/Cyber Monday week sales were on and I decided on a Lenovo Legion 5 AMD. It is running an AMD Ryzen 5 6600H, RTX3060, 16GB DDR5 dual-channel RAM and a 512GB PCIe Gen4 SSD. This is my first AMD system in, what?, 30 years. My last one was an AM386-DX40. Besides, my son had decided he wanted to build his own gaming rig using the Ryzen 5 6600 🙂 It is a nice laptop – runs quick, battery life for web browsing, YouTube is about 4-5 hours (with a 80% “full” using Lenovo’s battery conservation).

What’s next? Not sure yet. I’m still waiting for ArmA 4…

Posted in Uncategorized | Leave a comment

Ubuntu/Debian and Broadcom BCM5762

I picked up an HP EliteDesk 705 G3 Mini PC as the potential third node for a Proxmox cluster. While I wouldn’t be using HA, I did not want to cause potential problems with a tied quorum vote. For under CDN$100 I got an AMD Pro A10-8770E, 8GB RAM and a 128GB (Samsung OEM) SSD. It is really small and uses next to no power. And, it has no noisy fans.

While the jury is still out on this approach because the DL360 G8’s fans are so annoying, I still installed Proxmox on the 705. Part of this was because I wanted to experiment with only having one NIC and using VLANs. Setting up the VLANs was no real issue – just a little more fooling around on the command line manually configuring the admin interface to work on a VLAN and allow the other VLAN bridges to be available. However, this really messed with my head as I would have the network interfaces drop offline when under heavy load. And sometimes for what seemed to be no reason at all! Of course, since this was my first time using a single NIC, I was placing the blame on me not setting up the VLANs correctly.

Maybe the issue was Proxmox (and Debian?). So, I put on Ubuntu 22.04 server and desktop as well as Linux Mint 21 to test that theory out. No VLANs, just a “normal” installation. Same issue: under load the NIC would go offline and the console (for server) would sow my field with salt – or rather a bunch of errors. Since this occurred with and without VLANs, the error had to be with something other than my VLAN configuration.

After much digging, there seems to be a (longterm?) nasty kernel bug with tg3 and the Broadcom BCM5762 NIC.

The solution that worked for me was to add iommu=pt to /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt"

The just run update-grub and reboot. Problem fixed.

I read other suggestions on blacklisting tg3 in /etc/modprobe.d as that was the issue but that did not work for me.

Posted in aide-mémoire | Leave a comment

Proxmox Update

A couple of notes on where I am with my ESXi to Proxmox move.

What I have done:

  • I finally bit the bullet and deleted all of my iSCSI LUNs for ESXi – they were simply wasting space at this point.
  • I also moved from iSCSI to NFS for my Proxmox VMs. The main difference between iSCSI and NFS is that iSCSI shares data on the block level, and NFS shares data on the file level. Performance is almost the same, but, in some situations, iSCSI can provide better results. For my purposes NFS is fine.
  • And NFS has the benefit that I can access the VMs on the NAS in Synology File Station. There is a level of comfort in being able to “see” all of my VMs without having to set up an iSCSI initiator.
  • I haven’t worked on network bonding for the NFS 10 GbE connections. Frankly, I do not think I will see any benefits and “added complexity equals added problems.”
  • Both my “production” (PROD) Proxmox server (DL380 G9) and my “development” (DEV) Proxmox server (DL360 G8) can access the same NFS shares. That means that my ISOs are shared as well as my Proxmox backups. This also means that I can create a VM on the DEV box and, by backing up the VM to the shared backup location, restore it to the PROD box. While not the smoothest method, it is a rather neat workaround. I do have to be careful about starting any VMs that are visible to both servers, otherwise “bad things” would happen.
  • I have moved a number of full VMs to LXC containers. Things like my reverse proxy, a DNS server, Home Assistant (I don’t really have any devices for this yet but with Z-Wave it will give me something to do 🙂 ), and Pi-Hole. There is really no need to have a full blown VM for these simple services. Proxmox makes it easy.
  • Pi-Hole – not part of Proxmox (other that it runs in an LXC container) but I finally got around to setting up Pi-Hole. A couple of additional configs were needed as Pi-Hole sits on one subnet while my wired and wireless devices sit on other subnets to get it working for my clients. And, WOW!, the amount of ads, etc. that are blocked are crazy. Here are the last 24 hour stats – 51.4% blocked requests?!?!?!

The To-Do (or think-about) list:

  • Adding a Unifi Link Aggregation switch – while both Proxmox nodes are connected directly to the Synology NAS, they cannot connect directly to each other for storage. One node is using the 172.16.10.0/24 subnet and the other 172.168.20.0/24 with one of the two10 GbE port of the NAS assigned to each. With the switch I can have them both on the same subnet (yes, I know that I could have “broken” the 172.16.10/0 subnet in two using /25).
  • Adding a “one litre” PC (such as a HP EliteDesk 705 G3) with Proxmox installed and setting up the DL360 and DL380 as a cluster using the 705 to complete the quorum. Allowing the 705 (or whatever) access to storage may be beneficial, but the 705 does not have 10 GbE. I *think* that the main nodes need to have storage on the same subnet to work correctly (I haven’t read up on that, yet). The jury is still out on that given the cost of electricity, heat (not so bad in winter) and noise (the rack sits behind me in my office).

Posted in Uncategorized | Leave a comment

Jumped Off the Dock

Sometimes when you get to the end of the dock, you want to jump into the water. After getting comfortable (enough) with Proxmox I took the plunge and deleted ESXi 7 from my DL380 Gen9 and moved all of my VMs from the DL360G8 test server. Everything seems to work just fine.

I did not put the change the Smart Array P440ar Controller into IT mode. I left it in HBA mode with one mirror for Proxmox and another mirror for local ISOs. Now, that is a little overkill for the ISOs given they are enterprise SSDs. The other reason why I am not worried about HBA mode is that my VMs are on one of my NASes.

I did leave the (unconverted) ESXi VMs on the NAS and have the ESXi configuration backed up in case I need to revert. I like to have a solid fallback plan…

There is one major last thing I have to sort: I cannot get the bonded 10 GbE connections between the the DL380 and the RS1221+ to be stable. I am pretty sure that this is something with the bonding method I selected. There is no switch between the two – it is just NIC-to-NIC (times two). For now, I will let this sit for awhile.

Overall, I am impressed with Proxmox. VMs seem snappier that under ESXi. There are some quirks: there is no file manager as ESXi (really nice to have), the need to have different storage types for different purposes (head scratch) and the fact if you use LVM for iSCSI you cannot take snapshots (what?!?!).

Posted in Uncategorized | Leave a comment

Running on Proxmox 7.2

I took the plunge over the last few nights and migrated all of my VMs from ESXi to Proxmox. Outside of the previously mentioned updates to the NIC configurations, everything went fine. You will need to follow the instructions for Windows guests per the instructions (e.g., make sure you manually modify the Windows’ pve file and change scsi0 to sata0). Once you do that, you will be able to boot successfully. You will also have to add the virtio drivers as well. This is a manual download outside of Proxmox.

I have the ESXi server shutdown so I can restart that if necessary. The only thing I will have to do if I need to go back to ESXi is to get the weather station data.

To do list:

  • Research how to use the 10GbE links working between the DL380G9 and the RS1221+. I’ll be taking my time with this one as I do not want to change the ESXi server to Proxmox until I have some more time with Proxmox and feel comfortable.
  • Get syslog sending to my DS216+II. I just started looking into this and it seems that, unlike ESXi, I need to do this at the underlying Proxmox Debian 11 operating system. Remember that ESXi (and XCP-NG) are Type 1 hypervisors and Proxmox is a Type 2 hypervisor.
Posted in Uncategorized | Leave a comment

…And back to Proxmox

After about a couple of weeks trying out XCP-NG I do not think it is for me. I still think that XCP-NG has great capabilities but it does not really seem polished. Maybe I needed a couple more weeks to get more familiar with XCP-NG but I really wasn’t feeling it. I know that it took some time for me to become confident with ESXi – and that started with ESXi 5 – but XCP-NG seems so “difficult.”

So, back to Proxmox. After a couple of evenings working on my LVM on iSCSI problem, I finally figured it out. The problem I was having with iSCSI was that my ESXi server also had access to host initiator along with Proxmox. The Proxmox host initiator configuration was set up as Linux which does not allow sharing and the ESXi host configuration was set up as VMware (of course) which allows sharing. So while Proxmox could see the target Synology (it appears, anyway) would not allow both ESXi and Proxmox to share the target. Once I removed ESXi’s access to the new target connectivity was fine.

The next part was to add an LVM (it could have been ZFS but I wanted to keep it simple) so that multiple VMs could reside on the iSCSI share. The last time I tried Proxmox I got iSCSI to work but it would have been a separate LUN for each VM’s disks and that would have been a pain. I can see uses for a LUN dedicated to a VM (security, performance, etc.) but I don’t need that for my home lab. I finally realized that I manually had to add the LUN’s Volume Group to the LVM configuration. I think that this can automatically populate but it didn’t work for me. Once this had that figured out, I could create VMs on the iSCSI LVM space.

I also took a test VM I created on ESXi (just a simple Ubuntu 20.04.4 LTS install) as a migration test. I exported that VM as an OVF and copied the files to a temp space on Proxmox. Then running qm importovf I imported the test VM. Proxmox’s migration instructions were dead on. No Clonezilla in the mix. Nothing against Clonezilla – it is a excellent package – but this seems much more simple to me.

I even forgot to remove the open-vm-tools on the test VM prior to exporting/importing. No problem, just remove the package. Networking, of course, had to be reconfigured. First, there was no network adaptor but that was expected given the type of virtual NIC changed. I just had to add a NIC to the VM’s configuration and reboot. Second, I had to update netplan as the new NIC had a different device ID. Update the YAML file with the right device ID and restart netplan. Bang, DHCP brings back an IP and you have network access. I then installed the qemu-guest-agent and reboot. VM migrated.

Next steps: 1. I’m going to leave the couple of VMs running for a few days to check stability. The (production) VM I migrated as a test to XCP-NG had networking issues after an hour or so that I couldn’t resolve. It could have been me, but still it was frustrating. 2. I have to dig deeper into redundant iSCSI. I have my dual 10GbE connectivity between ESXi and Synology but I need to figure out any gotchas. If this pans out, moving to Proxmox won’t be all that painful!

Posted in Uncategorized | Leave a comment

ArmA III Online on Ubuntu 22.04 – WORKS

After far too many years it looks like ArmA III is working on Linux. While you still have to use Steam’s Proton Expermental branch performance matches, for me at least, Windows 10.

My desktop PC is an old i7-4790K, Asus B85M-D Plus, Nvidia GTX 1070, 2 x 8GB DDR3 RAM, and a Samsung EVO 850 for Windows and and an EVO 870 for Ubuntu. The framerates are the same under both Windows 10 and Ubuntu 22.04 – 60 FPS. I think this limit is due to my monitors.

The only issue is that is I had two – two! – Kingston A400 480GB drives that were D.O.A. They would not show up in the BIOS as Kingston drives but as Phison PS3111-S11 with 21MB (yes, MB) of space (which I think is the cache). I tried them on two different motherboards with the same results. I bought the first one back to the store (they only had two in stock) and had no problems exchanging it for another A400. However, when I got back home I had the same issue. Given there were only two in stock, I suspect that they are from the same batch. It has been some time since I had problems with drives from the same batch being wanky – the last time was with two Seagate IronWolf 4TB drives. So, it happens.

When I returned the second A400 I went with a Samsung EVO – all the rest of my SSDs (SATA III and M.2 PCIe) are Samsung so the extra $40 seems worth it. (The only exception is there are the 4 Hitachi Ultrastar SSD400M Enterprise SSD in my DL380 Gen9 – but they are enterprise drives :-)). Anyway, put in the Samsung and the install went more or less flawlessly.

I say more or less flawlessly because I always have at least two paritions – one for / and another for /home and I forgot to assign /home during the install. No big deal, just a pain in the butt to move /home to its own partition. Oh, and I had installed ArmA so there was about 140GB of ArmA, Steam and mods to move…

Once my partner is back from vacation, we’ll do some missions to see if this works as well as I think it will.

It is so nice to ditch Windows!

Posted in Uncategorized | Leave a comment

VMware ESXi to XCP-NG

Given the pending Broadcom acquisition of VMware, I have been giving some thought about an alternative for my homelab virtualisation solution. Given Broadcom’s previous approaches to acquisitions, given I am using the free version of ESXi I have concerns whether the free version will continue.

These concerns are more than just for the homelab user. Organisations who are VMware customers are also having concerns. As Ed Scannell of TechTarget SearchDataCenter noted in his article Broadcom’s acquire-and-axe history concerns VMware users (retrieved 2022 July 10): “Broadcom’s history: Acquire and axe”. This make perfect business sense: sell off or stop the unprofitable/low profitability products, concentrate on your top customers and maximize shareholder return. That is your fiduciary responsibility.

However, that doesn’t help the homelab user. (Let alone the SMB market…) How this all plays out, well time will tell.

I’ve been a big advocate of VMware starting back in the very early 2000’s. Back in the days of VMware GSX (basically what VMware Workstation is now) and ESX. Back then, looking at server utilisation where everything was physical showed a massive underutilization. “Old fogies” like me will remember your NetServers and Proliants with average CPU utilizations around 25% (or 5-10%…).

Basically, I have been using VMware ESX and later ESXi both professionally and personally for about 20 years. (Wow! Where did the years go?)

Anyway, back to XCP-NG…

There are number of options to ESXi (or vSphere, really). There is XCP-NG, Proxmox VE, Hyper-V, KVM, etc. For a Microsoft shop, Hyper-V would be a logical choice and Hyper-V has come a long way over the years. For a Citrix shop, Citrix Hypervisor (formerly XenServer)) is an obvious option. Proxmox seems really neat. Xen is really cool because it is the basis for XenServer/Citrix Hypervisor (I’ll be sticking with XenServer as “Hypervisor” without the Citrix reads strange.) and XCP-NG with XCP-NG basically being the free, open source version of XenServer.

In the end, I have two potential options and if I’m going open source (free, as in free beer not a free puppy) what are my options? There really seems to be two: XCP-NG or Proxmox. This is a challenge given the maturity of ESXi.

I first tried Proxmox. The interface is really nice. However, after a week of trying to get iSCSI to work (with lots of YouTube videos and Google University) I finally gave up. The problem wasn’t so much with getting iSCSI to work but to work as I want it to work. I want it to work like ESXi where I create a datastore and have all my VMs reside on that datastore. I could easily get a VM to use all the datastore but I wanted a shared datastore; not a LUN (or LUNs) per VM. I tried to get LVM as the filesystem for VM use but for whatever reason I couldn’t get it to work. Basically, after a week I gave up.

The same thing almost happened with XCP-NG. However, after about five days (and more than a little help from Tom Lawrence’s YouTube videos :-)) I have XCP-NG up and running on ye olde DL360 Gen8. I had to replace the onboard P420i RAID controller with my old IBM 1015 in IT mode. The problem with the P420i is not that you cannot change it from RAID to IT mode. Rather, the DL360 Gen8 cannot boot from the P420i when in IT mode. I thought about buying a PCIe M.2 card and using an M.2 SSD I had lying around but I could not get confidence that I would be able to successfully boot from the PCIe card. I thought about finding an HP power/SATA cable but the price on eBay (plus shipping time) was not appealing. I did have to purchase a couple of SAS cables as the HP cables for the P420i weren’t long enough but $25 and 5 days shipping seemed reasonable. The only downside is that the HP drive lights do not work with the non-HP cables. Oh well…

The biggest challenge – as it would be with any transition – is figuring out how X is done on the new platform. I have most things figured out. My feeling is XOA – the management “console” – is not as mature as ESXi. You can go with the XCP version of XOA but without a subscription you cannot get the patches. I decided to go with Roni Väyrynen’s XenOrchestraInstallerUpdater running on Ubuntu (not Debian, because that is what I am familiar with). However, XOA in my opinion is not as mature and polished as ESXi. Maybe the new Xen Orchestra Lite (a/k/a XOA-lite) may fix this. Remember, the web-based ESXi management GUI was pretty rough at one time, too. Time will tell.

I have one “production” VM running on XCP-NG. Using Clonezilla I migrated the VM from ESXi to XCP-NG. Just remember to remove the open-vm-tools first (I use Ubuntu for most of my servers). About the only issue I have found so far is that the network interface names change during the migration from the “ensxxx” format to the “traditional” ethx format. No problem, just update Netplan on first boot.

One thing that I have to figure out is Synology iSCSI and redundancy. The Synology iSCSI for Linux seems to be single path only. I’m sure I’ll figure it out over the next few weeks. I only need another GigE PCIe card for the DL360 as I only have four GigE ports – one for XCP-NG, one for VMs, one for DMZ and one for iSCSI.

I also have to figure out how to deal with 10GigE for any migration. Right now ESXi is directly connected to my RS1221+ so I’ll need another dual port 10GigE port for the DL360 and a 10GigE switch and four DAC cables. That might take time.

Anyway, I do have options…

Posted in Uncategorized | Leave a comment

Ubuntu 22.04 and Lenovo IdeaPad L340-15IRH (aide-mémoire)

For some reason the default driver for the Realtek RTL8111/8168/8411 wired Ethernet controller in the Lenovo IdealPad L340-15IHR will not work under Ubuntu 22.04. This happens with both a clean install and upgrading from Ubuntu 21.04/21.10. The solution is to install the r8168-dkms driver and reboot.

***EDIT: I should have added blacklist the r8169 driver (it seems list like this driver is poorly implemented):

sudo sh -c 'echo blacklist r8169 >> /etc/modprobe.d/blacklist.conf'

And, install the r8168 driver:

sudo apt-get install -y r8168-dkms

Posted in aide-mémoire, Uncategorized | Leave a comment

10GbE iSCSI – Nice and Quick

I’ve been running iSCSI between my DL380 Gen9 VMware ESXi 7.0.3 server and Synology RS1221+ NAS at 1GbE. It works okay but as one would expect it is not all that quick. For my birthday I received a Synology E10G21-F2 10GbE dual SFP+ Port adaptor:

The DL380 has a HP FlexFabric 10Gb 2-port 554FLR-SFP Adapter that I got for Christmas:

While I have used OM3 fibre on my UniFi switches, this time I decided to go with DAC cables. Fibre is, well, neat but DAC is simpler with no issues of dirty fibre connectors, etc. Probably a little cheaper as well. I have redundant connections between the ESXi server and the NAS.

The start up of the VMs seems a little faster to start but is a little hard to judge. A Windows Server 2022 VM using CrystalDiskMark showed the following results:

On the Synology NAS, here’s the results:

Note that these scores are in megabytes per second, so we need to multiply by 8:
* Windows Server: 5,678.64 megabits per second
* Synology NAS: 5,388.8 megabits per second
Not bad with the overhead of a VM.

Note that there is no switch involved. I have a direct connection between ESXi and the NAS. Eventually I may add a 10 GbE switch (likely a UniFi USW-Aggregation) and a 10 GbE card to the old DL360 Gen8. I can still connect the DL360 to the iSCSI targets using 1 GbE. The DL360 is only for testing anyway.

Posted in Uncategorized | Leave a comment