More and more, there is open source software appearing that will do something that used to require a very expensive piece of hardware and/or software. TrueNAS (with TrueNAS Core, formerly FreeNAS) is one such open source project.
With a view to moving my organisation's data to TrueNAS, as a test case I decided to configure an old circa-2012 HPE Proliant SE1220 with TrueNAS. It was quite an adventure, and this naturally lead to a blog post.
At this point, I'd like to thank Ben Pridmore, of First Nations Media, for productive discussions and suggestions at all stages of this investigation, and for collaboration with the hardware issues.
I think a lot of organisations have old server hardware lying around, and if not there is incredibly cheap superseded hardware online to play around with.
First, let's go over some terms and background, because the tech moves quickly, and perhaps, like me, this is the first time you've had to look at SAS or HBAs in any kind of depth. I'll assume you know what RAID is, since that is foundational for the whole post.
SATA
I'm sure everyone is familiar with SATA, which has been the hard drive interface of choice for a long time now. SATA stands for Serial ATA (ATA standing for Advanced Technology Attachment - hello, marketing terminology!). SATA took over from PATA, or IDE as it was also known, with PATA being Parallel ATA or originally just called ATA before there was a need to distinguish between serial and parallel versions.
A SATA III Interface can deliver 6Gbit/s (600Mbyte/s) with SATA II half that, and SATA I half again (1.5 Gbit/s). SATA can be used only for single drives: one drive per SATA port/cable.
SAS and SAS Expanders
SAS evolved out of SCSI, and fulfils a similar role to SATA, but it's a higher end product used in servers and enterprise hardware. It has multi channels, better bidirectional throughput, higher signalling voltages (hence greater maximum cable length) and a number of other advantages, and the common speeds are currently 3Gbit/s, 6Gbit/s and 12 Gbit/s.
The key hardware item to be aware of is the SAS Expander, which is the basis of any server RAID unit, allowing typically up to 16 SATA connections on a backplane to be connected to a single SAS cable. With that cable plugged into a compatible SAS controller, this will allow the OS to access individual drives similarly as if each were connected to the motherboard via its own SATA controller.
See these good articles/posts for details:
The question that immediately came to mind was about bottlenecking, given that we are accessing all those drives through one cable. The above article makes the point that most mechanical drives operate at around 140Mbyte/s (1400Mbit/s), and given multiple channels and that only several drives in the array are likely to be operating at once, in general there is ample bandwidth to avoid saturation.
With SSDs however, the situation is very different. With a typical 500Mbyte/s (5Gbit/s) bandwidth, several SSDs may rapidly saturate a SAS connector. High bandwidth SAS plus low disk numbers may be necessary for smooth operation of an SSD array.
Host Bus Adapters (HBAs)
The controller cards necessary to manage a SAS expander's drives fall into two categories: HBA and RAID. A HBA card transparently connects the drives on the SAS expander to the motherboard and OS - it doesn't try to provide any management layer, caching or additional smarts. Conversely, a RAID card undertakes the management of the drives into a RAID array - this is 'hardware RAID' - and the motherboard and OS sees often only one 'logical' drive, or several, depending on how many logical drives have been set up in the RAID configuration. The RAID card manages all aspects of the RAID array and the OS is simply the 'end user', seeing what the RAID card wants it to.
An HBA or RAID card has operating firmware and a separate firmware BIOS (often referred to as the 'SAS BIOS') that can be accessed during startup (just like the motherboard BIOS - it's easy to get confused!). The SAS BIOS can be used to set up things like boot devices (for HBA) or RAID configuration (for RAID cards).
For many models of card, the operating firmware and the BIOS can be flashed with different versions of the firmware that convert the behaviour to HBA or RAID card. However a card designed to work in one mode may not be as reliable in the other.
SAS Card Compatibility with TrueNAS
The HBA/RAID issue is a central one in the TrueNAS forums. ZFS and hence TrueNAS are designed to perform with direct and full access to the disk hardware through a HBA: TrueNAS is 100% software RAID.
This is at odds with hardware raid - it is definitely not recommended to use ZFS on top of hardware RAID:
Likewise, you can use a RAID card in JBOD mode and switch off as much RAID functionality as possible, but there will still not be direct access to the individual disks, and this is going to be a red flag.
But remember that a lot of RAID cards can be re-flashed into HBA mode. How about that option?
TrueNAS and ZFS can drive the hardware extremely hard during data rebuilds, and this is likely over time to expose any weaknesses in the controller card.
Here's some of the debate:
The TL;DR; of all this is: if you don't want to roll the dice in regard to your data, buy and use a TrueNAS recommended HBA card to replace any RAID card you might have.
The LSI 9211-8i (PCIe 2.0 6Gbit/s), LSI 9207-8i (PCIe 3.0 6Gbit/s) and LSI 9300-8i (PCIe 3.0 12Gbit/s) appear to be the gold standard and available quite cheaply online.
The post states that 'the LSI 9240-8i, IBM ServeRAID M1015, Dell PERC H200 and H310, and others are readily available on the used market and can be converted to LSI 9211-8i equivalents.'
My server contained a RAID card (the HP SmartArray P212) so I ordered an LSI 9211-8i HBA card second hand online for around $60US.
Anatomy of the Server
First, let's have a quick look at the anatomy of the server in light of the above discussion.
This is a top view of the server. The top area of the picture, inside the green rectangle, is the SAS Expander - an enclosure where the 12 SATA drives go (these are 2TB 7200RPM drives). If you look along the bottom edge of the drives, you can see the edge of a circuit board running along the entire length of the expander. The chassis and circuit board are basically a drop-in unit. They attach to the power supply, all the SATA drives plug directly in to the circuit board, and the whole thing plugs into the rest of the server via a single SAS cable.
There's a photo from the front of the server showing the drives, following the below photo.
The next block down, inside the aqua rectangle, are eight fans - of no configuration consequence, but they are very loud on startup.
Inside the purple rectangle is the area for two processors and RAM for each (only one is installed). There is a near invisible clear plastic air-directing cover over this area, to which I've taped a couple of screws during disassembly.
The metal box inside the red rectangle is a PCI extender, containing a SAS controller card and a matched pair of hard drives for use as mirrored system drives for the server (these are also attached by cable to the SAS Expander backplane). The third photo contains detail of what's inside.
Below is the server with the PCI extender box removed. It has been flipped over 180 degrees: when fitted, the PCI connectors, seen from the top in the green rectangle, fit downwards into the two black PCI slots towards the top of the image.
The LSI 9211-8i, in the tan rectangle, is shown fitted to the PCI extender slot. Note that the only single connection to it is the SAS cable from the SAS extender, which is the long cable with the black braided cover. Below, on the table an in the red rectangle, is the removed HP SmartArray P212, with its memory module and battery (some RAID cards have battery backed RAM to preserve the integrity of their write cache in the event of power failure).
The dual system disks (aqua rectangle) enclosure can be seen poking out from underneath the LSI HBA card. It was tempting to try to remove these from the SAS Expander and try to plug them directly into two of the six vacant SATA ports on the motherboard, but a the enclosure had small backplane through which power was delivered and I was unsure as to what other smarts might be involved. Rather than reroute power and possibly open up a can of worms in regard to the SATA interfaces, I just left these disks alone.
Setting up TrueNAS
After fitting the LSI 9211-8i HBA card and reassembling the PCI extender chassis, I proceeded to install TrueNAS by creating a bootable USB with the latest version as instructed on the TrueNAS site.
The install went smoothly, all drives were detected, and I was able to mark both the system drives for install, ending up with a mirrored system disk configuration.
On rebooting, however, I found that the server would cycle through all the boot options and end up cycling at network boot, which from experience is where the boot cycle goes to die. I checked the motherboard BIOS and it was set to boot from the HBA card, but wasn't detecting anything bootable.
On googling this, it became clear that the problem was that an unconfigured HBA would just try to boot from the first two available drives, which were very likely to be the data drives. It was necessary to boot into the SAS BIOS and configure the boot order.
Configuring Boot Order in the SAS BIOS
At this point, I did not know what firmware version my HBA card was running and had not fired up the SAS BIOS at all. In hindsight, it would have been good to check this before commencing any operations involving the SAS expander (such as the TrueNAS installation!).
As it happened, there was a problem with the SAS BIOS on the card which prevented me from booting into the BIOS to make these configuration changes, but it appears this problem is mainly specific to HP hardware, so for now I'm going to pretend I didn't have this problem and go ahead with the boot configuration as it should have happened (and did happen once the issue was fixed). I'll return to the other problem, which required removing the HBA card and re-flashing it in another computer, in the next section.
Booting the server takes a while, and eventually the screen displays something like 'hit any key for Option ROM'. At this point, there is no message telling you what keys to hit, but you need to hit Ctrl+C to boot into the SAS BIOS. After a pause, there is a message about the LSI configuration tool, and a few more keystrokes and you are in the SAS BIOS screen.
Once in there, you'll see a single line for the SAS expander, and it's necessary to hit enter a few times to expand the disk tree (there are a few useful YouTube videos covering this whole process). Then you'll see the below.
Bay 12 and 13 here are the system disks (the highlighting obscures the details of the bottom one) and we need to mark them as boot and alternate boot using Alt+B and Alt+A. Hitting Alt-M displays a handy instructional screen showing all the special key codes.
Presumably the motherboard was previously trying to boot from Bay 0, which explains the lack of success.
After saving the config, the server booted straight into TrueNAS and after a bit of further configuration, we were up and running.
Problem with BIOS on HP Hardware and Flashing the HBA Card
Fatal pci express device error B00/D09/F00 E0
| FW | BIOS | DL380 G7 | DL380 G6 | 
| P19 | P19 | works (old) | works (old) | 
| P20/< .07 | P19 | data corrupted(!!AVOID!!) | data corrupted(!!AVOID!!) | 
| P20/< .07 | P20 | data corrupted/DEATH on CONFIG2(!!AVOID!!) | data corrupted/DEATH on CONFIG(!!AVOID!!) | 
| P20/.07 | P20 | works/DEATH on CONFIG2(AVOID!) | works/DEATH on CONFIG (AVOID!) | 
| P20/.07 | P19 | works (THIS!) | works (THIS!) | 
P20/< .07 means all 20.00.XX.00 versions of the firmware earlier than 20.00.07.00.
BIOS versions follow a different numbering scheme, with P19 = 7.37.00.00 and P20 = 7.39.02.00 (my numbers, there might be others)
But firstly, the challenge of being able run the flash tool. I tried creating an MSDOS boot USB and loaded the DOS version of the flash tool onto it, but as suspected, the server hardware would not recognise this.
At this point, it was really not possible to use the server to boot into the flash tool without installing Windows on it. My two options were to find some really old hardware that would allow DOS boot, or to find a Windows machine that I could fit the HBA card to in order to flash it.
Luckily, I have a modern Windows PC as a spare that I sometimes use for development and gaming. Fitting the HBA card to it was easy (there's no need to attach drives to the HBA card) and I was able to boot into Windows normally.
I copied the three files above into a temporary folder, opened up a CMD window, and ran the flash tool. Using the -listall switch I was able to see immediately that (referring to the firmware matrix in the previously mentioned post) both the firmware and BIOS were at v20.
At this point, I went back to the Broadcom site and downloaded the P19 version of the firmware and BIOS package. I then replaced just the .rom file (from the 'sasbios_rel' folder) in my temporary folder with the P19 version, and ran the update as below. I also ran an additional command, not listed, to delete the firmware first, but it reported errors that seemed to indicate that it was no longer necessary to run this command in the Windows versions. I would nonetheless follow the instructions on the Broadcom site here.
This appeared to have worked as desired. After this, I removed the HBA card from my Windows box, reinserted it to the server, and was then able to boot into the SAS BIOS normally and make the configuration changes as outlined previously.
That's about it for this post. Before I go, I'll include one last useful link of informational videos from the TrueNAS forums. Good background:
https://www.truenas.com/community/resources/informational-videos-mostly-about-sas-hardware.105/
 
No comments:
Post a Comment