Monday, December 6, 2010

In-the-Lab: Default Rights on CIFS Shares

Following-up on the last installment of managing CIFS shares, there has been a considerable number of questions as to how to establish domain user rights on the share. From these questions it is apparent that the my explanation about root-level share permissions could have been more clear. To that end, I want to look at default shares from a Windows SBS Server 2008 R2 environment and translate those settings to a working NexentaStor CIFS share deployment.

Evaluating Default Shares


In SBS Server 2008, a number of default shares are promulgated from the SBS Server. Excluding the "hidden" shares, these include:
  • Address
  • ExchangeOAB
  • NETLOGON
  • Public
  • RedirectedFolders
  • SYSVOL
  • UserShares
  • Printers

Therefore, it follows that a useful exercise in rights deployment might be to recreate a couple of these shares on a NexentaStor system and detail the methodology. I have chosen the NETLOGON and SYSVOL shares as these two represent default shares common in all Windows server environments. Here are their relative permissions:

NETLOGON


From the Windows file browser, the NETLOGON share has default permissions that look like this:

NETLOGON Share permissions

Looking at this same permission set from the command line (ICALCS.EXE), the permission look like this:

NETLOGON permissions as reported from icacls
The key to observe here is the use of Windows built-in users and NT Authority accounts. Also, it is noteworthy that some administrative privileges are different depending on inheritance. For instance, the Administrator's rights are less than "Full" permissions on the share, however they are "Full" when inherited to sub-dirs and files, whereas SYSTEM's permissions are "Full" in both contexts.

SYSVOL


From the Windows file browser, the NETLOGON share has default permissions that look like this:

SYSVOL network share permissions

Looking at this same permission set from the command line (ICALCS.EXE), the permission look like this:

SYSVOL permissions from ICACLS.EXE
Note that Administrators privileges are truncated (not "Full") with respect to the inherited rights on sub-dirs and files when compared to the NETLOGON share ACL.

Create CIFS Shares in NexentaStor


On a ZFS pool, create a new folder using the Web GUI (NMV) that will represent the SYSVOL share. This will look something like the following:
Creating the SYSVOL share

Tuesday, November 16, 2010

Short-Take: New Oracle/Sun ZFS Goodies

I wanted to pass-on some information posted by Joerg Moellenkamp at c0t0d0s0.org - some good news for Sun/ZFS users out there about Solaris Express 2010.11 availability, links to details on ZFS encryption features in Solaris 11 Express and clarification on "production use" guidelines. Here's the pull quotes from his posting:
"Darren (Moffat) wrote three really interesting articles about ZFS encryption: The first one is Introducing ZFS Crypto in Oracle Solaris 11 Express. This blog entry gives you a first overview how to use encryption for ZFS datasets. The second one..."

-  Darren Moffat about ZFS encryption, c0t0d0s0.org, 11-16-2010


"There is a long section in the FAQ about licensing and production use: The OTN license just covers development, demo and testing use (Question 14) . However you can use Solaris 11 Express on your production system as well..."

- Solaris 11 Express for production use, c0t0d0s0.org, 11-16-2010


"A lot of changes found their way into the newest release of Solaris, the first release of Oracle Solaris Express 2010.11. The changes are summarized in a lengthy document, however..."

- What's new for the Administrator in Oracle Solaris  Express 2010.11, c0t0d0s0.org, 11-15-2010



Follow the links to Joerg's blog for more details and links back to the source articles. Cheers!

In-the-Lab: NexentaStor vs. Grub

In this In-the-Lab segment we're going to look at how to recover from a failed ZFS version update in case you've become ambitious with your NexentaStor installation after the last Short-Take on ZFS/ZPOOL versions. If you used the "root shell" to make those changes, chances are your grub is failing after reboot. If so, this blog can help, but before you read on, observe this necessary disclaimer:
NexentaStor is an appliance operating system, not a general purpose one. The accepted way to manage the system volume is through the NMC shell and NMV web interface. Using a "root shell" to configure the file system(s) is unsupported and may void your support agreement(s) and/or license(s).

That said, let's assume that you updated the syspool filesystem and zpool to the latest versions using the "root shell" instead of the NMC (i.e. following a system update where zfs and zpool warnings declare that your pool and filesystems are too old, etc.) In such a case, the resulting syspool will not be bootable until you update grub (this happens automagically when you use the NMC commands.) When this happens, you're greeted with the following boot prompt:
grub>

Grub is now telling you that it has no idea how to boot your NexentaStor OS. Chances are there are two things that will need to happen before your system boots again:

  1. Your boot archive will need updating, pointing to the latest checkpoint;

  2. Your master boot record (MBR) will need to have grub installed again.


We'll update both in the same recovery session to save time (this assumes you know or have a rough idea about your intended boot checkpoint - it is usually the highest numbered rootfs-nmu-NNN checkpoint, where NNN is a three digit number.) The first step is to load the recovery console. This could have been done from the "Safe Mode" boot menu option if grub was still active. However, since grub is blown-away, we'll boot from the latest NexentaStor CD and select the recovery option from the menu.

Import the syspool


Then, we login as "root" (empty password.) From this "root shell" we can import the existing (disks connected to active controllers) syspool with the following command:
# zpool import -f syspool

Note the use of the "-f" card to force the import of the pool. Chances are, the pool will not have been "destroyed" or "exported" so zpool will "think" the pool belongs to another system (your boot system, not the rescue system). As a precaution, zpool assumes that the pool is still "in use" by the "other system" and the import is rejected to avoid "importing an imported pool" which would be completely catastrophic.

With the syspool imported, we need to mount the correct (latest) checkpointed filesystem as our boot reference for grub, destroy the local zfs.cache file (in case the pool disks have been moved, but still all there), update the boot archive to correspond to the mounted checkpoint and install grub to the disk(s) in the pool (i.e. each mirror member).

List the Checkpoints


# zfs list syspool

From the resulting list, we'll pick our highest-numbered checkpoint; for the sake of this article let's say it's "rootfs-nmu-013" and mount it.

Mount the Checkpoint


# mkdir /tmp/syspool
# mount -F zfs syspool/rootfs-nmu-013 /tmp/syspool

Remove the ZPool Cache File


# cd /tmp/syspool/etc/zfs
# rm -f zpool.cache

Update the Boot Archive


# bootadm update-archive -R /tmp/syspool

Determine the Active Disks


# zpool status syspool

For the sake of this article, let's say the syspool was a three-way mirror and the zpool status returned the following:
  pool: syspool
 state: ONLINE
 scan: resilvered 8.64M in 0h0m with 0 errors on Tue Nov 16 12:34:40 2010
config:
NAME           STATE     READ WRITE CKSUM
        syspool        ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            c6t13d0s0  ONLINE       0     0     0
            c6t14d0s0  ONLINE       0     0     0
            c6t15d0s0  ONLINE       0     0     0

errors: No known data errors

This enumerates the three disk mirror as being composed of disks/slices c6t13d0s0, c6t14d0s0 and c6t15d0s0. We'll use that information for the grub installation.

Install Grub to Each Mirror Disk


# cd /tmp/syspool/boot/grub
# installgrub -f -m stage[12] c6t13d0s0
# installgrub -f -m stage[12] c6t14d0s0
# installgrub -f -m stage[12] c6t15d0s0

Unmount and Reboot


# umount /tmp/syspool
# sync
# reboot

Now, the system should be restored to a bootable configuration based on the selected system checkpoint. A similar procedure can be found on Nexenta's site when using the "Safe Mode" boot option. If you follow that process, you'll quickly encounter an error - likely intentional and meant to elicit a call to support for help. See if you can spot the step...

Monday, November 15, 2010

Short-Take: ZFS version ZPOOL Versions

As features are added to ZFS - the ZFS (filesystem) code may change and/or the underlying ZFS POOL code may change. When features are added, older versions of ZFS/ZPOOL will not be able to take advantage of these new features without the ZFS filesystem and/or pool being updated first.

Since ZFS filesystems exist inside of ZFS pools, the ZFS pool may need to be upgraded before a ZFS filesystem upgrade may take place. For instance, in ZFS pool version 24, support for system attributes was added to ZFS. To allow ZFS filesystems to take advantage of these new attributes, ZFS filesystem version 4 (or higher) is required. The proper order to upgrade would be to bring the ZFS pool up to at least version 24, and then upgrade the ZFS filesystem(s) as needed.

Systems running a newer version of ZFS (pool or filesystem) may "understand" an earlier version. However, older versions of ZFS will not be able to access ZFS streams from newer versions of ZFS.

For NexentaStor users, here are the current versions of the ZFS filesystem (see "zfs upgrade -v"):
VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS filesystem version
 2   Enhanced directory entries
 3   Case insensitive and File system unique identifier (FUID)
 4   userquota, groupquota properties
 5   System attributes

For NexentaStor users, here are the current versions of the ZFS pool (see "zpool upgrade -v"):
VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history
 5   Compression using the gzip algorithm
 6   bootfs pool property
 7   Separate intent log devices
 8   Delegated administration
 9   refquota and refreservation properties
 10  Cache devices
 11  Improved scrub performance
 12  Snapshot properties
 13  snapused property
 14  passthrough-x aclinherit
 15  user/group space accounting
 16  stmf property support
 17  Triple-parity RAID-Z
 18  Snapshot user holds
 19  Log device removal
 20  Compression using zle (zero-length encoding)
 21  Deduplication
 22  Received properties
 23  Slim ZIL
 24  System attributes
 25  Improved scrub stats
 26  Improved snapshot deletion performance

As versions change, upgrading the ZFS pool and filesystem is possible using the respective upgrade command. To upgrade all imported ZFS pools, issue the following command as root:
zpool upgrade -a

Likewise, to upgrade the ZFS filesystem(s) inside the pool and all child filesystems, issue the following command as root:
zfs upgrade -r -a

The new ZFS features available to these pool and filesystem version(s) will now be available to the upgraded pools/filesystems.

Friday, November 12, 2010

Quick-Take: Is Your Marriage a Happy One?

I came across a recent post by Chad Sakac (VP, VMware Alliance at EMC) discussing the issue of how vendors drive customer specifications down from broader goals to individual features or implementation sets (I'm sure VCE was not in mind at the time.) When it comes to vendors insist on framing the "client argument" in terms of specific features and proprietary approaches, I have to agree that Chad is spot on. Here's why:

First, it helps when vendors move beyond the "simple thinking" of infrastructure elements as a grid of point solutions and more of an "organic marriage of tools" - often with overlapping qualities. Some marriages begin with specific goals, some develop them along the way and others change course drastically and without much warning. The rigidness of point approaches rarely accommodates growth beyond the set of assumptions that created the it in the first place. Likewise, the "laser focus" on specific features detracts from the overall goal: the present and future value of the solution.

When I married my wife, we both knew we wanted kids. Some of our friends married and "never" wanted kids, only to discover a child on the way and subsequent fulfillment through raising them. Still, others saw a bright future strained with incompatibility and the inevitable divorce. Such is the way with marriages.

Second, it takes vision to solve complex problems. Our church (Church of the Highlands in Birmingham, Alabama) takes a very cautious position on the union between souls: requiring that each new couple seeking a marriage give it the due consideration and compatibility testing necessary to have a real chance at a successful outcome. A lot of "problems" we would encounter were identified before we were married, and when they finally popped-up we knew how to identify and deal with them properly.

Couples that see "counseling" as too obtrusive (or unnecessary) have other options. While the initial investment of money are often equivalent, the return on investment is not so certain. Uncovering incompatibilities "after the sale" provides for difficult and too often a doomed outcome (hence, 50% divorce rate.)

This same drama plays out in IT infrastructures where equally elaborate plans, goals and unexpected changes abound. You date (prospecting and trials), you marry (close) and are either fruitful (happy client), disappointed (unfulfilled promises) or divorce. Often, it's not the plan that failed but the failure to set/manage expectations and address problems that causes the split.

Our pastor could not promise that our marriage would last forever: our success is left to God and the two of us. But he did help us to make decisions that would give us a chance at a fruitful union. Likewise, no vendor can promise a flawless outcome (if they do, get a second opinion), but they can (and should) provide the necessary foundation for a successful marriage of the technology to the business problem.

Third, the value of good advice is not always obvious and never comes without risk. My wife and I were somewhat hesitant on counseling before marriage because we were "in love" and were happy to be blind to the "problems" we might face. Our church made it easy for us: no counseling, no marriage. Businesses can choose to plot a similar course for their clients with respect to their products (especially the complex ones): discuss the potential problems with the solution BEFORE the sale or there is no sale. Sometimes this takes a lot of guts - especially when the competition takes the route of oversimplification. Too often IT sales see identifying initial problems (with their own approach) as too high a risk and too great an obstacle to the sale.

Ultimately, when you give due consideration to the needs of the marriage, you have more options and are better equipped to handle the inevitable trials you will face. Whether it's an unexpected child on the way, or an unexpected up-tick in storage growth, having the tools in-hand to deal with the problem lessens its severity. The point is, being prepared is better than the assumption of perfection.

Finally, the focus has to be what YOUR SOLUTION can bring to the table: not how you think your competition will come-up short. In Chad's story, he's identified vendors disqualifying one another's solutions based on their (institutional) belief (or disbelief) in a particular feature or value proposition. That's all hollow marketing and puffery to me, and I agree completely with his conclusion: vendors need to concentrate on how their solution(s) provide present and future value to the customer and refrain from the "art" of narrowly framing their competitors.

Features don't solve problems: the people using them do. The presence (or absence) of a feature simply changes the approach (i.e. the fallacy of feature parity). As Chad said, it's the TOTALITY of the approach that derives value - and that goes way beyond individual features and products. It's clear to me that a lot of counseling takes place between Sakac's EMC team and their clients to reach those results. Great job, Chad, you've set a great example for your team!

Monday, November 8, 2010

Short-Take: vSphere Multi-core Virtual Machines

Virtual machines were once relegated to a second class status of single-core vCPU configurations. To get multiple process threads, you had to add to add one "virtual CPU" for each thread. This approach, while functional, had potential serious software licensing ramifications. This topic drew some attention on Jason Boche's blog back in July, 2010 with respect to vSphere 4.1.

With vSphere 4U2 and vSphere 4.1 you have the option of using an advanced configuration setting to change the "virtual cores per socket" to allow thread count needs to have a lesser impact on OS and application licensing. The advanced configuration parameter name is "cpuid.coresPerSocket" (default 1) and acts as a divisor for the virtual hardware setting "CPUs" which must be an integral multiple of the "cpuid.coresPerSocket" value. More on the specifics and limitations of this setting can be found in "Chapter 7, Configuring Virtual Machines" (page 79) of the vSphere Virtual Machine Administrator Guide for vSphere 4.1. [Note: See also VMware KB1010184.]

The value of "cpuid.coresPerSocket" is effectively ignored when "CPUs" is set to 1. In case "cpuid.coresPerSocket" is an imperfect divisor, the power-on operation will fail with the following message in the VI Client's task history:

[caption id="attachment_1726" align="aligncenter" width="328" caption="Virtual core count is imperfect divisor of CPUs"]Power-on-Fail[/caption]

If virtual machine logging is enabled, the following messages (only relevant items listed) will appear in the VM's log (Note: CPUs = 3, cpuid.coresPerSocket = 2):
Nov 08 14:17:43.676: vmx| DICT         virtualHW.version = 7
Nov 08 14:17:43.677: vmx| DICT                  numvcpus = 3
Nov 08 14:17:43.677: vmx| DICT      cpuid.coresPerSocket = 2
Nov 08 14:17:43.727: vmx| VMMon_ConfigMemSched: vmmon.numVCPUs=3
Nov 08 14:17:43.799: vmx| NumVCPUs 3
Nov 08 14:17:44.008: vmx| Msg_Post: Error
Nov 08 14:17:44.008: vmx| [msg.cpuid.asymmetricalCores] The number of VCPUs is not a multiple of the number of cores per socket of your VM, so it cannot be powered on.----------------------------------------
Nov 08 14:17:44.033: vmx| Module CPUID power on failed.

While the configuration guide clearly states (as Jason Boche rightly pointed out in his blog):
The number of virtual CPUs must be divisible by the number of cores per socket. The coresPerSocketsetting must be a power of two.

- Virtual Machine Configuration Guide, vSphere 4.1



We've found that "cpuid.coresPerCPU" simply needs to be a perfect divisor of the "CPUs" value. This tracks much better with prior versions of vSphere where "odd numbered" socket/CPU counts were allowed, so therefore odd numbers of cores-per-CPU allowed provided the division of CPUs by coresPerCPU is integral. Suffice to say, if the manual says "power of two" (1, 2, 4, 8, etc.)  then those are likely the only "supported" configuration available. Any other configuration that "works" (i.e. 3, 5, 6, 7, etc.) will likely be unsupported by VMware in the event of a problem.

That said, odd values of "cpuid.coresPerCPU" do work just fine. Since SOLORI has a large number of AMD-only eco-systems, it is useful to test configurations that match the physical core count of the underlying processors (i.e. 2, 3, 4, 6, 8, 12). For instance, we were able to create a single, multi-core virtual CPU with 3-cores (CPUs = 3, cpuid.coresPerSocket = 3) and run Windows Server 2003 without incident:

[caption id="attachment_1727" align="aligncenter" width="403" caption="Windows Server 2003 with virtual "tri-core" CPU"]Virtual Tri-core CPU[/caption]

It follows, then, that we were likewise able to run a 2P virtual machine with a total of 6-cores (3-per CPU) running the same installation of Windows Server 2003 (CPUs = 6, cpuid.coresPerSocket = 3):

[caption id="attachment_1728" align="aligncenter" width="403" caption="Virtual Dual-processor (2P), Tri-core (six cores total)"]2P Virtual Tri-core[/caption]

Here are the relevant vmware log messages associated with this 2P, six total core virtual machine boot-up:
Nov 08 14:54:21.892: vmx| DICT         virtualHW.version = 7
Nov 08 14:54:21.893: vmx| DICT                  numvcpus = 6
Nov 08 14:54:21.893: vmx| DICT      cpuid.coresPerSocket = 3
Nov 08 14:54:21.944: vmx| VMMon_ConfigMemSched: vmmon.numVCPUs=6
Nov 08 14:54:22.009: vmx| NumVCPUs 6
Nov 08 14:54:22.278: vmx| VMX_PowerOn: ModuleTable_PowerOn = 1
Nov 08 14:54:22.279: vcpu-0| VMMon_Start: vcpu-0: worldID=530748
Nov 08 14:54:22.456: vcpu-1| VMMon_Start: vcpu-1: worldID=530749
Nov 08 14:54:22.487: vcpu-2| VMMon_Start: vcpu-2: worldID=530750
Nov 08 14:54:22.489: vcpu-3| VMMon_Start: vcpu-3: worldID=530751
Nov 08 14:54:22.489: vcpu-4| VMMon_Start: vcpu-4: worldID=530752
Nov 08 14:54:22.491: vcpu-5| VMMon_Start: vcpu-5: worldID=530753

It's clear from the log that each virtual core spawns a new virtual machine monitor thread within the VMware kernel. Confirming the distribution of cores from the OS perspective is somewhat nebulous due to the mismatch of the CPU's ID (follows the physical CPU on the ESX host) and the "arbitrary" configuration set through the VI Client. CPU-z shows how this can be confusing:

[caption id="attachment_1729" align="aligncenter" width="407" caption="CPU#1 as described by CPU-z"]CPU-z output, 1 of 2[/caption]

[caption id="attachment_1730" align="aligncenter" width="406" caption="CPU#2 as described by CPU-z"]CPU-z CPU 2 of 2[/caption]

Note that CPU-z identifies the first 4-cores with what it calls "Processor #1" and the remaining 2-cores with "Processor #2" - this appears arbitrary due to CPU-z's "knowledge" of the physical CPU layout. In (virtual) reality, this assessment by CPU-z is incorrect in terms of cores per CPU, however it does properly demonstrate the existence of two (virtual) CPUs. Here's the same VM with a "cpuid.coresPerSocket" of 6 (again, not 1, 2, 4 or 8 as supported ):

[caption id="attachment_1731" align="aligncenter" width="405" caption="CPU-z demonstrating a 1P, six-core virtual CPU"]Single 6-core (virtual) CPU[/caption]

Note again that CPU-z correctly identifies the underlying physical CPU as an Opteron 2376 (2.3GHz quad-core) but shows 6-cores, 6-threads as configured through VMware. Note also that the "grayed-out" selection for "Processor #1" demonstrates a single processor is enumerated in virtual hardware. [Note: VMware's KB1030067 demonstrates accepted ways of verifying cores per socket in a VM.)

How does this help with per-CPU licensing in a virtual world? It effectively evens the playing field between physical and virtual configurations. In the past (VI3 and early vSphere 4) multiple virtual threads were only possible through the use of additional virtual sockets. This paradigm did not track with OS licensing and CPU-socket-aware application licensing since the OS/applications would recognize the additional threads as CPU sockets in excess of the license count.

With virtual cores, the underlying CPU configuration (2P, 12 total cores, etc.) can be emulated to the virtual machine layer and deliver thread-count parity to the virtual machine. Since most per-CPU licenses speak to the physical hardware layer, this allows for parity between the ESX host CPU count and the virtual machine CPU count, regardless of the number of physical cores.

Also, in NUMA systems where core/socket/memory affinity is a potential performance issue, addressing physical/virtual parity is potentially important. This could have performance implications for AMD 2400/6100 and Intel 5600 systems where 6 and 12 cores/threads are delivered per physical CPU socket.

Thursday, September 30, 2010

In-the-Lab: Windows Server 2008 R2 Template for VMware

As it turns out, the reasonably simple act of cloning a Windows Server 2008 R2 (insert addition here) has been complicated by the number of editions, changes from 2008 release through 2008 R2 as well as user profile management changes since its release. If you're like me, you like to tweak your templates to limit customization steps in post-deployment. While most of these customizations can now be setup in group policies from AD, the deployment of non-AD members has become a lot more difficult - especially where custom defaults are needed or required.

Here's my quick recipe to build a custom image of Windows Server 2008 R2 that has been tested with Standard, Enterprise and Foundation editions.

Create VM, use VMXNET3 as NIC(s), 40GB "thin" disk, using 2008 R2 Wizard


This is a somewhat "mix to taste" step. We use ISO images and encourage their use. The size of the OS volume will end-up being somewhere around 8GB of actual space-on-disk after this step, making 40GB sound like overkill. However, the OS volume will bloat-up to 18-20GB pretty quick after updates, roles and feature additions. Adding application(s) will quickly chew-up the rest.

  • Edit Settings... ->

    • Options -> Advanced -> General -> Uncheck "Enable logging"

    • Hardware -> CD/DVD Drive 1 ->

      • Click "Datastore ISO File"

        • Browse to Windows 2008 R2 ISO image



      • Check "Connect at power on"



    • Options -> Advanced -> Boot Options -> Force BIOS Setup

      • Check "The next time the virtual machine boots, force entry into the BIOS setup screen"





  • Power on VM

  • Install Windows Server 2008 R2


Use Custom VMware Tools installation to disable "Shared Folders" feature:


It is important that VMware Tools be installed next, if for no other reason than to make the rest of the process quicker and easier. The additional step of disabling "Shared Folders" is for ESX/vSphere environments where shared folders are not supported. Since this option is installed by default, it can/should be removed in vSphere installations.

  • VM -> Guest -> Install VMware Tools ->

    • Custom -> VMware Device Drivers -> Disable "Shared Folder" feature



  • Retstart


Complete Initial Configuration Tasks:


Once the initial installation is complete, we need to complete the 2008 R2 basic configuration. If you are working in an AD environment, this is not the time to join the template to the domain as GPO conflicts may hinder manual template defaults. We've chosen a minimal package installation based on our typical deployment profile. Some features/roles may differ in your organization's template (mix to taste).

  • Set time zone -> Date and Time ->

    • Internet Time -> Change Settings... -> Set to local time source

    • Date and Time -> Change time zone... -> Set to local time zone



  • Provide computer name and domain -> Computer name ->

    • Enterprise Edition: W2K8R2ENT-TMPL

    • Standard Edition: W2K8R2STD-TMPL

    • Foundation Edition: W2K8R2FND-TMPL

    • Note: Don't join to a domain just yet...



  • Restart Later

  • Configure Networking

    • Disable QoS Packet Scheduler



  • Enable automatic updating and feedback

    • Manually configure settings

      • Windows automatic updating -> Change Setting... ->

        • Important updates -> "check for updates but let me choose whether to download and install them"

        • Recommended updates -> Check "Give me recommended updates the same way I receive important updates"

        • Who can install updates -> Uncheck "Allow all users to install updates on this computer"



      • Windows Error Reporting -> Change Setting... ->

        • Select "I don't want to participate, and don't ask me again"



      • Customer Experience Improvement Program -> Change Setting... ->

        • Select "No, I don't want to participate"







  • Download and install updates

    • Bring to current (may require several reboots)



  • Add features (to taste)

    • .NET Framwork 3.5.1 Feautures

      • Check WCF Activation, Non-HTTP Activation

        • Pop-up: Click "Add Required Features"





    • SNMP Services

    • Telnet Client

    • TFTP Client

    • Windows PowerShell Integrated Scripting Environment (ISE)



  • Check for updates after new features

    • Install available updates



  • Enable Remote Desktop

    • System Properties -> Remote

      • Windows 2003 AD

        • Select "Allow connection sfrom computers running any version of Remote Desktop"



      • Windows 2008 AD (optional)

        • Select "Allow connections only from computers runnign Remote Desktop with Network Level Authentication"







  • Windows Firewall

    • Turn Windows Firewall on of off

      • Home or work location settings

        • Turn off Windows Firewall



      • Public network location settings

        • Turn off Windows Firewall







  • Complete Initial Configuration Tasks

    • Check "Do not show this window at logon" and close




Modify and Silence Server Manager


(Optional) Parts of this step may violate your local security policies, however, it's more than likely that a GPO will ultimately override this configuration. We find it useful to have this disabled for "general purpose" templates - especially in a testing/lab environment where the security measures will be defeated as a matter of practice.

  • Security Information -> Configure IE ESC

    • Select Administrators Off

    • Select Users Off



  • Select "Do not show me this console at logon" and close


Modify Taskbar Properties


Making the taskbar usable for your organization is another matter of taste. We like smaller icons and maximizing desktop utility. We also hate being nagged by the notification area...

  • Right-click Taskbar -> Taskbar and Start Menu Properties ->

    • Taskbar -> Check "Use small icons"

    • Taskbar -> Customize... ->

      • Set all icons to "Only show notifications"

      • Click "Turn system icons on or off"

        • Turn off "Volume"





    • Start Menu -> Customize...

      • Uncheck "Use large icons"






Modify default settings in Control Panel


Some Control Panel changes will help "optimize" the performance of the VM by disabling unnecessary features like screen saver and power management. We like to see our corporate logo on server desktops (regardless of performance implications) so now's the time to make that change as well.

  • Control Panel -> Power Options -> High Performance

    • Change plan settings -> Turn off the display -> Never



  • Control Panel -> Sound ->

    • Pop-up: "Would you like to enable the Windows Audio Service?" - No

    • Sound -> Sounds -> Sound Scheme: No Sounds

    • Uncheck "Play Windows Startup sound"



  • Control Panel -> VMware Tools -> Uncheck "Show VMware Tools in the taskbar"

  • Control Panel -> Display -> Change screen saver -> Screen Saver -> Blank, Wait 10 minutes

  • Change default desktop image (optional)

    • Copy your desktop logo background to a public folder (i.e. "c:\Users\Public\Public Pictures")

    • Control Panel -> Display -> Change desktop background -> Browse...

    • Find picture in browser, Picture position stretch




Disable Swap File


Disabling swap will allow the defragment step to be more efficient and will disable VMware's advanced memory management functions. This is only temporary and we'll be enabling swap right before committing the VM to template.

  • Computer Properties -> Visual Effects -> Adjust for best performance

  • Computer Properties -> Advanced System Settings ->

    • System Properties -> Advanced -> Performance -> Settings... ->

    • Performance Options -> Advanced -> Change...

      • Uncheck "Automatically manage paging file size for all drives"

      • Select "No paging file"

      • Click "Set" to disable swap file






Remove hibernation file and set boot timeout


It has been pointed out that the hibernation and timeout settings will get re-enabled by the sysprep operation. Removing the hibernation files will help in defragment now. We'll reinforce these steps in the customization wizard later.

  • cmd: powercfg -h off

  • cmd: bcdedit /timeout 5


Disable indexing on C:


Indexing the OS disk can suck performance and increase disk I/O unnecessarily. Chances are, this template (when cloned) will be heavily cached on your disk array so indexing in the OS will not likely benefit the template. We prefer to disable this feature as a matter of practice.

  • C: -> Properties -> General ->

    • Uncheck "Allow files on this drive to have contents indexed in addition to file properties"

    • Apply -> Apply changes to C:\ only (or files and folders, to taste)




Housekeeping


Time to clean-up and prepare for a streamlined template. The first step is intended to aid the copying of "administrator defaults" to "user defaults." If this does not apply, just defragment.

Remove "Default" user settings:

  • C:\Users -> Folder Options -> View -> Show hidden files...

  • C:\Users\Default -> Delete "NTUser.*" Delete "Music, Pictures, Saved Games, Videos"


Defragment

  • C: -> Properties -> Tools -> Defragment Now...

    • Select "(C:)"

    • Click "Defragment disk"




Copy Administrator settings to "Default" user


The "formal" way of handling this step requires a third-party utility. We're giving credit to Jason Samuel for consolidating other bloggers methods because he was the first to point out the importance of the "unattend.xml" file and it really saved us some time. His blog post also includes a link to an example "unattend.xml" file that can be modified for your specific use, as we have.

  • Jason Samuel points out a way to "easily" copy Administrator settings to defaults, by activating the CopyProfile node in an "unattend.xml" file used by sysprep.

  • Copy your "unattend.xml" file to C:\windows\system32\sysprep

  • Edit unattend.xml for environment and R2 version

    • Update offline image pointer to correspond to your virtual CD

      • E.g. wim:d:... -> wim:f:...



    • Update OS offline image source pointer, valid sources are:

      • Windows Server 2008 R2 SERVERDATACENTER

      • Windows Server 2008 R2 SERVERDATACENTERCORE

      • Windows Server 2008 R2 SERVERENTERPRISE

      • Windows Server 2008 R2 SERVERENTERPRISECORE

      • Windows Server 2008 R2 SERVERSTANDARD

      • Windows Server 2008 R2 SERVERSTANDARDCORE

      • Windows Server 2008 R2 SERVERWEB

      • Windows Server 2008 R2 SERVERWEBCORE

      • Windows Server 2008 R2 SERVERWINFOUNDATION



    • Any additional changes necessary



  • NOTE: now would be a good time to snapshot/backup the VM

  • cmd: cd \windows\system32\sysprep

  • cmd: sysprep /generalize /oobe /reboot /unattend:unattend.xml

    • Check "Generalize"

    • Shutdown Options -> Reboot



  • Login

  • Skip Activation

  • Administrator defaults are now system defaults



  • Reset Template Name

    • Computer Properties -> Advanced System Settings -> Computer name -> Change...

      • Enterprise Edition: W2K8R2ENT-TMPL

      • Standard Edition: W2K8R2STD-TMPL

      • Foundation Edition: W2K8R2FND-TMPL



    • If this will be an AD member clone, join template to the domain now



    • Restart





  • Enable Swap files

    • Computer Properties -> Advanced System Settings ->

      • System Properties -> Advanced -> Performance -> Settings... ->

      • Performance Options -> Advanced -> Change...

        • Check "Automatically manage paging file size for all drives"







  • Release IP

    • cmd: ipconfig /release



  • Shutdown

  • Convert VM to template


Convert VM Template to Clone


Use the VMware Customization Wizard to create a re-usable script for cloning the template. Now's a good time to test that your template will create a usable clone. If it fails, go check the "red letter" items and make sure your setup is correct. The following hints will help improve your results.

  • Remove hibernation related files and reset boot delay to 5 seconds in Customization Wizard


  • Remember that the ISO is still mounted by default. Once VM's are deployed from the template, it should be removed after the customization process is complete and additional roles/features are added.


That's the process we have working at SOLORI. It's not rocket science, but if you miss an important step you're likely to be visited by an error in "pass [specialize]" that will have you starting over. Note: this also happens when your AD credentials are bad, your license key is incorrect (version/edition mismatch, typo, etc.) or other nondescript issues - too bad the error code is unhelpful...

Wednesday, September 29, 2010

Short-Take: Jeff Bonwick Leaves Oracle after Two Decades

Jeff Bonwick's last day at Oracle may be September 30, 2010 after two decades with Sun, but his contributions to ZFS and Solaris will live on through Oracle and open source storage for decades to come. In 2007, Bill Moore, Jeff Bonwick (co-founders of ZFS) and Pawel Jakub Dawidek (ported ZFS to FreeBSD) were interviewed by David Brown for the Association for Computing Machinery and discussed the future of file systems. The discussion gave good insights into the visionary thinking behind ZFS and how the designers set out to solve problems that would plague future storage systems.
One thing that has changed, as Bill already mentioned, is that the error rates have remained constant, yet the amount of data and the I/O bandwidths have gone up tremendously. Back when we added large file support to Solaris 2.6, creating a one-terabyte file was a big deal. It took a week and an awful lot of disks to create this file.

Now for comparison, take a look at, say, Greenplum’s database software, which is based on Solaris and ZFS. Greenplum has created a data-warehousing appliance consisting of a rack of 10 Thumpers (SunFire x4500s). They can scan data at a rate of one terabyte per minute. That’s a whole different deal. Now if you’re getting an uncorrectable error occurring once every 10 to 20 terabytes, that’s once every 10 to 20 minutes—which is pretty bad, actually.

- Jeff Bonwick, ACM Queue, November, 2007



But it's quotes like this from Jeff's blog in 2007 that really resonate with my experience:
Custom interconnects can't keep up with Ethernet.  In the time that Fibre Channel went from 1Gb to 4Gb -- a factor of 4 -- Ethernet went from 10Mb to 10Gb -- a factor of 1000.  That SAN is just slowing you down.

Today's world of array products running custom firmware on custom RAID controllers on a Fibre Channel SAN is in for massive disruption. It will be replaced by intelligent storage servers, built from commodity hardware, running an open operating system, speaking over the real network.

- Jeff Bonwick, Sun Blog, April 2007



My old business partner, Craig White, philosopher and network architect at BT let me in on that secret back in the late 90's. At the time I was spreading Ethernet across a small city while Craig was off to Level3 - spreading gigabit Ethernet across entire continents. He made it clear to me that Ethernet - in its simplicity and utility - was like the loyal mutt that never let you down and always rose to meet a fight. Betting against Ethernet's domination as an interconnect was like betting against the house: ultimately a losing proposition. While there will always be room for exotic interconnects, the remaining 95% of the market will look to Ethernet. Lookup "ubiquity" in the dictionary - it's right there next to Ethernet, and it's come a long way since it first appeared on Bob Metcalf's napkin in '73.

Looking back at Jeff's Sun blog, it's pretty clear that Sun's "near-death experience" had the same profound change on the his thinking; and perhaps that change made him ultimately incompatible with the Oracle culture. I doubt a culture that embraces the voracious acquisition and marketing posture of former HP CEO Mark Hurd would likewise embrace the unknown risk and intangible reward framework of openness.
In each case, asking the question with a truly open mind changed the answer.  We killed our more-of-the-same SPARC roadmap and went multi-core, multi-thread, and low-power instead.  We started building AMD and Intel systems.  We launched a wave of innovation in Solaris (DTrace, ZFS, zones, FMA, SMF, FireEngine, CrossBow) and open-sourced all of it.  We started supporting Linux and Windows.  And most recently, we open-sourced Java.  In short, we changed just about everything.  Including, over time, the culture.

Still, there was no guarantee that open-sourcing Solaris would change anything.  It's that same nagging fear you have the first time you throw a party: what if nobody comes?  But in fact, it changed everything: the level of interest, the rate of adoption, the pace of communication.  Most significantly, it changed the way we do development.  It's not just the code that's open, but the entire development process.  And that, in turn, is attracting developers and ISVs whom we couldn't even have spoken to a few years ago.  The openness permits us to have the conversation; the technology makes the conversation interesting.

- Jeff Bonwick, Sun blog, April 2007



This lesson, I fear, cannot be unlearned, and perhaps that's a good thing. There's an side to an engineer's creation that goes way beyond profit and loss, schedules and deadlines, or success and failure. This side probably fits better in the subjective realm of the arts than the objective realm of engineering and capitalism. It's where inspiration and disruptive ideas abide. Reading Bonwick's "fairwell" posting, it's clear to see that the inspirational road ahead has more allure than recidivism at Oracle. I'll leave it in his words:
For me, it's time to try the Next Big Thing. Something I haven't fully fleshed out yet. Something I don't fully understand yet. Something way outside my comfort zone. Something I might fail at. Everything worth doing begins that way. I'll let you know how it goes.

- Jeff Bonwick, Sun blog, September 2010


Saturday, September 18, 2010

Short-Take: OpenSolaris mantle assumed by Illumos, OpenIndiana

While Oracle is effectively "closed the source" to key Solaris code by making updates available only when "full releases" are distributed, others in the "formerly OpenSolaris" community are stepping-up to carry the mantle for the community. In an internal memo - leaked to the OpenSolaris news group last month - Oracle makes the new policy clear:
We will distribute updates to approved CDDL or other open source-licensed code following full releases of our enterprise Solaris operating system. In this manner, new technology innovations will show up in our releases before anywhere else. We will no longer distribute source code for the entirety of the Solaris operating system in real-time while it is developed, on a nightly basis.

- Oracle Memo to Solaris Engineering, Aug, 2010



Frankly, Oracle clearly sees the issue of continuous availability to code updates as a threat to its control over its "best-of-breed" acquisition in Solaris. It will be interesting to see how long Oracle takes to reverse the decision (and whether or not it will be too late...)

However, at least two initiatives are stepping-up to carry the mantle of "freely accessible and open" Solaris code to the community: Illumos and OpenIndiana. Illumos' goal can be summed-up as follows:
Well the first thing is that the project is designed here to solve a key problem, and that is that not all of OpenSolaris is really open source. And there's a lot of other potential concerns in the community, but this one is really kind of a core one, and from solving this, I think a lot of other issues can be solved.

- Excerpt, Illumos Announcement Transcript



That said, it's pretty clear that Illumos will be a distinct fork away from "questionable" code (from a licensing perspective.) We already see a lot of chatter/concerns about this in the news/mail groups.

The second announcement comes from thje OpenIndiana group (part of the Illumos Foundation) and appears to be to Solaris as CentOS is to RedHat Enterprise Server. OpenIndiana's press release says it like this:
OpenIndiana, an exciting new distribution of OpenSolaris, built by the community, for the community - available for immediate download! OpenIndiana is a continuation of the OpenSolaris legacy and aims to be binary and package compatible with Oracle Solaris 11 and Solaris 11 Express.

- OpenIndiana Press Release, September 2010

Does any of this mean that OpenSolaris is going away or being discontinued? Strictly speaking: no - it lives on as Solaris 11 Express, et al. It does means control of code changes will be more tightly controlled by Oracle, and - from the reaction of the developer community - this exertion of control may slow or eliminate open source contribution to the Solaris/OpenSolaris corpus. Further, Solaris 11 won't be "free for production use"as earlier versions of Solaris were. It also means that distributions and appliance derivatives (like NexentaStor and Nexenta Core) will be able to thrive despite Oracle's tightening.

Illumous has yet to release a distribution. OpenIndiana has distributions available for download today.

Friday, September 17, 2010

Quick-Take: ZFS and Early Disk Failure

Anyone who's discussed storage with me knows that I "hate" desktop drives in storage arrays. When using SAS disks as a standard, that's typically a non-issue because there's not typically a distinction between "desktop" and "server" disks in the SAS world. Therefore, you know I'm talking about the other "S" word - SATA. Here's a tale of SATA woe that I've seen repeatedly cause problems for inexperienced ZFS'ers out there...

When volumes fail in ZFS, the "final" indicator is data corruption. Fortunately, ZFS checksums recognize corrupted data and can take action to correct and report the problem. But that's like treating cancer only after you've experienced the symptoms. In fact, the failing disk will likely begin to "under-perform" well before actual "hard" errors show-up as read, write or checksum errors in the ZFS pool. Depending on the reason for "under-performing" this can affect the performance of any controller, pool or enclosure that contains the disk.

Wait - did he say enclosure? Sure. Just like a bad NIC chattering on a loaded network, a bad SATA device can occupy enough of the available service time for a controller or SAS bus (i.e. JBOD enclosure) to make a noticeable performance drop in otherwise "unrelated" ZFS pools. Hence, detection of such events is an important thing. Here's an example of an old WD SATA disk failing as viewed from the NexentaStor "Data Sets" GUI:

[caption id="attachment_1660" align="aligncenter" width="450" caption="Something is wrong with device c5t84d0..."]Disk Statistics showing failing drive[/caption]

Device c5t84d0 is having some serious problems. Busy time is 7x higher than counterparts, and its average service time is 14x higher. As a member of a RAIDz group, the entire group is being held-back by this "under-performing" member. From this snapshot, it appears that NexentaStor is giving us some good information about the disk from the "web GUI" but this assumption would not be correct. In fact, the "web GUI" is only reporting "real time" data so long as the disk is under load. In the case of a lightly loaded zpool, the statistics may not even be reported.

However, from the command shell, historic and real-time access to per-device performance is available. The output of "iostat -exn" shows the count of all errors for devices since the last time counters were reset, and average I/O loads for each:

[caption id="attachment_1662" align="aligncenter" width="450" caption="Device statistics from 'iostat' show error and I/O history."]Device statistics from 'iostat' show error and I/O history.[/caption]

The output of iostat clearly shows this disk has serious hardware problems. It indicates hardware errors as well as transmission errors for the device recognized as 'c5t84d0' and the I/O statistics - chiefly read, write and average service time - implicate this disk as a performance problem for the associated RAIDz group. So, if the device is really failing, shouldn't there be a log report of such an event? Yes, and here's a snip from the message log showing the error:

[caption id="attachment_1663" align="aligncenter" width="450" caption="SCSI error with ioc_status=0x8048 reported in /var/log/messages for failing device."]SCSI error with ioc_status=0x8048 reported in /var/log/messages[/caption]

However, in this case, the log is not "full" with messages of this sort. In fact, it only showed-up under the stress of an iozone benchmark (run from the NexentaStor 'nmc' console). I can (somewhat safely) conclude this to be a device failure since at least one other disk in this group is of the same make, model and firmware revision of the culprit. The interesting aspect about this "failure" is that it does not result in a read, write or checksum error for the associated zpool. Why? Because the device is only loosely coupled to the zpool as a constituent leaf device, and it also implies that the device errors were recoverable by either the drive or the device driver (mapping around a bad/hard error.)

Since these problems are being resolved at the device layer, the ZFS pool is "unaware" of the problem as you can see from the output of 'zpool status' for this volume:

[caption id="attachment_1661" align="aligncenter" width="450" caption="Problems with disk device as yet undetected at the zpool layer."]zpool status output for pool with undetected failing device[/caption]

This doesn't mean that the "consumers" of the zpool's resources are "unaware" of the problem, as the disk error has manifested itself in the zpool as higher delays, lower I/O through-put and subsequently less pool bandwidth. In short, if the error is persistent under load, the drive has a correctable but catastrophic (to performance) problem and will need to be replaced. If, however, the error goes away, it is possible that the device driver has suitably corrected for the problem and the drive can stay in place.

SOLORI's Take: How do we know if the drive needs to be replaced? Time will establish an error rate. In short, running the benchmark again and watching the error counters for the device will determine if the problem persists. Eventually, the errors will either go away or they wont. For me, I'm hoping that the disk fails to give me an excuse to replace the whole pool with a new set of SATA "eco/green" disks for more lab play. Stay tuned...

SOLORI's Take: In all of its flavors, 1.5Gbps, 3Gbps and 6Gbps, I find SATA drives inferior to "similarly" spec'd SAS for just about everything. In my experience, the worst SAS drives I've ever used have been more reliable than most of the SATA drives I've used. That doesn't mean there are "no" good SATA drives, but it means that you really need to work within tighter boundaries when mixing vendors and models in SATA arrays. On top of that, the additional drive port and better typical sustained performance make SAS a clear winner over SATA (IMHO). The big exception to the rule is economy - especially where disk arrays are used for on-line backup - but that's another discussion...

Wednesday, September 15, 2010

Short-Take: SQL Performance Notes

Here are some Microsoft SQL performance notes from discussions that inevitably crop-up when discussing SQL storage:

  1. Where do I find technical resources for the current version of MS SQL?

  2. I'm new to SQL I/O performance, how can I learn the basics?

  3. The basics talk about SQL 2000, but what about performance considerations due to changes in SQL 2005?

  4. How does using SQL Server 6.x versus SQL Server 7.0 and change storage I/O performance assumptions?

  5. How does TEMPDB affect storage (and memory) requirements and architecture?

  6. How does controller and disk caching affect SQL performance and data integrity?

  7. How can I use NAS for storage of SQL database in a test/lab environment?

  8. What additional considerations are necessary to implement database mirroring in SQL Server?

  9. When do SQL dirty cache pages get flushed to disk?

  10. Where can I find Microsoft's general reference sheet on SQL I/O requirements for more information?


From performance tuning to performance testing and diagnostics:

  1. I've heard that SQLIOStress has been replaced by SQLIOSim: where can I find out about SQLIOSim to evaluate my storage I/O system before application testing?

  2. How do I diagnose and detect "unreported" SQL I/O problems?

  3. How do I diagnose stuck/stalled I/O problems in SQL Server?

  4. What are Bufwait and Writelog Timeout messages in SQL Server indicating?

  5. Can I control SQL Server checkpoint behavior to avoid additional I/O during certain operations?

  6. Where can I get the SQLIO benchmark tool to assess the potential of my current configuration?


That should provide a good half-day's reading for any storage/db admin...

Tuesday, September 14, 2010

Short-Take: iSCSI+Nexenta, Performance Notes

Here are a few performance tips for running iSCSI with NexentaStor in a Windows environment:

  1. When using the Windows iSCSI Software Initiator with some workloads, disabling the Nagle Algorithm on the Windows Server is sometimes recommended;

  2. Tuning TCP window and iSCSI parameters on each side of the connection can deliver better performance;

  3. VMware part of the equation? Adjusting the way VMware handles congestion events could be useful;

  4. On NexentaStor, disable the Nagle Algorithm with a value of "1" (default, 4095, enabled)


For storage applications where latency is a paramount issue, these hints just might help...

Wednesday, August 11, 2010

VMware Management Assistant Panics on Magny Cours

VMware's current version of its vSphere Management Assistant - also known as vMA (pronounced "vee mah") - will crash when run on an ESX host using AMD Magny Cours processors. This behavior was discovered recently when installing the vMA on an AMD Opteron 6100 system (aka. Magny Cours) causing a "kernel panic" on boot after deploying the OVF template. Something of note is the crash also results in 100% vCPU utilization until the VM is either powered-off or reset:

[caption id="attachment_1624" align="aligncenter" width="436" caption="vMA Kernel Panic on Import"]vMA Kernel Panic on Import[/caption]

As it turns out, no manner of tweaks to the virtual machine's virtualization settings nor OS boot/grub settings (i.e. noapic, etc.) seem to cure the ills for vMA. However, we did discover that the OVF deployed appliance was configured as a VMware Virtual Machine Hardware Version 4 machine:

[caption id="attachment_1625" align="aligncenter" width="450" caption="vMA 4.1 defaults to Virtual Machine Hardware Version 4"]vMA 4.1 defaults to Hardware Version 4[/caption]

Since our lab vMA deployments have all been upgraded to Virtual Machine Harware Version 7 for some time (and for functional benefits as well), we tried to update the vMA to Version 7 and try again:

[caption id="attachment_1626" align="aligncenter" width="450" caption="Upgrade vMA Virtual Machine Version..."]Upgrade vMA Virtual Machine Version...[/caption]

This time, with Virtual Hardware Version 7 (and no other changes to the VM), the vMA boots as it should:

[caption id="attachment_1627" align="aligncenter" width="450" caption="vMA Booting after Upgrade to Virtual Hardware Version 7"]vMA Booting after Upgrade to Virtual Hardware Version 7[/caption]

Since the Magny Cours CPU is essentially a pair of tweaked 6-core Opteron CPUs in a single package, we took the vMA into the lab and deployed it to an ESX server running on AMD 2435 6-core CPUs: the vMA booted as expected, even with Virtual Hardware Version 4. A quick check of the community and support boards show a few issues with older RedHat/Centos kernels (like vMA's) but no reports of kernel panic with Magny Cours. Perhaps there are just not that many AMD Opteron 6100 deployments out there with vMA yet...

Sunday, July 18, 2010

Quick-Take: NexentaStor AD Shares in 100% Virtual SMB

Here's a maintenance note for SMB environments attempting 100% virtualization and relying on SAN-based file shares to simplify backup and storage management: beware the chicken-and-egg scenario on restart before going home to capture much needed Zzz's. If your domain controller is virtualized and it's VMDK file lives on SAN/NAS, you'll need to restart SMB services on the NexentaStor appliance before leaving the building.

Here's the scenario:

  1. An afterhours SAN upgrade in non-HA environment (maybe Auto-CDP for BC/DR, but no active fail-over);

  2. Shutdown of SAN requires shutdown of all dependent VM's, including domain controllers (AD);

  3. End-user and/or maintenance plans are dependent on CIFS shares from SAN;

  4. Authentication of CIFS shares on NexentaStor is AD-based;


Here's the typical maintenance plan (detail omitted):

  1. Ordered shutdown of non-critical VM's (including UpdateManager, vMA, etc.);

  2. Ordered shutdown of application VM's;

  3. Ordered shutdown of resource VM's;

  4. Ordered shutdown of AD server VM's (minus one, see step 7);

  5. Migrate/vMotion remaining AD server and vCenter to a single ESX host;

  6. Ordered shutdown of ESX hosts (minus one, see step 8);

  7. vSphere Client: Log-out of vCenter;

  8. vSphere Client: Log-in to remaining ESX host;

  9. Ordered shutdown of vCenter;

  10. Ordered shutdown of remaining AD server;

  11. Ordered shutdown of remaining ESX host;

  12. Update SAN;

  13. Reboot SAN to update checkpoint;

  14. Test SAN update - restoring previous checkpoint if necessary;

  15. Power-on ESX host containing vCenter and AD server (see step 8);

  16. vSphere Client: Log-in to remaining ESX host;

  17. Power-on AD server (through to VMware Tools OK);

  18. Restart SMB service on NexentaStor;

  19. Power-on vCenter;

  20. vSphere Client: Log-in to vCenter;

  21. vSphere Client: Log-out of ESX host;

  22. Power-on remaining ESX hosts;

  23. Ordered power-on of remaining VM's;


A couple of things to note in an AD environment:

  1. NexnetaStor requires the use of AD-based DNS for AD integration;

  2. AD-based DNS will not be available at SAN re-boot if all DNS servers are virtual and only one SAN is involved;

  3. Lack of DNS resolution on re-boot will cause a failure for DNS name based NTP service synchronization;

  4. NexentaStor SMB service will fail to properly initialize AD credentials;

  5. VMware 4.1 now pushes AD authentication all the way to ESX hosts, enabling better credential management and security but creating a potential AD dependency as well;

  6. Using auto-startup order on the remaining ESX host for AD and vCenter could automate the process (steps 17 & 19), however, I prefer the "manual" approach after a SAN upgrade in case the upgrade failure is detected only after ESX host is restarted (i.e. storage service interaction in NFS/iSCSI after upgrade).


SOLORI's Take: This is a great opportunity to re-think storage resources in the SMB as the linchpin to 100% virtualization.  Since most SMB's will have a tier-2 or backup NAS/SAN (auto-sync or auto-CDP) for off-rack backup, leveraging a shared LUN/volume from that SAN/NAS for a backup domain controller is a smart move. Since tier-2 SAN's may not have the IOPs to run ALL mission critical applications during the maintenance interval, the presence of at least one valid AD server will promote a quicker RTO, post-maintenance, than coming up cold. [This even works with DAS on the ESX host]. Solution - add the following and you can ignore step 15:

3a. Migrate always-on AD server to LUN/volume on tier-2 SAN/NAS;


24. Migrate always-on AD server from LUN/volume on tier-2 SAN/NAS back to tier-1;


Since even vSphere Essentials Plus has vMotion now (a much requested and timely addition) collapsing all remaining VM's to a single ESX host is a no brainer. However, migrating the storage is another issue which cannot be resolved without either a shutdown of the VM (off-line storage migration) or Enterprise/Enterprise Plus version of vSphere. That is why the migration of the AD server from tier-2 is reserved for last (step 17) - it will likely need to be shutdown to migrate the storage between SAN/NAS appliances.

Friday, July 16, 2010

Quick-Take: vSphere 4, Now with SUSE Enterprise Linux, Gratis

Earlier this month VMware announced that it was expanding its partnership with Novell in order to offer a 1:1 CPU enablement license for SLES. Mike Norman's post at VirtualizationPractice.com discusses the potential "darker side" of the deal, which VMware presents this way:
VMware and Novell are expanding their technology partnership to make it easier for customers to use SLES operating system in vSphere environments with support offerings that will help your organization:

  • Reduce the cost of maintaining SLES in vSphere environments

  • Obtain direct technical support from VMware for both vSphere and SLES

  • Simplify your purchasing and deployment experience


In addition, VMware plans to standardize our virtual appliance-based products on SLES for VMware further simplifying the deployment and ongoing management of these solutions.

  • Customers will receive SLES with one (1) entitlement for a subscription to patches and updates per qualified VMware vSphere SKU. For example, if a customer were to buy 100 licenses of a qualified vSphere Enterprise Plus SKU, that customer would receive SLES with one hundred (100) entitlements for subscription to patches and updates.

  • Customers cannot install SLES with the accompanying patches and updates subscription entitled by a VMware purchase 1) directly on physical servers or 2) in virtual machines running on third party hypervisors.

  • Technical support for SLES with the accompanying patches and updates subscription entitled by a VMware purchase is not included and may be purchased separately from VMware starting in 3Q 2010.


- VMware Website, 6/2010



The part about standardization has been emphasized by us - not VMware - but it seems to be a good fit with VMware's recent acquisition of Zimbra (formerly owned by Yahoo!) and the release of vSphere 4.1 with "cloud scale" implications. That said, the latest version of the VMware Data Recovery appliance has been recast from RedHat to CentOS with AD integration, signaling that it will take some time for VMware to transition to Novell's SUSE Linux.


SOLORI's Take: Linux-based virtual appliances are a great way to extend features and control without increasing license costs. Kudus to VMware for hopping on-board the F/OSS train. Now where's my Linux-based vCenter with a Novell Directory Services for Windows alternative to Microsoft servers?

Thursday, July 15, 2010

ZFS Pool Import Fails After Power Outage

The early summer storms have taken its toll on Alabama and UPS failures (and short-falls) have been popping-up all over. Add consolidated, shared storage to the equation and you have a recipe for potential data loss - at least this is what we've been seeing recently. Add JBOD's with separate power rails and limited UPS life-time and/or no generator backup and you've got a recipe for potential data loss.



Even with ZFS pools, data integrity in a power event cannot be guaranteed - especially when employing "desktop" drives and RAID controllers with RAM cache and no BBU (or perhaps a "bad storage admin" that has managed to disable the ZIL). When this happens, NexentaStor (an other ZFS storage devices) may even show all members in the ZFS pool as "ONLINE" as if they are awaiting proper import. However, when an import is attempted (either automatically on reboot or manually) the pool fails to import.




From the command line, the suspect pool's status might look like this:


root@NexentaStor:~# zpool import
pool: pool0
id: 710683863402427473
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
        pool0        ONLINE
          mirror-0   ONLINE
            c1t12d0  ONLINE
            c1t13d0  ONLINE
          mirror-1   ONLINE
            c1t14d0  ONLINE
            c1t15d0  ONLINE

Looks good, but the import it may fail like this:

root@NexentaStor:~# zpool import pool0
cannot import 'pool0': I/O error

Not good. This probably indicates that something is not right with the array. Let's try to force the import and see what happens:



Nope. Now this is the point where most people start to get nervous, their neck tightens-up a bit and they begin to flip through a mental calendar of backup schedules and catalog backup repositories - I know I do. However, it's the next one that makes most administrators really nervous when trying to "force" the import:



root@NexentaStor:~# zpool import -f pool0
pool: pool0
id: 710683863402427473
status: The pool metadata is corrupted and the pool cannot be opened.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
cannot import 'pool0': I/O error


Really not good. Did it really suggest going to backup? Ouch!.



In this case, something must have happened to corrupt metadata - perhaps the non-BBU cache on the RAID device when power failed. Expensive lesson learned? Not yet. The ZFS file system still presents you with options, namely "acceptable data loss" for the period of time accounted for in the RAID controller's cache. Since ZFS writes data in transaction groups and transaction groups normally commit in 20-30 second intervals, that RAID controller's lack of BBU puts some or all of that pending group at risk. Here's how to tell by testing the forced import as if data loss was allowed:



root@NexentaStor:~# zpool import -nfF pool0
Would be able to return data to its state as of Fri May 7 10:14:32 2010.
Would discard approximately 30 seconds of transactions.


or

root@NexentaStor:~# zpool import -nfF pool0
WARNING: can't open objset for pool0

If the first output is acceptable, then proceeding without the "n" option will produce the desired effect by "rewinding" the last couple of transaction groups (read ignoring) and imported the "truncated" pool. The "import" option will report the exact number of "seconds" worth of data that cannot be restored. Depending on the bandwidth and utilization of your system, this could be very little data or several MB worth of transaction(s).



What to do about the second option? From the man pages on "zpool import" Sun/Oracle says the following:



zpool import [-o mntopts] [ -o property=value] ... [-d dir-c cachefile] [-D] [-f] [-R root] [-F [-n]]-a
Imports all pools found in the search directories. Identical to the previous command, except that all pools with a sufficient number of devices available are imported. Destroyed pools, pools that were previously destroyed with the “zpool destroy” command, will not be imported unless the-D option is specified.



-o mntopts
Comma-separated list of mount options to use when mounting datasets within the pool. See zfs(1M) for a description of dataset properties and mount options.

-o property=value
Sets the specified property on the imported pool. See the “Properties” section for more information on the available pool properties.

-c cachefile
Reads configuration from the given cachefile that was created with the “cachefile” pool property. This cachefile is used instead of searching for devices.

-d dir
Searches for devices or files in dir. The -d option can be specified multiple times. This option is incompatible with the -c option.

-D
Imports destroyed pools only. The -f option is also required.

-f
Forces import, even if the pool appears to be potentially active.

-F
Recovery mode for a non-importable pool. Attempt to return the pool to an importable state by discarding the last few transactions. Not all damaged pools can be recovered by using this option. If successful, the data from the discarded transactions is irretrievably lost. This option is ignored if the pool is importable or already imported.

-a
Searches for and imports all pools found.

-R root
Sets the “cachefile” property to “none” and the “altroot” property to “root”.

-n
Used with the -F recovery option. Determines whether a non-importable pool can be made importable again, but does not actually perform the pool recovery. For more details about pool recovery mode, see the -F option, above.


No real help here. What the documentation omits is the "-X" option. This option is only valid with the "-F" recovery mode setting, however it is NOT well documented suffice to say it is the last resort before acquiescing to real problem solving... Assuming the standard recovery mode "depth" of transaction replay is not quite enough to get you over the hump, the "-X" option gives you an "extended replay" by seemingly providing a scrub-like search through the transaction groups (read "potentially time consuming") until it arrives at the last reliable transaction group in the dataset.

Lessons to be learned from this excursion into pool recovery are as follows:



  1. Enterprise SAS good; desktop SATA could be a trap

  2. Redundant Power + UPS + Generator = Protected; Anything else = Risk

  3. SAS/RAID Controller + Cache + BBU = Fast; SAS/RAID Controller + Cache - BBU = Train Wreck



The data integrity functions in ZFS are solid when used appropriately. When architecting your HOME/SOHO/SMB NAS appliance, pay attention to the hidden risks of "promised performance" that may walk you down the plank towards a tape backup (or resume writing) event. Better to leave the 5-15% performance benefit on the table or purchase adequate BBU/UPS/Generator resources to supplant your system in worst-case events. In complex environments, a pending power loss can be properly mitigated through management supervisors and clever scripts: turning down resources in advance of total failure. How valuable is your data???

Thursday, June 17, 2010

VMware vSphere Update 2, View 4.x Issues

Right on the heels of vSphere 4 Update 2, a flurry of twitter and blog postings suggestion caution if upgrading a VMware View 4.x environment. Now, VMware has updated its KB to include a workaround for those needing to move to Update 2 for the feature and hardware support benefits but can't afford to break their PCoIP-based View environment: in short, it's all VMware Tools fault! Seems the SVGA drive in VMware Tools for Update 2 has a problem with PCoIP - preventing desktops from connecting.

Now that makes some sense given the fact that VMware is quite adamant about the ORDER in which VMware Tools, .NET framework and the View Agent must be installed. The current options for PCoIPers out there needing Update 2 are three-fold:

  1. Don't upgrade VMware Tools;

  2. Upgrade VMware Tools with "Advanced Options" including

    1. For x86 guests:
      /s /v "/qn REINSTALLMODE=voums REINSTALL=WYSE,VMXNet,
      VMCI,Mouse,MemCtl,Hgfs,VSS,VMXNet3,vmdesched,ThinPrint,Sync,
      PVSCSI,Debug,BootCamp,Audio,Buslogic,VICFSDK,VAssertSDK,
      Toolbox,Upgrader,GuestSDK,PerfMon,Common,microsoft_dlls_x86,
      ArchSpecific"
      (line breaks included for formatting only)

    2. For x64 guests:
      /s /v "/qn REINSTALLMODE=voums REINSTALL=WYSE,VMXNet,
      VMCI,Mouse,MemCtl,Hgfs,VSS,VMXNet3,vmdesched,ThinPrint,Sync,
      PVSCSI,Debug,BootCamp,Audio,Buslogic,VICFSDK,VAssertSDK,Toolbox,
      Upgrader,GuestSDK,PerfMon,Common,microsoft_dlls_x64,ArchSpecific"
      (line breaks included for formatting only)



  3. Use RDP (not really an option for PCoIP environments)


If you're already STUNG by the VMware Tools, the KB has a procedure for back-tracking the SVGA driver after VMware Tools for Update 2 has been installed. Here it is:
For customers that are experiencing the issue and want to continue using PCoIP, use this workaround:

Note: VMware recommends that you have VMware View 4.0.1 installed prior to performing these steps.

  1. Log in to the affected virtual machine(s) with Administrator rights. This is any virtual machine that has the VMware Tools shipped with ESX 4.0 Update 2 installed.

  2. Rename the C:\Program File\Common Files\VMware\Drivers\Video folder to C:\Program File\Common Files\VMware\Drivers\Video-OLD.

  3. From a working virtual machine, copy the C:\Program File\Common Files\Vmware\Drivers\Video folder to C:\Program File\Common Files\Vmware\Drivers\ on an affected virtual machine.

  4. Click StartSettingsControl PanelSystem.

  5. Click the Hardware tab and click Device Manager.

  6. Expand Display Adapters.

  7. Choose VMWARE SVGA II. Right-click on it and choose Properties.

  8. Click the Driver tab. You are presented with the SVGA driver properties.

    The version is 11.6.0.34 (Driver Date 01/03/2010).

  9. Click Update Driver. The Hardware Update wizard displays.

  10. When prompted with the question, "Can windows connect to Windows Update to search for software?", click No, not at this time and click Next.

  11. When prompted with the question, "What do you want the wizard to do", click Install from a list or specific location (Advanced) and click Next.

  12. Click Don't search. I will choose the driver to install and click Next.

  13. On the next screen, you are presented a list of two or more VMWARE SVGA II Versions to choose from. For example:

    VMWARE SVGA II Version: 11.6.0.31 (21/09/2009)
    VMWARE SVGA II Version: 11.6.0.32 (24/09/2009)
    VMWARE SVGA II Version: 11.6.0.34 (01/03/2010)

    Choose VMWARE SVGA II Version: 11.6.0.32 (24/09/2009).

  14. Click Next. The driver begins installing.

  15. Click Continue Anyway when Windows notifies you that VMware SVGA II has not passed Windows logo testing.

    The driver install completes.

  16. Click Finish.

  17. Click YES when prompted to restart your computer.

  18. Verify that PCoIP is now working.


Note: VMware is investigating a permanent fix for the issue so our customers can upgrade to ESX 4.0 Update 2 without experiencing this issue.

- VMware KB1022830, 6/17/2010


In-the-Lab: Install VMware Tools on NexentaStor VSA

Physical lab resources can be a challenge to "free-up" just to test a potential storage appliance. With NexentaStor, you can download a pre-configured VMware (or Xen) appliance from NexentaStor.Org, but what if you want to build your own? Here's a little help on the subject:

  1. Download the ISO from NexentaStor.Org (see link above);

  2. Create a VMware virtual machine:

    1. 2 vCPU

    2. 4GB RAM (leaves about 3GB for ARC);

    3. CD-ROM (mapped to the ISO image);

    4. One (optionally two if you want to simulate the OS mirror) 4GB, thin provisioned SCSI disks (LSI Logic Parallel);

    5. Guest Operating System type: Sun Solaris 10 (64-bit)

    6. One E1000 for Management/NAS

    7. (optional) One E1000 for iSCSI



  3. Streamline the guest by disabling unnecessary components:

    1. floppy disk

    2. floppy controller (remove from BIOS)

    3. primary IDE controller (remove from BIOS)

    4. COM ports (remove from BIOS)

    5. Parallel ports (remove from BIOS)



  4. Boot to ISO and install NexentaStor CE

    1. (optionally) choose second disk as OS mirror during install



  5. Register your installation with Nexenta

    1. http://www.nexenta.com/register-eval

    2. (optional) Select "Solori" as the partner



  6. Complete initial WebGUI configuration wizard

    1. If you will join it to a domain, use the domain FQDN (i.e. microsoft.msft)

    2. If you choose "Optimize I/O performance..." remember to re-enable ZFS intent logging under Settings>Preferences>System

      1. Sys_zil_disable = No





  7. Shutdown the VSA

    1. Settings>Appliance>PowerOff



  8. Re-direcect the CD-ROM

    1. Connect to Client Device



  9. Power-on the VSA and install VMware Tools

    1. login as admin

      1. assume root shell with "su" and root password



    2. From vSphere Client, initiate the VMware Tools install

    3. cd /tmp

      1. untar VMware Tools with "tar zxvf  /media/VMware\ Tools/vmware-solaris-tools.tar.gz"



    4. cd to /tmp/vmware-tools-distrib

      1. install VMware Tools with "./vmware-install.pl"

      2. Answer with defaults during install



    5. Check that VMware Tools shows and OK status

      1. IP address(es) of interfaces should now be registered

        [caption id="attachment_1580" align="aligncenter" width="300" caption="VMware Tools are registered."][/caption]





  10. Perform a test "Shutdown" of your VSA

    1. From the vSphere Client, issue VM>Power>Shutdown Guest

      [caption id="attachment_1581" align="aligncenter" width="300" caption="System shutting down from VMware Tools request. "][/caption]

    2. Restart the VSA...

      [caption id="attachment_1582" align="aligncenter" width="300" caption="VSA restarting in vSphere "][/caption]




Now VMware Tools has been installed and you're ready to add more virtual disks and build ZFS storage pools. If you get a warning about HGFS not loading properly at boot time:

[caption id="attachment_1586" align="aligncenter" width="300" caption="HGFS module mismatch warning."][/caption]

it is not usually a big deal, but the VMware Host-Guest File System (HGFS) has been known to cause issues in some installations. SInce the NexentaStor appliance is not a general purpose operating system, you should customize the install to not use HGFS at all. To disable it, perform the following:

  1. Edit "/kernel/drv/vmhgfs.conf"

    1. Change:     name="vmhgfs" parent="pseudo" instance=0;

    2. To:     #name="vmhgfs" parent="pseudo" instance=0;



  2. Re-boot the VSA


Upon reboot, there will be no complaint about the offending HGFS module. Remember that, after updating VMware Tools at a future date, the HGFS configuration file will need to be adjusted again. By the way, this process works just as well on the NexentaStor Commercial edition, however you might want to check with technical support prior to making such changes to a licensed/supported deployment.