4-5-13

So finally, after 5 months tortuous wait, I found out I passed the VCAP-DTD the other day. Well done to everyone else who’s got it, I don’t know how many there are, but I’m guessing not that many. There are a few out there in Twitterland who tweeted their success the other day, so I’m going to enjoy it’s exclusivity for now, until more certified folks come along.

As regards the exam itself, it was a tough old boot. I sat it early January, in fact the first week back after Christmas when my brain was even more Swiss cheese than normal. Owing to Pearson Vue chopping down the number of exam centres near where I live in north west England, I had to travel across the Pennines to Leeds to sit it.

The exam itself is listed as 195 minutes. I think I came in around 10 minutes under that, but as I was so far behind time wise, I just ended up going with my gut feeling on a lot of the answers. It was physically and mentally one of the most demanding exams I’ve ever done. I wasn’t permitted to take any water into the test room, which had no natural light and was very stuffy (even in January!). I also didn’t find the exam centre staff all that welcoming, so I was glad when it was all over.

Advice? Well firstly, you need to already be a VCP-DT to sit this exam. Also, it’s not an exam you can wing. I’ve been consulting for around 3 1/2 years, and that sort of experience is priceless for this sort of exam, as there are a lot of scenario based questions you need to answer. If  you’ve done this for real with living, breathing customers, you’ve already got an advantage.

You also need to know View inside out. What it can do, what it can’t do. How it integrates with other products. Some may disagree, but I think View is pretty unique in a VMware product as it has so many dependencies with other third party products such as RSA, load balancers and anti-virus. You need to know how to weave those into the solution, and not just at a superficial level. I’d say you need around 1/2 years of experience administering View, especially in an enterprise environment. The exam covers a very broad spectrum.

One more tip is to watch the sample video on the VMware Education website on the drag and drop “Visio style” design tool. The blueprint states there are six of these in the exam, and they do take a big chunk of time. Don’t get waylaid here and keep an eye on the time.  The demo can be obtained from http://mylearn.vmware.com/register.cfm?course=149330 and I think the video is around 10 minutes long.

So finally I hammered together some study notes in a couple of days (I didn’t get chance to do any study over Christmas. Well, too pissed most of the time anyway!). The notes were constructed around the beta exam blueprint, so may not perfectly match the final release. That being said, I will publish them to the community, feel free to get in touch and let me know if you spot any errors or omissions. You may also distribute it, please just provide an acknowledgement if you re-publish any aspect of them. They’re not perfect, but hopefully they’ll aid your preparation.

I submitted a session on the VCAP-DTD for VMworld Barcelona, so if you haven’t voted yet, please do and hopefully I can put together a decent presentation on this there.

Good luck!

VCAP-DTD-Study-Notes

24-3-13

It’s been a week since I popped my duathlon cherry at the Oulton Park Spring Duathlon and now seems a good time and place to take stock on how the event went, how I did and where I go from here.

Firstly the event itself was amazing and kudos to Xtra Mile Events for putting on such a good show. The choice of venue was brilliant. I didn’t take a single photo regrettably, but take my word for it when I say that racking my bike in transition in the pit lane made the hairs on the back of my neck stand on end. And then walking out onto the grid for the start was just immense. I have to say there was a pretty decent turnout of spectators too, considering by the start time it wasn’t any more than 3 or 4C. The race itself was 2 laps running, 9 laps bike and 1 lap running. The track itself is 4.3km long, so it was an 8.6k/38.8k/4.3k split for a total of 51.7k or 32.12 miles!

I think there was somewhere in the region of 230 entrants for the main race, which was also a national qualifying event. The good news for me that looking down the list of entrants, there seemed to be as many novices as there were total tri ninjas (you had to declare if you were a novice, so I did to save getting trampled or embarrassed by whippet thin tri professionals). I was also impressed that there was such a broad range of ages in show, from teenagers through to 50-60.

One of the main things with events such as these is not to be intimidated by people who show up with expensive carbon bikes and all the gear. Here I was with my Decathlon running shoes and my £200 road bike! At the end of the day, it’s about the duathlete, not about the gear you’ve got. By and large I didn’t feel like there was any snobbery about the event, there seemed to be a decent camaraderie.

So in total, I’d been training for the event for six weeks. By my reckoning, I was already reasonably fit and this would be a decent amount of time to get in the miles and be ready for the event itself. How massively wrong I was. It was quite chastening that I set off for the first two laps of running at my usual pace of around 10.5/11k an hour and just about everyone streaked past me. I’d tried to take on board some carbs before I started (a carb bar and some toast in the track restaurant) and some fluids, but I am prone to stitches, so I tried not to take on too much that would slow me down. Imagine my surprise to see competitors in the restaurant tucking into bacon and sausages before the start! I did wonder if they knew something I didn’t, but I couldn’t countenance the idea of running round with all that swilling in my belly.

I had some fluids on my bike, so I presumed that would be OK, I could drink while I rode. Let me tell you, you never realise how hilly these race circuits are until you’re running around them! The leaders were off and snaking into the distance even before I’d hit the 1K marker. It’s not a nice feeling to have bikes already flying past you when you’ve just completed the first run lap! I tried to tell myself to stay calm and run the race I’d trained for. And even though I hadn’t done much road cycling at all in preparation (stationary bike in the gym), I was confident that I would be OK and I could make up some ground there.

I had labels made up on my bike handles so I could keep track of how far I’d gone, as I was paranoid I wouldn’t do enough laps! So I got to the end of the first two laps running and I already felt dead and that there was no way I’d get to the end at this rate. There was just nothing in my legs at all. I’d had a virus a couple of weeks before the race, and even though I felt OK, it had obviously taken a lot more out of me than I thought.

The first transition was a case of me getting my running shoes off, bike shoes on and deciding whether or not to layer up in the cold. As I was burning up anyway, I decided to stay with the long sleeve base layer and tri singlet I had on. It proved to be a good decision as I never really felt cold on the bike.

Endorphins kick in at different times for different people. I’ve heard some say as little as 20/30 minutes. In my case it’s a lot more than that, I’d say around 45 minutes, which was more or less when the bike phase started. I felt pretty good on the bike and I was lapping 9 and 10 minute laps, which looking at the results, was reasonably par with the rest of the field and considering I’d done so little cycle training, pretty good.

The 9 laps on the bike went by reasonably quickly, and I was very happy with the climbs I was doing, I seemed to pass a few people along the way. It helps that you can pretty much gun it all the way around, I can’t recall ever touching the brakes on the way round. I tore off my last lap layer and headed to T2. At this point, my troubles were just about to start. I only had one drinks bottle on my bike and I’d sucked it all down during the ride. I had a gel pack, but I was dubious of using it as I’d never used them before and I’d read some horror stories in forums before about getting stomach cramps with them as they divert water from your intestine (I don’t know if this is true or not, or an urban myth).

I dismounted my bike before the pit lane, as per the rules. My legs were very heavy, but I expected that. However, once I’d pulled on my running shoes and tried to stand up, I realised I was worse off than I thought. I started to run towards the pit lane exit to rejoin the race, but I was absolutely parched. I had my gel pack and I also knew there was a water station at the end of the pit lane. I grabbed a cup of water and necked it, and then  sucked out about a third of my gel pack. It tasted like shit. I just hoped it worked and didn’t give me cramps.

I tried to run but my legs were just numb. I just told myself to put one foot in front of the other and I’d be OK. Those kilometre marker points seemed a million miles apart! I must have looked like one of those Olympic Walkers, who do that funny kind of arse wobbling walk that isn’t quite a canter. Then the worst of it, I was about 100 metres short of the 2K marker board and my legs just locked out totally. I was wracked with cramp and felt like the muscles in my thighs were as big as tennis balls. I stopped briefly to stretch and to his massive credit, a guy who had just passed me and had started to streak away into the distance turned and offered to stretch my legs out. My friend, I don’t know your name, but you’re a fine human being. Pride and embarrassment prevented me from taking up his offer, but I managed to start again and before I knew it, I was more or less there. I crossed the line and just felt so relieved it was over. The photographer asked me to smile and I just wanted to tell him to angrily f**k off! I didn’t and the finish picture actually came out pretty well.

In terms of my goals, I’d hoped for 2 hrs 30 mins and I finished in 2 hrs 41 minutes, so slightly outside of that. I didn’t finish last though, and I also beat my training partner, which was also my evil secret goal!

So then, having had a week to think about it, would I do it again? Absolutely, yes. It’s being run again in October, and I’ve no doubt I’ll have another go at it. What advice would I give others thinking of doing a duathlon? Firstly, have a go. While there are some pretentious arseholes in the field, they storm away from everyone else and you’ll soon forget about them. The vast amount are in as much pain as you, don’t forget that.

From a training perspective, 6 weeks is nowhere near enough training for physical exertion of this magnitude. If you were doing the sprint version of the race (5 laps bike, 1 lap run) you could probably get away with it. I’d recommend 3 months at least, 3 or 4 days a week training, with plenty of rest in between.

One thing I hadn’t factored in much was “brick” training. This is doing a run when you’ve just finished a long ride and your legs are screaming at you to bugger off and lie down. As the summer approaches (if it ever does here), I’m going to ensure I do a lot more bricks and get my legs used to it. The first run and bike laps were not bad time wise, but the last lap in agonising pain certainly buggered up my timings.

Get hydrated and stay hydrated. One bottle on your bike is not enough. Take at least two, you’ll need it. Maybe even consider a hydration backpack. I read a feature with Mark Cavendish that said if you wait until you are thirsty before you have a drink, it’s already too late. I’d like to think he knows what he’s talking about, so I’d bear that in mind.

Don’t be afraid to carb load before the race. I’m still not sure I’d advocate a full English before you start, but a sausage butty probably wouldn’t do you any harm a couple of hours before the start. Remember it’s 52K, so you need something in the tank. Having also used gel packs now, I’ll probably have one before the start and one at T1 next time, to give me that extra boost. Jelly Babies are probably a good idea too, they’re small and easily stowed, just try not to put them too close to your body where they just go manky!

I’m now having a big look at my training plan for the summer, with a lot more road biking and I also need to drop another half a stone, I reckon. The weight to power ratio should be about right at that point.

In summary, if you’ve done 10K races and fancy something different, give duathlons a go. If you don’t fancy swimming in a cold duck pond with 200 others, a duathlon is a decent combination of run/bike over a testing distance. Be warned, it’s not a cheap habit (this event was nearly £50), but if you find yourself doing more, consider joining the BTF and save the £5 day licence fee you need to pay. If you’re not sure, dip a toe in the water with the sprint event, which is only a single run lap and 5 on the bike, which should be well within most folks’ compass.

Give yourself plenty of time to train and remember to do lots of brick training. That second transition is the key point of the race where you’re going to sink or swim. Looking forward to October and setting a new PB!

26-08-12

On the move again…

Well Wednesday was my last working day at Xtravirt and I’m preparing to move on to yet another employer (saying that, I’ve not had so many in my career that I’ve lost count!). From 3rd September, I’ll be working at the Marsh and McLennan group in Liverpool, doing various VMware related activities.

As always at times like this, I find myself going to great pains to say that there were no ulterior motives for me leaving Xtravirt. They’re a great bunch and very skilled, so I would without hesitation say that if you ever get chance to work with or for them, take it, you won’t regret it.

For my part, I’d bitten off a little more than I could chew from a lifestyle point of view. I’ve had fun travelling the UK and further afield over the last couple of years, but I realised that I wanted my own bed back and to spend more time at home now my kids are a little older and somewhat more “challenging”. Getting an opportunity in Liverpool is perfect, far enough away but close enough to home that I can commute and see more of one of our finest cities close up. I’m embarrassed to say I’ve been to Liverpool less than half a dozen times in my life.

It also occurred to me after the interview that I could do without that level of stress for a while, so I certainly don’t plan on making any further moves for quite a while. It’s about time I got to own something over a period of time. Consulting is great fun and really takes you out of your comfort zone, but in the end, you walk off into the sunset never knowing how “your baby” turned out. Hopefully that won’t be the case at MMC, I’ll see things start from an idea and go through to design, procurement, delivery and a full life cycle after.

Can’t say I’ll miss motorway service stations too much, but I’ve learned in this life to never say never. There’s a good chance I’ll be back out on the road sometime in the future, but not for now.

 

17-08-12

The joy of SETX!

So this blog posting is a relatively short one, but hopefully it will be useful to those doing some stateless desktop implementations. One of the big problems in a stateless desktop deployment is the issue of identity persistence. By that, I mean that when a user logs into a desktop, the stateless model is comparable to the metaphor of the “next cab off the rank”. We have no idea which desktop will be used and by whom, it’s something of a random process.

This can and does work well in a vast majority of cases – where profile management solutions are in use (such as AppSense, RES or vendor specific implementations from VMware or Citrix), we can inject some user specific information back into the environment at logon and/or application open/close which means that users are not materially impacted by the stateless model and can work as they normally would with all their apps and settings available, irrespective of which desktop they happen to pluck out of the pool.

Where this tends to come unstuck is where the idea of identity needing to be persistent or predictable comes into play. Let me use a real life example. The engagement I’m currently working on is for a customer whose entire business is underpinned by an application on a mainframe server. This is not uncommon for businesses with long histories, and to be honest, it generally “just works” as it’s a solution that’s been in play for a number of years (sometimes decades) and has been refined over time to become an indispensable business tool. That’s all well and good and worked like a charm in the days of green screen “dumb terminals” (which I’m sad to say I’m just about old enough to remember!) and even as Windows and other GUIs came in, we still had the ability to use terminal emulation software to connect back to the mainframe and carry on as usual.

In my particular case, the customer is using a suite of connectivity products called Hummingbird. There isn’t anything spectacularly exotic about this, but the issue then becomes one of identity. In the “fat client” world, it’s easy enough to configure a connection and save it locally, so no matter who logs on, the “All Users” profile means that the correct settings (including LU name, which is the specific slot on the mainframe reserved for this workstation) are always available and never have to be fiddled with.

The difference of course in a stateless environment is that we never really know which desktop a user will connect to the mainframe from. It could be Desktop01, or Desktop02 or even Desktop50. Because of this, we need a way to tie down the LU identity so we know that regardless of which user logs into a thin client and regardless of which virtual desktop they are given, that particular thin client will always use the same connection details to the mainframe.

I thought long and hard about this, and came up with several solutions which while they worked, never felt truly elegant and required updating each time a new thin client was brought into the environment. When you’re talking about a couple of thousand devices, this design quickly becomes impractical.  In this particular environment, we are using Wyse T10 devices which have a factory preset device name (string value starting with WT), which can be altered in the config mode to pretty much anything you like. As well as this, we’re using Citrix XenDesktop 5.6 and AppSense Environment Manager 8.2.

A colleague stumbled across a Windows utility called SETX.EXE. Apparently it’s been around for years, but no-one seems to have heard of it before. Essentially what it does is create an environment variable based on input from a registry key, arguments or a file. Citrix creates entries in the registry for where the desktop session has originated from, which are called ClientName and ClientIPAddress. What we did was to use SETX.EXE to read these values from the registry and store them in a custom environment variable.

What we then did was to copy the Hummingbird configuration files to a network share (one folder per thin client) and use an AppSense Environment Manager policy to copy the appropriate configuration files from the network to the virtual desktop using the User Logon node. The logic was basically thus:-

  • Use SETX.EXE to create the environment variable ClientName and populate it with the reciprocal value from the registry
  • Copy \\share\configs\%ClientName%\*.hep to C:\ProgramData\Hummingbird\Connectivity\13.00\Profile\Startup

It’s as simple as it’s elegant and means it doesn’t matter how many thin clients we add or what we call them, as long as the share exists, is populated with the correct files and permissions are correct, when Hummingbird is started, the sessions will start automatically (hence the use of the Startup folder). Hint – to do this, you need to add the “-*” switch to the desktop shortcut.

I know I’ve rambled a bit, but hopefully SETX.EXE can be a useful Swiss Army Knife tool you can store in your VDI deployment armory for future use!

27-06-12

LoginVSI Pre-Flight Checks

In the previous blog post, I discussed how LoginVSI can help benchmark your VDI or SBC environment and provide some performance metrics on where the performance bottlenecks are likely to occur when the solution is heavily loaded. As discussed previously, you’ll have the following components set up and configured:-

  • LoginVSI share (hosted on a Windows Server or Samba share where the Windows 7 20 concurrent connection restriction does not apply)
  • LoginVSI Launcher workstations (with the Launcher setup run in advance)
  • LoginVSI Target desktop pools (with the Target setup run in advance and Microsoft Office installed)
  • Active Directory script run to configure the required LoginVSI users and groups and add the Group Policy settings to those users (turns off UAC, amongst other things)
  • Ensure statistics logging is working properly on vCenter (assuming a vSphere infrastructure)

Once the environment has been configured and you have your pool of desktops spun up, it is recommended that all virtual desktops be left to “sit” idle for a while, this is so that they reach “steady state” before the tests commence. Steady state is essentially where all desktops have started, launched all start up services (anti-virus scanners, “call home” services or applications, Windows services) and disk activity has settled down to an idle tick, rather than thrashing as it does when it starts. What’s worth bearing in mind is that if all virtual desktops are on the same datastore that it may take several minutes for steady state to be reached, depending on disk latencies. In my particular tests, I had between 100-120 desktops spun up at once and I left the pool to sit for around 20 minutes before running any LoginVSI workloads.

How do you know if steady state has been reached? I used the vSphere client to look at CPU and memory usage of each virtual machine and waited until the utilisation dropped down to a minimum. After a few test runs, you will start to get an idea of where steady state is, as each desktop build is slightly different, depending on applications and services installed. It’s not imperative you do this, but if you read the white papers produced by the major VDI stack vendors (Microsoft, Citrix, VMware, NetApp etc.), you will find this is something they tend to do.

At this stage, it’s often prudent to perform a few test runs, just to ensure that everything is running as you expect. You can also use these test runs to perform some workload tuning, such as time delays between sessions starting. As discussed in the previous post, if you set this value too aggressively, you can saturate your hypervisor host very quickly, and this can negatively skew results. Plus, is this the reality of how your users will use your VDI environment? Is it likely that you will have 100 users logging in a near simultaneous manner in a three or four minute window? In most cases you’d probably say no. The obvious exception to this would be an educational environment, in which dozens (even hundreds in a University or College setting) of users would login at the same time and start several applications after login. In a commercial or non-academic environment, generally users login over a much larger time frame and even when they’re logged in, they are far more inclined to make long phone calls or make a coffee, resulting in significant periods of idle time.

As a tip, use the calculator built into the Management Console to compute the time delays between the number of sessions and make sure they represent “real life” numbers, such as a login every 6 minutes etc.

During my testing with a customer, we would make a single environmental change and then analyse the results – for example, changing the amount of memory given to the virtual desktops (1.5GB vs 2GB, for example), or an extra vCPU, or a change to the underlying storage fabric. In this respect, LoginVSI can also be used to model environmental changes, a “what if” type of analysis. This can be especially useful if you are conducting a performance analysis of new storage to validate a vendor’s claims, or a “what if we add 20 more virtual desktops to this host” scenario.

VSIMax

The end goal is the result of the VSI Max, which is essentially the “tipping point” of performance. This is established in a way that I still don’t truly understand (and I read the explanation several times!), but in essence is calculated by capturing the delay intervals in between performing tasks in the target workload. There are embedded timers within the workloads that spawn activities such as reading Outlook messages, or playing a Flash video and the intervals between activities are randomised, so as to imitate real life usage. A baseline average response time is calculated and when delays increase, the VSI Max value is obtained. This value basically represents the maximum number of virtual desktops per host before performance significantly degrades.

In our particular test case, we were looking to achieve a density of 100 desktops per vSphere blade. This figure was reached after a capacity planning exercise – so VMware’s Capacity Planner was deployed to a bunch of workstations in a “knowledge worker” use case – users who generally have medium to high task demands – using Outlook to send messages, opening large spreadsheets, manipulating graphics intensive slide decks etc. As a result, 100 desktops was considered an appropriate density based on the Capacity Planner results and the specification of the hypervisor hardware.

The VSIMax validates the design of the solution and gives both the solution architect and the end users/customers confidence that the VDI solution is fit for purpose. The graphic below shows the output from three tests run that validate the design for 100 desktops. You will need to install the VSI Analyser to compare the results, using the Comparison Wizard:-

The comparison of three runs to demonstrate that the design scales to 100 desktops before performance suffers

Running The Tests

I’d recommend running at least three iterations of your test cycle to ensure a reliable result. What you should find is that each result is generally quite close together and this way you can average out the VSIMax over the three runs of the test. That being said, on odd occasions you may see freak results (generally at the lower end of the performance spectrum) and it’s worth discarding this result and performing another test iteration. This can happen for a variety of reasons, such as the pool not being in a steady state, for example. Several simultaneous power cycle operations on a pool can cause performance degredation.

Analysing Bottlenecks

So let’s say you’ve built your solution to meet the needs of a 100 simultaneous virtual desktop connections, but your VSIMax figure averages out well below that figure (worryingly so!). Where do you go from here? At this stage, this is where performance of the hypervisor host comes into play. In our particular test, the hypervisor in use is vSphere. This is good because vCenter automatically collects performance statistics and stores them in the database, so we don’t need to babysit real time statistics to know where the bottleneck is, we can just look back restrospectively in vCenter.

The main areas to look at first for performance bottlenecks include:-

  • Processor
  • Memory
  • Storage

There are other metrics we can look at, but it’s likely that in a high proportion of cases the bottleneck has been caused by one of the three main resources listed above. Looking at processor first, we can obtain graphs from vCenter for the lifetime of the test run (so please make sure you make a note of the start and stop times of the tests!). Export the information and select the processor, memory and datastore check boxes so we keep data to a minimum to start with.

CPU Performance

Performance output from vCenter for CPU performance during Login VSI tests

Looking at the graph above from vCenter, we can see variable saturation of processor resource. The main takeaway from this result is that CPU utilisation never exceeds ~65%, so we can see quite clearly from the off that CPU is not the limiting factor in this particular test scenario.

Memory Performance

To continue the investigation, we now need to take a look at the memory resource to see if this is the constraining resource. As we can see from the chart, again memory is not the issue. Although the memory usage hovers around maximum, it is a little below.

Memory performance showing that utilisation is constantly under 24GB

20GB of physical RAM is available in the ESXi host, and as we can see by the performance chart, memory is heavily utilised for most of the test but does not max out. So taking into account CPU and memory performance during the testing, we have enough spare capacity in these resources to service 100 virtual desktops. We’re making good progress in ruling out the performance bottleneck, but we haven’t found it yet! Onwards to the datastore performance charts!

Datastore Performance


Looking at the performance charts for the datastore, we can clearly see an issue with performance straight away. The chart shows high latencies for both read and write performance, in the worst case we can see a latency of 247ms for write operations to one datastore in use.

Performance statistics showing a high disk read and write latency for the datastore

So the question here is, what is an acceptable disk latency? In broad terms, the following values are a reasonable rule of thumb :-

  • Sub 10 ms – excellent, should be the target performance level
  • 10-20 ms – indicates a problem, may cause noticeable application/infrastructure issues
  • 20 ms or greater – indicates unacceptable performance, applications and services such as virtual desktops will exhibit significant performance issues

Depending on your workload, you may well see spikes in performance at the storage level. These spikes can be acceptable as by definition they are sporadic and rare and generally do not impact long term performance. Microsoft lists acceptable disk latency spikes for SQL Server as 50ms, for example. I don’t know I especially agree with this number, but they know SQL Server a lot better than I do!

Performance Conclusions

Looking at the performance charts, we can see that the disk is the bottleneck. The latencies at the disk level are quite severe, and would result in a much lower VSIMax value than what was originally planned for. If we can add bandwidth to the disk layer, we can improve the density of virtual desktops per hypervisor host. In this case, we had local SAS disks in a RAID1 configuration. Even though third party storage appliances were in use to try and improve throughput, the physical disks themselves could not sustain the level of performance required.

As such, the desktop pool was moved to SAN based storage, which was a Fibre Channel storage on a NetApp storage device. One LUN was configured to host the desktop pool datastore, in a one to one relationship, as per best practices. As the storage now in use is enterprise grade, we would expect the disk latencies to be significantly reduced. As mentioned before, LoginVSI can be a really useful for tool modeling configuration changes and their impact and this is a good example. We’ve already proved that CPU and memory are not fully utilised, and that the disk latencies are causing a lower than expected VSIMax value.

Datastore performance statistics when the desktop pool is moved to SAN storage from local disk

The performance graph for a virtual desktop datastore on the NetApp datastore shows a much reduced latency of (on average) under 1 ms. As stated previously, any latency under 10ms is excellent, anything sub 1 ms is jet propelled! Now we have identified and removed the performance bottleneck, our VDI solution will scale to the required number of 100, as per the original design. Obviously CPU, memory and datastore are only a subset of the possible performance metrics we could have obtained, but any bottleneck is most likely to be around those resources.

Also, we could look at such metrics as network, but we’d be most likely to look at those metrics if for example mouse movement was delayed, or keystrokes were slow. In a LoginVSI test scenario as the virtual desktops are designed to be “stand alone”, there should  be minimal network traffic anyway.

Hopefully the two posts on LoginVSI have provided some guidance on how you can benchmark your environment, and also identify and rectify any bottlenecks that prevent you from scaling to the designed limits. I’d quite like to present this topic as a slide deck at a VMUG somewhere, sometime. Please let me know if that’s something you’d like to see!

15-05-12

As I mentioned in a previous post, I’ve spent the last few weeks working with a product called Login VSI. What does it do? Well it essentially forms part of a virtual desktop deployment toolkit in the sense that it helps to benchmark performance of a VDI or SBC (Server Based Computing, such as Remote Desktop Services/Terminal Server) environment and essentially provide accurate end user performance metrics (OS and application response times) to outline the “tipping point” in performance of a VDI deployment.

For those who’ve already done several VDI deployments, you’ll already know the level of detail (and in some senses, educated guesswork) that goes into designing a solution. The types of questions posed include :-

- How many desktops do I need?

- How many IOPS do I need?

- How many physical disks do I need to provide the amount of IOPS?

- What sort of user metrics do I have from desktop assessment phases of the project?

- What are the requirements on the network fabric?

There are a lot more questions along similar lines, but all are important in the design of the solution to ensure it is fit for purpose. Once all numbers have been crunched, a design comes out of the other end that we hope will cut the mustard when it’s put into production.

Login VSI can help in this instance because it simulates users logging into the SBC/VDI environment and performing tasks expected of end users. As such, there are several pre-defined workloads that can be used to simulate real life examples. For example, the medium workload (which comes with the free licence) simulates a user logging in, browsing their Outlook mailbox, manipulating a Word document, PowerPoint presentation, Excel spreadsheet, PDF document, ZIP archive and website browsing with a Flash component (Kick-Ass trailer, which is a very funny movie if you haven’t seen it already!). Timers are built into the process to simulate random wait times when a user drinks coffee, sends a text or talks to a colleague, for example. There’s nothing so random as a human being, so it’s not precise but it does represent a “scattered” workload as you’d see in reality.

LoginVSI Architecture

The refreshing approach from Login VSI is that you don’t need to spin up a SQL Server to capture your performance metrics and environment configuration (don’t you just get tired of having to commission a SQL box every time you need to fart?). This means that as well as reduced initial cost, the complexity is lower and the time to be up and running is reduced. All you need to provide are four elements :-

- LoginVSI Share (can be anywhere on the network, but must be reachable and writable by all devices used in the test)

- Login VSI Launcher (Windows machine that can be physical or virtual, which essentially performs the logins and spawns the test workloads)

- Login VSI Target (Windows machine that has MS Office pre-installed, along with some other tools such as Flash Player, BullZip, Internet Explorer)

- Active Directory (a Login VSI OU is created, along with a Group Policy Object and some scripts that get copied into the NETLOGON share)

The good news is that you don’t need to rummage around dusty corners of the internet to get these tools, each of the four parts above some with their own installer. A handy graphic lifted from Login VSI’s website below illustrates the simple architecture of the product :-

Login VSI Environment Overview

LoginVSI Configuration

One thing that caught me out was that my VSI Share was on a Windows 7 machine. This would be fine on a  very small scale, but I was caught out by the fact that Windows 7 shares do not permit more than 20 simultaneous connections. Login VSI exhibits the behaviour that the target sessions starts and the user logs in, but the desktop just sits there and does not spawn any application sessions. This had me confused for quite a while as there are no error messages as such. If you go to one of the stalled desktops, unlock KidKeyLock (by typing vsiquit) and type in the UNC path of the VSI Share in Start | Run, you will see an error about the number of concurrent connections to a Windows 7 share. Save yourself a lot of time and put the VSI Share on a Windows server or Samba share!

In a VMware View or XenDesktop environment, you need to run the target setup routine on your master image before you spin up a desktop pool/catalog. This ensures that all of the desktops to be tested have all the appropriate software installed. You also need to ensure you have Microsoft Office installed in advance. Any version from 2003 upwards is fine, but if you’re testing Office 2007, it’s recommended to install SP2 beforehand, as there are some known issues with Outlook that are resolved by this patch.

Once you have your VSI Share, your launcher workstation(s) (each launcher will take a maximum of 50 targets, though my testing tended to work better with a maximum of around 35) and your targets, you’re pretty much set. The next stage from here is to configure your environment using the Management Console. The main points of interest here are configuring the launcher names and configuring the workload settings, such as workload type (light, medium etc.) and also peripheral settings such as the Microsoft Office version (if the wrong version is listed, this can prevent the automated workload from running successfully). The management console itself is pretty straight forward and self explanatory.

Configuring the connection string, number of sessions etc.

The screen shot above shows the test configuration. This is where the workload type is selected (Light, Medium etc.) and also connection settings to the VDI environment. As you can see from the above screen shot, Python is being used to connect to a Citrix XenDesktop web interface. This is because the login screen for Web Interface had been customised, and the Citrix connector for Login VSI could not recognise buttons on the screen such as login and selecting the available desktops. Citrix themselves provide some Python scripts to provide connectivity and these work just fine. In a View environment, the existing Login VSI connector would probably work just fine, as would a “vanilla” XenDesktop environment.

The next step before actually getting to the testing phase is to define your launcher machines (use the Windows NetBIOS name, rather than a DNS name or IP address, or you’ll likely see a few errors) and configure what settings you want for the workload itself. In my experience, the only setting you really need to look at is the Office version string, so 14 for Office 2010, 12 for Office 2007 and 11 for Office 2003. The screenshot below illustrates the settings.

Defining the launcher workstations in the Management Console

Defining the workload settings in the Management Console, including the Office version string.

Further Steps

You also have the option of creating custom workloads, but this is not something I have experience of and to be honest, not something I really had a need to use. If you just need some general benchmarks from your VDI environment, the Medium workload is recommended and used by most vendors when they produce performance white papers for their VDI solution (See Microsoft, Citrix and Equalogic for examples).

At this stage, I’m not going to get too invested in the nuts and bolts of how the whole process works, but needless to say if you’ve got this far, you’re pretty much ready to go. None of the workloads require access to the internet, nor do they require a connection to an Exchange server or any other network location. All workloads are fully isolated and self contained. If you’ve done all the setup and configuration successfully, you’re now at the stage where you can actually run some tests. Consult the Login VSI documentation for session specific settings, such as number of sessions, time delays between starting sessions (try and make this value sensible, so you don’t saturate your VDI hypervisor within a few minutes of starting the test, although if you’re simulating an academic environment, this may be important to you).

Once you’re ready to start the sessions, you should have the launcher agent running on your launcher workstations (a command prompt box that pings the VSI share for work to do) and all target machines spun up and ready to be logged into. In part two of this blog, I’ll tell you more about how to interpret the results. Stay tuned!

04-05-12

I’ve been in new new role with Xtravirt now so seven weeks or so now, and it’s interesting at this point to take a quick checkpoint and look at what I’ve observed already. I’ve been in the End User Computing game for some 14 of the 16 years I’ve been in the IT industry, in fact, it wasn’t even called EUC back then. I think the euphemism was something like “bloody end users!”, but of course we live in enlightened times these days, and we have to give them a more businesslike moniker.

The point is that EUC now is primarily a virtualised environment. Since I started my new role, I’ve been exposed to XenDesktop, XenApp, Login VSI and a whole raft of other tools. The interesting thing about the “new” EUC space is that it forces you to turn traditional desktop approaches on their heads. For example, Windows is written based around the fact that it usually has a local disk all to itself, that it can thrash to it’s heart’s content, without having to worry about running into other resources trying to land grab from it. In a virtualised EUC environment, this is no longer true. There can be several dozen virtual machines booting at once, logon storms, anti-virus scans and a whole batch of other processes going on simultaneously that are competing for the same disk resource.

Additionally, in the days of server consolidation (the phase 1 of mainstream virtualisation, if you will), capacity planning tools such as VMware Capacity Planner or PlateSpin Recon would be set up to capture performance metrics from physical servers, to essentially baseline what CPU, disk and memory resource was being used, so that the virtual equivalent could be appropriately sized for performance. In the EUC space, this is no longer sufficient. As well as capturing the previous metrics, we also need to look at additional detail based around the end user experience. If logon to a server console is slow, generally no-0ne but the admin would notice, and as frustrating as it might be, it’s generally tolerated and goes unreported. In the EUC world, when several dozen users logon at the same time and experience is degraded, IT will get to know about it pretty quickly.

As such, the likes of Login VSI help to determine the performance of the EUC experience using real world examples such as Outlook, Flash and manipulation of large spreadsheets. Traditional capacity planning tools are very useful for obtaining basic figures on specifications, but lack the insight to analyse application performance and the impact on a virtual desktop environment.

Away from such matters, it’s also interesting to look at applications. As I remarked at a BrainShare event presentation several years ago (before iPads and VDI in 2007), the apps drive the platform, not the other way around. Generally, users don’t care if it’s Mac, Linux, Windows, iPad, Android or Etch a Sketch, as long as they can get access to their line of business applications in a usable manner. The underlying layer of the OS generally just becomes another commodity item. I didn’t think I was being particularly visionary back then, just a pragmatic view based on the way I approached things as an end user.

Whilst enterprise applications such as Microsoft Office come with tools for the virtual environment, many core business applications are written in house and are proprietary to the business. As such, they tend not to have enterprise deployment tools, have extensive user communities or knowledge bases, and are written on the “good enough” principle. Again these apps are written with the assumption that the endpoint is a largely static thing, that the hostname doesn’t change and that it never moves around the network or across continents. In the virtual EUC space, this is no longer true and we must now be creative into fooling the app into thinking it’s still living in the traditional desktop environment.

It’s been seven weeks of change, steep learning curves and a change of thinking, but I’m enjoying every minute and it’s certainly the challenge I was hoping for.

13-3-12

Yesterday was my last day at NDS8. I can’t believe it’s been a little over two years since I decided to have a go at life on the road in consulting. It’s been a fun and tiring time, and I’ve learned so much. Not just about technologies, but about project management, business process, other people and mostly myself. I’ve managed to get customers out of some of the strangest of scrapes. I’ve worked late and travelled during my own time at weekends, just so I can turn up bright eyed and bushy tailed at 0900 on Monday morning so that the customer is left with a positive impression.

So why the move? Well it’s something I’ve been thinking about for a couple of months and an unbelievable opportunity came up that I simply could not turn down. From Thursday, I will be employed by Xtravirt Consulting, who are VMware’s EMEA Consulting Partner of the Year 2011. As I remarked to them at my interview and without being sycophantic, this to me is the Champions League of virtualisation jobs. It will take me in a direction far beyond where I am now, making me a far better consultant and giving me some real cutting edge technical skills.

It’s no great secret that the Novell space is shrinking and I’ve been really into virtualisation for years, cloud in the last 6/9 months. I’ve had the privilege of attending VMworld EMEA, and just seeing what a vibrant community of partners, vendors and customers has sprung up around VMware just made me want to be a part of it. It’s where all the innovation is happening right now, and the pace of change is electric. It also means I’ll get a lot more involved in things like storage design, which up until now, had only been at a high level.

In this business, if you fail to evolve you soon get left behind and I’m determined that won’t happen to me. I’m excited, nervous and slightly intimidated about my new role, but that’s how it should be as it will put me on my toes and keep me there!

 

08-11-11

For those looking for it, here is my response to Virgin Media’s “Did we answer your question? How are we doing?”, which was too long to publish on Facebook, Twitter or anything else pithy :-

 

It’s usually a joy to ask a question as one finds the customer service representative doesn’t read it before ploughing headlong into an answer. I’d love to engage them in some badinage and spend three hours re-iterating my initial (simple)question, but I worry life is too short. I still feel that offshoring customer service to India is a fundamental mistake. 
 
They’re polite and courteous, but don’t understand spoken English nuances and it’s a problem that won’t go away until CS returns to the UK. You may think you’re saving company pennies doing this, but it doesn’t help customer loyalty and despite everything, has not improved customer service. I’d wager your budget would be better spent hiring a few of the millions unemployed in the UK to provide a better service at probably the same cost.
Of course I open myself to accusations of racism, but we know in our heart of hearts this isn’t true. It’s generally an excuse peddled by the lazy, who don’t wish to engage in meaningful discussion on the topic. I’ve been to India and they’re wonderful people with a fantastic work ethic but offshoring customer facing operations has been fundamentally flawed from the outset, a fact only a handful of UK companies have grasped.

I’ve opined similar sentiments on previous occasions only to be met by a standard corporate response. I’m happy you’ve read this far, so there is no need really to follow up further. Hopefully it’s entertained as well as informed. This is my raison d’etre!

 

01-11-11

So the week  before last was spent in the delightful (but blinking expensive) city of Copenhagen. I was there to attend my first VMworld, and what an interesting week it was. I’ve been fortunate enough to attend a couple of Novell BrainShare events (well four in fact), but the scales are completely the opposite. I don’t think I’m inviting any criticism by saying that the two vendors are at opposite ends of the spectrum – Novell’s fortunes have been on the wane for several years, whereas VMware is an industry (and Wall Street) darling on the relentless rise quarter after quarter.

The purpose of the visit was really twofold. Firstly to network as a VMware partner, to see how we can really go out to market and sell VMware solutions and secondly with my technical head on, trying to re-learn my VMware skills that have gotten pretty rusty over the last couple of years. As such, it was a pretty good success. The problem with any event that has 7,000 delegates is that you’re always going to have a hard time spending any “quality time” with the people who have the answers. Getting the opportunity to talk to the subject matter experts in the exhibition hall is not an easy task!

The main thrust of the event was to showcase VMware’s cloud message. I used to hate the expression and all it stood for, as it stands for everything and nothing. Ask 10 vendors their take on cloud computing, and you get 10 totally different replies. In VMware’s case, it’s been a clear, concise and consistent message for the last couple of years. To them, cloud computing is essentially an agile environment where IT is turned into something akin to utilities such as water, gas and electricity.

In the VMware world, you can build an internal cloud that can expand and contract as load and business requirements fluctuate, or you can offload all of your VMs to a public cloud, such as the one from vCloud partner Colt, or you can have a Hybrid Cloud of the two, where you keep all your important business IP and services on premise, but when occasional spikes in demand occur, you can spin up some extra capacity at a vCloud partner, but still have it as part of your infrastructure. Once the spike falls, you tear down the VMs in the public cloud. I must admit, the more I’ve been thinking about this idea, the more I like it. As I said, it’s been the same message for the last couple of years and it’s an easy one to conceptualise.

On a slightly different track, I went along to a Scott Lowe session for the first time. He’s well known in the VMware space and has written numerous books and articles. What made his session interesting was that for once, it wasn’t too preachy or numbers driven. It really followed the process of how to design a vSphere implementation from the ground up. Rather than concentrating on raw numbers (which many do), he turned it around completely and asked to look at it from the business perspective. I really liked this approach, as even now, technology solutions end up being a square peg in a round hole and the business has to flex to fit it, rather than it being a business based solution that technology can help with.