Virtual Shared Hardware Accelerated 3D Gaming
This year at VMworld 2012 we ran a 3D gaming lab in the Green Room for the Hands-On Labs. It was to show off VMware's new vSGA (Virtual Shared Graphics Acceleration) technology in a fun way. vSGA has been introduced in vSphere 5.1 and will be available through View with our next major release due out in the first half of 2013.
VMworld in San Francisco was our first attempt at 3D gaming before vSphere 5.1 became generally available and we were using an Alpha build of View. We (myself, Todd Dayton and Tommy Walker) put the whole environment together in the week leading up to VMworld using loaner hardware donated by SuperMicro, along with internals for those hosts donated by Randy Keener and Nick Geisler, endpoints donated by Dell/Wyse, and Nvidia Quadro 6000 GPUs donated by Aaron Blasius. Aaron Blasius' and Warren Ponder's teams were invaluable in getting us the bits and new fixes along the way to improve performance.
Possibly the hardest part of setting up the "gaming" lab was finding games that would actually *run* on ESXi. Many games would simply dump us back to the desktop with no error (this was particularly true for any racing games we tried). We assume this is because those games are either looking for a supported GPU directly or are trying to use a specific feature on a specific GPU. vSGA supports DirectX 9 and OpenGL 2.1 (?), so it is also possible that some games are trying to use features that we don't support at this time.
For VMworld 2012 San Francisco we settled on two games that had decent performance in the short time we had to try and get things running. Those two games were Minecraft and Borderlands. We ran 6 sessions of each game for a total of 12 desktops across two hosts with a single Nvidia Quadro 6000 in each host. We found, at that time, that we couldn't get adequate performance at resolutions over 800x600, but a late VMtools build got our frame rate up to an average of ~25 fps at 800x600 resolution. We were happy at this point because it was a vast improvement over previous performance.
Here is a very short clip showing the VMworld 2012 San Francisco stations.
After the show was over and we started shutting down stations we found that the Minecraft sessions were crushing the Nvidia Quadro 6000 GPU. We found one session alone could use as much as 76% of the GPU at times. This meant that we were most likely bumping into GPU performance limitations that impacted frames-per-second (FPS) on the Minecraft sessions. The Borderlands sessions used far less GPU resources with a single session generally not pulling more than 35% of a GPU as a max. Still, this shows that the application being used has a wide varying impact on GPU utilization and needs to be tested in any given use case.
Fast Forward to VMworld 2012: BarcelonaFast forward a month and we were in Barcelona trying to setup the same environment with different host hardware. In the meantime the engineers had been given our feedback and had been looking into what else might be causing performance bottlenecks.
Simon Long was able to get Counter-Strike:Source running in a VM on ESXi and brought this along on an external drive that we added to the environment. At first we seemed to be bumping into some of the same limitations and were only successful at running this game at 800x600 at a fairly consistent 30FPS… but we were noticing a consistent drop in FPS every few seconds that would only last about 1 second. Warren Ponder happened to drop by and got this information to Lawrence Spracklen who was able to turn around and provide us with a change to an advanced setting that removed this bottleneck. That was when we decided to start trying higher resolutions and is the result of what you see in the video at the top of this post. Counter-Strike:Source running at 1920x1080 at an average of 30FPS using View and PCoIP being delivered to Dell/Wyse P25s.
Leaps and Bounds From Where We StartedHere is a clip of our initial test after enabling the setting that cured our performance issues.
Finally, here is a short clip of the process starting with connecting to View through the P25 to actually playing Counter-Strike:Source.
This effort, while short in time, involved so many different individuals that I haven't even called out (sorry to anyone who's name I missed). It's funny how gaming piques everyone's interest. ;-) Still, it was wonderful to have the help of so many different people and get so much feedback. I truly believe this 3D gaming effort has helped springboard the shared virtual 3D effort within VMware and will ultimately benefit our customers in their critical business use cases. Now that people have seen what is possible they are starting to approach me with real-world use case questions from their customers. That is probably what is the most exciting thing that has come from this.
Thanks to Dino Cicciarelli for taking the lead to fund this "science project"! The outcome far outweighs the means to make it happen.
**** NEWS FLASH ****
Here is a new post from Lawrence Spracklen that notes just one setting that will help with View video performance. Because this setting is native to Microsoft Windows I don't believe it is something we will control through View or GPOs.
**** UPDATE****Forgot to add the juicy techno details that made this happen. I can't release too much around unreleased software unfortunately, but here you go:
- ESXi Host Information (all loaner gear)
- San Francisco
- Two (2) SuperMicro - Dual socket 6-Core Xeon's @ 2.0GHz
- 128GB of RAM per host (though the VMs didn't need more than 2GB each)
- A single Nvidia Quadro 6000 GPU w/6GB of VRAM in each host (but the motherboard had four (4) PCIe x16 slots in it... not sure if the PSU could have handled four actively cooled GPUs though)
- Two (2) Dell T620s - Single 6-Core Xeon E5-2640 @ 2.5GHz (motherboard was dual socket, but it only had one CPU installed)
- 32GB of RAM per host
- A single Nvidia Quadro 6000 GPU w/6GB of VRAM in each host (the motherboard had four (4) PCIe x16 slots in it, although only 2 slots per CPU socket. The PSU did not have PCIe power cables, so we had to get creative to make things work for the show... like I said, this was a "science project")
- Endpoint Client Devices
- Dell/Wyse donated P25 Zero Clients with the new Tera2 chipset as well as Z90 dual-core Windows Embedded Thin Clients.
- vSphere / ESXi
- San Francisco was an RTM build (I can't remember the build number)
- Barcelona was the GA build (5.1.0-834536)
- San Francisco and Barcelona were the same build, but this was an alpha build that I am not able to list. However, I can say that this version will be labeled 5.2 and is due to release in the first half of 2013.
- In each case we had a flat 1Gbps local flat network switch. We saw sessions reach as high as 70Mbps once we had them running at 1080p @ 30fps. This is due to the very high number of pixel changes each second at that resolution and framerate. Trying to get this type of performance is not a good use case for the WAN. ;-)