How to enable the new security feature of BPDU blocking in vSphere ESXi 5.1 networking and why this is important.
Spanning Tree is a technology which basically makes sure that the Layer Two network does not contain loops. This is done by a calculation done across the switches, which will logically disable links creating loops, while making sure the most high-speed links stay up.
This makes it also possible to create redundancy at Layer Two (Ethernet) by creating deliberate loops, which will be logically shutdown during normal operations, but if any device fails or a cable is pulled Spanning Tree will bring up an alternative path to maintain connectivity.
The switches communicate through special frames called BPDUs. Through some very specific rules they will select one switch called the Root Switch which is crucial for the stability in the network. What is important to be aware of is that Spanning Tree by default trusts every incoming BDPU and that there is no authentication or other mechanism involved that identifies who really is the sender of the BPDU frames.
However not common, it could be possible for a malicious user to introduce faked BDPUs into the network and disturb or even destroy network connectivity.
We shall see how this unfortunately is possible even from a Virtual Machine running inside an ESXi host.
The virtual switches in ESXi does not have any Spanning Tree support and will never send any BPDU frames themselves and will not process any incoming BPDUs from the physical switch.
However, what if a VM send faked BPDU frames out through the vSwitch? Here above we have an ordinary VM, running Windows 2003 server, connected to a vSwitch on ESXi 5.1. Inside this VM there is a tool installed which could generate BPDUs, pretending to be a switch.
From inside this VM we will now use this tool to send faked BDPU frames out to the network. These BPDUs are constructed to try to grab the role as the Root Switch. By setting the Priority field to 0 and the MAC address as low as possible it have a very high chance of “winning” the Root Bridge role.
DO NOT USE THIS IN A PRODUCTION NETWORK.
In the physical switch Command Line Interface we can see that this frame was successfully passed to the network and that the VM now is acting as the Spanning Tree Root Switch. The ESXi host is at port A14 above and has now the Root role. This could be used for denial-of-service, traffic sniffing or other non-wanted security issues.
The network administrator could try to prevent this from happening by enabling protection for incoming BPDU frames on some switch ports. This setting is called BPDU Guard in Cisco and BPDU Protection on HP Networking devices. This could be enabled on ports where we do not expect any other switches to exist. The setting interprets any incoming BPDU as a sign of either misconfiguration or an attack.
On the physical switch port called A14, which attach to the ESXi host, we will now enable BPDU Protection, which will instruct the switch to shut down the port if any incoming BPDUs is observed.
Inside the VM the attack tool is used again. We see that we now lose the network connection, which is caused by the switch port going into disabled mode. The VM is no longer capable of disturbing the rest of the Spanning Tree topology in the physical network.
This could be verified on the physical switch, we can see that the port A14 connected to the ESXi host now is disabled due to the unexpected BPDU frame.
This might seem to solve the problem, but it does actually only accomplish one goal: protect the network from the VMs, however all other VMs running on the same host is now also disconnected from the network. This means that a malicious user inside a VM could cause a very efficient denial-of-service attack against all other Virtual Machines in the same ESXi host.
In vSphere ESXi 5.1 and later we have a new option: enabling a new feature called BPDU Guard on the virtual switch, which basically observe all frames sent from the VMs and simple discards all Spanning Tree BPDUs. There will be no blocking of virtual ports, but just silently drop all BPDUs – which VMs almost never have a valid reason to send.
NOTE: Even if some of the VMware Release Notes and publications for ESXi 5.1 calls this function “BPDU Guard” it is actually more similar to the “BPDU Filter” settings often found on physical switches. With BPDU Filter any incoming BPDUs are simply dropped, but the switch port is not disabled.
Enable this on the ESXi 5.1 host on the Configuration tab and Software – Advanced Settings. On the “Net” path you could see the option Net.BlockGuestBPDU, by default set to 0. Change this to 1 to enable. It could take several hours before the value is actually applied. A reboot of the ESXi host will make it apply instantly. Any BPDU now sent from within the VMs are just dropped at the vSwitch, however without any events being logged. It could be reasonable for the vSphere Administrator to get informed in any way of unexpected BPDUs, but unfortunately this is not implemented yet.
The BPDU block in ESXi is a good feature and adds another layer of security and prevents a malicious VM from disturbing both the physical network as well as other VMs. It might be a bit non-straightforward to enable, but only needs to be set once per host. This is also one of the few new features available on the Standard vSwitch as well as the Distributed vSwitches.
Excellent post, Rickard… ! (as always)
I can’t stop wondering why the blocking isn’t enabled per default, though…
/Rubeck
Thanks for your comment Kim!
I agree that it could be very reasonable to have this setting enabled per default, and the way to activate it now is not exactly obvious..
BPDU blocking not being enabled by default does follow the principle of “least surprise” when upgrading from 5 to 5.1
There are a few (perhaps rare) circumstances where you _do_ want a VM to be be able to send and receive BPDUs (a software Bridge might be an example).
Anyone upgrading from 5.0 to 5.1 would suddenly have their VM stop working properly, with no obvious explanation.
It is a valid point that keeping defaults from earlier versions does not put customers into surprises, even if it like you say should be very rare for any VM to need to transmit BPDU frames.
Perhaps a good way would have been to have the BDPU Block enabled by default, but with an easy accessable checkbox in the GUI of the vSwitch.
Shouldn’t BPDU Filtering be used instead of BPDU Guard on the physical switch?
Claes
BPDU Filtering (making the physical switch port ignore any incoming BPDUs) could make sense, however I am a bit worried if this makes it impossible for the physical switch to detect any internal ESXi loop, e.g. a Virtual Machine with two vNICs in bridging mode.
This could result in a real layer two loop breaking down the whole network if broadcast frames are passed into such a bridged VM and then forwarded back out. I would have to test to see if ordinary Spanning Tree, without BPDU-filter, could detect and prevent such situation however.
It might be that any incoming BPDUs from physical switches are dropped instantly at the vSwitch and by that never able to travel through a bridged VM and back out – and if so the physical switch has less use to ever observe incoming BPDUs from ESXi hosts.
VMware still hasn’t completely solved the problem, just by blocking BPDUs, because…. Filtering outgoing BPDUs is not compliant with 802.1d bridging and leaves open the possibility of a loop through the vSwitch.
This could occur, when a VM is configured to perform bridging between its virtual network adapters, and a loop is accidentally created, where the bridging VM is part of the loop.
Without the feature, the vSwitch is no worse a risk to the network than any other switch that doesn’t support STP. With the feature, the vSwitch is a greater risk to the network.
Accidental misconfigurations are much more common than internal DoS attacks leveraging spoofed STP/VTP/CDP/ARP frames; which require that a VM is already compromised.
VMware could have made the feature less a liability, and more an improvement, by instead if a “silently block BPDUs” feature; by implementing a feature, that allows the admin to select VM interfaces on which BPDUs can be received and causes the Virtual machine’s virtual network adapter to be turned off (Connected unchecked), if a BPDU, VTP, CDP, or ARP packet in violation of security policy is received on it.
Thanks for your comment James.
It is clear that there are more work to be done in the vSwitches to be able to handle all possible events. While this new feature stops intentional attacks it will not in any way prevent a loop created by a two vNIC virtual machine in bridged mode.
Even if Windows Servers does send BDPUs when configured for bridging this could not always assumed to be true. A virtual machine could be in bridging mode and remain silent but still pass through frames and destroy the Layer Two network in a broadcast storm.
One possible solution would be to let the vSwitch pass physical frame BPDU up to the VMs. Today incoming BPDUs are dropped too, which means that a bridging VM does not pass “physical” BPDUs through and make this detectable on the outside.
Another issue is that even vendor specific loop protection does not work either. Such feature often sends frames with unknown destination address to force these frames to be flooded and by that detect loops in “dumb” switches or hubs. However such unknown frames they are also dropped by the vSwitch and not distributed up to the VMs.
Perhaps the long term solution would be a full (Rapid) Spanning Tree implementation into the VMware vSwitches?
Nice article, Simple words and understandable, keep it up.
Sorry we have installed two gs3700 zyxel. There are 3 esxi hosts but when we changed switch there are some problems. I think that the problem is bpdu packet sent in the network, now I have activate loop guard and bpduguard on each switch. I hope that it will function.
Thank you for good how to
Great post has anything been introduce in recent version of vsphere to trach the offending VM / vmnic?