I recently ran across an interesting issue in my ACI fabric and there was not that much information available about it yet, so I thought I would share it.
The issue was related to a load balancer configured in HA mode as a concrete device on the ACI fabric. When the load balancer failed over traffic to the passive node traffic going to the VIP would start failing and it would take a couple minutes for the VIP to become responsive again.
In troubleshooting the issue we looked at a packet capture and saw that immediately after the failover the load balancer would send out a gratuitous ARP as expected, however on the leaf switch it showed no GARP’s being received.
The default behavior of an ACI fabric is to do all learning via UDP unicast lookups in the endpoint database located in the spines and as such there is no need to broadcast or flood an ARP. However in order to get things like HA on load balancers and firewalls or like OS level clustering like Microsoft Windows Failover Clustering or Linux Heartbeat we need to be able to learn based on GARP. A GARP (Gratuitous ARP) is used by devices on the network as a way to proactively update the ARP cache to let other devices know that the location of a MAC address has changed (advanced notification). In order for the fabric to be able to learn endpoint moves via GARP, we need to enable some non-default features on the Bridge Domain (BD) associated with the End Point Group (EPG). Those are “ARP Flooding” and EP Move Detection Mode” (GARP Detection Mode). Below is a screenshot of the settings I am referring to:
The first screenshot of enabling ARP Flooding is from “Tenant>Networking>Bridge Domains>YOUR-BD”
The second screenshot of enabling GARP based detection is also from “Tenant>Networking>Bridge Domains>YOUR-BD”, but you then need to goto the L3 Configurations tab on the BD.
These screenshots are from an APIC running on the 1.2 codebase.
I am very excited and thankful to be selected to be of the Cisco Champions. I look forward to contributing every way I can to the community and I look forward to working with all of the other Cisco Champions over the next year. I would also like to congratulate all of the other Cisco Champions for 2016.
I recently ran into an issue where my ESXi 5.5 hosts started randomly dropping off the network and the only way to get them back was to reboot them. In going through the logs below is the findings of what happened and why:
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_attn_int_deasserted3:4816(vmnic0)]MC assert!
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_mc_assert:937(vmnic0)]XSTORM_ASSERT_LIST_INDEX 0x2
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_mc_assert:951(vmnic0)]XSTORM_ASSERT_INDEX 0x0 = 0x00020000 0x00010017 0x05aa05b4 0x00010053
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_mc_assert:965(vmnic0)]Chip Revision: everest3, FW Version: 7_10_51
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_attn_int_deasserted3:4822(vmnic0)]driver assert
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_panic_dump:1140(vmnic0)]begin crash dump —————–
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_panic_dump:1150(vmnic0)]def_idx(0xfbd2) def_att_idx(0xa) attn_state(0x1) spq_prod_idx(0xcf) next_stats_cnt(0xdd33)
2015-06-12T16:23:54.454Z cpu17:33604)<3>bnx2x: [bnx2x_panic_dump:1155(vmnic0)]DSB: attn bits(0x0) ack(0x1) id(0x0) idx(0xa)
This is a low-level driver crash caused by a MC assert without any other hypervisor problems at the time. The bnx2x card then begins a crash dump and resets itself. The data from the crash dump of the adapter has data but it appears to only be useful to Broadcom/Qlogic. At the end of the crash dump it shows that the card gets reset.
I found out this morning that I was selected to the vExpert program for 2015. I was really surprised and elated that I was selected to be a part of this great group. I really did not expect to receive this great honor , but I look forward to contributing in every way that I can to the community and I look forward to working with all of the other vExperts over the next year. I would also like to congratulate all of the other vExperts for 2015.
I spent most of the day today troubleshooting an error that I was getting while configuring the vCAC appliance. This error had me, VMware support, and our consultant all scratching our heads. The error that we were getting was:
Invalid “Host Settings” in the remote SSO server. Expected: ssoservername.domain.dom:7444
As it turns out the SSO server information that you enter is case sensitive. We finally ran across a very good write up about the issue and how to resolve it:
It would be very helpful if VMware would note this type of information in the configuration guides or at least provide that information to their support and professional services teams.
Have you tried to connect to a SMBv2 NAS appliance via CIFS, using a UNC path, from a Windows 2012R2 client? If you have, have most likely run across the Invalid Signature error. The reason for this error is that your NAS does not support SMB 3.0. SMB 3.0 added a feature called “Secure Negotiate”. This feature depends on the error responses from all SMBv2 servers being correctly signed. If the error responses are not correctly signed the Workstation Service will immediately drop the connection. Microsoft added this feature to combat man in the middle attacks.
There is however a way to disable this functionality to allow you to use a SMBv2 NAS. All you need to do is run the following commands on the Windows 2012 R2/Windows 8.1 client machines:
These settings will disable the requirement of the security signature and disable Secure Negotiate.
We are now broadcasting to twitter on @vbootstrap.